cnn-prelu-imagenet

A Convolutional Neural Network (CNN) trained on ImageNet-1k with PReLU activation.

Repository: github.com/chrisjob1021/model-examples

Model Description

This is a ResNet-style CNN architecture featuring:

  • Activation Function: PReLU
  • Number of Classes: 1000
  • Architecture: Deep residual network with bottleneck blocks
  • Training Dataset: ImageNet-1k
  • Code: See cnn/ directory in the repository for training scripts and model implementation

Key Features

  • Residual connections for better gradient flow
  • Batch normalization for training stability
  • PReLU activation for learnable non-linearity
  • Manual and builtin convolution implementations for educational purposes

Training Details

  • Epochs: 300.0
  • Global Steps: 375,300
  • Training Loss: 0.898557186126709

Evaluation Results

  • Top-1 Accuracy: 78.01%
  • Top-5 Accuracy: 93.89%
  • Top-1 Error: 21.99%
  • Top-5 Error: 6.11%

Comparison with ImageNet Benchmarks

Model Top-1 Top-5 Parameters Year Notes
AlexNet 57.0% 80.3% 60M 2012 First deep CNN
VGG-16 71.5% 90.1% 138M 2014 Deep with small filters
ResNet-50 76.0% 93.0% 25M 2015 Baseline
ResNet-152 78.3% 94.3% 60M 2015 Deeper variant
Inception-v3 78.0% 93.9% 24M 2015 Multi-scale
This model 78.01% 93.89% ~23M 2025 PReLU

Key Achievement: +2.01% improvement over ResNet-50 baseline

Usage

from prelu_cnn import CNN

# Load the model
model = CNN.from_pretrained(
    "your-username/cnn-prelu-imagenet",
    use_prelu=True,
    num_classes=1000
)

# Use for inference
import torch
from torchvision import transforms
from PIL import Image

# Prepare image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

image = Image.open("path/to/image.jpg")
input_tensor = transform(image).unsqueeze(0)

# Get prediction
model.eval()
with torch.no_grad():
    output = model(input_tensor)
    probabilities = torch.nn.functional.softmax(output[0], dim=0)
    top5_prob, top5_catid = torch.topk(probabilities, 5)

Training Procedure

This model was trained on ImageNet-1k using advanced techniques:

Optimization

  • Optimizer: AdamW (weight_decay=0.02)
  • Learning Rate: 0.1 with 10-epoch warmup
  • Schedule: Cosine annealing
  • Batch Size: 1024 effective (512 per GPU Γ— 2 gradient accumulation)
  • Mixed Precision: fp16 for efficiency

Data Augmentation

  • MixUp (Ξ±=0.2, prob=50%): Linearly combines pairs of images and labels [Zhang et al., 2017]
  • CutMix (Ξ±=1.0, prob=50%): Replaces image regions with patches from other images [Yun et al., 2019]
  • RandAugment (magnitude=9, std=0.5): Automated augmentation policy [Cubuk et al., 2020]
  • Standard: RandomResizedCrop (224Γ—224), random horizontal flip, color jitter, random erasing (prob=0.25)

Regularization

  • Stochastic depth (drop_path_rate=0.1)
  • Label smoothing (via MixUp/CutMix)
  • Weight decay
  • Batch normalization

Model Architecture

ResNet-50 with PReLU

CNN(
  conv1: [ConvAct(3 β†’ 64, 7Γ—7, stride=2) + MaxPool(3Γ—3, stride=2)]
    Input: 224Γ—224 β†’ 112Γ—112 β†’ 56Γ—56

  conv2_x: 3Γ— BottleneckBlock(64 β†’ 64 β†’ 256)
    56Γ—56 (no downsampling)

  conv3_x: 4Γ— BottleneckBlock(256 β†’ 128 β†’ 512)
    56Γ—56 β†’ 28Γ—28 (first block stride=2)

  conv4_x: 6Γ— BottleneckBlock(512 β†’ 256 β†’ 1024)
    28Γ—28 β†’ 14Γ—14 (first block stride=2)

  conv5_x: 3Γ— BottleneckBlock(1024 β†’ 512 β†’ 2048)
    14Γ—14 β†’ 7Γ—7 (first block stride=2)

  avgpool: AdaptiveAvgPool2d(1Γ—1)
    7Γ—7 β†’ 1Γ—1

  fc: Linear(2048 β†’ 1000)
)

Total Layers: 50 (1 + 3Γ—3 + 4Γ—3 + 6Γ—3 + 3Γ—3 = 49 conv + 1 fc)

Key Features

  • PReLU Activation: Learnable negative slope for adaptive non-linearity
  • Bottleneck Blocks: 1Γ—1 β†’ 3Γ—3 β†’ 1Γ—1 design (4Γ— parameter reduction)
  • Residual Connections: Skip connections for deep network training
  • ReZero Scaling: Learnable residual scaling (initialized at 0)
  • Stochastic Depth: Linear decay DropPath (0.0 β†’ 0.1)
  • Batch Normalization: Momentum=0.01 for stable statistics
  • Global Average Pooling: Spatial invariance, zero parameters

References

Citation

If you use this model, please cite:

@misc{cnn_prelu_imagenet,
  title={cnn-prelu-imagenet: CNN with PReLU for ImageNet Classification},
  year={2025},
  publisher={HuggingFace Hub},
}

Original PReLU Paper

This model uses PReLU activation. Please also cite the original paper:

@inproceedings{he2015delving,
  title={Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification},
  author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={1026--1034},
  year={2015}
}

License

This model is released under the MIT License.

Downloads last month
14
Safetensors
Model size
25.7M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train chrisjob1021/cnn-prelu-imagenet