DHCD β Devanagari Handwritten Character Recognition
This repository contains a custom Convolutional Neural Network (CNN) trained from scratch to recognize 46 classes of handwritten Devanagari characters and digits. The model is implemented in PyTorch and achieves approximately 98.42% test accuracy on the Devanagari Handwritten Character Dataset (DHCD).
Model Overview
- Model Type: Custom CNN (Model A)
- Task: Image Classification (OCR β handwritten characters)
- Input: 32Γ32 grayscale images (white foreground on black background)
- Output: 46-class softmax probabilities
- Framework: PyTorch
The architecture is intentionally designed for small fixed-size inputs, aggressively reducing spatial dimensions to 1Γ1 before classification to enforce global feature learning.
Architecture
Layer progression:
- Input: 32Γ32Γ1
- Conv(64, 5Γ5) β LRN β ReLU β MaxPool
- Conv(128, 5Γ5) β LRN β ReLU β MaxPool
- Conv(256, 5Γ5) β LRN β ReLU β (1Γ1Γ256)
- Flatten β Dropout(0.5) β Linear(256 β 46)
Dataset
Dataset: Devanagari Handwritten Character Dataset (DHCD)
Total Images: ~92,000
Classes: 46
- Digits: 0β9 (10)
- Consonants: 36 basic Devanagari characters
Image Size: 32Γ32 pixels
Color Space: Grayscale
Writers: Multiple native writers
Dataset Split
The dataset was split in a class-balanced manner:
- Training: ~80%
- Validation: ~10%
- Test: ~10%
Each split preserves an approximately equal number of samples per class. The validation set was used for convergence monitoring and hyperparameter tuning, while the test set was held out entirely for final evaluation.
Training Details
- Optimizer: SGD with Momentum
- Epochs: 20
- Loss Function: Cross-Entropy Loss
- Regularization: Dropout (p = 0.5)
- Normalization: Local Response Normalization (LRN)
Performance
The model performs consistently well across most characters, with minor confusion only between visually similar glyphs.
Usage
β οΈ Important: Because this is a custom architecture, the model class must be defined exactly as used during training before loading the weights.
Quickstart
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms
from PIL import Image
class ModelA(nn.Module):
def __init__(self, num_classes=46):
super().__init__()
self.conv1 = nn.Conv2d(1, 64, 5)
self.lrn1 = nn.LocalResponseNorm(5)
self.pool1 = nn.MaxPool2d(2)
self.conv2 = nn.Conv2d(64, 128, 5)
self.lrn2 = nn.LocalResponseNorm(5)
self.pool2 = nn.MaxPool2d(2)
self.conv3 = nn.Conv2d(128, 256, 5)
self.lrn3 = nn.LocalResponseNorm(5)
self.dropout = nn.Dropout(0.5)
self.fc = nn.Linear(256, num_classes)
def forward(self, x):
x = self.pool1(F.relu(self.lrn1(self.conv1(x))))
x = self.pool2(F.relu(self.lrn2(self.conv2(x))))
x = F.relu(self.lrn3(self.conv3(x)))
x = x.view(x.size(0), -1)
x = self.dropout(x)
return self.fc(x)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = ModelA().to(device)
model.load_state_dict(torch.load("dhcd-model.pth", map_location=device))
model.eval()
Limitations
- Requires exact 32Γ32 input resolution
- Sensitive to color inversion
- Cannot process multi-character images
- Performance may degrade on unseen handwriting styles
Ethical Considerations
- Trained only on handwritten Devanagari characters
- No personal or sensitive data used
- Not intended for critical decision-making systems
Reproducibility
To reproduce results:
- Use the provided
dhcd.ipynbnotebook - Apply identical preprocessing
- Match the architecture definition exactly
Citation
Devanagari Handwritten Character Recognition using a Custom CNN,
Arpit Gaur, 2025
License
Released under the MIT License.



