Practical Guide to Building a Handwritten Digit Recognition System in Python

Bayram EKER
2 min readMay 10, 2024

--

Digitizing handwritten numbers efficiently requires robust image capture, preprocessing, and machine learning techniques. This guide details a Python implementation using OpenCV and Keras to develop a system capable of recognizing handwritten digits from webcam images.

Step 1: Capturing Images with OpenCV

Start by setting up the environment to capture images from a webcam. Each image captured corresponds to a handwritten digit, which is stored for later processing.

import cv2
import os

def capture_images():
num_samples = 100
output_dir = "dataset"

for digit in range(10):
os.makedirs(os.path.join(output_dir, str(digit)), exist_ok=True)
for i in range(num_samples):
cap = cv2.VideoCapture(0)
ret, frame = cap.read()
img_path = os.path.join(output_dir, str(digit), f"{digit}_{i}.png")
cv2.imwrite(img_path, frame)
cap.release()
cv2.destroyAllWindows()

Step 2: Image Preprocessing

Convert images to grayscale and resize them to 28x28 pixels to prepare for model training. This standardization is crucial for consistent model input.

def preprocess_images(input_dir, output_dir, size=(28, 28)):
for digit in range(10):
os.makedirs(os.path.join(output_dir, str(digit)), exist_ok=True)
for img_name in os.listdir(os.path.join(input_dir, str(digit))):
img_path = os.path.join(input_dir, str(digit), img_name)
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, size)
cv2.imwrite(os.path.join(output_dir, str(digit), img_name), img)

Step 3: Splitting the Dataset

Using the preprocessed images, create training and testing datasets to evaluate the model’s performance.

import shutil

def split_dataset(input_dir, train_dir, test_dir, train_split=0.8):
for digit in range(10):
os.makedirs(os.path.join(train_dir, str(digit)), exist_ok=True)
os.makedirs(os.path.join(test_dir, str(digit)), exist_ok=True)

images = os.listdir(os.path.join(input_dir, str(digit)))
num_train = int(len(images) * train_split)
train_images, test_images = images[:num_train], images[num_train:]

for img_name in train_images:
shutil.copy(os.path.join(input_dir, str(digit), img_name), os.path.join(train_dir, str(digit), img_name))
for img_name in test_images:
shutil.copy(os.path.join(input_dir, str(digit), img_name), os.path.join(test_dir, str(digit), img_name))

Step 4: Training the Model with Keras

Define and train a convolutional neural network (CNN) to classify the digits. This section includes the model architecture and training process.

from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from keras.utils import to_categorical
import numpy as np

def train_model(train_dir, test_dir):
X_train, y_train = load_dataset(train_dir)
X_test, y_test = load_dataset(test_dir)

model = Sequential([
Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D(pool_size=(2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32, validation_data=(X_test, y_test))

def load_dataset(data_dir):
images, labels = [], []
for label in os.listdir(data_dir):
for image_file in os.listdir(os.path.join(data_dir, label)):
image_path = os.path.join(data_dir, label, image_file)
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
image = cv2.resize(image, (28, 28))
images.append(image)
labels.append(int(label))
X = np.array(images).reshape(-1, 28, 28, 1)
y = to_categorical(labels)
return X, y

Wrapping Up

That’s it for this quick guide on building a handwritten digit recognition system with Python, OpenCV, and Keras. We’ve walked through the entire pipeline from capturing images to training a CNN. I hope you find this guide useful as you embark on your own projects in computer vision and machine learning.

For the complete code and to see this project in action, check out the GitHub repository: Handwriting OCR on GitHub.

Happy coding, and here’s to building incredible models that can see and understand the world a bit better!

--

--