AI Practitioner

Posts

Showing posts with the label PyTorch

Image Classification: Fine-Tune ResNet18 on Kaggle Dataset (PyTorch + Lightning)

Image Classification: Fine-Tuning ResNet-18 on Kaggle's Lions vs Cheetahs Dataset Image classification is a fundamental task in computer vision where the goal is to assign a label or class to an input image. It is widely used in various domains such as medical imaging, autonomous driving, wildlife monitoring, and security. A typical image classification pipeline involves feeding an image into a neural network model, which processes the input and outputs class probabilities corresponding to predefined categories. What is the ImageNet Dataset? ImageNet is one of the most influential datasets in the history of computer vision. It contains over 14 million labeled images across more than 20,000 categories, with a popular subset of 1,000 categories used in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Models trained on ImageNet learn powerful visual features that generalize well to many downstream tasks, making them a popular choice for transfer learning an...

Image Augmentation in Computer Vision using PyTorch Transforms v2

Why Image Augmentation is Essential in Deep Learning In computer vision, image augmentation plays a critical role in improving the generalization of deep neural networks. By artificially expanding the diversity of the training dataset through transformations that preserve the label, image augmentation helps reduce overfitting and increases model robustness. Especially for convolutional neural networks (CNNs) and vision transformers (ViTs), which learn hierarchical and spatial features, input variability introduced by augmentation forces the model to learn more invariant and meaningful representations. This is analogous to improving the mutual information between relevant features and output predictions while discarding noise. Common Image Augmentation Techniques and Parameter Descriptions 1. RandomHorizontalFlip Purpose: Introduces horizontal symmetry by flipping the image left-to-right with a certain probability. from torchvision.transforms import v2 as transforms transform ...

AI Practitioner

Search This Blog

Posts

Image Classification: Fine-Tune ResNet18 on Kaggle Dataset (PyTorch + Lightning)

Image Augmentation in Computer Vision using PyTorch Transforms v2

RoFormer and Rotary Position Embedding: Revolutionizing Positional Encoding in Transformers

RF-DETR: Overcoming the Limitations of DETR in Object Detection

Understanding SentencePiece: A Language-Independent Tokenizer for AI Engineers

RoFormer and Rotary Position Embedding: Revolutionizing Positional Encoding in Transformers

RF-DETR: Overcoming the Limitations of DETR in Object Detection

Understanding SentencePiece: A Language-Independent Tokenizer for AI Engineers