Skip to main content

Posts

Showing posts from April, 2025

Image Augmentation in Computer Vision using PyTorch Transforms v2

Why Image Augmentation is Essential in Deep Learning In computer vision, image augmentation plays a critical role in improving the generalization of deep neural networks. By artificially expanding the diversity of the training dataset through transformations that preserve the label, image augmentation helps reduce overfitting and increases model robustness. Especially for convolutional neural networks (CNNs) and vision transformers (ViTs), which learn hierarchical and spatial features, input variability introduced by augmentation forces the model to learn more invariant and meaningful representations. This is analogous to improving the mutual information between relevant features and output predictions while discarding noise. Common Image Augmentation Techniques and Parameter Descriptions 1. RandomHorizontalFlip Purpose:  Introduces horizontal symmetry by flipping the image left-to-right with a certain probability. from torchvision.transforms import v2 as transforms transform ...

SVD and Truncated SVD Explained: Theory, Python Examples, and Applications in Machine Learning & Deep Learning

Singular Value Decomposition (SVD) is a matrix factorization method widely used in mathematics, engineering, and economics. Since it's a crucial concept applied in accelerating matrix computations and data compression, it's worth studying at least once. SVD (Singular Value Decomposition) Theory Singular Value Decomposition (SVD) is a matrix factorization technique applicable to any real or complex matrix. Any matrix  A (m×n)  can be decomposed as follows: $A = U * \Sigma * V^T$ U : Orthogonal matrix composed of left singular vectors $(m \times m)$ Σ : Diagonal matrix $(m \times n)$ with singular values on the diagonal $V^T$ : Transposed matrix of right singular vectors $(n \times n)$ The singular values represent the energy or information content of matrix A, enabling tasks like dimensionality reduction or noise filtering. Truncated SVD Truncated SVD approximates the original matrix using only the top  k  singular values and corresponding singular vectors: $A \approx...

FixMatch Explained: A Simple Yet Powerful Algorithm for Semi-Supervised Learning

Paper Link: https://arxiv.org/pdf/2001.07685 What Problem Does FixMatch Address? FixMatch is a semi-supervised learning (SSL) algorithm designed to solve two long-standing technical challenges using a unified and simple framework. In many real-world machine learning applications, labeled data is expensive and time-consuming to obtain, while unlabeled data is abundant. FixMatch addresses this imbalance by combining two powerful ideas in SSL: Consistency Regularization: The assumption that a model should produce consistent predictions when the input undergoes small augmentations or perturbations. Pseudo-Labeling: Treating high-confidence predictions on unlabeled data as if they were ground truth labels for training purposes. While previous SSL methods often combined these ideas through complex architectures or training pipelines, FixMatch simplifies the process using a confidence threshold and a two-stage data augmentation strategy to achieve state-of-the-art performance ...

Cosine Similarity vs. Cosine Distance Explained with PyTorch Examples | Applications in Deep Learning

1. What is Cosine Similarity? Cosine similarity is a metric used to measure the similarity in direction between two vectors, regardless of their magnitude. It is widely used in tasks like text similarity analysis, sentence embedding comparison, and image embedding evaluation. The key idea is that the metric focuses on the angle (or direction) rather than the vector length. Formula: cos_sim(A, B) = (A · B) / (||A|| * ||B||) Here, A and B are input vectors, · denotes the dot product, and ||A|| is the norm (magnitude) of vector A. The cosine similarity value ranges from -1 to 1. A value close to 1 means the vectors are pointing in a similar direction, while a value close to -1 indicates they are pointing in opposite directions. 2. What is Cosine Distance? Cosine distance is derived from cosine similarity and represents the dissimilarity between vectors. It is defined as follows: cos_dist(A, B) = 1 - cos_sim(A, B) The cosine distance ranges from 0 to 2. A...

A Comprehensive Guide to Semi-Supervised Learning in Computer Vision: Algorithms, Comparisons, and Techniques

Introduction to Semi-Supervised Learning Semi-Supervised Learning is a deep learning technique that utilizes a small amount of labeled data and a large amount of unlabeled data. Traditional Supervised Learning uses only labeled data for training, but acquiring labeled data is often difficult and time-consuming. In contrast, Semi-Supervised Learning improves model performance by utilizing unlabeled data, achieving better results with less labeling effort in real-world scenarios. This approach is particularly advantageous in computer vision tasks such as image classification, object detection, and video analysis. When there is a lack of labeled data in large-scale image datasets, Semi-Supervised Learning can effectively enhance model performance using unlabeled data. Technical Background: The core techniques of Semi-Supervised Learning are  Consistency Regularization  and  Pseudo-labeling . Consistency Regularization encourages the model to make consistent predictions on au...

Depth-First Search (DFS) Algorithm Explained with Examples and Applications

  What is Depth-First Search (DFS)? Depth-First Search (DFS) is a fundamental algorithm for traversing or searching tree and graph data structures. The algorithm explores as far as possible along each branch before backtracking, making it suitable for pathfinding, topological sorting, and cycle detection tasks. DFS Algorithm Explanation DFS can be implemented using either recursion (implicit call stack) or an explicit stack data structure. The core idea is: Start at the root (or any arbitrary node for a graph). Visit a node and mark it as visited. Recursively or iteratively visit all the adjacent unvisited nodes. Algorithm Steps (for Binary Tree - Recursive) Visit the current node. Recursively traverse the left subtree. Recursively traverse the right subtree. Algorithm Steps (for General Graph - Iterative with Stack) Push the start node onto the stack and mark it as visited. While the stack is not empty: Pop a node from the stack. Process the node. Push all unvisited adjacent nodes...

A* Search Algorithm - Detailed Explanation with Python Example

  What is A* Search Algorithm? A* Search (pronounced "A-star") is one of the most popular and powerful pathfinding and graph traversal algorithms. It finds the shortest path between nodes using both the cost to reach a node and a heuristic estimation of the remaining cost to the goal. Key Concepts g(n):  The actual cost from the start node to the current node $n$. h(n):  The heuristic estimated cost from node $n$ to the goal. f(n):  The total estimated cost of the cheapest solution through node $n$, calculated as:$f(n) = g(n) + h(n)$ A* Algorithm Steps Initialize the open set with the start node. Initialize a map to record the lowest cost to reach each node ($g$ value). While the open set is not empty: Pick the node with the lowest $f(n)$ value. If the node is the goal, reconstruct and return the path. Else, for each neighbor: Calculate tentative $g$ score. If this score is better than previously recorded, update it. Set neighbor's parent to the current node. If the ...