In programming, a Singleton is a design pattern that ensures a class has only one instance during the entire lifetime of a program, and provides a global access point to that instance.
Singleton is widely used when you want to control resource usage, like database connections, configurations, or loading a heavy machine learning model only once.
Why Use Singleton?
- Efficient memory usage
- Controlled access to a resource
- Ensures consistency across your application
Simple Singleton Implementation in Python
class SingletonMeta(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
instance = super().__call__(*args, **kwargs)
cls._instances[cls] = instance
return cls._instances[cls]
class SingletonExample(metaclass=SingletonMeta):
def __init__(self):
print("Initializing SingletonExample")
# Usage
a = SingletonExample()
b = SingletonExample()
print(a is b) # True
Here, no matter how many times you instantiate SingletonExample
, it will always return the same object!
Real-World Example: Singleton for PyTorch Model Loading
In ML projects, model loading can be slow and memory-intensive. If your app tries to load a model multiple times — big performance issues! Using Singleton ensures only one copy is loaded and reused.
PyTorch Singleton Model Loader Example
import torch
import torch.nn as nn
class SingletonMeta(type):
_instances = {}
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
instance = super().__call__(*args, **kwargs)
cls._instances[cls] = instance
return cls._instances[cls]
class ModelLoader(metaclass=SingletonMeta):
def __init__(self, model_path):
self.model = self.load_model(model_path)
def load_model(self, model_path):
print(f"Loading model from {model_path}...")
model = nn.Sequential(
nn.Linear(10, 20),
nn.ReLU(),
nn.Linear(20, 1)
)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()
return model
def predict(self, input_tensor):
with torch.no_grad():
return self.model(input_tensor)
# Usage Example
if __name__ == "__main__":
loader1 = ModelLoader("model.pth")
loader2 = ModelLoader("model.pth")
print(f"loader1 is loader2: {loader1 is loader2}")
dummy_input = torch.randn(1, 10)
output = loader1.predict(dummy_input)
print(f"Prediction: {output}")
Key Points
ModelLoader
class loads the model only once.loader1
andloader2
are the same object.- Efficient use of memory and faster prediction serving.
Sample Output
Loading model from model.pth...
loader1 is loader2: True
Prediction: tensor([[...]])
References
Final Tip
In production ML services (like APIs or edge devices): Load once, serve fast, and save memory. Singleton is critical for handling large models like BERT, ResNet, and other deep architectures.
Comments
Post a Comment