Technical overview of ModelForge's modular architecture.
ModelForge v3 uses a clean, modular architecture based on SOLID principles:
┌─────────────────────────────────────────────────────┐
│ Web Interface (React) │
└────────────────────┬────────────────────────────────┘
│
│ HTTP/REST
▼
┌─────────────────────────────────────────────────────┐
│ FastAPI Application Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Finetuning │ │ Models │ │Playground│ │
│ │ Router │ │ Router │ │ Router │ │
│ └──────┬───────┘ └──────┬───────┘ └────┬─────┘ │
└─────────┼──────────────────┼───────────────┼────────┘
│ │ │
│ Dependency Injection (FastAPI) │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ Service Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Training │ │ Model │ │ Hardware │ │
│ │ Service │ │ Service │ │ Service │ │
│ └──────┬───────┘ └──────┬───────┘ └────┬─────┘ │
└─────────┼──────────────────┼───────────────┼────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ Business Logic Layer │
│ │
│ ┌─────────────────┐ ┌────────────────────┐ │
│ │ Provider │ │ Strategy │ │
│ │ Factory │ │ Factory │ │
│ │ │ │ │ │
│ │ ┌─────────────┐ │ │ ┌────────────────┐ │ │
│ │ │ HuggingFace │ │ │ │ SFT │ │ │
│ │ │ Provider │ │ │ │ Strategy │ │ │
│ │ └─────────────┘ │ │ └────────────────┘ │ │
│ │ ┌─────────────┐ │ │ ┌────────────────┐ │ │
│ │ │ Unsloth │ │ │ │ QLoRA │ │ │
│ │ │ Provider │ │ │ │ Strategy │ │ │
│ │ └─────────────┘ │ │ └────────────────┘ │ │
│ └─────────────────┘ └────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌────────────────────┐ │
│ │ Evaluation │ │ Quantization │ │
│ │ System │ │ Factory │ │
│ └─────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌──────────────────┐ ┌────────────────────┐ │
│ │ Database │ │ File Manager │ │
│ │ Manager │ │ │ │
│ │ (SQLAlchemy) │ │ - Datasets │ │
│ │ │ │ - Checkpoints │ │
│ │ - Models │ │ - Logs │ │
│ │ - Training │ └────────────────────┘ │
│ └──────────────────┘ │
└──────────────────────────────────────────────────────┘
Location: ModelForge/routers/
Responsibility: HTTP request handling
Files:
finetuning_router.py- Training endpointsmodels_router.py- Model managementplayground_router.py- Inference testinghub_management_router.py- Model hub operations
Pattern: Thin controllers, delegate to services
Location: ModelForge/services/
Responsibility: Core business logic
Files:
training_service.py- Training orchestrationmodel_service.py- Model CRUD operationshardware_service.py- Hardware detection
Pattern: Service layer with dependency injection
Location: ModelForge/providers/
Responsibility: Model and tokenizer loading
Files:
__init__.py- Provider protocolhuggingface_provider.py- HuggingFace implementationunsloth_provider.py- Unsloth implementationprovider_factory.py- Provider creation
Pattern: Protocol + Factory
Location: ModelForge/strategies/
Responsibility: Training algorithm implementation
Files:
__init__.py- Strategy protocolsft_strategy.py- Supervised fine-tuningqlora_strategy.py- Quantized LoRArlhf_strategy.py- RLHFdpo_strategy.py- DPOstrategy_factory.py- Strategy creation
Pattern: Strategy + Factory
Location: ModelForge/database/
Responsibility: Data persistence
Files:
models.py- SQLAlchemy modelsdatabase_manager.py- DB operations
Pattern: Repository with ORM
Location: ModelForge/evaluation/
Responsibility: Training evaluation
Files:
metrics.py- Task-specific metricsdataset_validator.py- Dataset validation
Implementation: FastAPI's Depends()
Example:
from fastapi import APIRouter, Depends
from ..dependencies import get_training_service
router = APIRouter()
@router.post("/start_training")
async def start_training(
config: TrainingConfig,
service: TrainingService = Depends(get_training_service),
):
return service.train_model(config.model_dump())Used for: Providers and Strategies
Example:
class ProviderFactory:
_providers = {
"huggingface": HuggingFaceProvider,
"unsloth": UnslothProvider,
}
@classmethod
def create_provider(cls, provider_name: str):
provider_class = cls._providers.get(provider_name)
if not provider_class:
raise ProviderError(f"Unknown provider: {provider_name}")
return provider_class()Used for: Training algorithms
Example:
class TrainingStrategy(Protocol):
def prepare_model(self, model, config): ...
def prepare_dataset(self, dataset, tokenizer, config): ...
def create_trainer(self, model, dataset, ...): ...Used for: Defining contracts
Example:
from typing import Protocol
class ModelProvider(Protocol):
def load_model(self, model_id: str, ...): ...
def load_tokenizer(self, model_id: str, ...): ...
def validate_model_access(self, model_id: str, ...): ...
def get_provider_name(self) -> str: ...- User submits training request via UI
- React Frontend sends POST to
/api/start_training - FastAPI Router receives request, validates with Pydantic
- Router injects
TrainingServicevia dependency - TrainingService orchestrates:
- Validates dataset
- Creates provider from
ProviderFactory - Loads model via provider
- Creates strategy from
StrategyFactory - Prepares model and dataset via strategy
- Creates trainer and starts training
- Training runs with callbacks for progress
- Results saved to database and file system
- Response returned to user
User Request
↓
ProviderFactory.create_provider(provider_name)
↓
Provider.load_model(model_id, config)
↓
Provider-specific implementation
↓
Return (model, tokenizer)
- Create class implementing
ModelProviderprotocol - Register in
ProviderFactory._providers - That's it! No other changes needed.
- Create class implementing
TrainingStrategyprotocol - Register in
StrategyFactory._strategies - That's it! No other changes needed.
- Add task-specific formatter in
services/training_service.py - Add metrics in
evaluation/metrics.py - Update schema validation
ModelForgeException (base)
├── ProviderError
├── StrategyError
├── DatasetValidationError
├── TrainingError
├── ConfigurationError
├── HardwareError
└── DatabaseError
All exceptions caught by FastAPI error handlers and converted to appropriate HTTP responses.
Location: ModelForge/schemas/
Validation and serialization of configuration.
HUGGINGFACE_TOKEN- HuggingFace API tokenMODELFORGE_DB_PATH- Custom database pathMODELFORGE_DISABLE_TENSORBOARD- Disable TensorBoard
Test individual components in isolation:
def test_provider_factory():
provider = ProviderFactory.create_provider("huggingface")
assert provider.get_provider_name() == "huggingface"Test component interactions:
def test_training_flow():
service = TrainingService(mock_db, mock_file_manager)
result = service.train_model(config)
assert result["status"] == "success"Use dependency injection for easy mocking:
mock_db = MagicMock(spec=DatabaseManager)
service = TrainingService(mock_db, file_manager)SQLAlchemy connection pool:
- Pool size: 10
- Max overflow: 20
- Recycle: 3600 seconds
Models and datasets loaded only when needed.
Reduces memory at cost of compute.
bf16/fp16 for faster training on modern GPUs.
All inputs validated via Pydantic schemas.
SQLAlchemy ORM prevents SQL injection.
File paths validated and sandboxed.
HuggingFace tokens stored in environment, never in code.
- Single-GPU training
- Single-process server
- SQLite database
- Multi-GPU support (already structured for it)
- Distributed training
- PostgreSQL for production
- Redis for caching
- Kubernetes deployment
| Metric | Value |
|---|---|
| Code Duplication | 0% |
| Cyclomatic Complexity | Low (< 10 per function) |
| Test Coverage | (To be added) |
| Type Hints | Extensive |
| Documentation | Comprehensive |
See Contributing Guide for:
- Code style guidelines
- Testing requirements
- PR process
Understanding the architecture makes contributing easy! Read the code in ModelForge/ to see it in action.