architecture.md

ModelForge v3 Architecture

Technical overview of ModelForge's modular architecture.

Architecture Overview

ModelForge v3 uses a clean, modular architecture based on SOLID principles:

┌─────────────────────────────────────────────────────┐
│                   Web Interface (React)              │
└────────────────────┬────────────────────────────────┘
                     │
                     │ HTTP/REST
                     ▼
┌─────────────────────────────────────────────────────┐
│              FastAPI Application Layer               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  │
│  │  Finetuning  │  │    Models    │  │Playground│  │
│  │    Router    │  │    Router    │  │  Router  │  │
│  └──────┬───────┘  └──────┬───────┘  └────┬─────┘  │
└─────────┼──────────────────┼───────────────┼────────┘
          │                  │               │
          │ Dependency Injection (FastAPI)   │
          ▼                  ▼               ▼
┌─────────────────────────────────────────────────────┐
│                  Service Layer                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  │
│  │   Training   │  │    Model     │  │ Hardware │  │
│  │   Service    │  │   Service    │  │ Service  │  │
│  └──────┬───────┘  └──────┬───────┘  └────┬─────┘  │
└─────────┼──────────────────┼───────────────┼────────┘
          │                  │               │
          ▼                  ▼               ▼
┌─────────────────────────────────────────────────────┐
│             Business Logic Layer                     │
│                                                      │
│  ┌─────────────────┐      ┌────────────────────┐   │
│  │  Provider       │      │  Strategy          │   │
│  │  Factory        │      │  Factory           │   │
│  │                 │      │                    │   │
│  │ ┌─────────────┐ │      │ ┌────────────────┐ │   │
│  │ │ HuggingFace │ │      │ │      SFT       │ │   │
│  │ │  Provider   │ │      │ │   Strategy     │ │   │
│  │ └─────────────┘ │      │ └────────────────┘ │   │
│  │ ┌─────────────┐ │      │ ┌────────────────┐ │   │
│  │ │   Unsloth   │ │      │ │     QLoRA      │ │   │
│  │ │  Provider   │ │      │ │   Strategy     │ │   │
│  │ └─────────────┘ │      │ └────────────────┘ │   │
│  └─────────────────┘      └────────────────────┘   │
│                                                      │
│  ┌─────────────────┐      ┌────────────────────┐   │
│  │  Evaluation     │      │  Quantization      │   │
│  │  System         │      │  Factory           │   │
│  └─────────────────┘      └────────────────────┘   │
└──────────────────────────────────────────────────────┘
          │                              │
          ▼                              ▼
┌─────────────────────────────────────────────────────┐
│                  Data Layer                          │
│  ┌──────────────────┐      ┌────────────────────┐  │
│  │  Database        │      │  File Manager      │  │
│  │  Manager         │      │                    │  │
│  │  (SQLAlchemy)    │      │  - Datasets        │  │
│  │                  │      │  - Checkpoints     │  │
│  │  - Models        │      │  - Logs            │  │
│  │  - Training      │      └────────────────────┘  │
│  └──────────────────┘                               │
└──────────────────────────────────────────────────────┘

Core Components

1. Routers (API Layer)

Location: ModelForge/routers/

Responsibility: HTTP request handling

Files:

finetuning_router.py - Training endpoints
models_router.py - Model management
playground_router.py - Inference testing
hub_management_router.py - Model hub operations

Pattern: Thin controllers, delegate to services

2. Services (Business Logic)

Location: ModelForge/services/

Responsibility: Core business logic

Files:

training_service.py - Training orchestration
model_service.py - Model CRUD operations
hardware_service.py - Hardware detection

Pattern: Service layer with dependency injection

3. Providers (Model Loading)

Location: ModelForge/providers/

Responsibility: Model and tokenizer loading

Files:

__init__.py - Provider protocol
huggingface_provider.py - HuggingFace implementation
unsloth_provider.py - Unsloth implementation
provider_factory.py - Provider creation

Pattern: Protocol + Factory

4. Strategies (Training Algorithms)

Location: ModelForge/strategies/

Responsibility: Training algorithm implementation

Files:

__init__.py - Strategy protocol
sft_strategy.py - Supervised fine-tuning
qlora_strategy.py - Quantized LoRA
rlhf_strategy.py - RLHF
dpo_strategy.py - DPO
strategy_factory.py - Strategy creation

Pattern: Strategy + Factory

5. Database Layer

Location: ModelForge/database/

Responsibility: Data persistence

Files:

models.py - SQLAlchemy models
database_manager.py - DB operations

Pattern: Repository with ORM

6. Evaluation System

Location: ModelForge/evaluation/

Responsibility: Training evaluation

Files:

metrics.py - Task-specific metrics
dataset_validator.py - Dataset validation

Design Patterns

Dependency Injection

Implementation: FastAPI's Depends()

Example:

from fastapi import APIRouter, Depends
from ..dependencies import get_training_service

router = APIRouter()

@router.post("/start_training")
async def start_training(
    config: TrainingConfig,
    service: TrainingService = Depends(get_training_service),
):
    return service.train_model(config.model_dump())

Factory Pattern

Used for: Providers and Strategies

Example:

class ProviderFactory:
    _providers = {
        "huggingface": HuggingFaceProvider,
        "unsloth": UnslothProvider,
    }
    
    @classmethod
    def create_provider(cls, provider_name: str):
        provider_class = cls._providers.get(provider_name)
        if not provider_class:
            raise ProviderError(f"Unknown provider: {provider_name}")
        return provider_class()

Strategy Pattern

Used for: Training algorithms

Example:

class TrainingStrategy(Protocol):
    def prepare_model(self, model, config): ...
    def prepare_dataset(self, dataset, tokenizer, config): ...
    def create_trainer(self, model, dataset, ...): ...

Protocol (Interface)

Used for: Defining contracts

Example:

from typing import Protocol

class ModelProvider(Protocol):
    def load_model(self, model_id: str, ...): ...
    def load_tokenizer(self, model_id: str, ...): ...
    def validate_model_access(self, model_id: str, ...): ...
    def get_provider_name(self) -> str: ...

Data Flow

Training Request Flow

User submits training request via UI
React Frontend sends POST to /api/start_training
FastAPI Router receives request, validates with Pydantic
Router injects TrainingService via dependency
TrainingService orchestrates:
- Validates dataset
- Creates provider from ProviderFactory
- Loads model via provider
- Creates strategy from StrategyFactory
- Prepares model and dataset via strategy
- Creates trainer and starts training
Training runs with callbacks for progress
Results saved to database and file system
Response returned to user

Model Loading Flow

User Request
    ↓
ProviderFactory.create_provider(provider_name)
    ↓
Provider.load_model(model_id, config)
    ↓
Provider-specific implementation
    ↓
Return (model, tokenizer)

Extension Points

Adding a Provider

Create class implementing ModelProvider protocol
Register in ProviderFactory._providers
That's it! No other changes needed.

Adding a Strategy

Create class implementing TrainingStrategy protocol
Register in StrategyFactory._strategies
That's it! No other changes needed.

Adding a Task

Add task-specific formatter in services/training_service.py
Add metrics in evaluation/metrics.py
Update schema validation

Error Handling

Exception Hierarchy

ModelForgeException (base)
├── ProviderError
├── StrategyError
├── DatasetValidationError
├── TrainingError
├── ConfigurationError
├── HardwareError
└── DatabaseError

Error Handler

All exceptions caught by FastAPI error handlers and converted to appropriate HTTP responses.

Configuration Management

Pydantic Schemas

Location: ModelForge/schemas/

Validation and serialization of configuration.

Environment Variables

HUGGINGFACE_TOKEN - HuggingFace API token
MODELFORGE_DB_PATH - Custom database path
MODELFORGE_DISABLE_TENSORBOARD - Disable TensorBoard

Testing Strategy

Unit Tests

Test individual components in isolation:

def test_provider_factory():
    provider = ProviderFactory.create_provider("huggingface")
    assert provider.get_provider_name() == "huggingface"

Integration Tests

Test component interactions:

def test_training_flow():
    service = TrainingService(mock_db, mock_file_manager)
    result = service.train_model(config)
    assert result["status"] == "success"

Mocking

Use dependency injection for easy mocking:

mock_db = MagicMock(spec=DatabaseManager)
service = TrainingService(mock_db, file_manager)

Performance Considerations

Connection Pooling

SQLAlchemy connection pool:

Pool size: 10
Max overflow: 20
Recycle: 3600 seconds

Lazy Loading

Models and datasets loaded only when needed.

Gradient Checkpointing

Reduces memory at cost of compute.

Mixed Precision

bf16/fp16 for faster training on modern GPUs.

Security

Input Validation

All inputs validated via Pydantic schemas.

SQL Injection Prevention

SQLAlchemy ORM prevents SQL injection.

File Access

File paths validated and sandboxed.

Token Security

HuggingFace tokens stored in environment, never in code.

Scalability

Current Limitations

Single-GPU training
Single-process server
SQLite database

Future Improvements

Multi-GPU support (already structured for it)
Distributed training
PostgreSQL for production
Redis for caching
Kubernetes deployment

Code Quality Metrics

Metric	Value
Code Duplication	0%
Cyclomatic Complexity	Low (< 10 per function)
Test Coverage	(To be added)
Type Hints	Extensive
Documentation	Comprehensive

Contributing

See Contributing Guide for:

Code style guidelines
Testing requirements
PR process

Understanding the architecture makes contributing easy! Read the code in ModelForge/ to see it in action.

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

ModelForge v3 Architecture

Architecture Overview

Core Components

1. Routers (API Layer)

2. Services (Business Logic)

3. Providers (Model Loading)

4. Strategies (Training Algorithms)

5. Database Layer

6. Evaluation System

Design Patterns

Dependency Injection

Factory Pattern

Strategy Pattern

Protocol (Interface)

Data Flow

Training Request Flow

Model Loading Flow

Extension Points

Adding a Provider

Adding a Strategy

Adding a Task

Error Handling

Exception Hierarchy

Error Handler

Configuration Management

Pydantic Schemas

Environment Variables

Testing Strategy

Unit Tests

Integration Tests

Mocking

Performance Considerations

Connection Pooling

Lazy Loading

Gradient Checkpointing

Mixed Precision

Security

Input Validation

SQL Injection Prevention

File Access

Token Security

Scalability

Current Limitations

Future Improvements

Code Quality Metrics

Contributing