GPU-Sharing Workload Scheduler

A Python implementation of GPU scheduling algorithms for containerized workloads, including a Fragmentation Gradient Descent (FGD) scheduler inspired by the paper "Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent".

Overview

This project simulates GPU cluster scheduling with support for:

Fractional GPU allocation (0.1 to 1.0 GPU units)
Multi-GPU requests (>1.0 GPU units requiring full GPUs)
CPU and GPU co-scheduling
Fragmentation-aware scheduling algorithms

Architecture

Core Components

Pod: Represents a containerized workload with CPU and GPU resource requests
GPU: Models individual GPU resources with fractional allocation support
Node: Represents a compute node with multiple GPUs and CPU resources
Cluster: Collection of nodes representing the entire cluster
PodQueue: FIFO queue of pending pods with workload distribution statistics

Scheduling Algorithms

1. First Fit Scheduler (`FirstFitScheduler`)

Algorithm: Assigns each pod to the first node that can accommodate it
Complexity: O(n × m) where n = pods, m = nodes
Use case: Baseline comparison, fast scheduling

2. Fragmentation Gradient Descent Scheduler (`LocalFGDScheduler`)

Algorithm: Selects the node that minimizes fragmentation increase
Complexity: O(n × m × k) where k = average pods in distribution
Features:
- Fragmentation-aware placement decisions
- Expected fragmentation calculation based on workload distribution
- Optimized with hypothetical serving (no deep copying)

Key Features

Fragmentation Calculation

The system implements fragmentation scoring based on the FGD paper:

GPU Fragmentation: Unused GPU capacity that cannot serve future workloads
Expected Fragmentation: Probability-weighted fragmentation across workload distribution
Fragmentation Delta: Change in fragmentation after hypothetical pod placement

Performance Optimizations

Hypothetical Serving: Simulates resource allocation without state mutation
Utility Functions: Pure functions for fragmentation calculations
Efficient Data Structures: Uses deque for O(1) queue operations

Installation

git clone <repository-url>
cd gpu-scheduler
pip install -r requirements.txt  # if requirements.txt exists

Usage

Basic Scheduling Comparison with Provided Data

Run the main scheduling script from the root directory:

python schedule.py

Custom Workload

To test the scheduler with your own workload:

Format your input as a CSV with cpu_request, gpu_request columns.
Replace or extend the data loading function in utils/load_data.py.
Update the schedule.py logic if needed.

Example custom CSV:

cpu_request,gpu_request
2,0.5
4,1.0
1,0.0
8,2.0

File Structure

src/
├── Pod.py                      # Pod resource model
├── GPU.py                      # GPU resource management
├── Node.py                     # Compute node with multiple GPUs
├── Cluster.py                  # Cluster of nodes
├── PodQueue.py                 # Pod queue with workload distribution
├── heuristics/
│   ├── BaseScheduler.py        # Abstract scheduler base class
│   ├── FirstFitScheduler.py    # First-fit scheduling algorithm
│   └── FGDScheduler.py         # Fragmentation gradient descent
└── utils/
    ├── fragmentation_utils.py  # Pure fragmentation calculation functions
    └── load_data.py            # Data loading utilities
schedule.py                     # Main evaluation script

Resource Model

Pod Specifications

CPU Request: Integer CPU units required
GPU Request: Float GPU units (0.0 to N.0)
- 0.0: CPU-only workload
- 0.1-1.0: Fractional GPU sharing
- >1.0: Multi-GPU requiring full GPUs

Node Specifications

CPU Capacity: Total CPU units available
GPU List: Collection of individual GPU resources
Scheduling Constraints:
- Multi-GPU requests require full GPUs
- Fractional requests can share GPUs
- CPU requirements must be satisfied

Evaluation Metrics

The evaluation script tracks several metrics:

Schedule Success Rate: Percentage of pods successfully placed
Average Time per Pod: Scheduling latency per workload
Fragmentation Score: Resource utilization efficiency
Cluster Utilization: Overall resource usage

Result and example output (verbose setting)

Suprisingly both scheduler, although choosing different nodes for the initial pods, towards the end of the queue choose similar pods. Additionally, although by design fragmentation-aware scheduling should lead to a higher scheduling success rate, this is not achieved.
Unsuprisingly, FGD is slower than first fit by several orders of mangitude as the node evaluations are not done in parallel, adding a factor of N into the complexity of the algorith.

Further work is necessary to determine the reason for the exact same scheduling success rate.

First Fit Results:
Scheduled 6973 out of 8152 pods.
Scheduling success rate: 85.54%
Average time per pod: 0.000695 seconds
Average time per pod: 0.695 milliseconds

openb-node-0123 serves openb-pod-0000
openb-node-0123 serves openb-pod-0001
openb-node-0124 serves openb-pod-0002
openb-node-0124 serves openb-pod-0003
…….
openb-node-0123 serves openb-pod-8106
openb-node-0123 serves openb-pod-8113
openb-node-0123 serves openb-pod-8114
0 left to schedule

FGD Results:
Scheduled 6973 out of 8152 pods.
Scheduling success rate: 85.54%
Average time per pod: 0.075204 seconds
Average time per pod: 75.204 milliseconds

openb-node-1328 serves openb-pod-0000
openb-node-0356 serves openb-pod-0001
openb-node-1329 serves openb-pod-0002
…..
openb-node-0123 serves openb-pod-8106
openb-node-0123 serves openb-pod-8113
openb-node-0123 serves openb-pod-8114
0 left to schedule

Fragmentation Gradient Descent Details

For each pending pod:

Calculate current fragmentation for each eligible node
Simulate pod placement (hypothetical serving)
Calculate fragmentation after placement
Select node with minimum fragmentation increase

Optimization

Avoids deep copies by using in-place hypothetical simulation
Leverages workload distribution statistics to estimate expected fragmentation

Research Background

This implementation is based on the research paper:

Title: "Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent"
Key Insight: GPU fragmentation significantly impacts cluster utilization
Solution: Fragmentation-aware scheduling reduces resource waste

Possible Future Enhancements

Further evaluation to determine the reason for the similar success scheduling rate
Support for GPU type constraints
Dynamic workload arrival patterns and requeuing
Distributed scheduling simulation with actual parallelization of the nodes
Plotting of scheduling trajectory

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
src		src
.DS_Store		.DS_Store
README.md		README.md
schedule.py		schedule.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU-Sharing Workload Scheduler

Overview

Architecture

Core Components

Scheduling Algorithms

1. First Fit Scheduler (`FirstFitScheduler`)

2. Fragmentation Gradient Descent Scheduler (`LocalFGDScheduler`)

Key Features

Fragmentation Calculation

Performance Optimizations

Installation

Usage

Basic Scheduling Comparison with Provided Data

Custom Workload

File Structure

Resource Model

Pod Specifications

Node Specifications

Evaluation Metrics

Result and example output (verbose setting)

Fragmentation Gradient Descent Details

Optimization

Research Background

Possible Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU-Sharing Workload Scheduler

Overview

Architecture

Core Components

Scheduling Algorithms

1. First Fit Scheduler (FirstFitScheduler)

2. Fragmentation Gradient Descent Scheduler (LocalFGDScheduler)

Key Features

Fragmentation Calculation

Performance Optimizations

Installation

Usage

Basic Scheduling Comparison with Provided Data

Custom Workload

File Structure

Resource Model

Pod Specifications

Node Specifications

Evaluation Metrics

Result and example output (verbose setting)

Fragmentation Gradient Descent Details

Optimization

Research Background

Possible Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. First Fit Scheduler (`FirstFitScheduler`)

2. Fragmentation Gradient Descent Scheduler (`LocalFGDScheduler`)

Packages