Skip to content
This repository was archived by the owner on Jan 24, 2026. It is now read-only.

cjhCoder7/AdaptiveLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length

Introduction

Large Language Models (LLMs) have advanced code generation but struggle to balance performance and inference costs across diverse tasks. Dynamically selecting the optimal LLM based on task difficulty and resource constraints offers a solution, yet existing methods are resource-intensive, costly, and rely on human-annotated difficulty labels, which are often unavailable or misaligned with LLMs' perception.

We introduce AdaptiveLLM , a framework that dynamically selects optimal LLMs by automatically assessing task difficulty. It estimates difficulty using Chain-of-Thought (CoT) lengths from reasoning models, clusters tasks into three difficulty levels via k-means, and fine-tunes CodeBERT to embed difficulty-aware features. An XGBoost classifier then selects the best model for each task, optimizing performance-cost trade-offs.

Repository Structure

Baseline/

This folder is used to store the experimental code for the baseline method ComplexityNet. In the ComplexityNet framework, the model pool consists of CodeLlama, GPT-3.5, and GPT-4o, and the selector used for fine-tuning is Qwen2.5-7B-Instruct.

Consistency_Check/

compare/

This folder contains the box plot comparison between difficulty annotations based on CoT length and human-annotated difficulty levels. We conducted the comparison on two datasets: LeetCodeSample and CodeContests.

confusion_matrix/

This folder contains the confusion matrix comparing difficulty annotations based on CoT length with human-annotated difficulty levels, aimed at exploring the differences between the two classification methods. The comparison was also performed on the LeetCodeSample and CodeContests datasets.

Processed_Data/

This folder contains the original datasets as well as the datasets annotated with difficulty labels based on CoT length.

  • prompts_en_extra_is_freeform.jsonl: HumanEval dataset
  • prompts_python_en_test.jsonl: CodeContests dataset
  • prompts.jsonl: LeetCodeSample dataset

K-means/

This folder contains the combined datasets of three datasets annotated with chain-of-thought difficulty labels by the DeepSeek-R1-Distill-Qwen-32B model.

Result/

This folder contains the generation results and code produced by invoking the models from the model pool on the three datasets.

Thinking_Length/

This folder contains the CoT lengths generated by the DeepSeek R1 distilled models with parameter sizes of 1.5B, 7B, 14B, and 32B, along with their corresponding clustering results.

Train/

This folder contains the code and results for fine-tuning CodeBERT and training the XGBoost classifier

  • CodeBert_finetune.py : Training code for fine-tuning CodeBERT.
  • data_split.py : Code for splitting the dataset into training and testing sets.
  • score.py : Formula for calculating the cost-performance score of models.
  • Classifier.py : Code for training the XGBoost classifier.
  • test_data.jsonl : Test dataset.
  • train_data.jsonl : Training dataset.
  • predictions_1.jsonl : Prediction results from AdaptiveLLM.
  • predictions_2.jsonl : Prediction results from AdaptiveLLM (without fine-tuning).
  • xgboost_model_1.pkl : Trained XGBoost classifier from the AdaptiveLLM framework.
  • xgboost_model_2.pkl : Trained XGBoost classifier from AdaptiveLLM (without fine-tuning).

Model candidate pool

LLM Size Link Price
Yi-Coder-1.5B-Chat 1.5B https://huggingface.co/01-ai/Yi-Coder-1.5B-Chat $ 0.14/ M Tokens
Qwen2.5-Coder-1.5B-Instruct 1.5B https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct $ 0.14/ M Tokens
CodeLlama-7b-Instruct-hf 7B https://huggingface.co/meta-llama/CodeLlama-7b-Instruct-hf $ 0.42/ M Tokens
starcoder2-15b-instruct-v0.1 15B https://huggingface.co/bigcode/starcoder2-15b-instruct-v0.1 $ 0.72/ M Tokens
deepseek-coder-v2-lite-instruct 16B https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct $ 0.72/ M Tokens
Codestral-22B-v0.1 22B https://huggingface.co/mistralai/Codestral-22B-v0.1 $ 0.95/ M Tokens
deepseek-coder-33b-instruct 33B https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct $ 1.26/ M Tokens
Qwen2.5-Coder-32B-Instruct 32B https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct $ 1.26/ M Tokens

Reasoning model

LLM Size Link
DeepSeek-R1-Distill-Qwen-1.5B 1.5B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek-R1-Distill-Qwen-7B 7B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Qwen-14B 14B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B 32B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1 671B https://huggingface.co/deepseek-ai/DeepSeek-R1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages