Implement progress bar and multi-connection downloads by ericcurtin · Pull Request #16196 · ggml-org/llama.cpp

ericcurtin · 2025-09-23T09:00:59Z

For llama-server pulling

For llama-server pulling Signed-off-by: Eric Curtin <eric.curtin@docker.com>

JohannesGaessler · 2025-09-23T11:50:20Z

FYI if you want a smoother/more compact progress bar you can use fractional blocks. Example implementation for training:

llama.cpp/ggml/src/ggml-opt.cpp

Lines 935 to 958 in 0889589

    
           // The progress bar consists of partially filled blocks, unicode has 8 separate fill levels. 
        
           constexpr int64_t bar_length = 8; 
        
           const int64_t ibatch8 = 8 * ibatch; 
        
           for (int64_t j = 0; j < bar_length; ++j) { 
        
               if        (ibatch_max * (8*j + 8) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u2588"); // full block 
        
               } else if (ibatch_max * (8*j + 7) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u2589"); // 7/8 filled 
        
               } else if (ibatch_max * (8*j + 6) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258A"); // 6/8 filled 
        
               } else if (ibatch_max * (8*j + 5) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258B"); // 5/8 filled 
        
               } else if (ibatch_max * (8*j + 4) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258C"); // 4/8 filled 
        
               } else if (ibatch_max * (8*j + 3) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258D"); // 3/8 filled 
        
               } else if (ibatch_max * (8*j + 2) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258E"); // 2/8 filled 
        
               } else if (ibatch_max * (8*j + 1) / bar_length < ibatch8) { 
        
                   fprintf(stderr, "\u258F"); // 1/8 filled 
        
               } else { 
        
                   fprintf(stderr, " "); 
        
               } 
        
           }

ericcurtin · 2025-09-25T01:54:38Z

@JohannesGaessler was thinking docker-style because of portability, do you know if this works on Windows?

angt · 2025-09-25T07:39:37Z

Hey :)

I'm currently in the middle of a larger rework of this exact section, with the goal of removing the cURL dependency completely. You can see the work in progress here: #16185

And also made a very simple progressbar 😆

static void print_progress(size_t current, size_t total) { // TODO isatty
    if (!total) {
        return;
    }

    size_t width = 50;
    size_t pct = (100 * current) / total;
    size_t pos = (width * current) / total;

    std::cout << "["
              << std::string(pos, '=')
              << (pos < width ? ">" : "")
              << std::string(width - pos, ' ')
              << "] " << std::setw(3) << pct << "%  ("
              << current / (1024 * 1024) << " MB / "
              << total / (1024 * 1024) << " MB)\r";
    std::cout.flush();
}

To avoid us getting tangled in merge conflicts, how about we get my refactor merged first to create a clean base, and then we can circle back to integrate the improvements from your patch?

Let me know what you think

ericcurtin · 2025-09-25T09:44:48Z

#16185

Let me know when the PR is ready for review, would like to test and review.

ngxson · 2025-10-05T11:43:11Z

I'll take time to review in the next few days. It's quite important as multi-thread downloading can significantly improve download speed.

In the meantime, @ericcurtin could you confirm if the user-reported bugs from original PR are resolved?

I'm also thinking of a method to test this, maybe extend server test and hide this new test behind a flag, so it doesn't run automatically on CI.

ngxson · 2025-10-05T11:45:52Z

I think for simplification we can adopt this style:

[ #####                   ] 30%

It's just plain ASCII so should be ok on all systems. Other style can be implemented in another PR.

ericcurtin · 2025-10-05T13:35:16Z

I'll take time to review in the next few days. It's quite important as multi-thread downloading can significantly improve download speed.

Yeah, I kinda abandoned this for now until @angt was happy. Yeah, it can improve performance quite a bit I think @xenoscopic saw like a 25% performance increase with similar code for DockerHub.

A lot of people see zero improvement also, but I guess it depends on your networking setup, you can simply be throttled by bandwidth. I'm my apartment I see zero improvement.

In the meantime, @ericcurtin could you confirm if the user-reported bugs from original PR are resolved?

I think I caught everything except for maybe, the file closing one before rename, don't remember addressing that one.

I'm also thinking of a method to test this, maybe extend server test and hide this new test behind a flag, so it doesn't run automatically on CI.

Sounds like a great idea. Or just use llama-pull to test pull in an isolated way.

Implement progress bar and multi-connection downloads

48d40f1

For llama-server pulling Signed-off-by: Eric Curtin <eric.curtin@docker.com>

ericcurtin requested a review from ggerganov as a code owner September 23, 2025 09:01

ericcurtin mentioned this pull request Sep 23, 2025

Implement progress bar and multi-connection downloads #16115

Closed

ericcurtin mentioned this pull request Oct 5, 2025

Implement llama-pull tool #16423

Open

ericcurtin mentioned this pull request Nov 28, 2025

refactor : use common download in tools/run #17535

Closed

ericcurtin closed this Dec 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement progress bar and multi-connection downloads#16196

Implement progress bar and multi-connection downloads#16196
ericcurtin wants to merge 1 commit intoggml-org:masterfrom
ericcurtin:multicon-progress

ericcurtin commented Sep 23, 2025

Uh oh!

JohannesGaessler commented Sep 23, 2025

Uh oh!

ericcurtin commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025 •

edited

Loading

Uh oh!

ericcurtin commented Sep 25, 2025

Uh oh!

ngxson commented Oct 5, 2025

Uh oh!

ngxson commented Oct 5, 2025

Uh oh!

ericcurtin commented Oct 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ericcurtin commented Sep 23, 2025

Uh oh!

JohannesGaessler commented Sep 23, 2025

Uh oh!

ericcurtin commented Sep 25, 2025

Uh oh!

angt commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericcurtin commented Sep 25, 2025

Uh oh!

ngxson commented Oct 5, 2025

Uh oh!

ngxson commented Oct 5, 2025

Uh oh!

ericcurtin commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

angt commented Sep 25, 2025 •

edited

Loading

ericcurtin commented Oct 5, 2025 •

edited

Loading