dynamic estimate of required memory usage#438
dynamic estimate of required memory usage#438Green-Sky wants to merge 3 commits intoggml-org:masterfrom
Conversation
c9aa526 to
660e1df
Compare
660e1df to
636a954
Compare
f0e79f4 to
4e64e37
Compare
4e64e37 to
424281a
Compare
|
hold up, need to fix perplexity. update: still investigating. |
|
@Green-Sky UB is hard to fix, I really appreciate! I'll try this PR tomorrow. Before that, let me to make an immature suggestion: Think about the situation that new segmentation fault occur again, but still take time fix. |
it is only UB if you run without address sanitizer 😉 |
so, 32GiB are not enough to run perplexity (defaults) on 7B q4_0 . edit: with context 1024 edit: #407 changes how this works |
3c31292 to
5dd94f7
Compare
|
for some reason @ggerganov pushed 4870e45 👀 |
Unfortunately, this still doesn't fix the memory allocation issues :( From what I can tell, it pretty much wraps stuff in vectors and adds an assertion to force the code to fail rather than segfaulting. |
|
@ggerganov promised an memory overhaul here #407 (comment) so i am closing this pr. |
|
Runs smoothly, thanks! |
|
officially replaced by #473 |
uses observations made in #213 and replaces it.
fixes
ggml_new_tensor_impl: not enough space in the context's memory pooland resulting Segfaults.this is still as much of a hack as it was before, but this time it is working.
this could potentially fix a bunch of issues. ( fixes #153 )