Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

Sync to llama.cpp + GGML version as of 20230407 09:57 AM UTC.#119

Merged
philpax merged 2 commits intorustformers:mainfrom
KerfuffleV2:feat-update-ggml
Apr 7, 2023
Merged

Sync to llama.cpp + GGML version as of 20230407 09:57 AM UTC.#119
philpax merged 2 commits intorustformers:mainfrom
KerfuffleV2:feat-update-ggml

Conversation

@KerfuffleV2
Copy link
Contributor

@KerfuffleV2 KerfuffleV2 commented Apr 7, 2023

This would have been a pretty interesting first issue.

This also includes nuking the increased_determinism option since it's no longer needed.

I confirmed that it's possible to change the thread/batch sizes with deterministic results. I haven't don't any extensive performance testing but there didn't seem to be an obvious issue.

Other people should probably do some testing before this gets merged as it's a fairly complicated change. I can't say I really understood what I was porting across.

Closes #118
Closes #67

@philpax
Copy link
Collaborator

philpax commented Apr 7, 2023

Seems pretty reasonable from what I can see, will need to test but have no objections to merging otherwise

@philpax
Copy link
Collaborator

philpax commented Apr 7, 2023

@KerfuffleV2
Copy link
Contributor Author

Can do! By the way, the contributing thing refers to CREDITS.md, I didn't know what it meant until now.

@philpax
Copy link
Collaborator

philpax commented Apr 7, 2023

Can do! By the way, the contributing thing refers to CREDITS.md, I didn't know what it meant until now.

made it clearer, ty

Copy link
Collaborator

@philpax philpax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on macOS M1 and Windows x86-64 with no issues. Let's do it 🚀

@philpax philpax merged commit 89a1e70 into rustformers:main Apr 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update to the latest llama.cpp ggml Making results independent from threadcount/batch size (from llama.cpp)

2 participants