Support tool use and add example#174
Conversation
8c484a2 to
84e2f64
Compare
|
This looks really interesting! |
mlx-swift-examples.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved
Outdated
Show resolved
Hide resolved
3959ba1 to
be440a8
Compare
|
This is almost ready, but some more changes in swift-transformers might be necessary: huggingface/swift-transformers#169 |
aa1afcc to
559a44c
Compare
559a44c to
50c3529
Compare
|
This is now ready for review. I've added the |
|
|
||
| } | ||
| .task { | ||
| self.prompt = llm.modelConfiguration.defaultPrompt |
There was a problem hiding this comment.
This should be put back once the prompt override is removed
There was a problem hiding this comment.
I'm not sure what you mean.
There was a problem hiding this comment.
Above you had hard coded the prompt for development -- once that part is reverted this should go back in (the per model prompt)
There was a problem hiding this comment.
Okay, I've reverted that part.
| Toggle(isOn: $llm.includeWeatherTool) { | ||
| Text("Include \"get current weather\" tool") | ||
| } | ||
| .frame(maxWidth: 350, alignment: .leading) |
There was a problem hiding this comment.
I think this is a great way to see the tool integration in action. I wonder if it should be included in this generic UI? I guess it is an example and people who use this as the basis for their own app can either include it or not, depending on their needs.
| // let modelConfiguration = ModelRegistry.llama3_1_8B_4bit | ||
| // let modelConfiguration = ModelRegistry.mistral7B4bit | ||
| // let modelConfiguration = ModelRegistry.qwen2_5_7b | ||
| let modelConfiguration = ModelRegistry.qwen2_5_1_5b |
There was a problem hiding this comment.
Changing this to qwen is fine (it comes in at 1.7G) but please update the comment and remove the commented out parts.
| "required": ["location"], | ||
| ] as [String: any Sendable], | ||
| ] as [String: any Sendable], | ||
| ] as [String: any Sendable] |
There was a problem hiding this comment.
This is pretty unwieldy :-) I see it matches the type in Transformers -- it is too bad we don't have a struct that could be used to declare this and convert to that type. Perhaps an exercise for future refactoring.
There was a problem hiding this comment.
I looked at using a struct, but it seems even more unwieldy, although it would be more type-safe. Perhaps someone else can find a more elegant solution than me.
| static public let qwen205b4bit = ModelConfiguration( | ||
| id: "mlx-community/Qwen1.5-0.5B-Chat-4bit", | ||
| overrideTokenizer: "PreTrainedTokenizer", | ||
| defaultPrompt: "why is the sky blue?" |
There was a problem hiding this comment.
Should this be removed? It is older, but people might still refer to this.
There was a problem hiding this comment.
I've added it back, although I think it's fine to remove outdated models, since the pattern is clear from the existing examples.
|
@DePasqualeOrg this looks really cool! I added a few comments on removing some of the debug/dev code. It looks like there is a conflict in UserInput.swift now (I think from the merge of the video processing for VLMs). |
4c64908 to
a4d206e
Compare
|
Thanks! I resolved the merge conflict and made the changes you suggested. |
a4d206e to
d0f57b4
Compare
davidkoski
left a comment
There was a problem hiding this comment.
Awesome work, thank you!


This demonstrates tool use (function calling), which is now supported in my PR to Swift Jinja. You should make sure that you have the latest
tokenizer_config.jsonfile for each model, since in some cases function calling was added in a recent update.The following are some examples of responses to the prompt
What's the weather in Paris today?in LLMEval. Aget_current_weatherfunction is provided to the model in the prompt constructed with the chat template.Llama 3.1 8B
Llama 3.2 3B
Qwen 2.5 7B
Mistral 7B
Performance
Llama 3.1 8B and 3.2 3B with the current chat templates from
tokenizer_config.jsontend to always respond with a function call, even when not appropriate. My proposed change to the chat template helps, but the models still sometimes respond with calls to non-existent functions. In general, the prompts provided in the Llama chat templates are far from optimal, and I think the models' performance could be further improved simply by using better prompts (for example, "Knowledge cutoff date" instead of "Cutting Knowledge Date").Qwen 2.5 7B and Mistral 7B do a better job of calling functions only when appropriate.
Handling the function call
If I understand correctly, the app would need to parse the JSON function call, stop generating after the function call, call the function with the parameters, add a message with the function's output, and generate a user-facing response with the messages.