Peter Albert issues

Results 9 issues of


                                            Peter Albert

Model Request: Blenderbot 2.0

# 🌟 New model addition ## Model description Facebook released Blenderbot 2.0, a chatbot that builds on RAG and Blenderbot 1.0. It can save interactions for later reference and use...

New model

Include run id in train.log

Added a --run-id flag that appears in logging at each training step. As default value I added a time stamp of the beginning of training. This allows for easy identification...

cla signed

Unify sweep.py and slurm.py between metaseq and metaseq-internal

Continuation of https://github.com/facebookresearch/metaseq/pull/476 . As metaseq-internals unification PR was not merged, a few other features got added to metaseq-internal's sweep and slurm. I brought these here into metaseq. Note: Gpu...

cla signed

Allow to skip batches during training

Allows skipping ranges of batches during training.

cla signed

Training starts hanging at the beginning if dataset is too small - on AWS

## 🐛 Bug When starting a training run the model starts hanging at the first forward pass. This happened when I used the small book dataset used in the gpu_test/test_training_integrity.py....

bug

Rewrite of the load_checkpoint function

This is a rewrite of how we determine which checkpoint to load when starting/restarting a training run. (Originally there was also a refactor of how our different checkpoint paths are...

cla signed

[Bug]: Anthropic tool calls with images is broken

### What happened? When using the new tool call functionality of anthropic and you add an image to your message (according to the openai's message format) the message conversion to...

bug

Allow multiple tools in anthropic response

Currently in litellm's anthropic response only the first tool is extracted and the rest is ignored. This PR allows to also parse the rest of the tools.

Error while clustering

After starting the clustering I get this error: ``` [local/evol1][1 shards] map "extract_text" to "('prompt__cluster',)": 100%|████████████████████████████████████████████████████████████████████████████████| 319/319 [00:00