Skip to content

apr chat has repetitive responses #189

@alfredodeza

Description

@alfredodeza

I suspect this has to do with the changes in #170 and I believe it might have to do with EOS markers, but I can be wrong:

apr chat qwen2.5-1.5b-instruct-q4_k_m.apr

=== Model Chat (APR Format) ===

Using APR v2 format with mmap (Native Library Mandate)

  Model: qwen2.5-1.5b-instruct-q4_k_m.apr
  Chat Template: ChatML
  Temperature: 0.7
  Top-P: 0.9
  Max Tokens: 512

Commands:
  /quit     Exit the chat
  /clear    Clear conversation history
  /system   Set system prompt
  /help     Show help

════════════════════════════════════════════════════════════

Loading model...
Loaded APR format in 0.15s (1113.2 MB)
Loaded tokenizer: tokenizer.json (151936 tokens)
Detected Raw chat template
You: hey
[AprV2ModelCuda] Built indexed weights for 28 layers
[AprV2ModelCuda] Pre-cached 5596 MB of weights on GPU (28 layers, 197 quantized, 112 F32 tensors)
[AprV2ModelCuda] Cached embedding table: 890 MB
[APR CUDA: NVIDIA GeForce RTX 4090 (24077 MB VRAM)]
Assistant: ,VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE:VILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLEVILLE

You:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions