DebuggerCafe - Deep Learning, Machine Learning, Artificial Intelligence

Fine-Tuning Phi-3.5 Vision Instruct

In this article we are fine-tuning the Phi-3.5 Vision Instruct model on a receipt OCR dataset. We are using Hugging Face libraries and training a LoRA. ...

Object Detection with DEIMv2

Sovit Ranjan Rath December 1, 2025 0 Comment

In this article, we explore the DEIMv2 object detection model based on the DINOv3 and HGNetv2 backbones, along with carrying inference on images and videos. ...

Introduction to Moondream3 and Tasks

Sovit Ranjan Rath November 24, 2025 0 Comment

In this article, we cover Moondream3, the latest iteration in Moondream VLM family. We cover the model architecture and carry out inference using the different tasks that it supports. ...

DINOv3 with RetinaNet Head for Object Detection

Sovit Ranjan Rath November 17, 2025 0 Comment

In this article, we modify the DINOv3 backbone with RetinaNet head for object detection. We train the model on the Pascal VOC dataset and carry out inference. ...

Object Detection with DINOv3

Sovit Ranjan Rath November 10, 2025 0 Comment

In this article, we modify the DINOv3 model for object detection and train in on the Pascal VOC detection dataset. We discuss the model creation, training, and inference in detail. ...