Skip to content
View Leo-bsb's full-sized avatar

Block or report Leo-bsb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Leo-bsb/README.md

πŸ‘‹ Hi, I'm Leonardo Braga

Data Science & AI Undergraduate | Generative AI & Data Engineering Intern πŸ“ BrasΓ­lia, Brazil | πŸ“§ | LinkedIn

I build production-ready AI and data engineering solutions, from scalable LLM applications and retrieval-augmented generation (RAG) pipelines to robust ETL workflows with SAP and modern data stacks. I transform complex data challenges into efficient systems that drive business impact.


πŸš€ Featured Projects

πŸ” Legal Document Information Extraction System

LLM-powered PDF parsing β€’ Streamlit dashboard β€’ SQLite persistence β€’ Batch processing

Live Demo

Python LLM Streamlit Polars SQLite PDF Processing

  • Engineered a modular system for large-scale legal PDF parsing and structured data extraction
  • Optimized batch processing to handle documents up to 200MB with minimal memory footprint
  • Implemented schema validation and auto-repair mechanisms to ensure data integrity
  • Delivered a production-ready UI enabling real-time analytics and manual review workflows

πŸ€– SAP Data Services AI Assistant

RAG pipeline β€’ Domain-specific intent detection β€’ Streamlit UI

Live Demo

RAG Sentence-Transformers Streamlit FAISS SAP BODS

  • Developed a custom chatbot tailored for SAP BODS users, overcoming generic LLM limitations
  • Integrated semantic search and intent classification for precise, domain-aware support
  • Enabled fluent, natural Portuguese interactions aligned with SAP ecosystem documentation

⏱️ Intelligent Delivery Time Predictor

XGBoost β€’ Model Explainability β€’ Streamlit UI

Live Demo

XGBoost SHAP Streamlit Scikit-learn Pandas

  • Delivered highly accurate delivery time predictions using structured business features
  • Applied SHAP for interpretable models, providing actionable insights to stakeholders
  • Built a user-friendly interface for real-time predictions and scenario analysis

🍽️ Generative AI for Food Industry

Sentiment Analysis β€’ Text Generation β€’ Multi-modal Applications

Food Bot
Dish Generator

Gemini AI Polars Sentiment Analysis Text Generation

  • Built a sentiment-aware Food Review Reply Bot generating empathetic responses to customer feedback
  • Created a persuasive Dish Description Generator to boost engagement in food delivery apps
  • Designed efficient data pipelines with Polars for scalable multi-modal applications

πŸ“š Intelligent Book Recommendation System

Hybrid Neural Network β€’ Collaborative Filtering β€’ Deep Learning

Live Demo

PyTorch Neural Networks Collaborative Filtering Hugging Face

  • Designed a hybrid recommendation engine combining deep learning with collaborative filtering
  • Developed scalable workflows in PyTorch for personalized book suggestions
  • Documented the entire process through a comprehensive Medium tutorial

πŸ“Š Fast EDA & Data Visualization Tools

Interactive Dashboards β€’ Healthcare Analytics β€’ Automated Reporting

Fast EDA
Healthcare

PyGWalker Polars Streamlit Plotly Gradio

  • Delivered instant exploratory data analysis with an interactive Fast EDA dashboard
  • Developed a visual analytics platform for Brazilian live births (SINASC 2023 data)
  • Authored an interactive tutorial for PyGWalker enabling fast, code-light EDA

πŸ› οΈ Technical Stack

Programming & Query Languages: Python SQL C Bash

AI & Machine Learning: PyTorch Transformers Hugging Face XGBoost Scikit-learn LLMs RAG Embeddings

Data Engineering & Processing: SAP Data Services (BODS) Pentaho Polars Pandas SQL Databases ETL/ELT

Data Visualization & Apps: Streamlit Gradio Plotly PyGWalker Power BI

Tools & Platforms: Git Docker Linux Hugging Face Spaces Google Colab

Certifications: Oracle Generative AI Professional Oracle Data Science Professional


πŸ’Ό Professional Experience

Data Migration Intern β€” First Decision (Apr 2025 – Present)

  • Delivered complex ETL migrations for enterprise clients, handling end-to-end data workflows
  • Developed and optimized pipelines leveraging SAP BODS, Pentaho, and SQL databases
  • Gained hands-on experience managing data migrations for large-scale market players

πŸ“š Education

B.Sc. Data Science & Artificial Intelligence β€” IESB (2023–2026) Core Courses: Machine Learning, Data Mining, Big Data, Statistical Modeling, Deep Learning


🌐 Connect With Me

LinkedIn GitHub Hugging Face

Pinned Loading

  1. delivery-time-predictor delivery-time-predictor Public

    Python

  2. sap-bot sap-bot Public

    Python

  3. generative-ai-response-for-food-review generative-ai-response-for-food-review Public

    Python

  4. book-recommendation-system book-recommendation-system Public

    Jupyter Notebook

  5. generative-ai-for-persuasive-dish-descriptions generative-ai-for-persuasive-dish-descriptions Public

  6. legal-extraction-system legal-extraction-system Public

    Python