🏥 DeepPlantAI - Advanced Organ Transplant Prediction Platform

Inspiration

The critical need for better transplant outcomes prediction - Every year, thousands of patients undergo organ transplants with varying success rates. Current risk assessment methods rely heavily on clinical experience and basic scoring systems like MELD/PELD, which don't capture the full complexity of patient factors. We were inspired to leverage machine learning to provide more accurate, data-driven mortality predictions that could help clinicians make better-informed decisions and potentially save more lives.

What it does

DeepPlantAI is an AI-powered clinical decision support system that predicts post-transplant outcomes for organ transplant patients. The platform:

  • Predicts liver transplant mortality risk using advanced Gradient Boosted Decision Trees (XGBoost/LightGBM)
  • Supports multiple prediction tasks: Binary mortality, survival status, time categories, and risk stratification
  • Provides HCC recurrence prediction for hepatocellular carcinoma patients using ensemble models
  • Offers interactive visualizations including performance metrics, feature importance analysis, and risk distribution charts
  • Enables real-time risk assessment through an intuitive Streamlit web dashboard
  • Supports comprehensive data processing with 60+ clinical features from multiple organ transplant datasets

How we built it

Full-stack architecture combining Streamlit with advanced ML:

  • Frontend: Streamlit-based interactive dashboard with custom medical UI components and visualizations
  • Backend: Python-based ML pipeline with comprehensive data processing and model training
  • ML Pipeline: Gradient Boosted Decision Trees (XGBoost/LightGBM) with comprehensive feature engineering
  • Data Processing: Robust handling of 60+ clinical features including demographics, lab values, donor characteristics, and transplant factors
  • Model Training: Jupyter notebooks for exploratory data analysis and model development
  • Multi-organ Support: Liver transplant focus with extensible architecture for other organ types
  • Deployment: Streamlit-based deployment with integrated Python backend and ML models

Key Features

🎯 Liver Transplant Mortality Prediction

  • Multiple Prediction Tasks: Binary mortality, survival status, time categories, risk assessment
  • Advanced ML Models: XGBoost and LightGBM with optimized hyperparameters
  • Comprehensive Features: 31+ clinical variables including MELD scores, lab values, donor characteristics
  • High Accuracy: 85-90% prediction accuracy on test data
  • Real-time Predictions: Instant risk assessment with confidence scores

🧬 HCC Recurrence Prediction

  • Specialized Model: Ensemble approach combining XGBoost and Neural Networks
  • Tumor-specific Features: HCC characteristics, vascular invasion, tumor size
  • 2-year Recurrence Prediction: Binary classification for cancer recurrence risk
  • Clinical Integration: Seamlessly integrated with mortality prediction models

📊 Interactive Dashboard

  • Model Performance: Real-time accuracy metrics, ROC curves, confusion matrices
  • Live Predictions: CSV upload and instant prediction capabilities
  • Feature Analysis: Interactive charts showing feature importance and correlations
  • Risk Visualization: Distribution charts and risk stratification displays
  • Professional UI: Medical-grade interface with comprehensive reporting

🔧 Advanced Data Processing

  • Multi-source Integration: Combines data from 6+ CSV files
  • Robust Encoding: Handles multiple file encodings (UTF-8, Latin-1, CP1252, ISO-8859-1)
  • Feature Engineering: Hybrid categorical encoding (One-Hot + Label Encoding)
  • Data Validation: Comprehensive error handling and fallback mechanisms

Technical Architecture

Data Pipeline

Raw Data (OPTN/UNOS) → Data Processing → Feature Engineering → Model Training → Real-time Predictions

Model Stack

  • Primary: Gradient Boosted Decision Trees (XGBoost/LightGBM)
  • Secondary: Neural Networks for HCC recurrence
  • Ensemble: Combined approaches for enhanced accuracy
  • Validation: 5-fold cross-validation for robust evaluation

Key Clinical Features

  • Patient Demographics: Age, Gender, BMI, Ethnicity
  • Disease Severity: MELD/PELD scores, Lab values (Albumin, Creatinine, Bilirubin, INR)
  • Donor Characteristics: Donor age, BMI, cause of death, ECD status
  • Transplant Factors: Cold ischemic time, donor type, procedure type
  • Clinical Status: Ascites, encephalopathy, diabetes, previous transplants
  • HCC Features: Tumor characteristics, vascular invasion, recurrence markers

Challenges we ran into

  • Data complexity: Large, multi-source medical datasets with missing values and encoding issues
  • Feature engineering: Selecting and processing 60+ clinical features while maintaining medical relevance
  • Model interpretability: Balancing prediction accuracy with explainable AI for clinical use
  • Integration complexity: Seamlessly connecting the ML backend with the Streamlit frontend
  • Real-time performance: Ensuring fast prediction responses for clinical workflow integration
  • Data validation: Handling inconsistent data formats across multiple organ transplant datasets

Accomplishments that we're proud of

  • Built a complete end-to-end system from data processing to user interface
  • Achieved robust ML pipeline with comprehensive feature engineering and model validation
  • Created intuitive visualizations that make complex medical data accessible to clinicians
  • Implemented real-time prediction API with proper error handling and fallback mechanisms
  • Successfully integrated modern web technologies with advanced machine learning
  • Developed a scalable architecture that can be extended to additional organ types and models
  • Achieved 85-90% accuracy on liver transplant mortality prediction
  • Built specialized HCC recurrence prediction with ensemble modeling

What we learned

  • Medical AI requires careful validation and interpretability for clinical acceptance
  • Feature engineering is crucial - the right features matter more than complex algorithms
  • User experience is key - even the best ML model needs an intuitive interface
  • Robust error handling is essential when dealing with real-world medical data
  • Integration challenges between different technology stacks require careful planning
  • Clinical workflows need fast, reliable predictions that fit into existing processes
  • Multi-organ datasets provide rich context but require sophisticated data processing

What's next for DeepPlantAI

  • Expand to more organ types (kidney, heart, lung) with specialized prediction models
  • Add more advanced ML models including deep learning approaches
  • Implement user authentication and role-based access for different user types
  • Add batch prediction capabilities for analyzing multiple patients
  • Integrate with hospital systems and electronic health records
  • Implement model retraining capabilities for continuous improvement
  • Add survival analysis with Kaplan-Meier curves and Cox regression
  • Develop mobile application for point-of-care predictions

Getting Started

Prerequisites

pip install -r src/dashboard/requirements.txt

Running the Dashboard

# Navigate to dashboard directory
cd src/dashboard

# Run the Streamlit dashboard
streamlit run dashboard.py

The dashboard will be available at: http://localhost:8501

Running the Models

# Navigate to models directory
cd src/Models

# Run liver mortality prediction
python run_liver_mortality_gbdt.py

# Run HCC recurrence prediction
python hcc_recurrence_prediction.py

Project Structure

67ers_Hackgt/
├── src/
│   ├── dashboard/           # Streamlit web interface
│   │   ├── dashboard.py     # Main dashboard application
│   │   └── requirements.txt # Dashboard dependencies
│   ├── Models/              # ML models and training
│   │   ├── liver_mortality_gbdt.py      # Main mortality prediction
│   │   ├── hcc_recurrence_prediction.py # HCC recurrence model
│   │   └── requirements_gbdt.txt        # Model dependencies
│   └── data_prep/           # Data processing utilities
├── data/                    # Transplant datasets
│   ├── csv_data/           # Processed CSV files
│   └── dictionaries/       # Data dictionaries
└── README.md               # This file

Medical Disclaimer

  • Research Only: This tool is for research and educational purposes
  • Not Clinical: Not intended for clinical decision making
  • Professional Review: Always consult medical professionals for clinical decisions

License

This project is part of the 67ers Hackathon and is available under the MIT License.


Built with ❤️ by the 67ers Team for HackGT 2024

Built With

Share this project:

Updates