Predictra is a full-stack web application that enables users to upload CSV datasets, perform AI-powered predictive analytics, and explore data through interactive visualizations and machine learning models.
- ๐ Dataset Management: Upload, browse, and manage CSV datasets with thumbnail previews
- ๐ค AI-Powered Predictions: Train neural network models using PyTorch for regression tasks
- ๐ Real-time Training Visualization: Monitor training progress with live loss graphs via WebSocket
- ๐ Data Distribution Analysis: Interactive histograms and statistical summaries for all dataset columns
- ๐ฌ Interactive Chat Assistant: Get insights about your data distributions through an AI chatbot
- ๐จ Modern UI/UX: Beautiful, responsive interface with light/dark theme support
- ๐ Automatic Data Preprocessing: Handles both numeric and categorical features with encoding
- ๐ Dataset Search: Quickly find datasets with integrated search functionality
- React 19 - Modern UI framework
- React Router v7 - Client-side routing
- Chart.js + react-chartjs-2 - Data visualization
- Create React App - Build tooling and development server
- FastAPI - Modern Python web framework
- PyTorch - Deep learning library for neural networks
- scikit-learn - Data preprocessing and train/test splitting
- NumPy - Numerical computations
- WebSocket - Real-time training loss streaming
- Uvicorn - ASGI web server
- Custom CSV Cleaner - Automatic feature detection and encoding
- Dynamic Neural Network - Configurable ANN architecture with dropout regularization
Predictra/
โโโ api/
โ โโโ main.py # FastAPI backend application
โโโ frontend/
โ โโโ src/
โ โ โโโ components/ # React components
โ โ โ โโโ LibraryPage.jsx
โ โ โ โโโ AnalysisPage.jsx
โ โ โ โโโ TrainingGraph.jsx
โ โ โ โโโ DataVisualization.jsx
โ โ โ โโโ ChatBot.jsx
โ โ โ โโโ ...
โ โ โโโ contexts/ # React contexts
โ โ โโโ config.js # API configuration
โ โ โโโ App.js # Main app component
โ โโโ package.json
โโโ util/
โ โโโ csvCleaner.py # CSV preprocessing utility
โ โโโ createNeuralNet.py # Neural network creation
โโโ datasets/ # CSV dataset storage
โโโ thumbnails/ # Dataset thumbnail images
โโโ venv/ # Python virtual environment
- Python 3.12+ (or compatible Python 3.x)
- Node.js 16+ and npm
- Git (optional, for cloning)
-
Navigate to the project directory:
cd Predictra -
Activate the virtual environment:
source venv/bin/activate # On macOS/Linux # OR venv\Scripts\activate # On Windows
-
Install Python dependencies:
pip install fastapi uvicorn torch scikit-learn numpy
If you prefer to install from a requirements file, create
requirements.txt:fastapi>=0.104.0 uvicorn[standard]>=0.24.0 torch>=2.0.0 scikit-learn>=1.3.0 numpy>=1.24.0 pydantic>=2.0.0 python-multipart>=0.0.6 websockets>=12.0 -
Prepare directories:
mkdir -p datasets thumbnails
-
Start the FastAPI server:
uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
The API will be available at:
- API Base: http://localhost:8000
- Interactive Docs: http://localhost:8000/docs
- Alternative Docs: http://localhost:8000/redoc
-
Navigate to the frontend directory:
cd frontend -
Install Node.js dependencies:
npm install
-
Configure API endpoint (if needed):
Edit
frontend/src/config.jsto match your backend URL:BASE_URL: "http://localhost:8000" -
Start the development server:
npm start
The React app will open at http://localhost:3000
- Click "๐ Choose CSV File" on the homepage
- Select a CSV file from your computer
- Click "๐ Upload" to upload to the server
- The dataset will appear in your library after upload
- Click on any dataset card to open the Analysis Page
- The page automatically fetches and displays dataset headers
- Select a target field (column) you want to predict
- View distribution visualizations for all columns
- On the Analysis Page, select your target field (what you want to predict)
- Configure training parameters:
- Epochs: Number of training iterations (default: 10)
- Test Size: Proportion of data for testing (default: 0.1)
- Click "๐ Train Model"
- Monitor training progress in real-time via the Training Graph
- Training loss updates stream via WebSocket every 2 epochs
- After training completes, scroll to the Prediction Section
- Fill in feature values based on the form generated from your dataset
- For categorical fields, select from available options
- For numeric fields, enter numeric values
- Click "๐ฎ Predict" to get your prediction
- View the predicted value and processed feature information
- Click "๐ View Distributions" to analyze column distributions
- View histograms for numeric columns
- See category counts for categorical columns
- Interact with the ChatBot to ask questions about distributions
GET /libraries- List all available datasetsGET /libraries/{library_name}- Get specific dataset infoPOST /upload- Upload a new CSV filePOST /rescan- Rescan datasets folder
GET /analyze?dataset_name={name}- Get dataset headersGET /dataset-distribution?dataset_name={name}- Get distribution dataPOST /train- Start model trainingGET /model-info- Get trained model information
POST /predict- Make a prediction with feature values
WS /training-loss- WebSocket endpoint for live training loss updates
The default model uses a multi-layer perceptron with:
- Input Layer: Dynamic size based on dataset features
- Hidden Layer 1: 64 neurons + ReLU + Dropout (0.2)
- Hidden Layer 2: 64 neurons + ReLU + Dropout (0.2)
- Hidden Layer 3: 32 neurons + ReLU + Dropout (0.2)
- Output Layer: 1 neuron (for regression)
- Feature Detection: Automatically identifies numeric vs categorical columns
- Categorical Encoding: Label encoding with stored mappings
- Data Scaling: StandardScaler applied to both features and target
- Train/Test Split: Configurable split ratio (default: 0.2)
- Optimizer: Adam (learning rate: 0.001, weight decay: 1e-5)
- Loss Function: Mean Squared Error (MSE)
- Batch Size: 32
- WebSocket Updates: Averaged loss sent every 2 epochs
The project includes several example datasets:
housing.csv- Housing price predictionheart.csv- Heart disease databreast_cancer.csv- Medical classification datalebron.csv- Basketball statisticscrop_production.csv- Agricultural data- And more...