Galaxy Redshift Prediction

This project utilizes data from the Sloan Digital Sky Survey (SDSS) to create a machine learning model that predicts the redshift of galaxies based on their photometric properties ('u', 'g', 'r', 'i', 'z'). The model is built using TensorFlow, demonstrating the application of a Multi-Layer Perceptron (MLP) in predicting astronomical measurements.

Project Overview

The dataset comprises photometric properties and redshifts for approximately 1M galaxies, with the aim of training a machine learning model to understand and predict how these properties correlate with redshift. The model could potentially be used to estimate redshifts for other astronomical data, assisting in cosmological studies.

Data Description

The data is extracted using a SQL query from the SDSS online database, which includes:

Photometric magnitudes in five different bands (u, g, r, i, z).
Spectroscopic redshifts and their errors.
Metadata such as right ascension (ra), declination (dec), and object identifiers.

Getting Started

Download data from Zenodo

Dependencies

Ensure you have the following installed:

Python 3.8 or above
TensorFlow 2.x
Pandas
NumPy
Scikit-learn
Matplotlib

MLP Regression and Classification

The Python notebooks demonstrate the complete process of using photometric data from the Sloan Digital Sky Survey (SDSS) to predict the redshift of galaxies using machine learning techniques.

The redshift prediction problem is explored under two formulations:

Regression, where the redshift is treated as a continuous variable and predicted directly using a Multi-Layer Perceptron (MLP).
Classification, where the redshift range is discretized into bins, and the model is trained to classify galaxies into redshift intervals (also includes the regression case above)

Both models are evaluated and compared in terms of predictive accuracy and error metrics, offering insights into the effectiveness of each approach.

Uncertainty-Aware Photo-z (MLP)

Notebook: MLP_PhotoZ_SDSS_Uncertainty_github.ipynb

This notebook extends the basic MLP photo-z workflow by producing predictive uncertainties alongside the point redshift estimate. It is designed for SDSS photometry (ugr) with spectroscopic ground truth and complements the regression/classification notebooks already in this repo.

What it does

Trains an MLP on SDSS photometric features to predict photometric redshift (\hat{z}).
Estimates uncertainty for each prediction, exposing both:
- Epistemic component via stochastic forward passes (dropout at inference).
- Aleatoric component via a heteroscedastic Gaussian head that learns a data-dependent variance (optional).
Outputs per-object results and diagnostics useful for downstream cosmology tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
MLP_PhotoZ_SDSS.ipynb		MLP_PhotoZ_SDSS.ipynb
MLP_PhotoZ_SDSS_R&C.ipynb		MLP_PhotoZ_SDSS_R&C.ipynb
MLP_PhotoZ_SDSS_Uncertainty_github.ipynb		MLP_PhotoZ_SDSS_Uncertainty_github.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Galaxy Redshift Prediction

Project Overview

Data Description

Getting Started

Dependencies

MLP Regression and Classification

Uncertainty-Aware Photo-z (MLP)

What it does

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Galaxy Redshift Prediction

Project Overview

Data Description

Getting Started

Dependencies

MLP Regression and Classification

Uncertainty-Aware Photo-z (MLP)

What it does

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages