Releases: data-nih/tcia
ucsd-ptgbm_4
Automated release for ucsd-ptgbm_4
ucsd-ptgbm_3
Automated release for ucsd-ptgbm_3
ucsd-ptgbm_2
Automated release for ucsd-ptgbm_2
ucsd-ptgbm_1
Automated release for ucsd-ptgbm_1
UCSD-PTGBM – Post-Treatment High-Grade Glioma Multimodal MRI Dataset
This dataset provides a large-scale, post-operative glioblastoma MRI cohort with advanced imaging modalities, expert annotations, and clinical outcomes, designed to support research in tumor progression, treatment response, and radiogenomics.
- Source: https://www.cancerimagingarchive.net/collection/ucsd-ptgbm/
- DOI: https://doi.org/10.7937/fwv2-dt74
Dataset Overview
-
Subjects: 178 patients
-
Timepoints: 243 imaging sessions
-
Cancer type: High-grade glioma (glioblastoma)
-
Mean age: 56 ± 13 years
-
Sex: 116 male, 62 female
-
Total size: ~44.98 GB :contentReference[oaicite:0]{index=0}
-
Scanner: 3T clinical MRI (GE systems)
Imaging Modalities
Structural MRI
- Pre- and post-contrast T1-weighted (3D IR-SPGR)
- T2-weighted FLAIR
Diffusion MRI (Multishell RSI)
- b-values: 0, 500, 1500, 4000 s/mm²
- Multi-direction acquisition
- Enables cellularity mapping via Restricted Spectrum Imaging (RSI)
Perfusion Imaging
- DSC (Dynamic Susceptibility Contrast)
- ASL (Arterial Spin Labeling)
Key Features
-
Post-operative dataset (rare compared to pre-op datasets)
-
Multishell diffusion MRI for separating:
- Tumor cellularity
- Edema / free water
-
Expert-validated tumor segmentation
- Based on BraTS standards
- Manually refined by neuroradiologists
-
Cellular tumor annotations
- Enhancing tumor (ECT)
- Non-enhancing tumor (NECT)
- Total cellular tumor (TCT)
-
Clinical and molecular data
- IDH mutation status
- MGMT promoter methylation
- Overall survival (OS)
- Progression-free survival (PFS)
Dataset Composition
Imaging Data
- Multimodal MRI (NIfTI format)
- Co-registered to 1 mm isotropic MNI space
Segmentations
- Multi-compartment tumor labels
- Radiologist-approved voxelwise annotations
Clinical Data
- Demographics
- Diagnosis and treatment
- Follow-up outcomes
Additional Files
- b-values / b-vectors
- Negative case categorization
Cohort Details
-
Residual or recurrent tumor: 192 timepoints
-
Post-treatment changes only: 51 timepoints
- Includes:
- Pseudoprogression
- Radiation necrosis
- Non-specific treatment effects
- Includes:
-
Survival data available: subset of 94 patients
Preprocessing
- RSI cellularity modeling via linear mixture modeling
- Beam-forming filter for signal refinement
- DSC processed with leakage-corrected CBV
- Skull stripping via nnUNet-based pipeline
- All modalities registered to MNI space
Scientific Applications
- Post-treatment tumor assessment
- Differentiation of recurrence vs treatment effects
- Radiogenomics and biomarker discovery
- AI/ML for segmentation and prognosis
- Diffusion-based tumor microstructure modeling
License
- Creative Commons Attribution 4.0 (CC BY 4.0)
Citation
Gagnon et al., 2025
The UCSD-PTGBM dataset (Version 3)
https://doi.org/10.7937/fwv2-dt74
Notes
- One of the few publicly available post-operative glioma datasets
- Includes advanced diffusion + perfusion imaging, uncommon in open datasets
- Particularly valuable for clinical translation and longitudinal analysis
participants.tsv Fields
The participants.tsv file includes scanner, demographic, diagnostic, molecular, treatment, and outcome variables for each case.
| Data Collection Name | Data Descriptor / Metadata Name |
|---|---|
| ID | TCIA ID |
| Brats Subject ID | ID used in the BraTS 2024 challenge |
| Magnetic Field Strength | Magnetic field strength of the scanner |
| Manufacturer | Manufacturer of the scanner |
| Manufacturer's Model Name | Model name of the scanner |
| Patient's Age | At time of scan (in years) |
| Sex at birth | M/F |
| Race | American Indian or Alaska Native / Asian / Native Hawaiian or Pacific Islander / Black or African American / White / Unknown / Not Reported |
| Ethnicity | Hispanic or Latino / Not Hispanic or Latino / Unknown / Not Reported |
| Primary Diagnosis | Glioblastoma; Oligoastrocytoma; Astrocytoma, IDH-Mutant, Grade 4; Anaplastic Astrocytoma |
| Days from Acquisition to Date of initial surgery, treatment or diagnosis | Either date of surgery, or other initial treatment if the patient did not undergo surgery at the time of diagnosis, or date of diagnosis if the patient did not receive any treatment |
| Days from Acquisition to Date of last surgery prior to scan if different | Date of last surgery prior to scan if different from the date of initial surgery |
| WHO 2021 Diagnosis | Diagnostic classification according to WHO 2021 criteria |
| Non WHO 2021 Diagnosis | Diagnostic classification prior to WHO 2021 |
| Grade | Grade according to criteria at the time of diagnosis |
| MGMT | MGMT methylation status, either methylated or unmethylated |
| IDH | IDH mutation status, either mutated or wild type |
| 1p19q | 1p19q codeletion status, either codeleted or intact (= not 1p19q codeleted) |
| ATRX | ATRX status. Loss = indication of mutation; intact = ATRX retained, no indication of mutation |
| Days from Acquisition to Date of last follow-up | Date of last follow-up from the clinical record |
| Days from Acquisition to Date of death | Date of death |
| Overall survival | Days between initial surgery and death |
| Progression free survival | Days between initial surgery and first sign of progression |
| Surgery | Yes/No |
| Number of surgeries | Number of surgeries |
| Surgery extend | GTR / STR / Biopsy |
| Radiation | Yes/No |
| Number of radiation courses | Number of radiation courses |
| Days from Acquisition to Date of first radiation | Date of first radiation |
| Days from Acquisition to Date of last radiation prior to scan | Date of last radiation prior to scan |
| 1st Chemo type | Type of first chemotherapy agent |
| Days from Acquisition to Date of 1st chemo start | Date of first chemotherapy treatment start |
| Avastin® (bevacizumab) | Yes/No |
| Days from Acquisition to Date of last Avastin treatment prior to scan | Date of last Avastin treatment prior to scan |
| Other treatment prior to scan | Yes/No |
| Days from Acquisition to Other treatment dates | Date of other treatment if known |
UPENN-GBM – Multi-parametric MRI for De Novo Glioblastoma (University of Pennsylvania Health System)
The UPENN-GBM collection provides multi-parametric magnetic resonance imaging (mpMRI) scans and corresponding clinical, histopathologic, and radiomic data from 630 patients with de novo glioblastoma (GBM).
The dataset was curated by the University of Pennsylvania Health System and released through The Cancer Imaging Archive (TCIA) as a comprehensive, open-access imaging resource for studying glioblastoma biology, segmentation reproducibility, and radiogenomic biomarkers.
Each subject includes co-registered and skull-stripped mpMRI scans together with automated and manually corrected tumor segmentation labels that delineate histologically distinct subregions (enhancing core, necrotic core, edema, etc.). These segmentations were refined and approved by expert board-certified neuroradiologists, enabling quantitative analyses without repeated manual annotation.
The dataset also provides a large panel of radiomic features, clinical outcomes, and molecular data, supporting cross-disciplinary research linking imaging, histology, and genomics. For a subset of cases, matched H&E-stained whole-slide histopathology images from resected tumor tissue are available, enabling radiology–pathology correlation.
Overview
- Dataset name: UPENN-GBM (Multi-parametric MRI for De Novo Glioblastoma)
- Institution: University of Pennsylvania Health System
- Repository: The Cancer Imaging Archive (TCIA)
- Species: Human
- Subjects: 630 patients
- Cancer type: Glioblastoma multiforme (GBM)
- Data types: MRI (DICOM/NIfTI), segmentation labels, histopathology, demographics, molecular and radiomic features
- Total size: ~357 GB
- Version: 2 (Updated 2022-10-24)
- DOI: 10.7937/TCIA.709X-DN49
- License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Imaging and Data Modalities
| Data Type | Format | Description | Size | Access |
|---|---|---|---|---|
| MRI Images | DICOM | mpMRI scans including T1, T1-Gd, T2, and FLAIR | 139.4 GB | Download via NBIA Data Retriever |
| MRI + Segmentation | NIfTI | Co-registered mpMRI with tumor and whole-brain segmentation labels | 69 GB | Requires IBM Aspera Connect |
| Histopathology Images | NDPI | Digitized H&E slides from resected tumors | 149 GB | Requires IBM Aspera Connect |
| Clinical Data | CSV | Demographics, molecular tests, and outcomes | 64.9 KB | Direct download |
| Radiomic Features | ZIP/CSV | CaPTk-extracted intensity, texture, and morphologic features | 15.4 MB | Direct download |
| Radiology–Pathology Mapping | CSV | Links imaging and histology IDs | 2.5 KB | Direct download |
| Acquisition Parameters | CSV | MRI scanner and sequence details | 194 KB | Direct download |
| Data Availability per Subject | CSV | File completeness summary | 125 KB | Direct download |
| Radiomic Parameter File | CSV | CaPTk configuration reference | 3.8 KB | Direct download |
Study Description
This dataset integrates clinical, imaging, and molecular data to enable large-scale computational and translational research in glioblastoma.
All MRI volumes were preprocessed (skull stripping and co-registration) prior to segmentation, which was performed via an automated pipeline followed by manual expert correction.
Derived features include:
- Intensity and histogram-based measures
- Volumetric and morphological statistics
- Textural parameters (GLCM, GLRLM, etc.)
- Radiomic descriptors consistent with CaPTk and IBSI standards
The dataset supports:
- Benchmarking of automated tumor segmentation algorithms
- Radiogenomic association studies linking imaging phenotypes to molecular subtypes
- Outcome prediction (e.g., overall survival, progression-free survival)
- Radiology–pathology correlation and cross-modality feature harmonization
Data Access
All data are publicly available through The Cancer Imaging Archive (TCIA).
Use of NBIA Data Retriever or IBM Aspera Connect is required for large downloads.
Version 2 Updates (October 2022):
- Added digitized histopathology (NDPI format)
- Added radiology–pathology mapping CSV
- Harmonized radiomic feature files and metadata
Citation
Bakas, S., Sako, C., Akbari, H., Bilello, M., Sotiras, A., Shukla, G., Rudie, J. D., Flores Santamaria, N., Fathi Kazerooni, A., Pati, S., Rathore, S., Mamourian, E., Ha, S. M., Parker, W., Doshi, J., Baid, U., Bergman, M., Binder, Z. A., Verma, R., … Davatzikos, C. (2021).
Multi-parametric magnetic resonance imaging (mpMRI) scans for de novo Glioblastoma (GBM) patients from the University of Pennsylvania Health System (UPENN-GBM) (Version 2).
The Cancer Imaging Archive.
https://doi.org/10.7937/TCIA.709X-DN49
Users must include this citation in all publications derived from this dataset.
Usage Policy
This collection is released under CC BY 4.0, allowing sharing and adaptation for any purpose (including commercial), provided appropriate attribution is given.
All users must comply with the TCIA Data Usage Policy and Restrictions.
Acknowledgments
This dataset was developed through the collaboration of the University of Pennsylvania Health System, The Cancer Imaging Archive (TCIA), and the National Cancer Institute’s Cancer Imaging Program (CIP).
We thank the contributing radiologists, data scientists, and patients for enabling open-access cancer imaging research.
External Resources
© 2025 The Cancer Imaging Archive (TCIA).
Prepared for redistribution under data-others/disease/upenn-gbm by the Pittsburgh Fiber Data Hub.
UCSF-PDGM – University of California San Francisco Preoperative Diffuse Glioma MRI
The UCSF-PDGM collection provides preoperative multi-parametric brain MRI and matched molecular, clinical, and follow-up data for adult patients with histopathologically confirmed WHO grade II–IV diffuse gliomas. All patients were scanned at the University of California San Francisco (UCSF) using a standardized 3T MRI protocol that emphasizes predominantly 3D acquisitions and includes advanced diffusion (HARDI) and perfusion (ASL) imaging.
In total, the dataset comprises 495 subjects (501 MRI exams) with harmonized mpMRI, tumor segmentations (aligned to BraTS standards), and curated IDH and MGMT biomarker status. This resource is designed to support AI and quantitative imaging research in areas such as automated tumor segmentation, radiogenomics, survival prediction, and treatment response modeling. All data are fully preoperative; prior tumor treatment is an exclusion criterion (biopsy allowed).
Overview
- Dataset name: UCSF-PDGM – UCSF Preoperative Diffuse Glioma MRI
- Institution: University of California San Francisco
- Repository: The Cancer Imaging Archive (TCIA)
- Species: Human
- Subjects: 495 patients (501 exams)
- Tumor types: WHO grade II–IV diffuse glioma
- Data types: MRI (NIfTI), bval/bvec, tumor segmentations, clinical and molecular data
- Total size: ~142 GB (imaging + annotations)
- Latest version: Version 5 (updated 2025-05-30)
- DOI: 10.7937/TCIA.BDGF-8V37
- License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Study Population and Biomarkers
- Population: Adult patients with histopathologically confirmed grade II–IV diffuse gliomas
- Inclusion: Preoperative MRI, initial tumor resection, genetic testing at a single center (2015–2021)
- Exclusion: Any prior brain tumor treatment (except biopsy)
Genetic biomarkers:
- IDH mutation status available for all tumors
- MGMT promoter methylation available for grade III–IV gliomas
- 1p/19q codeletion reported for a subset of cases
Grade distribution (501 cases):
- Grade II: 55 cases (11%)
- Grade III: 42 cases (9%)
- Grade IV: 403 cases (80%)
There is a consistent male predominance (~56–60%) across grades. IDH mutations are common in lower-grade gliomas (83% of grade II, 67% of grade III) and rare in grade IV (8%). MGMT hypermethylation is present in ~63% of grade IV gliomas.
Imaging Protocol
All MRIs were acquired preoperatively on a 3.0T GE Discovery 750 scanner using an 8-channel head coil. The standardized protocol includes:
-
Structural:
- 3D T2-weighted
- 3D T2/FLAIR-weighted
- Susceptibility-weighted imaging (SWI)
- Pre- and post-contrast T1-weighted (3D)
-
Diffusion:
- 2D 55-direction HARDI diffusion sequence
- Derived maps: DWI, FA, MD, AD, RD (via FSL Eddy + DTIFIT)
-
Perfusion:
- 3D arterial spin labeling (ASL) perfusion imaging
Gadolinium-based contrast agents used:
- Gadobutrol (Gadovist): 0.1 mL/kg
- Gadoterate (Dotarem): 0.2 mL/kg
Image Pre-processing and Tumor Segmentation
Pre-processing:
- HARDI data corrected with FSL Eddy (eddy current correction with outlier replacement; no topup)
- Tensor fitting with FSL DTIFIT (simple least squares)
- All contrasts registered and resampled to each subject’s T2/FLAIR space (1 mm isotropic) using ANTs non-linear registration
- Skull stripping performed with a public deep-learning model:
Tumor segmentation:
- Multicompartment segmentation performed as part of the BraTS 2021 pipeline
- Initial automated segmentation using an ensemble of prior BraTS-winning models
- Manual corrections by trained radiologists, with final approval by two expert reviewers
- Segmented compartments:
- Enhancing tumor
- Non-enhancing / necrotic tumor
- FLAIR hyperintense abnormality (“edema” region)
These labels support:
- Benchmarking of segmentation algorithms
- Radiomics and radiogenomics analyses
- Survival and progression modeling
Data Access
All data are hosted on TCIA and are publicly accessible.
Version 5 changes (2025-05-30):
- Fixed a header issue in DTI_eddy_noreg by providing NIfTI files in original orientation and spacing (post-FSL eddy, prior to further processing)
- Added rotated bvecs for each exam (FSL eddy outputs)
Download resources:
| Content | Data Type | Format | Subjects | License | Access |
|---|---|---|---|---|---|
| Images & annotations | MR images + segmentations | NIfTI + BVEC | 495 | CC BY 4.0 | Download via IBM Aspera (142 GB) |
| Clinical data | Demographic, molecular, diagnosis, follow-up | CSV | 495 | CC BY 4.0 | Direct CSV download |
| bval files | Diffusion b-values | BVAL/ZIP | — | CC BY 4.0 | Direct download |
| bvec files | Rotated diffusion b-vectors | BVEC/ZIP | — | CC BY 4.0 | Direct download |
Access details and download links are available on the TCIA collection page.
External Tools and Resources
-
Skull stripping model:
https://github.com/ecalabr/brain_mask/ -
Related datasets and benchmarks:
- RSNA-ASNR-MICCAI BraTS 2021 challenge dataset
Citation
Users must cite the dataset as:
Calabrese, E., Villanueva-Meyer, J., Rudie, J., Rauschecker, A., Baid, U., Bakas, S., Cha, S., Mongan, J., Hess, C. (2022).
The University of California San Francisco Preoperative Diffuse Glioma MRI (UCSF-PDGM) (Version 5) [dataset].
The Cancer Imaging Archive.
https://doi.org/10.7937/TCIA.BDGF-8V37
Usage Policy
The UCSF-PDGM collection is distributed under CC BY 4.0, allowing reuse and adaptation (including commercial use) with appropriate attribution.
Users must comply with the TCIA Data Usage Policy and Restrictions.
Source
- TCIA collection page – UCSF-PDGM:
(search “UCSF-PDGM” on The Cancer Imaging Archive) - About TCIA:
https://www.cancerimagingarchive.net/
© 2025 The Cancer Imaging Archive (TCIA).
Prepared for redistribution under data-others/disease/ucsf-pdgm by the Pittsburgh Fiber Data Hub.