Avatar

Sebastian Pölsterl

AI Researcher

AstraZeneca

About Me

I’m an AI researcher in the computational pathology, Oncology R&D team at AstraZeneca and an open-source enthusiast working on machine learning for biomedical applications. My research interests are time-to-event analysis (survival analysis) and causal inference. Previously, I worked at the lab for Artificial Intelligence in Medical Imaging at the Technical University of Munich and The Institute of Cancer Research, London. I’m the author of scikit-survival, a machine learning library for survival analysis built on top of scikit-learn.

Interests

  • Time-to-event analysis
  • Causal inference
  • Heterogenous data
  • Biomedical applications
  • Deep learning

Education

  • PhD in Computer Science, 2016

    Technische Universität München

  • MSc in Bioinformatics, 2011

    Ludwig-Maximilians-Universität & Technische Universität München

  • BSc in Bioinformatics, 2008

    Ludwig-Maximilians-Universität & Technische Universität München

Recent Posts

scikit-survival 0.26.0 released

I am pleased to announce that scikit-survival 0.26.0 has been released.

This is a maintainance release that adds support for Python 3.14 and includes updates to make scikit-survival compatible with new versions of pandas and osqp. It adds support for the pandas string dtype, and copy-on-write, which is going to become the default with pandas 3. In addition, sksurv.preprocessing.OneHotEncoder now supports converting columns with the object dtype.

scikit-survival 0.25.0 with improved documentation released

I am pleased to announce that scikit-survival 0.25.0 has been released.

This release adds support for scikit-learn 1.7, in addition to version 1.6. However, the most significant changes in this release affect the documentation. The API documentation has been completely overhauled to improve clarity and consistency. I hope this marks a significant improvement for users new to scikit-survival.

One of the biggest pain points for users seems to be understanding which metric can be used to evaluate the performance of a given estimator. The user guide now summarizes the different options.

scikit-survival 0.24.0 released

It’s my pleasure to announce the release of scikit-survival 0.24.0.

A highlight of this release the addition of cumulative_incidence_competing_risks() which implements a non-parameteric estimator of the cumulative incidence function in the presence of competing risks. In addition, the release adds support for scikit-learn 1.6, including the support for missing values for ExtraSurvivalTrees.

Analysis of Competing Risks

In classical survival analysis, the focus is on the time until a specific event occurs. If no event is observed during the study period, the time of the event is considered censored. A common assumption is that censoring is non-informative, meaning that censored subjects have a similar prognosis to those who were not censored.

scikit-survival 0.23.0 released

I am pleased to announce the release of scikit-survival 0.23.0.

This release adds support for scikit-learn 1.4 and 1.5, which includes missing value support for RandomSurvivalForest. For more details on missing values support, see the section in the release announcement for 0.23.0.

Moreover, this release fixes critical bugs. When fitting SurvivalTree, the sample_weight is now correctly considered when computing the log-rank statistic for each split. This change also affects RandomSurvivalForest and ExtraSurvivalTrees which pass sample_weight to the individual trees in the ensemble. Therefore, the outputs produced by SurvivalTree, RandomSurvivalForest, and ExtraSurvivalTrees will differ from previous releases.

scikit-survival 0.22.0 released

I am pleased to announce the release of scikit-survival 0.22.0. The highlights for this release include

Projects

scikit-survival: machine learning for time-to-event analysis

scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while …