Welcome to the code behind Clusters in Focus: A Simple and Robust Detail-On-Demand Dashboard for Patient Data! This work was presented at the EG VCBM 2025 at TU Delft.
Available on the EG Digital Library (including a direct link to the PDF file).
Exploring tabular datasets to understand how different feature pairs partition data into meaningful cohorts is crucial in domains such as biomarker discovery, yet comparing clusters across multiple feature pair projections is challenging. We introduce Clusters in Focus, an interactive visual analytics dashboard designed to address this gap.
Clusters in Focus employs a three-panel coordinated view: a Data Panel offers multiple perspectives (tabular, heatmap, condensed with histograms / SHAP values) for initial data exploration; a Selection Panel displays the 2D clustering (K-Means / DBSCAN) for a user-selected feature pair; and a novel Cluster Similarity Panel featuring two switchable views for comparing clusters. A ranked list enables the identification of top-matching feature pairs, while an interactive similarity matrix with reordering capabilities allows for the discovery of global structural patterns and groups of related features. This dual-view design supports both focused querying and broad visual exploration.
A use case on a Parkinson's disease speech dataset demonstrates the tool's effectiveness in revealing relationships between different feature pairs characterizing the same patient subgroup.
To run the application:
docker compose build
docker compose upThis will start a webserver (on port 80), which you can access via your browser.
In order to recreate the screenshot above and in the paper (Figure 1), open Firefox on http://localhost.
You should now be able to see the interface.
On the top right dropdown, select parkinsonsdata.csv, which should already be available within the pre-loaded database.
You should now be able to see a tabular view of the data. In order to get to the heatmap view, click on the list icon in the top-right corner of Panel 1 (the Data Panel).
Now for example, select the two columns MDVP:FO(HZ) and MDVP:FLO(HZ).
Panel 2 should display a scatter plot of all entries within the dataset across these two features.
Click on one of the data points to open Panel 3 and you will be presented with a similarity analysis of all other possible clusters within the same dataset which share a high overlap of entries.
This similarity is based on the Jaccard Index
You can switch between a list and matrix view, again on the top right. The colored intensities there correspond to a high Jaccard Index, enabling a robust workflow for cluster re-identification.
For the presentation and demonstration of an exemplary use case, we used the following dataset:
https://www.kaggle.com/datasets/debasisdotcom/parkinson-disease-detection/data
If you found this tool helpful for your work, please cite it as follows:
@inproceedings{2025-clusters-in-focus,
booktitle = {Eurographics Workshop on Visual Computing for Biology and Medicine},
editor = {Garrison, Laura and Krueger, Robert},
title = {{Clusters in Focus: A Simple and Robust Detail-On-Demand Dashboard for Patient Data}},
author = {Schilcher, Lukas and Waldert, Peter and Kantz, Benedikt and Schreck, Tobias},
year = {2025},
publisher = {The Eurographics Association},
issn = {2070-5786},
isbn = {978-3-03868-276-9},
doi = {10.2312/vcbm.20251250}
}