This repository hosts the code for the paper:
GeneticPrism: Multifaceted Visualization of Citation-based Scholarly Research Evolution
🔗 Online System: https://genetic-flow.com / https://geneticflow.ye-sun.com/
🎞️ Demo Video: https://youtu.be/zVbM7lgA6Ig
See User Manual and Appendix in Wiki pages.
Understanding the evolution of scholarly research is essential for academic decision-making (e.g., research planning and frontier exploration). Existing platforms like Google Scholar rely on abstract numerical indicators lacking contextual depth, while visualization approaches rarely leverage curated self-citation data to depict individual scholars’ evolution.
This work introduces:
- A 3D prism metaphor visualizing scholars’ research profiles
- A scroll metaphor visualizing structured topic evolution via streamgraphs and inter-topic flow maps
- Six-degree-impact glyphs highlighting interdisciplinary breakthroughs
- Evaluations through case studies (Turing Award laureates, visualization venues) and user studies
Processed from the open-source Academic Graph:
- v1 (up to Sept. 2022): process Microsoft Academic Graph (MAG) to construct GF Graph (from KDD’23 paper, github repo)
- v2 (up to Oct. 2024): MAG fused with OpenAlex
🔗 Download v2 dataset: Hugging Face. Due to the dataset's size, it is divided into two compressed archives.
- The
csv.tar.gz contains CSV files covering all research fields except Artificial Intelligence (AI). After extraction, place these CSV files directly in your project root directory. - The
AI.tar.gz contains only AI-related data – extract its CSV files into the project's csv/.
The system remains fully functional if only one archive (either AI or Non-AI) is installed, enabling flexible data management based on research needs.
Place the extracted CSV files in the project root directory. The directory structure should look like this:
GeneticPrism/
├── csv/
│ ├── AI/ # Contains AI-related data
│ │ ├── links/
│ │ ├── papers/
│ │ ├── paperIDDistribution.csv
│ │ ├── top_field_authors.csv
│ │ └── field_leaves.csv
│ └── <field> # Contains other research fields
│ ├── links/
│ ├── papers/
│ └── ...
├── manage.py
└── ...conda create -n GFVis python=3.11
conda activate GFVis
pip install -r requirements.txtOption A: Direct run
python manage.py runserver 0.0.0.0:9001Option B: Background run (persistent)
nohup python manage.py runserver 0.0.0.0:9001 2>&1 &- Access the system at:
http://<your-ip>:9001 - Use
ctrl + cto terminate direct runs - Monitor background processes via
tail -f nohup.out