🌍 Landslide Prediction Using Tree-Based Models

A data science competition project from Virginia Tech that uses spatial and environmental data to predict the region where a rainfall-triggered landslide is likely to occur. Built using decision tree-based machine learning models.

🧪 Research Question

In which region did a landslide occur given environmental conditions and specifications of the landslide?

The goal is to use features such as rainfall, location, and date to accurately classify landslide occurrences into distinct regions of risk.

📊 Dataset

Name: Global Landslide Catalog (GLC)
Source: NASA Open Data Portal
Years Covered: 2007–2015
Size: 6,788 rows × 35 columns
Purpose: Identify rainfall-triggered landslides worldwide

Direct CSV Download

🧠 Models Used

🌳 Random Forest

Achieved 68.5% accuracy on test data
Hyperparameters: ntree = 300, mtry = 18
Further tuning did not yield significant improvement

⚡ XGBoost

Achieved 69.5% accuracy without hyperparameter tuning
Observed lower training error (0.02), but potential overfitting

📌 Region Definition

Landslide regions were defined using a 100-mile radius around events
Formed clusters with at least 5 observations
86 distinct regions were identified as classification targets

📈 Time Analysis

Most landslides occurred in July and August
Contrast with expected months like March and April
Training Data: 2007–2012 (4,138 observations)
Testing Data: 2012–2015 (2,644 observations)

🛠️ Why Tree-Based Models?

Handle both discrete and continuous variables
Perform well with high-dimensional spatial data (e.g., latitude and longitude)
More robust against noise and overfitting compared to linear models

📚 References

👥 Team

Hokie Hackers — Virginia Tech

Ted Li
Devanshu Khadka
Drew Keely
Nami Jain

📫 Contact

Devanshu Khadka
LinkedIn
📧 khadkadevanshu@gmail.com

📜 License

For academic use only. Contact authors for reuse or collaboration.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
datacomp-main		datacomp-main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Landslide Prediction Using Tree-Based Models

🧪 Research Question

📊 Dataset

🧠 Models Used

🌳 Random Forest

⚡ XGBoost

📌 Region Definition

📈 Time Analysis

🛠️ Why Tree-Based Models?

📚 References

👥 Team

📫 Contact

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🌍 Landslide Prediction Using Tree-Based Models

🧪 Research Question

📊 Dataset

🧠 Models Used

🌳 Random Forest

⚡ XGBoost

📌 Region Definition

📈 Time Analysis

🛠️ Why Tree-Based Models?

📚 References

👥 Team

📫 Contact

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages