Inspiration
Our team participated in the NUS SDS Datathon, where we delved into data science theory, applications, and soft skills like time management and teamwork. The event provided us with a dataset containing information on approximately 30,000 companies in Singapore.
What it does
We aimed to develop a robust predictive model capable of forecasting future sales for these companies. By analyzing trends, patterns, and causative factors within the dataset, our goal was to provide accurate sales forecasts to aid strategic decision-making, inventory management, and resource allocation.
How we built it
- Data Processing: We prepared and cleaned the dataset, handling NaN values and redundant columns.
- Exploratory Data Analysis (EDA): We examined correlations between numerical variables, selected features using CatBoost, and developed hypotheses based on domain knowledge.
- Feature Engineering: We created new feature columns based on statistical significance tests and incorporated NLP techniques for textual data encoding.
- Modeling and Evaluation: Utilizing a CatBoost Regressor, we built our final predictive model and evaluated its performance using K-Fold cross-validation.
Challenges we ran into
We encountered challenges in data cleaning, feature selection, and model optimization due to the dataset's high cardinality categorical features and sparse nature.
Accomplishments that we're proud of
- Developing statistically significant hypotheses backed by strong domain knowledge.
- Enhancing model fit through feature selection and engineering techniques.
- Gaining insights into the importance of EDA and iterative hypothesis testing in data science projects.
What we learned
- Further refinement of the predictive model by exploring advanced feature engineering techniques.
- Experimentation with different modeling algorithms to improve forecast accuracy.
- Continuous learning and application of data science methodologies in future projects.
What's next for DataDestroyers
Turning this into a consulting company.
Log in or sign up for Devpost to join the conversation.