Risk Radar

Example of home page
Example of displayed tables

Inspiration

The increasing number of fraudulent transactions in today's financial landscape motivated us to create a solution that would empower individuals and businesses to quickly identify and act on suspicious activities. Our aim was to create an accessible and efficient tool to help improve financial security and reduce fraud risks.

What it does

The tool uses machine learning to analyze transaction data and determine the likelihood of fraud. It identifies potential fraudulent activities based on patterns within the data, providing a risk score for each transaction. High-risk transactions are flagged for further review, allowing users to take timely actions to prevent financial losses. The user has the option to press a generate button to see how our model works in which the button creates a new csv file using our data generation algorithm based on the demographics we are looking for, then runs it through our model and displays the data. The other option is to use an upload button that allows the user to upload their own csv file to run with our model and display the data.

How we built it

The project is built using Python and popular data science libraries such as pandas and numpy for both data generation and processing. We utilized scikit-learn to train and evaluate a Random Forest model, which predicts the probability of fraud for each transaction. The model intelligence is saved using joblib. The final model was then deployed in a streamlined application using Streamlit to make it user-friendly. We generated our data for training using faker.

Challenges we ran into

There was a learning curve in getting grasp on both the libraries relevant to the project and Python's syntax. We also encountered issues reformatting our datasets in order to facilitate our fraud detection model's learning. This was particularly true in our attempts to handle missing values and encode categorical features. We also struggled to find data to train our model so to solve this issue we created a data generation algorithm based on the demographic of an average college student's spending habits taken directly from the Federal Reserve Bank of Atlanta

Accomplishments that we're proud of and what we've learned

Collectively, we are proud of the fact that we we were able to successfully build a robust model that positively identifies potentially fraudulent transactions. Many of us learned new technologies along the way and gained new footing in understanding and applying machine learning techniques to real-world problems. We learned how to build a Random Forest model and process complex categorical data.

What's next for Risk Radar

We aim to further refine and nuance our data models by expanding the identifiers and features that contribute to fraud detection. By incorporating additional data points—such as user behavior patterns and location-based analytics—we plan to create a more comprehensive set of features for our machine learning model. We also want to add more flexibility with user upload, such as allowing them to convert a pdf to a csv file or convert their csv file to the format our model needs, since right now it has to be in a specific format for the model to read. Further accessibility would be adding more ways to display the data or having an AI chatbot assistant to advise the user on what to do next based on the fraud detection repot.

Built With

faker
joblib
numpy
pandas
python
scikit-learn
streamlit

Submitted to

Knight Hacks VII

Created by

I contributed to the program that generated data sets based on
1) Transaction amount
2)Merchant Type
3) Timestamp
4) Transaction Category

I also wrote the write-up

Ivie Imhonde
I contributed by brainstorming and developing various approaches to detect fraud, identifying different potential fraud scenarios to improve our model's detection capabilities. I contributed to building the machine learning model, but focused primarily on Streamlit integration, ensuring a smooth user interface.

nickiesethi Sethi
I worked all around the project, but I spent the most time implementing the machine learning model our project uses. I also integrated our model, data generation algorithm, and our Streamlit to get our project working as intended.

Haresh Palli
eariz01 Aristizabal

Updates

Ivie Imhonde started this project — Oct 05, 2024 10:05 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.