Link to Github Site = alan-tapper.github.io/datathon2022 Social Networks are everywhere, and but growing these networks do not happen in a vacuum. Finding likely connections between members of a network is therefore crucial in keeping it healthy. To this end, we present a model that will train itself on a given social network and try to predict connections between any two given members of that network.
The model works by using the random forest method for regression, using them to fit nodes to likely edges based on similar features between nodes.
The model was built using tensorflow -- a open-source machine learning framework built in Python. We also used pandas for efficient storage and manipulation of data. Due to lack of computational power, we restricted ourselves to training our model on a random subset of 2500 features and 20000 randomly selected edges.
The major challenges that we ran into were getting around a lack of computational power and sparsity in the data set. High amounts of sparsity calls for high amounts of data to train on, but high amounts of data results in very long computation times. In the future, we seek to have a better method for choosing features to train on and utilizing more powerful machines.
Looking forward, we would like to fine-tune our model to be able to take in more data, eventually training it on a fully fleshed out social network.
Log in or sign up for Devpost to join the conversation.