Gitner
Tinder but for issues.
MLH Fellowship is a brand new experience for most of the people here. At first, everyone is out of their comfort zone and the impostor syndrome strikes when you expect it the least, wondering if you can handle a certain issue or should you just become a janitor? If that sounds familiar to you, then you might be this project's target audience. gitner (git + Tinder) matches you with the issue from a certain project that is most appropriate to you and your relevant experience!
Discord Bot
As the best means of integration in the MLH Fellowship!
Works simply by typing: !gitner match <username>
Github Crawler
Making heavy use of the GitHub API in order to build the graph of users and their interactions with different repositories and issues.
Matching/Recommendation Backend
For license and commercial issues please contact Yida We aim at building a more concise and appropriate recommendation backend using graph convolution network. Specifically, LightGCN is adopted as the matching algorithm, which learns user and item embeddings and potential relationships. Furthermore, we refer to a parallelized version upon LightGCN to improve the inference speed. We appreciate for original authors of such a paper.
Numerical Performance
We report the performance which is claimed in the paper of LightGCN, where benchmarks are reported in 4 publically available datasets. Such quantitative results satisfy our need for building the recommending system which matches Github issues/repos to MLH students.

Crucial Components
User and Items should be provided as training data, in our scenario, the user is a GitHub user, while items would be opensource projects/repos with which each user has interacted before.
Architecture Diagram
Organize your training data
As there are 2 crucial components in the Github issues matching process: user and issues. Basically, we need training data which includes the history of repos/issues/PRs with which each user has interacted. Assume that we present 6 users with index of 0, 1, 2, 3, 4, 5; and 20 repos of 0, 1, 2, 3,..., 19:
train.txt- Train file.
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
test.txt- Test file (positive instances).
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
- Note that here we treat all unobserved interactions as the negative instances when reporting performance.
user_list.txt- User file.
- Each line is a triplet (org_id, remap_id) for one user, where org_id and remap_id represent the ID of the user in the original and our datasets, respectively.
item_list.txt- Item file.
- Each line is a triplet (org_id, remap_id) for one item, where org_id and remap_id represent the ID of the item in the original and our datasets, respectively.
Gitner experiment:
For our team member:
cd recommend
python3 LightGCN.py
For others:
python3 LightGCN.py --dataset gitner --regs "[1e-4]" --embed_size 2 --layer_size "[64,64,64,64]" --lr 0.001 --batch_size 256 --epoch 100 --test_flag full
Lessons Learned
We learned about Graph Neural Networks, and their application in recommender systems, we learned how to make discord bots and how to properly distribute tasks among people without dependencies! (Microservices!!)

Log in or sign up for Devpost to join the conversation.