Gitner architectural diagram
An example of the working bot

Gitner

Tinder but for issues.

MLH Fellowship is a brand new experience for most of the people here. At first, everyone is out of their comfort zone and the impostor syndrome strikes when you expect it the least, wondering if you can handle a certain issue or should you just become a janitor? If that sounds familiar to you, then you might be this project's target audience. gitner (git + Tinder) matches you with the issue from a certain project that is most appropriate to you and your relevant experience!

Discord Bot

As the best means of integration in the MLH Fellowship! Works simply by typing: !gitner match <username>

Github Crawler

Making heavy use of the GitHub API in order to build the graph of users and their interactions with different repositories and issues.

Matching/Recommendation Backend

For license and commercial issues please contact Yida We aim at building a more concise and appropriate recommendation backend using graph convolution network. Specifically, LightGCN is adopted as the matching algorithm, which learns user and item embeddings and potential relationships. Furthermore, we refer to a parallelized version upon LightGCN to improve the inference speed. We appreciate for original authors of such a paper.

Numerical Performance

We report the performance which is claimed in the paper of LightGCN, where benchmarks are reported in 4 publically available datasets. Such quantitative results satisfy our need for building the recommending system which matches Github issues/repos to MLH students. quantitatives

Crucial Components

User and Items should be provided as training data, in our scenario, the user is a GitHub user, while items would be opensource projects/repos with which each user has interacted before.

Architecture Diagram

link Ginter Architecture Diagram

Organize your training data

As there are 2 crucial components in the Github issues matching process: user and issues. Basically, we need training data which includes the history of repos/issues/PRs with which each user has interacted. Assume that we present 6 users with index of 0, 1, 2, 3, 4, 5; and 20 repos of 0, 1, 2, 3,..., 19:

train.txt
- Train file.
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
test.txt
- Test file (positive instances).
- Each line is a user with her/his positive interactions with items: userID\t a list of itemID\n.
- Note that here we treat all unobserved interactions as the negative instances when reporting performance.
user_list.txt
- User file.
- Each line is a triplet (org_id, remap_id) for one user, where org_id and remap_id represent the ID of the user in the original and our datasets, respectively.
item_list.txt
- Item file.
- Each line is a triplet (org_id, remap_id) for one item, where org_id and remap_id represent the ID of the item in the original and our datasets, respectively.

Gitner experiment:

For our team member:

cd recommend

python3 LightGCN.py

For others:

python3 LightGCN.py --dataset gitner --regs "[1e-4]" --embed_size 2 --layer_size "[64,64,64,64]" --lr 0.001 --batch_size 256 --epoch 100 --test_flag full

Lessons Learned

We learned about Graph Neural Networks, and their application in recommender systems, we learned how to make discord bots and how to properly distribute tasks among people without dependencies! (Microservices!!)

Made by Viktor, Yida and Harshit

Built With

discord
flask
node.js
python
tensorflow

Submitted to

MLH Fellowship Halfway Hackathon - Batch 1

Created by

Worked on rest API to create API endpoint to integrate and serve the deep learning convolution graph model recommender system.

Harshit Singhai
I'm in charge of of the user-project matching backend, there will be satisfying matches between our members and existed Github repos/isssues. It's an optimised parametric model which could be improved ever since we have sufficient data.

Yida Wang
Ph.D. student in machine learning and computer vision
I was working on the discord bot and the GitHub API Crawler, as well as the organization of the team :)

Viktor Velev

Updates

Viktor Velev started this project — Nov 20, 2020 05:06 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.