Inspiration

A web app to find literature to get you from what you know to what you want to know as efficiently as possible. Become familiar with a new field, understand a collaborators publication or check key publications for your literature review - all quickly and efficiently. Introduction

What it does

Think of all possible knowledge as a (very) high dimensional hyperspace. Any publication will occupy some subspace of this hyperspace. Conventional literature searches provide the publications that intersect some subspace, usually defined by a keywords. More recent social media/bot approaches try to interpret the subspace the user is familiar with and suggest publications that exist at or just outside the boundary of this familiarity. Our service provides a means of bridging between the subspace of knowledge which the user knows and their search subspace. This is done by presenting the minimum necessary publications required for the user to traverse between these subsapces.

How we built it

Think of all possible knowledge as a (very) high dimensional hyperspace. Any publication will occupy some subspace of this hyperspace. Conventional literature searches provide the publications that intersect some subspace, usually defined by a keywords. More recent social media/bot approaches try to interpret the subspace the user is familiar with and suggest publications that exist at or just outside the boundary of this familiarity. Our service provides a means of bridging between the subspace of knowledge which the user knows and their search subspace. This is done by presenting the minimum necessary publications required for the user to traverse between these subsapces.

Publication subspace coverage is approximated as nodes and the hyperspace is projected onto a network with distances between nodes as approximations to the distances within the hyperspace.

These distance estimates are an aggregations of several factors including whether the papers are citation/reference connected (first degree connection), how many shared cited/referencing papers they have (second degree connections) and overlap of keywords. More could include text analysis of abstract/full text analysis; author input; experint input; user input or others.

A shortest path is found on a network of unity weight edges for distances below a threshold and weights penalised above unity for distances above the threshold. This threshold is to model the users ability to move outside his field of knowledge. So papers they find comfortable learning are included but papers that they would find challeneging are penalised.

Challenges we ran into

  • Collecting a dataset large enough and representative
  • Data cleaning
  • Compatibility between React and Flask

Accomplishments that we're proud of

  • A working prototype with real world data

What we learned

  • Networkx is great and easy to use
  • Don't be the guy not doing anything at the end who gets stuck with documentation.

What's next for knowledge-direct

  • Incorporate NLP to generate better distances between publications.
  • Some sort of assessment of paper breadth/quality/accessibility
  • Transition to more scalable technology
  • Apply bigger data set

Built With

Share this project:

Updates