Inspiration

Idealo offered a clear, concise problem to solve at HackHPI: attempt to detect whether a product offering on their site was "good", one which accurately adheres to what it was selling, or a bad offer - one which may have been misleading or titled inappropriately. We attempted to solve this issue using learning machine learning methods we knew or learnt about at the hackathon.

What it does

Our aim was to making a system that could look at a particular product offering and it's associated attributes, in order to classify whether it was a "good" or "bad" offering. Unfortunately, we weren't able to finish it.

How we built it (or wanted to)

As this was a binary classification task, we started off by seeking to use a support vector machine (SVM) to classify our data - we simply used one from scikit-learn. We then had to find ways of embedding product offerings in order to have them classified by the SVM. We had wanted to use a neural net to learn embeddings for categories associated with products (left unimplemented), and to use scikit-learn for embedding the symantic similarities of string data associated with the product (e.g. similarity between the category and title) (unimplemented). We also had to translate Idealo's data from German into English, using Textblob.

Challenges we ran into

We ran into a lot! Some of us had issues with our computer, our paths, or were new to ML. Throw these all together and you get a slower development process. Oh, and hunting for various APIs (such as one for translation) proved to be quite tedious.

What we learned

Chris: I've fiddled with python, but this event gave me the first chance to seriously use it. Also, the only ML I had done up until this point was in all Octave, so it was nice to learn about some useful libraries. Daniel: I learnt to use SVMs and how to embed data (that was hard!). Floren: how to sleep less than three hours in three days. Fin: SVMs, symantic analysis.

Built With

Share this project:

Updates