Inspiration

Sika® is a specialty chemicals company. It is among others industry leaders, who are lagging behind technological and digital curve.

Imagine, files across silos and teams are still shared through PDFs and emails. There are millions of files with knowledge and data, spread across vast network of subsidiaries. Even paying customers have a hard time accessing relevant product information.

For the purpose of feeling in the communication gaps and provide common access to information we introduce SikaSeek.

What it does

SikaSeek is Sika® knowledgebase powered by AI. We have indexed almost 3000 files (.pdf, .docx) containing Sika® brochures, product data sheets, safety data sheets, and a bunch of official Sika® webpages and connected it with large language model to make data accessible through simple search bar.

What we actually do:

  • Natural language processing. You can ask questions about Sika® and it's products in everyday language. Powered by GPT3.5.
  • Get answers about anything. Whether you are looking for safety sheet of Sika® Microcrete-3000 or just a guide how to fix your toilet - we have got you covered.
  • Multilingual support. SikaSeek is capable of understanding and responding to messages in your language.
  • Providing data sources. SikaSeek will attach a relevant documents and urls to any search result, so you can check for answer accuracy yourself.
  • Search indexed files. Current index consists of ~3000 files and many webpages.
  • Suits different users. Could be used both internally and externally.
  • Vector storage. Embed and retrieve unstructured data in various formats:.pdf, .docx, .jpg etc.

How we built it

  • Llamaindex is used to facilitate data ingestion, structuring and retrieval.
  • Python and FastAPI are used to build the backend.
  • TypeScript and NextJS are used for frontend.
  • DigitalOcean is used for hosting and deploying our solution.

Challenges we ran into

  • Gaining overall understanding of how to process different file formats and store retrieved knowledge.
  • How to tune LLM and use it for specific use-cases, e.g. knowledgebase.
  • OpenAI API key usage without exceeding the rate limits and staying within subscription budget.

Accomplishments that we're proud of

Developing working solution that improves information accessibility for Sika®, saves time and resources and provide multilingual support.

What we learned

On the top of developing such solution and learning how to make work tools listed above, we have definitely improved our manner of speaking to ChatGPT in order to get the most accurate results possible.

What's next for SikaSeek

  • Other formats support to perform search through them.
  • Collecting data about user queries and results to improve knowledge sharing. For example, if a specific product was searched a lot, but Sika® lacks information about it, the need of this kind of information will be detected.
  • Document uploading for authorized Sika® employees.
  • Voice request processing.

Built With

Share this project:

Updates