Inspiration
I play around with LLM (Large Language Model) and I am impressed with accuracy of the task given to LLM. When I play with LLM, more often I use it for asking questions or summarizing information and then I realize that LLM is good at extracting information and returning it in specified format such as JSON. Combined with OCR and document database capability such as MongoDB, the trio are complete and could can be used to extract and store information from any type of document.
What it does
The end product of this project is actually an API that can be called to extract data from document and store it on MongoDB database collection based. The solutions consists of frontend, backend API, and API part.
- Administrator of the app must use frontend to configure prompt for each document type they would like to extract data from and where on MongoDB the extracted information is stored. The frontend uses backend to do administration tasks. In the github README you can find the demo page.
- A ready to use API can then be called by integrator of another app to extract data from document type that has already been configured. See documentation on github or demo video on how to use the API. In the github README you can find the demo API.
How we built it
We built the solution using various services of AWS and MongoDB.
- In frontend, we use javascript and react.
- In backend, we use python flask, langchain community document loader for Amazon Textract, pymongo for MongoDB, python boto3 for AWS services such as S3, Bedrock, Secrets Manager, and Textract.
- In API, we use tech stack same as backend.
Challenges we ran into
- We started this project late so we need to prioritize and quickly deliver the MVP so that we can submit the project on Devpost., This prioritization also makes us neglect some basic features such as authentication.
- Originally, we plan to also add a demo app that uses the API for processing certain type of document but due to time constraint, we finally decide not to do it.
Accomplishments that we're proud of
We actually did this project only on 3 days (but not full time on each day) and successfully prove that we can create a configurable document extraction system leveraging OCR and Bedrock.
What we learned
Langchain, Bedrock, MongoDB, Textract, React.
What's next for Document to MongoDB
See README in the github, we already put What's Next there.
Built With
- amazon-web-services
- atlas
- bedrock
- javascript
- langchain
- mongodb
- python
- react
Log in or sign up for Devpost to join the conversation.