Inspired by how cumbersome it can be to sift through thrift store books, or to even catalog your own personal bookshelf, we set out to create ScannaBook to aid in this venture.
ScannaBook takes in an image of a bookshelf and extracts the text from each book spine to get the title and author (if available). It then queries the Google Books API to retrieve the average rating and description for each book so that it can then be presented in a more digestible format on ScannaBook's home page.
For our frontend, we used Typescript/React to create the components and the overall layout of our webpage. For the backend, we used AWS services and Pillow to extract the text, Google books API to give us information about each book, Gemini to format output to our likings, and then connected all of this to the frontend using Flask to create our fully working application.
Overcame challenges while learning the AWS console, successfully incorporating Rekognition and S3, creating IAM profiles, and integrating these services into our project.
We explored ways to clean up book spine texts for the Google Books API query, cropping individual books using Pillow to prevent data bleed and to remove noise from the text.
To efficiently separate titles and authors, we leveraged Gemini AI to structure the data into a clean JSON format, as opposed to relying solely on using regex to clean the texts, making it easier to send to the frontend.
ScannaBook really excited us when we were in the brainstorming phase of our project as we had many ambitious ideas for UI elements, novel functionalities, and how they would interact with one another in fun, creative, and user-friendly ways. One of our ideas was to add a rotating carousel of book spine images extracted from Rekognition. The book spine images could also be paired with a cover image to create a semi-third dimensional representation of books on our site.
Frontend
- Install dependencies for the Next.js app:
npm install
# or
yarn installBackend
- Install Python dependencies for OCR, API calls, and cloud integration:
pip install -r requirements.txtOR
pip install boto3 google-auth google-auth-oauthlib google-api-python-client pillow requests python-dotenvThis is a Next.js project bootstrapped with create-next-app.
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOpen http://localhost:3000 with your browser to see the result.
In a separate terminal, run the Python text extraction script that powers backend processing:
python3 extract_text.pyThis script handles OCR and metadata extraction while the Next.js frontend runs. Make sure both the Next.js app and the Python script are running simultaneously for full functionality.
Ben Cave | Pedro Gomez | Timothy Jeon | Ren-Zhi Zheng

