Skip to content

renzzheng/ScannaBook

 
 

Repository files navigation

image

image

Inspiration

Inspired by how cumbersome it can be to sift through thrift store books, or to even catalog your own personal bookshelf, we set out to create ScannaBook to aid in this venture.

What it does

ScannaBook takes in an image of a bookshelf and extracts the text from each book spine to get the title and author (if available). It then queries the Google Books API to retrieve the average rating and description for each book so that it can then be presented in a more digestible format on ScannaBook's home page.

How we built it

For our frontend, we used Typescript/React to create the components and the overall layout of our webpage. For the backend, we used AWS services and Pillow to extract the text, Google books API to give us information about each book, Gemini to format output to our likings, and then connected all of this to the frontend using Flask to create our fully working application.

image

Challenges we ran into

Overcame challenges while learning the AWS console, successfully incorporating Rekognition and S3, creating IAM profiles, and integrating these services into our project.

We explored ways to clean up book spine texts for the Google Books API query, cropping individual books using Pillow to prevent data bleed and to remove noise from the text.

To efficiently separate titles and authors, we leveraged Gemini AI to structure the data into a clean JSON format, as opposed to relying solely on using regex to clean the texts, making it easier to send to the frontend.

What's next for ScannaBook

ScannaBook really excited us when we were in the brainstorming phase of our project as we had many ambitious ideas for UI elements, novel functionalities, and how they would interact with one another in fun, creative, and user-friendly ways. One of our ideas was to add a rotating carousel of book spine images extracted from Rekognition. The book spine images could also be paired with a cover image to create a semi-third dimensional representation of books on our site.


Dependencies

Frontend

  • Install dependencies for the Next.js app:
npm install
# or
yarn install

Backend

  • Install Python dependencies for OCR, API calls, and cloud integration:
pip install -r requirements.txt

OR

pip install boto3 google-auth google-auth-oauthlib google-api-python-client pillow requests python-dotenv

This is a Next.js project bootstrapped with create-next-app.

Getting Started

First, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 with your browser to see the result.

In a separate terminal, run the Python text extraction script that powers backend processing:

python3 extract_text.py

This script handles OCR and metadata extraction while the Next.js frontend runs. Make sure both the Next.js app and the Python script are running simultaneously for full functionality.


Ben Cave | Pedro Gomez | Timothy Jeon | Ren-Zhi Zheng

About

The simplest way to digitize your physical library. Scan your books to unlock a world of digital notes, search, and accessibility.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 62.6%
  • Python 33.8%
  • JavaScript 2.0%
  • CSS 1.6%