GitHub - renzzheng/ScannaBook: The simplest way to digitize your physical library. Scan your books to unlock a world of digital notes, search, and accessibility.

Inspiration

Inspired by how cumbersome it can be to sift through thrift store books, or to even catalog your own personal bookshelf, we set out to create ScannaBook to aid in this venture.

What it does

ScannaBook takes in an image of a bookshelf and extracts the text from each book spine to get the title and author (if available). It then queries the Google Books API to retrieve the average rating and description for each book so that it can then be presented in a more digestible format on ScannaBook's home page.

How we built it

For our frontend, we used Typescript/React to create the components and the overall layout of our webpage. For the backend, we used AWS services and Pillow to extract the text, Google books API to give us information about each book, Gemini to format output to our likings, and then connected all of this to the frontend using Flask to create our fully working application.

Challenges we ran into

Overcame challenges while learning the AWS console, successfully incorporating Rekognition and S3, creating IAM profiles, and integrating these services into our project.

We explored ways to clean up book spine texts for the Google Books API query, cropping individual books using Pillow to prevent data bleed and to remove noise from the text.

To efficiently separate titles and authors, we leveraged Gemini AI to structure the data into a clean JSON format, as opposed to relying solely on using regex to clean the texts, making it easier to send to the frontend.

What's next for ScannaBook

ScannaBook really excited us when we were in the brainstorming phase of our project as we had many ambitious ideas for UI elements, novel functionalities, and how they would interact with one another in fun, creative, and user-friendly ways. One of our ideas was to add a rotating carousel of book spine images extracted from Rekognition. The book spine images could also be paired with a cover image to create a semi-third dimensional representation of books on our site.

Dependencies

Frontend

Install dependencies for the Next.js app:

npm install
# or
yarn install

Backend

Install Python dependencies for OCR, API calls, and cloud integration:

pip install -r requirements.txt

OR

pip install boto3 google-auth google-auth-oauthlib google-api-python-client pillow requests python-dotenv

This is a Next.js project bootstrapped with create-next-app.

Getting Started

First, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 with your browser to see the result.

In a separate terminal, run the Python text extraction script that powers backend processing:

python3 extract_text.py

This script handles OCR and metadata extraction while the Next.js frontend runs. Make sure both the Next.js app and the Python script are running simultaneously for full functionality.

Ben Cave | Pedro Gomez | Timothy Jeon | Ren-Zhi Zheng

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Backend		Backend
public		public
src		src
.gitignore		.gitignore
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
requirements.txt		requirements.txt
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inspiration

What it does

How we built it

Challenges we ran into

What's next for ScannaBook

Dependencies

Getting Started

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Inspiration

What it does

How we built it

Challenges we ran into

What's next for ScannaBook

Dependencies

Getting Started

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages