YarnYard

Inspiration

I picked up Machine Knitting a few years ago, as it allows you to create garments in a half automated way. Knitting Machines are not produced any more, but the creative community around them is huge. As the machines died, so did a lot of magazines that provided users with learning tips and patterns. As a machine knitter myself, I constantly struggle to find patterns of what I want to make, and it is a big creative blocker. Ravelry (community site for fellow knitters, crocheters etc) have a small batch of patterns to follow, but the real gold mine is in vintage machine knitting magazines that were scanned and are now available online (check https://mkmanuals.com) There are hundreds of magazines including probably thousands of patterns, but in current format (pdfs) it is impossible to quickly find the desired design in them. In the world of machine knitting, vast arrays of tacit and implicit knowledge are trapped in old documents.

What it does

A simple frontend https://yarnyard-frontend.vercel.app/ provides a quick way to visually see what designs are available in the magazines we processed. Buttons with categories help to filter by garment type. By clicking on an image, we can see the source, and we can be redirected to the source magazines. Additionally, the text from the site is displayed.

How we built it

For this demonstration, we used a batch of 40 magazines.

We applied yolov7 for relevant image extraction and built a custom image classifier for garment types using data taken from Ravelry API and resnet18.

The text from PDFs is OCR using tesseract.

Data is pushed to GCS buckets and PostgreSQL.

Frontend

Challenges we ran into

We tried deep layout parsing for PDF layout extraction, and it was not giving good results. The source magazines contain a lot of images, including technical graphs and knitting charts, and it was not easy to filter those out using deep layout search.

Our initial scope of the project was too big for the time given. We first wanted to be able to find images of designs and find exact matching instructions for it, but this is more complex than we initially thought, this is most probably due to various layouts of different magazines.

We built some simple classifiers for garment types using bing data and revelry data but non of them gave great results, so building a better classifier is definetely needed.

The quality of the OCRed text is poor.

Accomplishments that we're proud of

We found a fast way to extract images showing designs We have a working prototype of a frontend We have high quality output of the people from the magazines We have fast display of the content that allows easy browsing and location within magazines We have a process for expanding and improving on the text using LLM

What we learned

We learnt some new APIs, tested some new models We played with discord, Cohere, Assembly, Uberduck and OpenAI.

What's next for YarnYard

Putting the extraction process into a data pipeline and processing remaining magazines
Improving OCR technique by training our own model
Replacing text abbreviations and making instructions more user-friendly using one of AI language models. There is an example on the website of what can be done
- Finding matching instructions for the design image
- Building better classifier for garment type
- Adding free text search
- Extracting images with knitting charts and building a search site
- Generating knitting patterns using text commands Build your own knitting pattern from images, lower the skill barrier to making new patterns Search by image and search by pattern with mobile application Capturing the audio and browsing from the myriad of online instruction on YouTube, there is a lot of great old content, but It's scattered around and very poorly indexed or searchable Pattern store and sales