JsonGenius is a robust, self-hosted, AI-powered scraping API written in Go that makes it easy to extract structured data (defined by a JSON Schema) from any webpage. It uses Chromium to render pages like a normal browser, so it works on complex sites.
With JsonGenius and Docker, you can set up a scraping API to pull data from sites in just a few minutes. Define the schema you want, send a URL, and JsonGenius will return extracted data matching your schema. This makes it easy to collect and work with all kinds of web data.
How to use it:
1. Clone JsonGenius from Github and navigating to the jsongenius directory:
git clone https://github.com/semanser/jsongenius cd jsongenius
2. Insert your OpenAI API Key:
export OPEN_AI_KEY=...
3. Run docker-compose up and the API will be available at http://localhost:3001.
4. To scrape a website, provide its URL and a desired JSON Schema to extract data:
curl -X POST -H "Content-Type: application/json" -d '{
"url": "/path/to/",
"schema": {
"type": "object",
"properties": {
"products": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The product name"
},
"price": {
"type": "number",
"description": "The price of the product in USD"
}
}
}
}
}
}
}' http://localhost:3001/lookup









