Overview

My program is a web crawler and simple search engine built in Go. It crawls specified URLs, processes the content, and stores relevant information in a SQL database. Users can search for specific words, and the program returns the pages (links) with the highest relevance based on their search terms.

Web Crawling: Efficiently crawls web pages, extracts words, and stores them in a SQL database.
Search Engine: Allows users to search for specific terms and retrieves the most relevant pages.
Sitemap Handling: Supports fetching and parsing sitemaps to discover new URLs for crawling.
Robots.txt Compliance: Respects website crawling policies defined in robots.txt files.
TF-IDF Calculation: Implements Term Frequency-Inverse Document Frequency (TF-IDF) to rank pages based on search terms.
Image Search: Provides functionality to search for images related to the given search terms.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TFIDF_test.go		TFIDF_test.go
crawl.go		crawl.go
crawl_policies.go		crawl_policies.go
dbSql.go		dbSql.go
download.go		download.go
extract.go		extract.go
go.mod		go.mod
go.sum		go.sum
images.html		images.html
main.go		main.go
search.go		search.go
sitemaps.go		sitemaps.go
stop.go		stop.go
stopwords-en.json		stopwords-en.json
temp.html		temp.html
tfIdf.go		tfIdf.go