crawlzone
crawlzone copied to clipboard
Crawlzone is a fast asynchronous internet crawling framework for PHP.
Bumps [guzzlehttp/guzzle](https://github.com/guzzle/guzzle) from 7.4.1 to 7.4.5. Release notes Sourced from guzzlehttp/guzzle's releases. Release 7.4.5 See change log for changes. Release 7.4.4 See change log for changes. Release 7.4.3 See change...
Bumps [guzzlehttp/psr7](https://github.com/guzzle/psr7) from 2.1.0 to 2.2.1. Release notes Sourced from guzzlehttp/psr7's releases. 2.2.1 See change log for changes. 2.2.0 See change log for changes. 2.1.2 See change log for changes....
In command line mode, how to save the crawled results?
Would it be possible to exclude subdomains from the scan, paying attention to second level TLDs like * .co.uk? (maybe an array of only TLDs)
Is there any way to use Closure for deny, allow option? I think we should use a Closure/function for that so we can check using database ...etc. See https://github.com/spatie/crawler#filtering-certain-urls
Hi, is it possible to replace the default link extractor? Maybe removing the default ExtractAndQueueLinks extension and readding it with my link extractor?
Create a handler which is able to execute javascript on the pages.
Bumps [guzzlehttp/psr7](https://github.com/guzzle/psr7) from 2.1.0 to 2.5.0. Release notes Sourced from guzzlehttp/psr7's releases. 2.5.0 See change log for changes. 2.4.5 See change log for changes. 2.4.4 See change log for changes....