crawlzone icon indicating copy to clipboard operation
crawlzone copied to clipboard

Crawlzone is a fast asynchronous internet crawling framework for PHP.

Results 10 crawlzone issues
Sort by recently updated
recently updated
newest added

Bumps [guzzlehttp/guzzle](https://github.com/guzzle/guzzle) from 7.4.1 to 7.4.5. Release notes Sourced from guzzlehttp/guzzle's releases. Release 7.4.5 See change log for changes. Release 7.4.4 See change log for changes. Release 7.4.3 See change...

dependencies

Bumps [guzzlehttp/psr7](https://github.com/guzzle/psr7) from 2.1.0 to 2.2.1. Release notes Sourced from guzzlehttp/psr7's releases. 2.2.1 See change log for changes. 2.2.0 See change log for changes. 2.1.2 See change log for changes....

dependencies

In command line mode, how to save the crawled results?

Would it be possible to exclude subdomains from the scan, paying attention to second level TLDs like * .co.uk? (maybe an array of only TLDs)

Is there any way to use Closure for deny, allow option? I think we should use a Closure/function for that so we can check using database ...etc. See https://github.com/spatie/crawler#filtering-certain-urls

enhancement

Hi, is it possible to replace the default link extractor? Maybe removing the default ExtractAndQueueLinks extension and readding it with my link extractor?

enhancement

Disable crawling of mailto links

bug

Create a handler which is able to execute javascript on the pages.

enhancement

Bumps [guzzlehttp/psr7](https://github.com/guzzle/psr7) from 2.1.0 to 2.5.0. Release notes Sourced from guzzlehttp/psr7's releases. 2.5.0 See change log for changes. 2.4.5 See change log for changes. 2.4.4 See change log for changes....

dependencies