Conversation
WalkthroughAdds per-source HTTP request tracking: a new Changes
Sequence Diagram(s)sequenceDiagram
participant Runner
participant Source
participant HTTP
Runner->>Source: Run()
Source->>Source: s.requests = 0
loop per request
Source->>Source: s.requests++ (or atomic.Add)
Source->>HTTP: perform HTTP/API call
HTTP-->>Source: response
Source->>Source: parse results, s.results++ / s.errors++ (atomic where used)
end
Source-->>Runner: Statistics{Errors, Results, Requests, TimeTaken, Skipped}
Runner->>Runner: format line with Requests column
Runner-->>Console: print stats table (includes Requests)
note right of Runner `#D3F2E8`: New column added to overall output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @pkg/subscraping/sources/gitlab/gitlab.go:
- Around line 94-98: Concurrent goroutines incrementing s.requests in the
goroutine started in the anonymous func cause a data race; change the
Source.requests field to an int64 and import sync/atomic, replace the direct
increment s.requests++ with atomic.AddInt64(&s.requests, 1) inside the
goroutine, and update Statistics() (or any reader) to use
atomic.LoadInt64(&s.requests) so reads/writes are atomic and race-free.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)
60-115: Pre-existing issue: Page 1 requested twice when pagination is needed.When
ResultCount > pageSize, the pagination logic has a pre-existing inefficiency:
- Line 60-62: Initial request fetches page 1
- Line 79: Detects pagination is needed
- Line 81-115: Loop starts from
page = 1and re-requests page 1 at line 88- The subdomains from the initial request (line 60-62) are never processed in the pagination branch
This results in an unnecessary duplicate request for page 1. The loop at line 81 should start from
page = 2instead, and the first request's subdomains should be processed before entering the pagination loop.Note: This is a pre-existing bug not introduced by this PR. The request tracking feature correctly counts all requests, including the duplicate.
🔎 Suggested fix to eliminate duplicate page 1 request
session.DiscardHTTPResponse(resp) + + // Process subdomains from the first request + for _, subdomain := range response.Subdomains { + select { + case <-ctx.Done(): + return + case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}: + s.results++ + } + } + if response.Metadata.ResultCount > pageSize { totalPages := (response.Metadata.ResultCount + pageSize - 1) / pageSize - for page := 1; page <= totalPages; page++ { + for page := 2; page <= totalPages; page++ { select { case <-ctx.Done(): return default: } getUrl = fmt.Sprintf("%s?domain=%s&page=%d&page_size=%d", baseUrl, domain, page, pageSize) s.requests++ resp, err := session.Get(ctx, getUrl, "", requestHeaders) if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("encountered error: %v; note: if you get a 'limit has been reached' error, head over to https://devportal.redhuntlabs.com", err)} session.DiscardHTTPResponse(resp) s.errors++ return } err = jsoniter.NewDecoder(resp.Body).Decode(&response) if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} session.DiscardHTTPResponse(resp) s.errors++ continue } session.DiscardHTTPResponse(resp) for _, subdomain := range response.Subdomains { select { case <-ctx.Done(): return case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}: s.results++ } } } - } else { - for _, subdomain := range response.Subdomains { - select { - case <-ctx.Done(): - return - case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}: - s.results++ - } - } }pkg/subscraping/sources/github/github.go (1)
152-237: Critical: Race condition on request counter.The
s.requestscounter is incremented at line 173 inside a goroutine spawned at line 164. Multiple goroutines increment this counter concurrently without any synchronization (mutex or atomic operations), causing a data race.Similarly,
s.resultsis incremented at lines 202 and 220 within the same goroutines, creating additional race conditions.🔎 Recommended fix using sync/atomic
Add atomic imports and change the counter types:
import ( "bufio" "context" "fmt" "net/http" "net/url" "regexp" "strconv" "strings" "sync" + "sync/atomic" "time"Update the Source struct to use atomic-friendly types:
type Source struct { apiKeys []string timeTaken time.Duration - errors int - results int - requests int + errors int32 + results int32 + requests int32 skipped bool }Replace all increments with atomic operations:
- s.requests++ + atomic.AddInt32(&s.requests, 1)- s.results++ + atomic.AddInt32(&s.results, 1)Update Statistics() to return atomic values:
func (s *Source) Statistics() subscraping.Statistics { return subscraping.Statistics { - Errors: s.errors, - Results: s.results, - Requests: s.requests, + Errors: int(atomic.LoadInt32(&s.errors)), + Results: int(atomic.LoadInt32(&s.results)), + Requests: int(atomic.LoadInt32(&s.requests)), TimeTaken: s.timeTaken, Skipped: s.skipped, } }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (54)
pkg/runner/stats.gopkg/subscraping/sources/alienvault/alienvault.gopkg/subscraping/sources/anubis/anubis.gopkg/subscraping/sources/bevigil/bevigil.gopkg/subscraping/sources/bufferover/bufferover.gopkg/subscraping/sources/builtwith/builtwith.gopkg/subscraping/sources/c99/c99.gopkg/subscraping/sources/censys/censys.gopkg/subscraping/sources/certspotter/certspotter.gopkg/subscraping/sources/chaos/chaos.gopkg/subscraping/sources/chinaz/chinaz.gopkg/subscraping/sources/commoncrawl/commoncrawl.gopkg/subscraping/sources/crtsh/crtsh.gopkg/subscraping/sources/digitalyama/digitalyama.gopkg/subscraping/sources/digitorus/digitorus.gopkg/subscraping/sources/dnsdb/dnsdb.gopkg/subscraping/sources/dnsdumpster/dnsdumpster.gopkg/subscraping/sources/dnsrepo/dnsrepo.gopkg/subscraping/sources/domainsproject/domainsproject.gopkg/subscraping/sources/driftnet/driftnet.gopkg/subscraping/sources/facebook/ctlogs.gopkg/subscraping/sources/fofa/fofa.gopkg/subscraping/sources/fullhunt/fullhunt.gopkg/subscraping/sources/github/github.gopkg/subscraping/sources/gitlab/gitlab.gopkg/subscraping/sources/hackertarget/hackertarget.gopkg/subscraping/sources/hudsonrock/hudsonrock.gopkg/subscraping/sources/intelx/intelx.gopkg/subscraping/sources/leakix/leakix.gopkg/subscraping/sources/merklemap/merklemap.gopkg/subscraping/sources/netlas/netlas.gopkg/subscraping/sources/onyphe/onyphe.gopkg/subscraping/sources/profundis/profundis.gopkg/subscraping/sources/pugrecon/pugrecon.gopkg/subscraping/sources/quake/quake.gopkg/subscraping/sources/rapiddns/rapiddns.gopkg/subscraping/sources/reconcloud/reconcloud.gopkg/subscraping/sources/reconeer/reconeer.gopkg/subscraping/sources/redhuntlabs/redhuntlabs.gopkg/subscraping/sources/riddler/riddler.gopkg/subscraping/sources/rsecloud/rsecloud.gopkg/subscraping/sources/securitytrails/securitytrails.gopkg/subscraping/sources/shodan/shodan.gopkg/subscraping/sources/sitedossier/sitedossier.gopkg/subscraping/sources/thc/thc.gopkg/subscraping/sources/threatbook/threatbook.gopkg/subscraping/sources/threatcrowd/threatcrowd.gopkg/subscraping/sources/threatminer/threatminer.gopkg/subscraping/sources/virustotal/virustotal.gopkg/subscraping/sources/waybackarchive/waybackarchive.gopkg/subscraping/sources/whoisxmlapi/whoisxmlapi.gopkg/subscraping/sources/windvane/windvane.gopkg/subscraping/sources/zoomeyeapi/zoomeyeapi.gopkg/subscraping/types.go
🧰 Additional context used
🧬 Code graph analysis (13)
pkg/subscraping/sources/reconcloud/reconcloud.go (11)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/riddler/riddler.go (2)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)
pkg/subscraping/sources/threatcrowd/threatcrowd.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/github/github.go (1)
Source(41-48)pkg/subscraping/sources/hackertarget/hackertarget.go (1)
Source(14-21)pkg/subscraping/sources/intelx/intelx.go (1)
Source(41-48)pkg/subscraping/sources/securitytrails/securitytrails.go (1)
Source(29-36)pkg/subscraping/sources/shodan/shodan.go (1)
Source(15-22)pkg/subscraping/sources/sitedossier/sitedossier.go (1)
Source(22-27)pkg/subscraping/sources/threatminer/threatminer.go (1)
Source(21-26)pkg/subscraping/sources/virustotal/virustotal.go (1)
Source(28-35)pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)
Source(16-21)
pkg/subscraping/sources/netlas/netlas.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/bevigil/bevigil.go (1)
Source(19-26)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/builtwith/builtwith.go (1)
Source(33-40)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/anubis/anubis.go (18)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/bevigil/bevigil.go (1)
Source(19-26)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/builtwith/builtwith.go (1)
Source(33-40)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)
Source(15-22)pkg/subscraping/types.go (2)
Session(72-79)Result(82-87)pkg/resolve/resolve.go (1)
Result(35-42)
pkg/subscraping/sources/fullhunt/fullhunt.go (11)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/onyphe/onyphe.go (1)
Result(25-32)
pkg/subscraping/sources/crtsh/crtsh.go (2)
pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/sitedossier/sitedossier.go (1)
Source(22-27)
pkg/subscraping/sources/bevigil/bevigil.go (5)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)
pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/bevigil/bevigil.go (1)
Source(19-26)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/builtwith/builtwith.go (1)
Source(33-40)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/reconeer/reconeer.go (4)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)
pkg/subscraping/sources/commoncrawl/commoncrawl.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/github/github.go (1)
Source(41-48)pkg/subscraping/sources/hackertarget/hackertarget.go (1)
Source(14-21)pkg/subscraping/sources/intelx/intelx.go (1)
Source(41-48)pkg/subscraping/sources/securitytrails/securitytrails.go (1)
Source(29-36)pkg/subscraping/sources/shodan/shodan.go (1)
Source(15-22)pkg/subscraping/sources/sitedossier/sitedossier.go (1)
Source(22-27)pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)
Source(22-27)pkg/subscraping/sources/threatminer/threatminer.go (1)
Source(21-26)pkg/subscraping/sources/virustotal/virustotal.go (1)
Source(28-35)pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)
Source(16-21)
pkg/subscraping/sources/hudsonrock/hudsonrock.go (6)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/digitorus/digitorus.go (12)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)
Source(15-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Builds (ubuntu-latest)
- GitHub Check: Test Builds (windows-latest)
- GitHub Check: Test Builds (macOS-latest)
- GitHub Check: release-test
- GitHub Check: Analyze (go)
🔇 Additional comments (64)
pkg/subscraping/sources/quake/quake.go (1)
39-39: LGTM! Request tracking implementation is correct.The request counter is properly declared, initialized, incremented before each API call (including paginated requests), and exposed via Statistics().
Also applies to: 48-48, 74-74, 159-159
pkg/subscraping/sources/rapiddns/rapiddns.go (1)
22-22: LGTM! Request tracking correctly handles pagination.The implementation correctly increments the counter for each page request in the pagination loop, ensuring accurate tracking across all API calls.
Also applies to: 30-30, 46-46, 121-121
pkg/subscraping/sources/threatbook/threatbook.go (1)
33-33: LGTM! Request tracking implementation is correct.The counter is properly initialized, incremented before the API call, and included in the statistics output.
Also applies to: 42-42, 56-56, 133-133
pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)
20-20: LGTM! Request tracking implementation is correct.The implementation follows the established pattern: field declaration, initialization, increment before the HTTP call, and statistics reporting.
Also applies to: 28-28, 36-36, 103-103
pkg/subscraping/sources/certspotter/certspotter.go (1)
25-25: LGTM! Request tracking correctly handles both initial and paginated requests.The implementation properly increments the counter for both the initial issuances call (Line 51) and subsequent paginated requests (Line 94), ensuring accurate tracking across all API interactions.
Also applies to: 34-34, 51-51, 94-94, 159-159
pkg/subscraping/sources/intelx/intelx.go (1)
46-46: LGTM! Request tracking correctly implemented.The implementation properly tracks all HTTP requests:
- Initial POST request (line 90) to initiate the search
- Subsequent GET requests (line 118) in the polling loop
The use of a plain
intcounter is appropriate sinceRun()uses a single goroutine.Also applies to: 60-60, 90-90, 118-118, 186-186
pkg/subscraping/sources/zoomeyeapi/zoomeyeapi.go (1)
30-30: LGTM! Paginated request tracking implemented correctly.The request counter is properly incremented inside the pagination loop (line 74), ensuring all page requests are tracked. The counter is reset at the start of
Run()and correctly exposed viaStatistics().Also applies to: 39-39, 74-74, 138-138
pkg/subscraping/sources/driftnet/driftnet.go (1)
31-31: LGTM! Concurrent request tracking with proper synchronization.Excellent use of
atomic.Int32for the request counter. SinceRun()spawns multiple goroutines (one per endpoint at lines 86-88), the atomic operations (Store,Add,Load) correctly prevent race conditions when tracking concurrent requests.Also applies to: 69-69, 145-145, 124-124
pkg/subscraping/sources/domainsproject/domainsproject.go (1)
21-21: LGTM! Straightforward request tracking implementation.The implementation correctly tracks the single API request made per
Run()invocation. The counter is incremented before the HTTP request (line 55) and properly exposed viaStatistics().Also applies to: 40-40, 55-55, 137-137
pkg/subscraping/sources/threatminer/threatminer.go (1)
25-25: LGTM! Clean request tracking implementation.The implementation correctly tracks the single API request made per
Run()invocation. The counter is incremented before the HTTP call (line 41) and properly included in theStatistics()return value.Also applies to: 33-33, 41-41, 99-99
pkg/subscraping/sources/hudsonrock/hudsonrock.go (1)
29-29: LGTM! Request tracking correctly implemented.The request tracking follows the established pattern consistently:
- Field added to struct
- Initialized at Run() start
- Incremented before the HTTP call
- Exposed via Statistics()
The placement before the API call (line 45) correctly tracks attempted requests, which is appropriate for quota monitoring.
Also applies to: 37-37, 45-45, 105-105
pkg/subscraping/sources/shodan/shodan.go (1)
20-20: LGTM! Request tracking correctly handles pagination.The implementation properly tracks each paginated request:
- Counter incremented inside the pagination loop (line 60)
- Each page fetch is counted individually
- Early exit when API key is missing (lines 46-49) correctly avoids incrementing
This ensures accurate request counts for quota tracking even with multiple pages.
Also applies to: 37-37, 60-60, 133-133
pkg/subscraping/sources/leakix/leakix.go (1)
19-19: LGTM! Request tracking correctly implemented.The implementation follows the established pattern consistently:
- Field initialization and increment placement are correct
- Request counted before the HTTP call (line 45)
- Properly integrated with optional API key handling
Also applies to: 28-28, 45-45, 106-106
pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go (1)
36-36: LGTM! Request tracking correctly implemented.The implementation properly handles the required API key check:
- Early exit when no API key (lines 54-56) occurs before the request counter increment
- Counter incremented immediately before the HTTP call (line 59)
- Follows the consistent pattern across all sources
Also applies to: 45-45, 59-59, 119-119
pkg/subscraping/sources/alienvault/alienvault.go (1)
26-26: LGTM! Request tracking correctly implemented.The implementation follows the established pattern:
- API key validation (lines 44-48) correctly exits before incrementing when key is missing
- Request counter incremented before the HTTP call (line 50)
- Consistent with all other sources in this PR
Excellent consistency across all reviewed sources - the uniform implementation pattern will aid maintainability.
Also applies to: 36-36, 50-50, 116-116
pkg/subscraping/types.go (1)
29-35: LGTM! Clean addition to Statistics struct.The
Requestsfield is correctly added to track HTTP request counts per source. The field type and naming are appropriate.pkg/subscraping/sources/hackertarget/hackertarget.go (1)
19-19: LGTM! Request tracking correctly implemented.The implementation follows a clean pattern:
- Adds
requestsfield to track counter- Initializes to 0 at Run start
- Increments before the HTTP request
- Exposes via Statistics()
The counter increments before the actual HTTP call, which correctly tracks attempted requests (useful for quota monitoring even if requests fail).
Also applies to: 28-28, 45-45, 115-115
pkg/subscraping/sources/onyphe/onyphe.go (1)
39-39: LGTM! Pagination correctly handled.The request tracking is correctly implemented for this paginated source:
- Counter increments inside the pagination loop (lines 67-134)
- Each page request is tracked separately
- Total requests accurately reflect all API calls made during enumeration
This is the correct pattern for sources that make multiple requests.
Also applies to: 48-48, 78-78, 165-165
pkg/subscraping/sources/facebook/ctlogs.go (1)
81-81: LGTM! Request tracking correctly implemented for paginated source.The implementation properly handles pagination:
- Counter increments inside the pagination loop (lines 107-148)
- Each page request is tracked
- The
FetchAccessTokenHTTP call (line 43) is intentionally not counted, which is correct since it occurs during initialization viaAddApiKeys, not during the enumeration phaseAlso applies to: 90-90, 113-113, 198-198
pkg/subscraping/sources/builtwith/builtwith.go (1)
38-38: LGTM! Request tracking correctly implemented.The implementation follows the established pattern consistently:
- Adds tracking field and initializes at Run start
- Increments before the HTTP request (line 61)
- Properly exposes the counter via Statistics()
Also applies to: 47-47, 60-60, 118-118
pkg/subscraping/sources/bufferover/bufferover.go (1)
30-30: LGTM! Request tracking implementation is correct.The request counter follows the established pattern consistently:
- Field added to Source struct
- Initialized in Run()
- Incremented before the HTTP call in getData()
- Exposed via Statistics()
This correctly tracks request attempts regardless of success or failure.
Also applies to: 39-39, 60-60, 137-137
pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)
26-26: LGTM! Request tracking correctly implemented.The implementation follows the established pattern:
- Counter field added to Source struct
- Properly initialized at the start of Run()
- Incremented immediately before the HTTP request
- Exposed through Statistics()
The placement at line 50 (before
session.Client.Do) correctly tracks the request attempt.Also applies to: 34-34, 50-50, 128-128
pkg/subscraping/sources/fofa/fofa.go (1)
30-30: LGTM! Request tracking implementation is correct.The implementation is consistent with the PR-wide pattern:
- Counter field properly added to Source struct
- Initialized to 0 at Run() start
- Incremented before the API call at line 60
- Returned through Statistics()
The single request tracks correctly with the bulk retrieval approach (size=10000 on line 61).
Also applies to: 44-44, 60-60, 137-137
pkg/subscraping/sources/chinaz/chinaz.go (1)
20-20: LGTM! Request tracking correctly implemented.The implementation follows the established pattern consistently:
- Counter field added to Source struct
- Properly initialized in Run()
- Incremented before the HTTP call (line 43)
- Exposed via Statistics()
All changes align with the PR objectives for tracking API requests.
Also applies to: 29-29, 43-43, 109-109
pkg/subscraping/sources/sitedossier/sitedossier.go (1)
26-26: LGTM! Request tracking correctly handles pagination.The implementation properly tracks requests in a paginated scenario:
- Counter field added to Source struct
- Initialized in Run()
- Incremented at the start of enumerate() before each HTTP call
- Correctly counts multiple requests when pagination occurs (recursive enumerate at line 86)
- Exposed through Statistics()
This accurately tracks the total number of API requests across all pages.
Also applies to: 34-34, 55-55, 116-116
pkg/subscraping/sources/thc/thc.go (1)
27-27: LGTM! Request tracking implementation is correct.The
requestscounter is properly declared, initialized, incremented before each API call, and exposed viaStatistics(). The placement of the increment at line 68 correctly tracks each paginated request.Also applies to: 42-42, 68-68, 129-129
pkg/subscraping/sources/windvane/windvane.go (1)
41-41: LGTM! Request tracking implementation is correct.The
requestscounter is properly declared, initialized, incremented before each paginated API call, and exposed viaStatistics().Also applies to: 49-49, 77-77, 156-156
pkg/subscraping/sources/netlas/netlas.go (1)
43-43: LGTM! Request tracking implementation is correct.The
requestscounter properly tracks both API calls (count query and bulk download). The increments at lines 68 and 126 are correctly placed before their respective HTTP requests.Also applies to: 51-51, 68-68, 126-126, 203-203
pkg/subscraping/sources/gitlab/gitlab.go (1)
25-25: Sequential request tracking is correct, but see concurrency issue at line 97.The field declaration, initialization, sequential increment at line 71, and statistics exposure are all implemented correctly. However, the concurrent increment at line 97 requires fixing.
Also applies to: 41-41, 71-71, 180-180
pkg/runner/stats.go (1)
27-27: LGTM! Statistics output formatting is correct.The new "Requests" column is properly integrated into both the format string and header. The column width (10 chars) and separator length (68 chars) are consistent with the overall layout.
Also applies to: 32-32
pkg/subscraping/sources/pugrecon/pugrecon.go (1)
35-35: LGTM! Request tracking correctly implemented.The request counting follows the correct pattern: field declaration, initialization, increment before the HTTP call, and exposure via Statistics(). The implementation accurately tracks the single API request made to the pugrecon service.
Also applies to: 44-44, 76-76, 156-156
pkg/subscraping/sources/virustotal/virustotal.go (1)
33-33: LGTM! Pagination-aware request tracking correctly implemented.The implementation correctly increments the request counter inside the pagination loop (line 65), ensuring accurate tracking when VirusTotal returns multiple pages of results. This is essential for monitoring actual API quota consumption.
Also applies to: 42-42, 65-65, 131-131
pkg/subscraping/sources/reconeer/reconeer.go (1)
25-25: LGTM! Request tracking correctly implemented.The implementation follows the established pattern and accurately tracks the single API request to the Reconeer service.
Also applies to: 33-33, 51-51, 112-112
pkg/subscraping/sources/riddler/riddler.go (1)
18-18: LGTM! Request tracking correctly implemented.The implementation follows the established pattern and accurately tracks the single HTTP request to the Riddler service.
Also applies to: 26-26, 34-34, 97-97
pkg/subscraping/sources/chaos/chaos.go (1)
19-19: The implementation is correct. The Chaos API'sGetSubdomainsendpoint makes a single HTTP request that returns all subdomains in one response with no pagination. Incrementing the request counter once before callingGetSubdomains()accurately reflects the actual API usage.Likely an incorrect or invalid review comment.
pkg/subscraping/sources/digitorus/digitorus.go (1)
21-21: LGTM! Request tracking correctly implemented.The request counter is properly added to the Source struct, initialized at the start of Run, incremented before the HTTP call, and exposed via Statistics(). The implementation follows the consistent pattern across all sources.
Also applies to: 29-29, 37-37, 105-105
pkg/subscraping/sources/censys/censys.go (1)
78-78: LGTM! Request tracking correctly handles pagination.The request counter is properly implemented with correct pagination handling. Each HTTP request in the pagination loop (up to
maxCensysPages) is counted by incrementings.requestsat line 141 before the API call at line 142. This ensures accurate tracking of all API requests made to Censys.Also applies to: 87-87, 141-141, 226-226
pkg/subscraping/sources/crtsh/crtsh.go (1)
32-32: LGTM! Request tracking correctly distinguishes HTTP from database queries.The implementation correctly counts only HTTP requests. The
getSubdomainsFromSQLmethod (which performs direct database queries via PostgreSQL connection) does not increment the counter, whilegetSubdomainsFromHTTPappropriately increments at line 150 before the HTTP call at line 151. This properly reflects the feature's intent to track HTTP/API requests.Also applies to: 40-40, 150-150, 218-218
pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
28-28: LGTM! Request tracking correctly implemented.The request counter is properly integrated following the standard pattern: field added to Source struct, initialized at Run start, incremented before the HTTP call at line 52, and exposed via Statistics(). Implementation is correct and consistent with other sources.
Also applies to: 37-37, 51-51, 109-109
pkg/subscraping/sources/reconcloud/reconcloud.go (1)
32-32: LGTM! Request tracking correctly implemented.The request counter follows the consistent pattern across all sources: added to Source struct, initialized at the start of Run, incremented before the HTTP call at line 49, and properly exposed via Statistics(). The implementation is correct.
Also applies to: 40-40, 48-48, 108-108
pkg/subscraping/sources/securitytrails/securitytrails.go (4)
34-34: LGTM! Request tracking field added.The
requestsfield is correctly added to track HTTP requests made by this source.
43-43: LGTM! Request counter initialized.The counter is properly reset at the start of each run.
71-71: LGTM! All HTTP requests tracked correctly.The implementation correctly increments the request counter before:
- The initial POST request (line 71)
- Each pagination GET request (line 75)
- The retry GET request on 403 errors (line 80)
This ensures accurate tracking of all API calls made during enumeration.
Also applies to: 75-75, 80-80
164-164: LGTM! Request metric exposed in statistics.The
Requestsfield is correctly included in the statistics output.pkg/subscraping/sources/profundis/profundis.go (1)
21-21: LGTM! Request tracking implemented correctly.The implementation follows the correct pattern:
- Field declared in struct (line 21)
- Counter initialized at run start (line 30)
- Incremented before the HTTP POST request (line 57)
- Exposed via Statistics() (line 129)
Also applies to: 30-30, 57-57, 129-129
pkg/subscraping/sources/anubis/anubis.go (1)
20-20: LGTM! Request tracking implemented correctly.The implementation follows the correct pattern:
- Field declared in struct (line 20)
- Counter initialized at run start (line 28)
- Incremented before the HTTP GET request (line 36)
- Exposed via Statistics() (line 100)
Also applies to: 28-28, 36-36, 100-100
pkg/subscraping/sources/c99/c99.go (1)
21-21: LGTM! Request tracking implemented correctly.The implementation follows the correct pattern:
- Field declared in struct (line 21)
- Counter initialized at run start (line 40)
- Incremented before the HTTP GET request (line 55)
- Exposed via Statistics() (line 125)
Also applies to: 40-40, 55-55, 125-125
pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)
31-31: LGTM! Request tracking implemented correctly.The implementation follows the correct pattern:
- Field declared in struct (line 31)
- Counter initialized at run start (line 39)
- Incremented before the initial request (line 61) and each pagination request (line 88)
- Exposed via Statistics() (line 157)
The request tracking accurately counts all HTTP calls made during enumeration, including pagination.
Also applies to: 39-39, 61-61, 88-88, 157-157
pkg/subscraping/sources/commoncrawl/commoncrawl.go (4)
35-35: LGTM: Request tracking field added.The
requestsfield is correctly added to the Source struct, following the same pattern as other sources in the codebase.
43-51: LGTM: Request counter properly initialized and incremented.The counter is correctly initialized to 0 at the start of the Run method and incremented before the initial HTTP request to the index URL.
135-136: LGTM: Per-subdomain request tracking implemented correctly.The request counter is correctly incremented before each HTTP call in the getSubdomains loop, ensuring all API requests are tracked.
119-126: LGTM: Statistics properly exposes request count.The Statistics method correctly includes the Requests field, making the request count available for the
-statsoutput.pkg/subscraping/sources/fullhunt/fullhunt.go (1)
26-26: LGTM: Request tracking correctly implemented.The request tracking implementation is correct:
- Field added to Source struct (line 26)
- Initialized at Run start (line 34)
- Incremented before the HTTP request (line 48)
- Exposed via Statistics (line 104)
Also applies to: 34-34, 48-48, 104-104
pkg/subscraping/sources/dnsdb/dnsdb.go (3)
46-46: LGTM: Request tracking properly initialized.The request counter is correctly added to the struct, initialized, and incremented before the getMaxOffset call (which internally makes an HTTP request to the rate_limit endpoint).
Also applies to: 55-55, 75-76
98-99: LGTM: Paginated requests properly tracked.The request counter is correctly incremented for each paginated request in the loop.
198-206: LGTM: Statistics properly exposes request count.The Statistics method correctly includes the Requests field in the returned structure.
pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)
20-20: LGTM: Request tracking correctly implemented.The request tracking is properly implemented across all necessary points:
- Field declaration (line 20)
- Initialization (line 32)
- Increment before HTTP request (line 55)
- Exposure via Statistics (line 118)
Also applies to: 32-32, 55-56, 118-118
pkg/subscraping/sources/github/github.go (3)
46-46: LGTM: Request counter field and initialization.The requests field is properly added to the Source struct and initialized at the start of the Run method.
Also applies to: 55-55
91-92: LGTM: Initial search request tracked correctly.The request counter is correctly incremented before the initial GitHub search API call.
280-288: Statistics method structure is correct, but affected by race condition.While the Statistics method properly includes the Requests field, the value will be inaccurate due to the race condition flagged in the previous comment.
pkg/subscraping/sources/digitalyama/digitalyama.go (1)
19-19: LGTM!The request tracking implementation follows the established pattern across all source files:
- Field added to struct
- Initialization in
Run- Increment before the HTTP request
- Exposure via
Statistics()Also applies to: 38-38, 53-53, 137-137
pkg/subscraping/sources/bevigil/bevigil.go (1)
24-24: LGTM!Request tracking implementation is consistent with the codebase pattern and correctly placed before the HTTP request.
Also applies to: 32-32, 48-48, 112-112
pkg/subscraping/sources/merklemap/merklemap.go (1)
24-24: LGTM!Request tracking correctly handles pagination by incrementing in
fetchPage, ensuring each page request is counted. Implementation is consistent with the codebase pattern.Also applies to: 33-33, 102-102, 159-159
pkg/subscraping/sources/rsecloud/rsecloud.go (1)
27-27: LGTM!Request tracking correctly handles the dual-endpoint pagination pattern. The increment inside
fetchSubdomainsproperly counts all requests across both "active" and "passive" endpoints and their respective pages.Also applies to: 36-36, 60-60, 128-128
use atomic.Int32 for counter fields to prevent data races when multiple goroutines concurrently increment requests/results/errors
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/subscraping/sources/github/github.go (1)
154-238: Incomplete error counting in concurrent item processing.When multiple goroutines in
proccesItemsencounter errors (line 180), they're collected inerrChan, but only the first error is returned (lines 231-235) and counted in statistics (lines 127-129). If several items fail concurrently, the error count will be underreported.Consider incrementing the error counter atomically within the goroutine when an error occurs, similar to how results are tracked.
🔎 Suggested fix to track all concurrent errors
resp, err := session.SimpleGet(ctx, rawURL(responseItem.HTMLURL)) if err != nil { + s.errors.Add(1) if resp != nil && resp.StatusCode != http.StatusNotFound { session.DiscardHTTPResponse(resp) } - errChan <- err return }Then you can simplify the error channel handling or remove it if it's no longer needed for propagation.
🧹 Nitpick comments (1)
pkg/subscraping/sources/gitlab/gitlab.go (1)
95-126: Pre-existing bug:defer wg.Done()placement can cause deadlock.The
defer wg.Done()at line 124 is registered after potential early returns (e.g., line 106). If the goroutine returns before reaching line 124,wg.Done()is never called, causingwg.Wait()at line 147 to hang indefinitely.Move the defer to the beginning of the goroutine to ensure it's always registered.
🔎 Proposed fix
for _, it := range items { go func(item item) { + defer wg.Done() // The original item.Path causes 404 error because the Gitlab API is expecting the url encoded path fileUrl := fmt.Sprintf("https://gitlab.com/api/v4/projects/%d/repository/files/%s/raw?ref=%s", item.ProjectId, url.QueryEscape(item.Path), item.Ref) s.requests.Add(1) resp, err := session.Get(ctx, fileUrl, "", headers) if err != nil { if resp == nil || (resp != nil && resp.StatusCode != http.StatusNotFound) { session.DiscardHTTPResponse(resp) results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors.Add(1) return } } if resp.StatusCode == http.StatusOK { scanner := bufio.NewScanner(resp.Body) for scanner.Scan() { line := scanner.Text() if line == "" { continue } for _, subdomain := range domainRegexp.FindAllString(line, -1) { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain} s.results.Add(1) } } session.DiscardHTTPResponse(resp) } - defer wg.Done() }(it) }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/subscraping/sources/github/github.gopkg/subscraping/sources/gitlab/gitlab.go
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/subscraping/sources/github/github.go (1)
pkg/subscraping/sources/gitlab/gitlab.go (1)
Source(21-28)
pkg/subscraping/sources/gitlab/gitlab.go (1)
pkg/subscraping/sources/github/github.go (1)
Source(42-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Builds (macOS-latest)
- GitHub Check: Test Builds (ubuntu-latest)
- GitHub Check: Test Builds (windows-latest)
- GitHub Check: Analyze (go)
🔇 Additional comments (9)
pkg/subscraping/sources/gitlab/gitlab.go (4)
21-28: LGTM! Atomic counters properly address the race condition.The switch from primitive
inttoatomic.Int32forerrors,results, andrequestsfields correctly resolves the data race flagged in the previous review. This is consistent with the GitHub source implementation.
40-42: LGTM!Proper initialization of atomic counters using
Store(0)ensures clean state at the start of each run.
72-76: LGTM!The request counter is correctly incremented before the API call, and error counter is updated atomically on failure.
177-185: LGTM!The
Statistics()method correctly usesLoad()to safely read the atomic counter values.pkg/subscraping/sources/github/github.go (5)
15-15: LGTM! Atomic counters prevent race conditions.The introduction of
sync/atomicand the conversion of counter fields toatomic.Int32correctly addresses the race condition with concurrent goroutines processing items (spawned at line 165).Also applies to: 45-47
54-56: LGTM! Counter initialization is correct.Using
Store(0)properly resets all atomic counters at the start of each enumeration run.
92-92: LGTM! Request tracking is correctly implemented.Both the initial GitHub search request and per-item raw file fetches are properly tracked with atomic increments before the HTTP calls.
Also applies to: 174-174
203-203: LGTM! Results tracking is thread-safe and accurate.The atomic increments correctly track each subdomain result found, both from file content and text matches, with proper synchronization for concurrent processing.
Also applies to: 221-221
281-289: LGTM! Statistics method correctly reports atomic counter values.The method properly uses
Load()to safely read the atomic counters and returns the complete statistics including the newRequestsfield.
Mzack9999
left a comment
There was a problem hiding this comment.
lgtm
$ go run . -d scanme.sh -stats
...
[INF] Enumerating subdomains for scanme.sh
honey.scanme.sh
demo.scanme.sh
[INF] Found 2 subdomains for scanme.sh in 439 milliseconds 62 microseconds
[INF] Printing source statistics for scanme.sh
Source Duration Results Requests Errors
────────────────────────────────────────────────────────────────────
anubis 437ms 0 1 0
builtwith 0s 0 0 0
crtsh 366ms 2 0 0
digitorus 111ms 0 1 0
driftnet 33ms 0 4 4
leakix 210ms 0 1 1
thc 383ms 0 1 0
virustotal 0s 0 0 0
...There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/subscraping/sources/gitlab/gitlab.go (1)
95-126: Deadlock:defer wg.Done()placed after early return.The
defer wg.Done()at line 124 is never reached if the goroutine returns early at line 106. This causeswg.Wait()at line 147 to block indefinitely, resulting in a deadlock when any file fetch fails with a non-404 error.🐛 Proposed fix: move defer to start of goroutine
for _, it := range items { go func(item item) { + defer wg.Done() // The original item.Path causes 404 error because the Gitlab API is expecting the url encoded path fileUrl := fmt.Sprintf("https://gitlab.com/api/v4/projects/%d/repository/files/%s/raw?ref=%s", item.ProjectId, url.QueryEscape(item.Path), item.Ref) s.requests.Add(1) resp, err := session.Get(ctx, fileUrl, "", headers) if err != nil { if resp == nil || (resp != nil && resp.StatusCode != http.StatusNotFound) { session.DiscardHTTPResponse(resp) results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors.Add(1) return } } if resp.StatusCode == http.StatusOK { scanner := bufio.NewScanner(resp.Body) for scanner.Scan() { line := scanner.Text() if line == "" { continue } for _, subdomain := range domainRegexp.FindAllString(line, -1) { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain} s.results.Add(1) } } session.DiscardHTTPResponse(resp) } - defer wg.Done() }(it) }pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)
79-88: Pre-existing pagination issue: page 1 fetched twice.The initial request (line 60-62) fetches page 1, but when pagination is needed (line 79), the loop starts at
page := 1(line 81), causing page 1 to be fetched again at line 89. This wastes an API call and may confuse users monitoring request counts.The request tracking added by this PR is correct and accurately counts the duplicate request, but the underlying pagination logic should be fixed.
♻️ Suggested fix for pagination logic
Option 1: Start pagination loop from page 2 (preferred if initial request subdomains are already processed):
session.DiscardHTTPResponse(resp) if response.Metadata.ResultCount > pageSize { totalPages := (response.Metadata.ResultCount + pageSize - 1) / pageSize - for page := 1; page <= totalPages; page++ { + for page := 2; page <= totalPages; page++ { select { case <-ctx.Done(): returnOption 2: Skip initial request if pagination is needed (fetch all pages in loop):
session.DiscardHTTPResponse(resp) + + // Process initial page response only if no pagination needed if response.Metadata.ResultCount > pageSize { + // Fetch all pages via pagination totalPages := (response.Metadata.ResultCount + pageSize - 1) / pageSize for page := 1; page <= totalPages; page++ { // ... pagination logic } } else { + // Process subdomains from initial request for _, subdomain := range response.Subdomains {
🤖 Fix all issues with AI agents
In @pkg/subscraping/sources/c99/c99.go:
- Line 21: The fields requests, errors, and results in the c99 source struct
should be changed from plain int to atomic.Int32 (like
gitlab.go/github.go/driftnet.go) to make stats thread-safe: replace their
declarations with atomic.Int32, import sync/atomic (or use the atomic package
type), increment them using Add(1) in Run(), and read them using Load() in
Statistics(); ensure zero-value initialization is fine and update any casts when
returning numbers to match the Statistics() return types.
In @pkg/subscraping/sources/fofa/fofa.go:
- Line 30: Replace the plain int field named requests with atomic.Int32 on the
struct (use sync/atomic's atomic.Int32 type), update all increments in Run() (or
any goroutine) to use requests.Add(1) / atomic.AddInt32(&requests, 1) style, and
change any reads (e.g., in Statistics()) to use requests.Load() /
atomic.LoadInt32(&requests) so all accesses are thread-safe; ensure you import
sync/atomic or use the atomic.Int32 methods consistently and update any type
assumptions where requests was previously treated as an int.
In @pkg/subscraping/sources/hackertarget/hackertarget.go:
- Around line 42-44: Remove the duplicated API-key append that mutates
htSearchUrl a second time (the assignment using fmt.Sprintf with randomApiKey)
so the URL isn’t built with a duplicate or empty apikey parameter; ensure the
single API-key append remains only in the earlier logic that already handles
randomApiKey (or, alternatively, gate any append with randomApiKey != ""), and
leave the s.requests++ increment as-is before the HTTP call.
🧹 Nitpick comments (1)
pkg/subscraping/sources/bufferover/bufferover.go (1)
30-30: Consider using atomic operations for thread safety.While this implementation follows the existing pattern for
errorsandresultscounters, there's a potential data race whenStatistics()is called concurrently while theRungoroutine modifies the counter. Since the AI summary indicates some sources use atomics, consider usingatomic.Int32oratomic.Int64for consistency and guaranteed thread safety.🔒 Proposed refactor using atomic.Int32
+import ( + "sync/atomic" +) + type Source struct { apiKeys []string timeTaken time.Duration errors int results int - requests int + requests atomic.Int32 skipped bool }Then update the increment at line 60:
- s.requests++ + s.requests.Add(1)And the statistics exposure at line 141:
- Requests: s.requests, + Requests: int(s.requests.Load()),Note: Initialization at line 39 can be removed since atomic.Int32 is zero-initialized.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (53)
pkg/subscraping/sources/alienvault/alienvault.gopkg/subscraping/sources/anubis/anubis.gopkg/subscraping/sources/bevigil/bevigil.gopkg/subscraping/sources/bufferover/bufferover.gopkg/subscraping/sources/builtwith/builtwith.gopkg/subscraping/sources/c99/c99.gopkg/subscraping/sources/censys/censys.gopkg/subscraping/sources/certspotter/certspotter.gopkg/subscraping/sources/chaos/chaos.gopkg/subscraping/sources/chinaz/chinaz.gopkg/subscraping/sources/commoncrawl/commoncrawl.gopkg/subscraping/sources/crtsh/crtsh.gopkg/subscraping/sources/digitalyama/digitalyama.gopkg/subscraping/sources/digitorus/digitorus.gopkg/subscraping/sources/dnsdb/dnsdb.gopkg/subscraping/sources/dnsdumpster/dnsdumpster.gopkg/subscraping/sources/dnsrepo/dnsrepo.gopkg/subscraping/sources/domainsproject/domainsproject.gopkg/subscraping/sources/driftnet/driftnet.gopkg/subscraping/sources/facebook/ctlogs.gopkg/subscraping/sources/fofa/fofa.gopkg/subscraping/sources/fullhunt/fullhunt.gopkg/subscraping/sources/github/github.gopkg/subscraping/sources/gitlab/gitlab.gopkg/subscraping/sources/hackertarget/hackertarget.gopkg/subscraping/sources/hudsonrock/hudsonrock.gopkg/subscraping/sources/intelx/intelx.gopkg/subscraping/sources/leakix/leakix.gopkg/subscraping/sources/merklemap/merklemap.gopkg/subscraping/sources/netlas/netlas.gopkg/subscraping/sources/onyphe/onyphe.gopkg/subscraping/sources/profundis/profundis.gopkg/subscraping/sources/pugrecon/pugrecon.gopkg/subscraping/sources/quake/quake.gopkg/subscraping/sources/rapiddns/rapiddns.gopkg/subscraping/sources/reconcloud/reconcloud.gopkg/subscraping/sources/reconeer/reconeer.gopkg/subscraping/sources/redhuntlabs/redhuntlabs.gopkg/subscraping/sources/riddler/riddler.gopkg/subscraping/sources/rsecloud/rsecloud.gopkg/subscraping/sources/securitytrails/securitytrails.gopkg/subscraping/sources/shodan/shodan.gopkg/subscraping/sources/sitedossier/sitedossier.gopkg/subscraping/sources/thc/thc.gopkg/subscraping/sources/threatbook/threatbook.gopkg/subscraping/sources/threatcrowd/threatcrowd.gopkg/subscraping/sources/threatminer/threatminer.gopkg/subscraping/sources/virustotal/virustotal.gopkg/subscraping/sources/waybackarchive/waybackarchive.gopkg/subscraping/sources/whoisxmlapi/whoisxmlapi.gopkg/subscraping/sources/windvane/windvane.gopkg/subscraping/sources/zoomeyeapi/zoomeyeapi.gopkg/subscraping/types.go
🚧 Files skipped from review as they are similar to previous changes (28)
- pkg/subscraping/sources/builtwith/builtwith.go
- pkg/subscraping/sources/merklemap/merklemap.go
- pkg/subscraping/sources/alienvault/alienvault.go
- pkg/subscraping/sources/fullhunt/fullhunt.go
- pkg/subscraping/sources/anubis/anubis.go
- pkg/subscraping/sources/zoomeyeapi/zoomeyeapi.go
- pkg/subscraping/sources/crtsh/crtsh.go
- pkg/subscraping/sources/certspotter/certspotter.go
- pkg/subscraping/sources/facebook/ctlogs.go
- pkg/subscraping/sources/digitalyama/digitalyama.go
- pkg/subscraping/sources/domainsproject/domainsproject.go
- pkg/subscraping/sources/profundis/profundis.go
- pkg/subscraping/sources/reconeer/reconeer.go
- pkg/subscraping/sources/rsecloud/rsecloud.go
- pkg/subscraping/sources/securitytrails/securitytrails.go
- pkg/subscraping/sources/shodan/shodan.go
- pkg/subscraping/sources/windvane/windvane.go
- pkg/subscraping/types.go
- pkg/subscraping/sources/rapiddns/rapiddns.go
- pkg/subscraping/sources/sitedossier/sitedossier.go
- pkg/subscraping/sources/github/github.go
- pkg/subscraping/sources/digitorus/digitorus.go
- pkg/subscraping/sources/pugrecon/pugrecon.go
- pkg/subscraping/sources/leakix/leakix.go
- pkg/subscraping/sources/hudsonrock/hudsonrock.go
- pkg/subscraping/sources/threatcrowd/threatcrowd.go
- pkg/subscraping/sources/riddler/riddler.go
- pkg/subscraping/sources/chaos/chaos.go
🧰 Additional context used
🧬 Code graph analysis (6)
pkg/subscraping/sources/netlas/netlas.go (10)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/reconcloud/reconcloud.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/bevigil/bevigil.go (1)
Source(19-26)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/builtwith/builtwith.go (1)
Source(33-40)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
pkg/subscraping/sources/commoncrawl/commoncrawl.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/github/github.go (1)
Source(42-49)pkg/subscraping/sources/hackertarget/hackertarget.go (1)
Source(14-21)pkg/subscraping/sources/intelx/intelx.go (1)
Source(41-48)pkg/subscraping/sources/securitytrails/securitytrails.go (1)
Source(29-36)pkg/subscraping/sources/shodan/shodan.go (1)
Source(15-22)pkg/subscraping/sources/sitedossier/sitedossier.go (1)
Source(22-27)pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)
Source(22-27)pkg/subscraping/sources/threatminer/threatminer.go (1)
Source(21-26)pkg/subscraping/sources/virustotal/virustotal.go (1)
Source(28-35)pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)
Source(16-21)
pkg/subscraping/sources/gitlab/gitlab.go (3)
pkg/subscraping/types.go (2)
Result(95-100)Error(108-108)pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/github/github.go (1)
Source(42-49)
pkg/subscraping/sources/bevigil/bevigil.go (16)
pkg/subscraping/sources/alienvault/alienvault.go (1)
Source(22-29)pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/bufferover/bufferover.go (1)
Source(25-32)pkg/subscraping/sources/builtwith/builtwith.go (1)
Source(33-40)pkg/subscraping/sources/c99/c99.go (1)
Source(16-23)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)
Source(15-22)
pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (11)
pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-21)pkg/subscraping/sources/censys/censys.go (1)
Source(73-80)pkg/subscraping/sources/certspotter/certspotter.go (1)
Source(20-27)pkg/subscraping/sources/chaos/chaos.go (1)
Source(14-21)pkg/subscraping/sources/chinaz/chinaz.go (1)
Source(15-22)pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
Source(31-36)pkg/subscraping/sources/crtsh/crtsh.go (1)
Source(28-33)pkg/subscraping/sources/digitalyama/digitalyama.go (1)
Source(14-21)pkg/subscraping/sources/digitorus/digitorus.go (1)
Source(17-22)pkg/subscraping/sources/dnsdb/dnsdb.go (1)
Source(41-48)pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
Source(23-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Builds (windows-latest)
- GitHub Check: Test Builds (macOS-latest)
- GitHub Check: Test Builds (ubuntu-latest)
- GitHub Check: Analyze (go)
🔇 Additional comments (74)
pkg/subscraping/sources/netlas/netlas.go (1)
43-43: LGTM! Request tracking implemented correctly.The request tracking implementation follows the established pattern consistently:
- Field added to the
Sourcestruct (line 43)- Properly initialized to 0 at the start of
Run(line 51)- Incremented immediately before each HTTP request (lines 68, 126)
- Exposed in the
Statistics()method (line 207)The tracking accurately captures both HTTP requests made by this source (GET for domain count, POST for domain download). Since all operations execute sequentially within a single goroutine, no atomic operations are needed.
Also applies to: 51-51, 68-68, 126-126, 207-207
pkg/subscraping/sources/chinaz/chinaz.go (4)
20-20: LGTM! Field declaration follows existing pattern.The
requestsfield is consistent with the existingerrorsandresultstracking fields.
27-29: LGTM! Proper initialization of tracking fields.The
requestscounter is correctly initialized alongsideerrorsandresultsat the start of eachRuninvocation.
43-44: LGTM! Request counter correctly placed.The increment is properly positioned immediately before the HTTP request, ensuring all request attempts are counted regardless of outcome.
109-117: LGTM! Statistics correctly expose request count.The
Requestsfield is properly included in theStatisticsreturn value, consistent with other tracked metrics.pkg/subscraping/sources/waybackarchive/waybackarchive.go (3)
20-20: LGTM! Request counter properly declared and initialized.The
requestsfield follows the same pattern as the existingerrorsandresultscounters, and is correctly initialized at the start ofRun().Also applies to: 28-28
36-36: Correct placement of request counter increment.The counter is incremented immediately before the HTTP call, ensuring both successful and failed requests are tracked.
107-107: Request count properly exposed in statistics.The
Requestsfield is correctly included in theStatisticsreturn, enabling the-statsflag to display request counts as intended.pkg/subscraping/sources/gitlab/gitlab.go (5)
12-12: Atomic counters correctly address the race condition.The migration to
atomic.Int32forerrors,results, andrequestsfields aligns with the pattern used ingithub.goand properly addresses the previously identified race condition from concurrent goroutine access.Also applies to: 24-26
40-42: Counter reset ensures clean state per invocation.Initializing all atomic counters to zero at the start of
Run()ensures accurate statistics for each enumeration run.
72-72: Atomic increments in the main enumerate flow look correct.Request and error counters are properly incremented using atomic operations in the synchronous parts of
enumerate().Also applies to: 76-76, 87-87
139-139: Atomic error increment is correct here.
181-188: Statistics correctly reads atomic counters.Using
.Load()to read and casting tointfor the Statistics struct is the correct approach for safely retrieving atomic values.pkg/subscraping/sources/driftnet/driftnet.go (4)
31-31: LGTM! Consistent with existing counter fields.The
requestsfield follows the same atomic.Int32 pattern as the existingerrorsandresultscounters, ensuring thread-safe access across the concurrent goroutines spawned for each endpoint.
69-69: LGTM! Proper initialization before concurrent access.The counter is correctly reset at the start of each
Runinvocation, before the goroutines are spawned, ensuring a clean state and matching the initialization pattern of the error and result counters.
150-150: LGTM! Request tracking correctly placed.The counter increment is positioned immediately before the HTTP request, ensuring all API calls are tracked (including failed ones). Since
runSubsourceis invoked once per endpoint in concurrent goroutines, the atomic increment safely counts all requests across the 4 configured endpoints.
129-129: LGTM! Statistics correctly expose the request count.The atomic
Load()safely reads the request count and returns it as part of the statistics, matching the pattern used for errors and results.pkg/subscraping/sources/threatbook/threatbook.go (4)
33-33: LGTM: Request tracking field added.The
requestsfield is correctly declared and follows the same pattern as other statistics fields in the struct.
40-42: LGTM: Initialization is correct.The
requestscounter is properly initialized to 0 at the start of eachRuninvocation, consistent with other statistics fields.
56-57: LGTM: Request increment correctly placed.The request counter is incremented immediately before the API call, which correctly tracks the request regardless of success or failure. This aligns with the PR's objective to monitor actual API usage.
131-139: LGTM: Statistics correctly expose request count.The
Requestsfield is properly included in the returnedStatistics, enabling the-statsflag to display request counts as described in the PR objectives.pkg/subscraping/sources/reconcloud/reconcloud.go (4)
32-32: LGTM! Consistent field addition.The
requestsfield addition follows the established pattern across all sources in this PR and is correctly positioned alongside other tracking metrics.
38-40: LGTM! Proper initialization.The request counter is correctly initialized to 0 at the start of
Run, consistent with the existing error and result counters.
48-48: LGTM! Correct request tracking.The request counter is incremented at the appropriate location—immediately before the HTTP request—accurately tracking the single API call made by this source.
107-114: LGTM! Statistics correctly updated.The
Statistics()method now returns the request count, properly surfacing the tracked metric for the-statsflag output as intended by the PR objectives.pkg/subscraping/sources/commoncrawl/commoncrawl.go (5)
35-35: LGTM! Field addition follows established pattern.The
requests intfield is consistent with other sources and appropriate for single-goroutine access.
41-43: LGTM! Proper initialization.The
requestsfield is correctly initialized alongside other metrics at the start ofRun.
51-52: LGTM! Initial request correctly tracked.The increment before
session.SimpleGet(ctx, indexURL)accurately counts the index fetch request.
139-140: LGTM! Subdomain search requests correctly tracked.The increment before
session.Get(...)accurately counts each subdomain search request made in the loop.
127-127: LGTM! Statistics correctly updated.The
Requestsfield is properly included in the statistics output, enabling request tracking for the-statsflag.pkg/subscraping/sources/threatminer/threatminer.go (4)
25-25: LGTM!The
requestsfield is appropriately added to track HTTP requests, consistent with the existing metric fields.
31-33: LGTM!The initialization of
s.requestsis consistent with the existing pattern fors.errorsands.results.
41-42: LGTM! Correct placement for request tracking.The increment occurs immediately before the HTTP request, ensuring all attempted requests are counted (including those that may fail). This is the correct behavior for tracking API quota usage.
98-105: LGTM!The
Requestsfield is properly included in the Statistics return value, consistent with the other tracked metrics.pkg/subscraping/sources/thc/thc.go (3)
27-27: LGTM! Field addition and initialization are correct.The
requestsfield addition to the Source struct and its initialization inRun()follow the established pattern for other metrics (errors, results). The field is appropriately scoped as unexported.Also applies to: 42-42
68-68: LGTM! Request increment is correctly placed.The counter is incremented before the HTTP POST call, ensuring all request attempts are tracked (including those that may fail). The placement within the pagination loop correctly accounts for multiple requests when fetching paginated results.
133-133: LGTM! Statistics exposure is consistent.The
Requestsfield is properly exposed in theStatistics()method return value, following the same pattern as other metrics.pkg/subscraping/sources/intelx/intelx.go (1)
46-46: LGTM! Request tracking implemented consistently.The request counter is properly initialized, incremented before each HTTP call (1 POST + N GETs in the polling loop), and exposed via Statistics(). The implementation follows the established pattern used for error and result tracking across all sources.
Also applies to: 60-60, 90-90, 118-118, 190-190
pkg/subscraping/sources/virustotal/virustotal.go (1)
33-33: LGTM! Request tracking correctly handles pagination.The implementation properly counts each paginated request in the cursor-based loop and follows the consistent pattern established across all sources. The counter is initialized, incremented before each HTTP call, and exposed via Statistics().
Also applies to: 42-42, 65-65, 135-135
pkg/subscraping/sources/dnsdb/dnsdb.go (1)
46-46: LGTM! Request tracking accounts for both offset lookup and data retrieval.The implementation correctly counts both the initial rate limit/offset request (line 75, executed within
getMaxOffset()at line 215) and each subsequent data retrieval request (line 98) in the pagination loop. The pattern is consistent with other sources.Also applies to: 55-55, 75-75, 98-98, 206-206
pkg/subscraping/sources/bevigil/bevigil.go (1)
24-24: LGTM! Request tracking follows the established pattern.The implementation is straightforward and consistent with the pattern used across all other sources in this PR. The counter is properly initialized, incremented before the single HTTP request, and exposed via Statistics(). This matches the approach used in the 15+ sources shown in the relevant code snippets.
Also applies to: 32-32, 48-48, 116-116
pkg/subscraping/sources/censys/censys.go (1)
78-78: LGTM! Request tracking implemented correctly.The request counter follows the established pattern for error and result tracking in this source. The increment is correctly placed before the HTTP call (line 141), ensuring all request attempts are counted, including those that may fail. The initialization and statistics exposure are properly handled.
Also applies to: 87-87, 141-141, 230-230
pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (4)
28-28: LGTM: Request tracking field added.The
requestsfield is properly added to the Source struct and follows the same pattern as the existingerrorsandresultsfields.
37-37: LGTM: Proper initialization.The
requestscounter is correctly initialized to zero at the start of each Run() execution, consistent with the existing error and result counters.
51-51: LGTM: Request increment correctly placed.The increment is properly positioned immediately before the HTTP call and after API key validation. This ensures that only actual HTTP requests are counted, and skipped sources don't increment the counter.
113-113: LGTM: Statistics properly updated.The
Requestsfield is correctly included in the returned statistics, enabling the-statsflag to display per-source request counts as intended by the PR.pkg/subscraping/sources/quake/quake.go (4)
39-39: LGTM! Field addition is consistent with existing metrics.The
requestsfield follows the same pattern aserrorsandresults, and is appropriately unexported for encapsulation.
48-48: LGTM! Proper initialization alongside other counters.Explicitly resetting the counter at the start of
Run()ensures accurate tracking across multiple invocations.
74-74: LGTM! Request counting is accurate for paginated API calls.The increment is correctly placed before the HTTP call within the pagination loop, ensuring all requests (including failed ones) are tracked. This accurately reflects API usage for quota management.
157-165: LGTM! Statistics correctly includes request count.The
Requestsfield is properly returned alongside other metrics, enabling the-statsflag output as intended by the PR objectives.pkg/subscraping/sources/hackertarget/hackertarget.go (3)
19-19: LGTM: Request tracking field added.The
requestsfield is appropriately added to track HTTP requests. Since theRunmethod operates in a single goroutine, a simpleintwithout atomic operations is sufficient.
28-28: LGTM: Request counter initialization.The request counter is correctly initialized at the start of the
Runmethod.
117-117: LGTM: Statistics correctly expose request count.The
Requestsfield is properly included in the returned statistics.pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (5)
31-31: LGTM! Request tracking field follows established pattern.The
requests intfield addition is consistent with the pattern used across other sources (anubis, commoncrawl, crtsh, etc.).
39-39: LGTM! Proper initialization.Initializing
s.requests = 0at the start ofRun()ensures a clean state for each enumeration, consistent with the existing pattern for errors and results.
61-61: LGTM! Request increment correctly placed.The increment before the initial HTTP request correctly tracks the API call.
88-88: LGTM! Pagination request increment correctly placed.The increment before each paginated HTTP request correctly tracks API calls in the pagination loop.
155-162: LGTM! Statistics output correctly includes request count.The
Requests: s.requestsfield is properly included in the returned statistics, making the tracked request count available to the runner's stats output.pkg/subscraping/sources/dnsrepo/dnsrepo.go (4)
20-20: Field addition looks good.The
requestsfield is correctly added to track HTTP request counts, following the same pattern as the existingerrorsandresultsfields.
32-32: Initialization is correct.The
requestscounter is properly reset to 0 at the start of eachRun()invocation, ensuring accurate per-run tracking.
55-55: Request increment is correctly placed.The counter is incremented immediately before the HTTP request, ensuring it accurately reflects the number of API calls attempted regardless of success or failure.
118-126: Statistics correctly exposes the request count.The
Requestsfield is properly returned in the statistics payload and follows the established pattern used across the vast majority of sources in this codebase (50 out of 53 sources use plainintfields without synchronization).pkg/subscraping/sources/onyphe/onyphe.go (4)
39-39: LGTM! Field addition is consistent with existing counters.The
requestsfield follows the same pattern aserrorsandresults, maintaining consistency within the Source struct.
48-48: LGTM! Proper initialization.The counter is correctly reset at the start of each Run invocation, consistent with the initialization pattern for
errorsandresults.
78-78: LGTM! Request counting correctly placed.The increment is positioned inside the pagination loop right before each HTTP call, ensuring all requests (including those that may fail) are accurately counted. This placement correctly tracks requests across multiple pages.
169-169: LGTM! Statistics correctly populated.The request count is properly included in the returned Statistics struct, making the tracked data available to consumers.
pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go (4)
36-36: LGTM! Field addition follows existing pattern.The
requestsfield is added consistently with the existingerrorsandresultsfields in the struct.
45-45: LGTM! Initialization is correctly placed.The
s.requestscounter is properly initialized before the goroutine starts, consistent with the pattern forerrorsandresults.
117-125: LGTM! Statistics output correctly includes the new field.The
Requestsfield is properly included in theStatistics()return value, maintaining consistency with the other tracked metrics.
59-59: [Your rewritten review comment text here]
[Exactly ONE classification tag]pkg/subscraping/sources/c99/c99.go (1)
21-21: Request tracking implementation looks correct.The request counter is properly initialized, incremented before the HTTP request, and exposed via Statistics(). The placement ensures that only actual request attempts are counted, correctly skipping the increment when no API key is available (lines 48-52).
Also applies to: 40-40, 55-55, 129-129
pkg/subscraping/sources/bufferover/bufferover.go (3)
37-39: LGTM! Initialization follows the established pattern.The counter is properly initialized to 0 at the start of
Run(), consistent with the existingerrorsandresultsinitialization.
60-60: LGTM! Request tracking correctly counts all attempts.The increment is placed before the HTTP request and counts all attempts (including failures), which is appropriate for tracking API usage and quota consumption as specified in the PR objectives.
141-141: LGTM! Statistics exposure is correct.The
Requestsfield is properly exposed in theStatistics()method, enabling the-statsflag to display request counts as intended.
| timeTaken time.Duration | ||
| errors int | ||
| results int | ||
| requests int |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check atomic usage patterns across subscraping sources
echo "=== Sources using atomic types for requests/errors/results ==="
rg -n "atomic\.(Int32|Int64|Int)" pkg/subscraping/sources/ -A 2 -B 2 | head -100
echo -e "\n=== Sources with plain int fields (like c99.go) ==="
rg -n "^\s+(requests|errors|results)\s+int\s*$" pkg/subscraping/sources/ -C 2 | head -100
echo -e "\n=== File structure of c99.go to understand the context ==="
head -50 pkg/subscraping/sources/c99/c99.goRepository: projectdiscovery/subfinder
Length of output: 8530
🏁 Script executed:
#!/bin/bash
# Examine c99.go more thoroughly
echo "=== Full c99.go file ==="
wc -l pkg/subscraping/sources/c99/c99.go
echo -e "\n=== Full Run() method and Statistics() method ==="
cat -n pkg/subscraping/sources/c99/c99.go | sed -n '33,130p'Repository: projectdiscovery/subfinder
Length of output: 3169
🏁 Script executed:
#!/bin/bash
# Check how Statistics() is used and when it's called
echo "=== Check Statistics() usage in the codebase ==="
rg -n "Statistics()" --type go -B 2 -A 2 | head -50
echo -e "\n=== Check for -stats flag implementation ==="
rg -n "stats" pkg/subscraping/ --type go -i | grep -i flag | head -20
echo -e "\n=== Check if consumers wait for channel to close before calling Statistics() ==="
rg -n "Range\|<-" pkg/subscraping/ --type go -A 1 -B 1 | grep -A 3 -B 3 Statistics | head -30Repository: projectdiscovery/subfinder
Length of output: 2864
🏁 Script executed:
#!/bin/bash
# Check Runner implementation to understand timing of Statistics() calls
echo "=== Check agent.go or runner implementation ==="
fd -t f "agent|runner" pkg/subscraping/ --type go | head -5
echo -e "\n=== Look at PassiveAgent interface and implementation ==="
rg -n "GetStatistics|passiveAgent" pkg/ --type go -B 2 -A 5 | head -80
echo -e "\n=== Check how sources are run in the enumeration flow ==="
rg -n "\.Run\(" pkg/subscraping/ --type go -B 2 -A 3 | head -60Repository: projectdiscovery/subfinder
Length of output: 5326
🏁 Script executed:
#!/bin/bash
# Examine the complete enumeration flow
echo "=== Check enumerate.go around the GetStatistics call ==="
sed -n '50,210p' pkg/runner/enumerate.go | cat -n
echo -e "\n=== Check PassiveAgent.EnumerateSubdomainsWithCtx implementation ==="
sed -n '1,80p' pkg/passive/passive.go | cat -nRepository: projectdiscovery/subfinder
Length of output: 10137
Use atomic types for thread-safe statistics to match other sources.
The requests, errors, and results fields are modified in the Run() goroutine and read by Statistics(). While the current enumeration flow synchronizes these accesses via wg.Wait() before calling GetStatistics(), this is not a defensive pattern and creates inconsistency with other sources.
Three sources already use atomic.Int32 for these fields: gitlab.go, github.go, and driftnet.go. Consider migrating c99.go to use atomic.Int32 for consistency and to guard against future code changes that may not maintain the current synchronization guarantee.
🤖 Prompt for AI Agents
In @pkg/subscraping/sources/c99/c99.go at line 21, The fields requests, errors,
and results in the c99 source struct should be changed from plain int to
atomic.Int32 (like gitlab.go/github.go/driftnet.go) to make stats thread-safe:
replace their declarations with atomic.Int32, import sync/atomic (or use the
atomic package type), increment them using Add(1) in Run(), and read them using
Load() in Statistics(); ensure zero-value initialization is fine and update any
casts when returning numbers to match the Statistics() return types.
| timeTaken time.Duration | ||
| errors int | ||
| results int | ||
| requests int |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if other sources use atomic operations for request tracking
echo "=== Searching for request tracking in sources ==="
rg -n 'requests\s+(int|atomic)' pkg/subscraping/sources/ --type go
echo ""
echo "=== Checking for atomic operations in request handling ==="
rg -n 'atomic\..*[Rr]equest|sync\.Mutex.*[Rr]equest' pkg/subscraping/sources/ --type goRepository: projectdiscovery/subfinder
Length of output: 3516
🏁 Script executed:
#!/bin/bash
# Examine the FOFA source implementation
echo "=== FOFA source implementation ==="
cat -n pkg/subscraping/sources/fofa/fofa.go | sed -n '25,65p'
echo ""
echo "=== FOFA Statistics method ==="
cat -n pkg/subscraping/sources/fofa/fofa.go | sed -n '135,145p'Repository: projectdiscovery/subfinder
Length of output: 1999
🏁 Script executed:
#!/bin/bash
# Check how other sources implement request tracking to understand concurrent access patterns
echo "=== Checking Run() method signatures in sources ==="
rg -n 'func \(.*Source\) Run\(' pkg/subscraping/sources/ --type go -A 3 | head -40
echo ""
echo "=== Check for goroutine spawning patterns ==="
rg -n 'go ' pkg/subscraping/sources/fofa/fofa.go --type go -B 2 -A 2Repository: projectdiscovery/subfinder
Length of output: 3448
Use atomic.Int32 for thread-safe request counting to match safer implementations in the codebase.
The request tracking logic is correct, but the plain int field creates a data race: Statistics() can be called while the Run() goroutine is active, resulting in unsafe concurrent access. While most sources in the codebase currently use plain int, newer implementations (gitlab, github, driftnet) use atomic.Int32, indicating the codebase is moving toward thread-safe counters.
Replace the plain int with atomic.Int32 for requests and use atomic.AddInt32() to increment the counter for consistency and safety.
🤖 Prompt for AI Agents
In @pkg/subscraping/sources/fofa/fofa.go at line 30, Replace the plain int field
named requests with atomic.Int32 on the struct (use sync/atomic's atomic.Int32
type), update all increments in Run() (or any goroutine) to use requests.Add(1)
/ atomic.AddInt32(&requests, 1) style, and change any reads (e.g., in
Statistics()) to use requests.Load() / atomic.LoadInt32(&requests) so all
accesses are thread-safe; ensure you import sync/atomic or use the atomic.Int32
methods consistently and update any type assumptions where requests was
previously treated as an int.
| htSearchUrl = fmt.Sprintf("%s&apikey=%s", htSearchUrl, randomApiKey) | ||
|
|
||
| s.requests++ |
There was a problem hiding this comment.
Critical: Duplicate API key appending breaks URL construction.
Lines 42-43 duplicate the API key logic already handled in lines 38-39, causing two issues:
- When
randomApiKeyis not empty: The API key gets appended twice, creating an invalid URL with duplicateapikeyparameters. - When
randomApiKeyis empty: Line 42 appends&apikey=with an empty value, breaking the request.
Lines 42-43 should be removed entirely.
🐛 Proposed fix
if randomApiKey != "" {
htSearchUrl = fmt.Sprintf("%s&apikey=%s", htSearchUrl, randomApiKey)
}
- htSearchUrl = fmt.Sprintf("%s&apikey=%s", htSearchUrl, randomApiKey)
-
s.requests++
resp, err := session.SimpleGet(ctx, htSearchUrl)Note: The s.requests++ increment on line 44 is correctly placed before the HTTP call and is safe without atomics since it executes in a single goroutine.
🤖 Prompt for AI Agents
In @pkg/subscraping/sources/hackertarget/hackertarget.go around lines 42 - 44,
Remove the duplicated API-key append that mutates htSearchUrl a second time (the
assignment using fmt.Sprintf with randomApiKey) so the URL isn’t built with a
duplicate or empty apikey parameter; ensure the single API-key append remains
only in the earlier logic that already handles randomApiKey (or, alternatively,
gate any append with randomApiKey != ""), and leave the s.requests++ increment
as-is before the HTTP call.
Summary
Requestsfield toStatisticsstruct to track HTTP requests per source-statsoutput tableThis helps users monitor API usage and debug quota consumption issues.
Example Output
Test plan
subfinder -d example.com -statsand verify request counts appearCloses #1698
Related to #1562
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.