Conversation
|
""" WalkthroughThis change modernizes code across several files by updating loop constructs, string formatting, and map type usage to leverage more contemporary Go idioms. Adjustments include replacing traditional for-loops with range-based loops, using Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant ModernizedFunction
Caller->>ModernizedFunction: Call function (with modernized loop or formatting)
ModernizedFunction->>ModernizedFunction: Use range-based loop or fmt.Appendf
ModernizedFunction-->>Caller: Return result
Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNo out-of-scope changes found. Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (4)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🔭 Outside diff range comments (4)
v2/pkg/resolve/resolve.go (1)
76-88: Wildcard check loop brokenSame issue with
maxWildcardChecks(constant int). Use:- for range maxWildcardChecks { + for i := 0; i < maxWildcardChecks; i++ {v2/pkg/testutils/integration.go (1)
30-35: Non-existentstrings.SplitSeqand wrong loop variable
strings.SplitSeqdoesn’t exist; additionallyrange itemsreturns indices, yet you treat them as strings.- items := strings.SplitSeq(string(data), "\n") - for i := range items { - if i != "" { - parts = append(parts, i) - } - } + for _, item := range strings.Split(strings.TrimSpace(string(data)), "\n") { + if item != "" { + parts = append(parts, item) + } + }v2/pkg/subscraping/sources/crtsh/crtsh.go (2)
121-129:SplitSequndefined and loop misuses indexNeed standard split and value iteration:
- for subdomain := range strings.SplitSeq(data, "\n") { + for _, subdomain := range strings.Split(data, "\n") {
155-162: Repeat of previous issue- for sub := range strings.SplitSeq(subdomain.NameValue, "\n") { + for _, sub := range strings.Split(subdomain.NameValue, "\n") {
🧹 Nitpick comments (1)
v2/pkg/subscraping/sources/netlas/netlas.go (1)
100-122: Avoid superfluous string conversion when the payload is already[]byte.
jsonRequestBodyis a[]byte, yet we allocate a new string and wrap it withstrings.NewReader.
Usebytes.NewReader(jsonRequestBody)to save an allocation and garbage-copy.- resp, err = session.HTTPRequest(ctx, http.MethodPost, apiUrl, "", map[string]string{ - "accept": "application/json", - "X-API-Key": randomApiKey, - "Content-Type": "application/json"}, strings.NewReader(string(jsonRequestBody)), subscraping.BasicAuth{}) + resp, err = session.HTTPRequest(ctx, http.MethodPost, apiUrl, "", map[string]string{ + "accept": "application/json", + "X-API-Key": randomApiKey, + "Content-Type": "application/json"}, bytes.NewReader(jsonRequestBody), subscraping.BasicAuth{})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.github/workflows/build-test.ymlis excluded by!**/*.yml
📒 Files selected for processing (9)
v2/pkg/resolve/resolve.go(2 hunks)v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go(1 hunks)v2/pkg/subscraping/sources/crtsh/crtsh.go(2 hunks)v2/pkg/subscraping/sources/fofa/fofa.go(1 hunks)v2/pkg/subscraping/sources/hunter/hunter.go(1 hunks)v2/pkg/subscraping/sources/netlas/netlas.go(1 hunks)v2/pkg/subscraping/sources/quake/quake.go(1 hunks)v2/pkg/subscraping/sources/securitytrails/securitytrails.go(1 hunks)v2/pkg/testutils/integration.go(1 hunks)
🔇 Additional comments (4)
v2/pkg/subscraping/sources/quake/quake.go (1)
61-64: ```shell
#!/bin/bashCheck the Go version specified in v2/go.mod
grep -n '^go ' v2/go.mod || echo "Go version not specified in v2/go.mod"
</details> <details> <summary>v2/pkg/subscraping/sources/securitytrails/securitytrails.go (1)</summary> `62-66`: **LGTM – switch to `fmt.Appendf` is correct and allocation-free.** </details> <details> <summary>v2/pkg/subscraping/sources/hunter/hunter.go (1)</summary> `60-65`: **LGTM – more efficient query construction.** </details> <details> <summary>v2/pkg/subscraping/sources/fofa/fofa.go (1)</summary> `56-59`: **LGTM – matches the idiom used in other sources.** </details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
v2/pkg/subscraping/sources/pugrecon/pugrecon.go (1)
81-82: Align response cleanup with the new helper
defer resp.Body.Close()bypasses the new error-aware helper and duplicates logic.
Switching keeps the pattern uniform and catches close errors.- defer resp.Body.Close() + defer session.DiscardHTTPResponse(resp)
♻️ Duplicate comments (2)
v2/pkg/runner/outputter.go (2)
158-164: Same masking issue as aboveSee the previous comment; the identical error–flush pattern appears here.
233-238: Same masking issue as aboveSee the first comment for suggested fix/refactor.
🧹 Nitpick comments (3)
v2/pkg/subscraping/sources/pugrecon/pugrecon.go (1)
90-91: Nit: direct string wrap withfmt.Errorfunnecessary
fmt.Errorf("%s", errorMsg)is fine buterrors.New(errorMsg)is clearer/cheaper when no formatting is needed.- results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("%s", errorMsg)} + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: errors.New(errorMsg)}(Requires
errorsimport.)v2/pkg/runner/runner.go (2)
97-99: Defer would guarantee close on early returnsIf
EnumerateMultipleDomainsWithCtxpanics or returns before reaching these lines, the file handle stays open.
Usedefer f.Close()with error logging to cover all exits.
144-146: Duplicate close-and-log blocks – consider helperThe two identical close/error-log snippets invite DRY violation. A small helper will reduce repetition and future misses.
Example:
func safeClose(path string, c io.Closer) { if err := c.Close(); err != nil { gologger.Error().Msgf("Error closing file %s: %s", path, err) } }Then
defer safeClose(outputFile, file).Also applies to: 164-166
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.github/workflows/build-test.ymlis excluded by!**/*.yml
📒 Files selected for processing (9)
v2/pkg/runner/config.go(1 hunks)v2/pkg/runner/outputter.go(3 hunks)v2/pkg/runner/runner.go(3 hunks)v2/pkg/subscraping/agent.go(1 hunks)v2/pkg/subscraping/sources/crtsh/crtsh.go(4 hunks)v2/pkg/subscraping/sources/hunter/hunter.go(2 hunks)v2/pkg/subscraping/sources/pugrecon/pugrecon.go(1 hunks)v2/pkg/subscraping/sources/virustotal/virustotal.go(1 hunks)v2/pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- v2/pkg/subscraping/sources/virustotal/virustotal.go
🚧 Files skipped from review as they are similar to previous changes (2)
- v2/pkg/subscraping/sources/hunter/hunter.go
- v2/pkg/subscraping/sources/crtsh/crtsh.go
🔇 Additional comments (3)
v2/pkg/subscraping/agent.go (1)
117-124: #!/bin/bash
set -e
echo "Lines 110–140 of v2/pkg/subscraping/agent.go:"
sed -n '110,140p' v2/pkg/subscraping/agent.gov2/pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go (1)
70-75: LGTM – consistent cleanup viaDiscardHTTPResponseNice replacement of the raw
Close()withsession.DiscardHTTPResponse, keeping resource-handling consistent with the other sources.v2/pkg/runner/config.go (1)
20-24: Good defensive close with loggingWrapping the close in a defer + error check is a solid improvement.
Variable shadowing (err :=) is scoped to theifstatement, so no side-effects. 👍
|
we can ignore flaky integration tests for now. We're continuously monitoring them to improve or remove. |
There was a problem hiding this comment.
Actionable comments posted: 11
🔭 Outside diff range comments (19)
v2/pkg/subscraping/sources/riddler/riddler.go (2)
32-38: Guard against nilrespbefore discarding
session.SimpleGetcan return a non-nil error together with anilresponse.
Callingsession.DiscardHTTPResponse(resp)unconditionally risks a panic ifrespisnil.- s.errors++ - session.DiscardHTTPResponse(resp) + s.errors++ + if resp != nil { // avoid nil-pointer deref + session.DiscardHTTPResponse(resp) + }
40-50: Checkscanner.Err()to surface I/O errorsThe loop ignores any scanning error.
If the body is truncated or the connection drops, valuable error information is lost and the source silently mis-reports an empty result set.@@ for scanner.Scan() { line := scanner.Text() if line == "" { continue } for _, subdomain := range session.Extractor.Extract(line) { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain} s.results++ } } + if scanErr := scanner.Err(); scanErr != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: scanErr} + s.errors++ + }v2/pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)
67-75: Avoid double-free: remove the secondDiscardHTTPResponsecall
session.DiscardHTTPResponse(resp)is invoked immediately after the body is read (line 67).
Invoking it again in the JSON-unmarshal error branch (line 73) is redundant and risks a double-close if the helper doesn’t guard against it internally. A singledeferregistered right after the successfulGetcall keeps the intent clear and guarantees the response is discarded exactly once on all paths.- session.DiscardHTTPResponse(resp) return } }Consider instead:
resp, err := session.Get(...) if err != nil { ... return } defer session.DiscardHTTPResponse(resp)v2/pkg/subscraping/sources/anubis/anubis.go (1)
35-40: Possible nil-pointer dereference when request fails
session.SimpleGetcan returnresp == nilon network/URL errors.
Unconditionally passing that value tosession.DiscardHTTPResponserisks a panic ifDiscardHTTPResponsedereferences the pointer.- session.DiscardHTTPResponse(resp) + if resp != nil { + session.DiscardHTTPResponse(resp) + }v2/pkg/subscraping/sources/bufferover/bufferover.go (2)
58-66: Potential nil-pointer dereference when response is nil
session.DiscardHTTPResponse(resp)is called even whenresp == nil(the branch is guarded byerr != nil && resp == nil).
UnlessDiscardHTTPResponseguards against anilargument, this will panic.resp, err := session.Get(ctx, sourceURL, "", map[string]string{"x-api-key": apiKey}) -if err != nil && resp == nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - session.DiscardHTTPResponse(resp) // resp is definitely nil here - return -} +if err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + if resp != nil { // guard against nil + session.DiscardHTTPResponse(resp) + } + return +}
68-74: Duplicate cleanup ‑ prefer a single deferred discardThe response body is discarded here on the decode-error path, but also again on the success path (line 76). Repeating the call is unnecessary and easy to miss in future edits. Initialise a
deferimmediately afterGetsucceeds instead.resp, err := session.Get(ctx, sourceURL, "", map[string]string{"x-api-key": apiKey}) -if resp != nil { - defer session.DiscardHTTPResponse(resp) -}This guarantees cleanup exactly once and removes the need for explicit calls later.
v2/pkg/subscraping/sources/alienvault/alienvault.go (1)
41-46: Incorrect nil‐check leads to leaked connections and possible nil dereference
err != nil && resp == nilignores the very common case whereerr != nilandrespis non-nil (e.g. non-2xx status, redirect loop, TLS handshake error returned after opening the connection).
In that scenario the code:
- Treats the call as success and continues to decode the body, even though
erris non-nil.- Never closes the body because execution jumps out before the later
DiscardHTTPResponsecalls.- Risks dereferencing
respinsideDecodeeven though the request failed.Refactor the guard to handle any error first and always discard the response if available:
resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://otx.alienvault.com/api/v1/indicators/domain/%s/passive_dns", domain)) -if err != nil && resp == nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - session.DiscardHTTPResponse(resp) - return -} +if err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + if resp != nil { + session.DiscardHTTPResponse(resp) + } + return +} +defer session.DiscardHTTPResponse(resp) // one‐liner guarantees cleanup on every pathv2/pkg/subscraping/sources/hackertarget/hackertarget.go (1)
47-58: Missingscanner.Err()check may swallow upstream read errorsAfter the scanning loop completes we never inspect
scanner.Err(). If the HTTP body is truncated or the connection resets, those errors are silently ignored and the statistics under-report failures.@@ } } + + // Surface any I/O errors that occurred during scanning. + if err := scanner.Err(); err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + }Adding this tiny block keeps error reporting symmetric with the new
Close()handling and improves observability.v2/pkg/subscraping/sources/censys/censys.go (2)
115-120: Guard against nilrespbefore discarding
session.HTTPRequestcan legitimately return anilresponse on transport errors. Callingsession.DiscardHTTPResponse(resp)unconditionally risks a nil-pointer panic unless that helper already checks for nil internally (not guaranteed from this context).if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors++ - session.DiscardHTTPResponse(resp) + if resp != nil { + session.DiscardHTTPResponse(resp) + } return }
122-129: Validate HTTP status code before attempting JSON decodeThe code proceeds to decode the body without checking
resp.StatusCode. A non-2xx status (e.g., 401/429) will produce a confusing JSON parse error and hide the real problem. Prefer an explicit check and early error return:var censysResponse response -err = jsoniter.NewDecoder(resp.Body).Decode(&censysResponse) +if resp.StatusCode != 200 { + session.DiscardHTTPResponse(resp) + results <- subscraping.Result{ + Source: s.Name(), Type: subscraping.Error, + Error: fmt.Errorf("unexpected status code %d", resp.StatusCode), + } + s.errors++ + return +} + +err = jsoniter.NewDecoder(resp.Body).Decode(&censysResponse) if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors++ session.DiscardHTTPResponse(resp) return }v2/pkg/subscraping/sources/threatbook/threatbook.go (1)
55-60: Error-handling path discards a potentiallynilresponse and skips theerr != nil && resp != *non-nil*caseThe guard clause fires only when
err != nil && resp == nil, yet you invokesession.DiscardHTTPResponse(resp)anyway.
- If
respisnilthis call is pointless at best and unsafe at worst (depends on the implementation).- If
err != nilandresp != nilthe current branch is bypassed, leaving the body undiscarded and leaking the connection.A safer, simpler structure:
- if err != nil && resp == nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - session.DiscardHTTPResponse(resp) - return - } + if err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + if resp != nil { + session.DiscardHTTPResponse(resp) + } + return + }This handles every error permutation and never passes a
nilpointer into the discard helper.v2/pkg/subscraping/sources/reconcloud/reconcloud.go (2)
46-52: Wrong error predicate + possible nil-pointer inDiscardHTTPResponse
if err != nil && resp == nilonly trips when both are true.
In practice an HTTP client almost always returnsnilresponse together with the error, but you’re still relying on that invariant.
More importantly, when the predicate is truerespis guaranteed to be nil (second operand), yet the code immediately callssession.DiscardHTTPResponse(resp)→ potential panic inside that helper.Fix:
resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://recon.cloud/api/search?domain=%s", domain)) -if err != nil && resp == nil { +if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors++ - session.DiscardHTTPResponse(resp) + if resp != nil { // defensive, in case http.Client changes behaviour + session.DiscardHTTPResponse(resp) + } return } + +// ensure body is always cleaned up on every other exit path +defer session.DiscardHTTPResponse(resp)This:
- Handles any non-nil error.
- Protects against nil response values.
- Uses a single
deferto guarantee the body is discarded exactly once.
55-63: Now redundant calls toDiscardHTTPResponseWith the
deferin place (see previous comment) the two explicit calls below are no longer necessary and will double-close the body.- session.DiscardHTTPResponse(resp) - return + return } - session.DiscardHTTPResponse(resp)Remove them to avoid redundant work and make the control-flow clearer.
v2/pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)
49-65: Missingscanner.Err()check may silently swallow read errorsAfter the
for scanner.Scan()loop finishes,scanner.Err()should be inspected; otherwise network/IO errors are ignored and the source reports zero errors even though parsing failed part-way.@@ for scanner.Scan() { @@ s.results++ } + if err := scanner.Err(); err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + }This preserves accuracy of statistics and surfaces transient network issues to the caller.
v2/pkg/subscraping/sources/digitorus/digitorus.go (1)
51-64: Handle scanner I/O errors to avoid silent data loss
bufio.Scannerwill drop any I/O or token-size error unlessscanner.Err()is checked after the loop. Currently such errors are ignored, the scrape silently fails, ands.errorsis not incremented.for scanner.Scan() { … } -if scanner.Err() != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: scanner.Err()} - s.errors++ -} +if err := scanner.Err(); err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ +}v2/pkg/subscraping/sources/robtex/robtext.go (1)
88-90: Possible nil-dereference whensession.Getfails
session.Getmay return(nil, err)on transport errors. Passing anilresponse intosession.DiscardHTTPResponsewithout a guard assumes that the helper toleratesnil; if it ever dereferencesresp, this will panic.- if err != nil { - session.DiscardHTTPResponse(resp) - return results, err + if err != nil { + if resp != nil { + session.DiscardHTTPResponse(resp) + } + return results, err }v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)
78-82: Page-1 is fetched twice – wastes quota and yields duplicatesThe first request already retrieved
page=1.
The loop below starts again atpage := 1, causing an identical second call and duplicate subdomain emission.- for page := 1; page <= totalPages; page++ { + // page 1 already processed above + for page := 2; page <= totalPages; page++ {v2/pkg/subscraping/sources/chinaz/chinaz.go (1)
41-47:DiscardHTTPResponsemust tolerate a nilrespOn the error path
respcan legitimately benil(e.g. network failure or DNS error).
Ifsession.DiscardHTTPResponseblindly dereferencesresp.Bodythis statement will panic and kill the worker goroutine, closing the results channel prematurely.- session.DiscardHTTPResponse(resp) + if resp != nil { + session.DiscardHTTPResponse(resp) + }Alternatively, make
DiscardHTTPResponseitself a no-op whenresp == nil.v2/pkg/subscraping/sources/gitlab/gitlab.go (1)
110-123: 404 path still leaks the response body
session.DiscardHTTPResponse(resp)is now correctly used for the happy path (line 122), but iferr != nil && resp.StatusCode == http.StatusNotFound(see lines 100-107) the function continues without closingresp.Body, leading to a connection leak.An easy fix is to always discard/close before the early
continue/return, even for the 404 case:- if resp == nil || (resp != nil && resp.StatusCode != http.StatusNotFound) { - session.DiscardHTTPResponse(resp) - … - } + if resp == nil { + … + } + // Always release the body, even on 404 + session.DiscardHTTPResponse(resp) + if resp != nil && resp.StatusCode != http.StatusNotFound { + … + }(or simply
defer session.DiscardHTTPResponse(resp)immediately after the GET).
♻️ Duplicate comments (1)
v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
68-70: Invalidrangeover integer – won’t compile
for i := range maxYearsBack { … }attempts to iterate over anint.
Replace with an index-based loop:- for i := range maxYearsBack { + for i := 0; i < maxYearsBack; i++ {
🧹 Nitpick comments (18)
v2/pkg/subscraping/sources/digitalyama/digitalyama.go (1)
57-62: Consider reusingsession.DiscardHTTPResponsefor consistency.Most other sources updated in this PR switched to the helper
session.DiscardHTTPResponse(resp), which both drains and closes the body. Using that helper here would keep the cleanup logic uniform across sources and avoid a tiny bit of duplicated code.- defer func() { - if err := resp.Body.Close(); err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - } - }() + defer session.DiscardHTTPResponse(resp)Up to you—current code is functionally fine, this is purely a consistency/refactor suggestion.
v2/pkg/subscraping/sources/riddler/riddler.go (1)
40-52: Defer the discard to guarantee cleanup & avoid double-closeInstead of invoking
session.DiscardHTTPResponseat the very end of the loop, defer it immediately after the successfulSimpleGet.
This (a) guarantees cleanup even on early returns (e.g. context cancel) and
(b) removes the need for the explicit call on line 51.@@ resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://riddler.io/search?q=pld:%s&view_type=data_table", domain)) @@ } + // ensure the response body is always fully discarded/closed + defer session.DiscardHTTPResponse(resp) + scanner := bufio.NewScanner(resp.Body) @@ - } - session.DiscardHTTPResponse(resp) + }v2/pkg/subscraping/sources/anubis/anubis.go (2)
28-57: Duplicate cleanup calls – consider onedeferinstead
session.DiscardHTTPResponse(resp)is invoked on every early-return path and once after decoding.
You can simplify control-flow and guarantee cleanup even on future branches by deferring it immediately after the successfulSimpleGet.resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://jonlu.ca/anubis/subdomains/%s", domain)) if err != nil { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} s.errors++ - if resp != nil { - session.DiscardHTTPResponse(resp) - } return } + // Always close/discard the response when we leave this function. + defer session.DiscardHTTPResponse(resp)This removes repetition and makes future maintenance easier.
(Keep the nil-check inside the defer ifDiscardHTTPResponseisn’t nil-safe.)
42-45: Lost error information for non-200 responsesWhen the service replies with a non-OK status you silently return, giving callers no clue what happened.
At minimum, emit an error result so users and metrics capture the failure.- if resp.StatusCode != http.StatusOK { - session.DiscardHTTPResponse(resp) - return - } + if resp.StatusCode != http.StatusOK { + results <- subscraping.Result{ + Source: s.Name(), + Type: subscraping.Error, + Error: fmt.Errorf("unexpected status code %d", resp.StatusCode), + } + s.errors++ + return + }v2/pkg/subscraping/sources/bufferover/bufferover.go (1)
76-77: Unneeded discard after adding a deferIf you adopt the deferred approach suggested above, this explicit
session.DiscardHTTPResponse(resp)becomes redundant and can be removed, simplifying the happy-path flow.v2/pkg/subscraping/sources/hackertarget/hackertarget.go (1)
40-45: Explicit close-error handling is 👍 — consider centralising withsession.DiscardHTTPResponsefor consistencyThe new deferred block correctly captures and reports
resp.Body.Close()errors, avoiding silent leaks.
Across the code-base, most sources now delegate response cleanup tosession.DiscardHTTPResponse(resp), which also drains the body for connection reuse. To keep behaviour uniform and prevent future drift, you could:- defer func() { - if err := resp.Body.Close(); err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - } - }() + defer func() { + if err := session.DiscardHTTPResponse(resp); err != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + s.errors++ + } + }()This removes duplication and leverages the helper’s drain/reuse logic.
IfDiscardHTTPResponsedoes not return an error, you can still call it first and fall back toresp.Body.Close()for error capture.v2/pkg/subscraping/sources/threatbook/threatbook.go (1)
67-71: Consider deferring the discard for single-exit readabilityAfter successful decoding you immediately call
session.DiscardHTTPResponse(resp), which is correct.
Using adeferright after theSimpleGetcall would eliminate the need to remember manual invocations on every exit path and prevent future leaks when code is refactored.resp, err := session.SimpleGet(ctx, url) if resp != nil { defer session.DiscardHTTPResponse(resp) }Not blocking, but it reduces cognitive load and guards against maintenance slip-ups.
v2/pkg/subscraping/sources/c99/c99.go (1)
59-64: Close-error handling is welcome, but diverges from the newsession.DiscardHTTPResponsehelperGood call catching
resp.Body.Close()errors and surfacing them through theresultschannel.
That said, most updated sources in this PR switched to the centralised helpersession.DiscardHTTPResponse(resp), which:• drains & closes the body (keeps the HTTP transport reusable)
• nil-checksresp
• encapsulates the same error handling logicUsing the helper keeps behaviour uniform and avoids duplicating subtle cleanup logic:
- defer func() { - if err := resp.Body.Close(); err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - } - }() + defer session.DiscardHTTPResponse(resp)If pushing the error into
resultsis essential, consider extendingDiscardHTTPResponseto optionally report through a callback/channel rather than diverging per source.v2/pkg/subscraping/sources/intelx/intelx.go (1)
126-134: Remove redundantio.ReadAllbefore discarding the responseThe helper already consumes & closes the body; the explicit
io.ReadAllfollowed by another call introduces an unnecessary second read. Streamline this block:- _, err = io.ReadAll(resp.Body) - if err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - session.DiscardHTTPResponse(resp) - return - } - session.DiscardHTTPResponse(resp) + session.DiscardHTTPResponse(resp)This removes needless I/O while preserving error handling through the helper.
v2/pkg/subscraping/sources/digitorus/digitorus.go (1)
44-49: Consider re-using the common response-cleanup helper for consistencyEvery other modernized source now relies on
session.DiscardHTTPResponse(resp)so that response closing (and associated error logging) is handled in one place.
Duplicating the logic here means:
- We lose the single-point behaviour (future fixes/improvements to
DiscardHTTPResponsewill not reach this file).- The per-source
errors++bookkeeping becomes inconsistent – other sources only log, this one also counts the error.If you want to keep the local error counting, consider extending
DiscardHTTPResponseto accept a callback so each caller can decide what to do with the close error, and then replace the block with a single defer.v2/pkg/subscraping/sources/rapiddns/rapiddns.go (2)
51-52: Guard against nil response before discarding
session.DiscardHTTPResponse(resp)is invoked here afterio.ReadAllfails.
Whilerespis expected to be non-nil in this branch, we already call the same helper in the earliererr != nilpath (line 43) whererespcan legitimately benil.
To be defensive and avoid accidental nil-pointer dereferences shouldDiscardHTTPResponseever omit an internal nil-check, wrap the call:-if session.DiscardHTTPResponse(resp) +if resp != nil { + session.DiscardHTTPResponse(resp) +}
55-56: Consider draining once per iteration to avoid duplicate workYou now discard the response body twice per successful iteration:
- After a read error (lines 51-52).
- After a successful read (line 55).
The first is needed only on the error path, while the second is always executed.
A clearer pattern is:resp, err := session.SimpleGet(...) if err != nil { // handle & return } defer session.DiscardHTTPResponse(resp) // runs on both success & error below body, err := io.ReadAll(resp.Body) if err != nil { // handle & return }Because the loop exits on each
return, the deferred call fires immediately, keeping memory usage low and avoiding repeated identical statements.
Up to you whether the small readability gain outweighs the extradeferallocation per request.v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go (2)
54-59: Prefer usingsession.DiscardHTTPResponseto standardise cleanup across sourcesAlmost every other modernised source now delegates connection reuse + close-error handling to
session.DiscardHTTPResponse. Retaining a bespokeresp.Body.Close()block here diverges from that convention and misses the extra drain logic the helper provides.A minimal, drop-in adjustment:
- defer func() { - if err := resp.Body.Close(); err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} - s.errors++ - } - }() + // Drain & close so the connection can be reused. + defer func() { + if cerr := session.DiscardHTTPResponse(resp); cerr != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: cerr} + s.errors++ + } + }()Gives identical error reporting while aligning with the rest of the codebase and ensuring HTTP keep-alive.
55-56: Minor lint: avoid shadowingerrinside the deferred func
if err := …introduces a new shadow variable that can trip static-analysis linters (shadow,gosimple). Using a distinct name keeps the outererruntouched and silences the warning:- if err := resp.Body.Close(); err != nil { - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err} + if closeErr := resp.Body.Close(); closeErr != nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: closeErr}v2/pkg/subscraping/sources/bevigil/bevigil.go (1)
60-67: Prefer a single deferred cleanup instead of multiple explicit calls
session.DiscardHTTPResponse(resp)is now invoked in every exit path (decode failure and success). Replacing these duplicates with onedeferimmediately after a non-nilrespis obtained shortens the function and eliminates the chance of forgetting a path during future edits.resp, err := session.Get(ctx, getUrl, "", headers) if err != nil { … } +defer session.DiscardHTTPResponse(resp) // remove explicit calls on lines 62 and 66This mirrors the standard
defer resp.Body.Close()idiom and keeps resource-management concerns in one place.v2/pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)
68-68: Remove redundant discard to prevent second closeWith the above deferred cleanup in place, this explicit call becomes redundant (and could be
nilwhen the request itself failed). Deleting it simplifies the flow and avoids needless work.- session.DiscardHTTPResponse(resp)v2/pkg/subscraping/sources/sitedossier/sitedossier.go (1)
66-70: Guard against nil response before discarding
session.DiscardHTTPResponse(resp)is invoked unconditionally.
Ifrespis evernil(e.g.,SimpleGetreturned an error andrespwasnil), this could panic unlessDiscardHTTPResponseis explicitly nil-safe.-if err != nil { - ... - session.DiscardHTTPResponse(resp) +if err != nil { + ... + if resp != nil { + session.DiscardHTTPResponse(resp) + }Please verify that
DiscardHTTPResponsegracefully handlesnil; otherwise, add the check.v2/pkg/subscraping/sources/fullhunt/fullhunt.go (1)
62-63: Duplicate call—safe but redundant oncedeferis applied.After moving cleanup into a
defer, this line becomes unnecessary and can be removed to avoid confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.github/workflows/build-test.ymlis excluded by!**/*.yml
📒 Files selected for processing (42)
v2/pkg/runner/outputter.go(3 hunks)v2/pkg/subscraping/sources/alienvault/alienvault.go(1 hunks)v2/pkg/subscraping/sources/anubis/anubis.go(2 hunks)v2/pkg/subscraping/sources/bevigil/bevigil.go(1 hunks)v2/pkg/subscraping/sources/bufferover/bufferover.go(1 hunks)v2/pkg/subscraping/sources/c99/c99.go(1 hunks)v2/pkg/subscraping/sources/censys/censys.go(1 hunks)v2/pkg/subscraping/sources/certspotter/certspotter.go(2 hunks)v2/pkg/subscraping/sources/chinaz/chinaz.go(1 hunks)v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go(2 hunks)v2/pkg/subscraping/sources/crtsh/crtsh.go(4 hunks)v2/pkg/subscraping/sources/digitalyama/digitalyama.go(1 hunks)v2/pkg/subscraping/sources/digitorus/digitorus.go(1 hunks)v2/pkg/subscraping/sources/dnsdb/dnsdb.go(3 hunks)v2/pkg/subscraping/sources/dnsdumpster/dnsdumpster.go(1 hunks)v2/pkg/subscraping/sources/dnsrepo/dnsrepo.go(1 hunks)v2/pkg/subscraping/sources/facebook/ctlogs.go(2 hunks)v2/pkg/subscraping/sources/facebook/ctlogs_test.go(2 hunks)v2/pkg/subscraping/sources/fofa/fofa.go(2 hunks)v2/pkg/subscraping/sources/fullhunt/fullhunt.go(1 hunks)v2/pkg/subscraping/sources/github/github.go(3 hunks)v2/pkg/subscraping/sources/gitlab/gitlab.go(2 hunks)v2/pkg/subscraping/sources/hackertarget/hackertarget.go(1 hunks)v2/pkg/subscraping/sources/hudsonrock/hudsonrock.go(1 hunks)v2/pkg/subscraping/sources/intelx/intelx.go(2 hunks)v2/pkg/subscraping/sources/leakix/leakix.go(1 hunks)v2/pkg/subscraping/sources/netlas/netlas.go(3 hunks)v2/pkg/subscraping/sources/pugrecon/pugrecon.go(2 hunks)v2/pkg/subscraping/sources/quake/quake.go(2 hunks)v2/pkg/subscraping/sources/rapiddns/rapiddns.go(1 hunks)v2/pkg/subscraping/sources/reconcloud/reconcloud.go(1 hunks)v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go(2 hunks)v2/pkg/subscraping/sources/riddler/riddler.go(1 hunks)v2/pkg/subscraping/sources/robtex/robtext.go(1 hunks)v2/pkg/subscraping/sources/securitytrails/securitytrails.go(2 hunks)v2/pkg/subscraping/sources/shodan/shodan.go(1 hunks)v2/pkg/subscraping/sources/sitedossier/sitedossier.go(1 hunks)v2/pkg/subscraping/sources/threatbook/threatbook.go(1 hunks)v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go(1 hunks)v2/pkg/subscraping/sources/threatminer/threatminer.go(1 hunks)v2/pkg/subscraping/sources/virustotal/virustotal.go(2 hunks)v2/pkg/subscraping/sources/waybackarchive/waybackarchive.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- v2/pkg/subscraping/sources/quake/quake.go
- v2/pkg/subscraping/sources/netlas/netlas.go
- v2/pkg/runner/outputter.go
- v2/pkg/subscraping/sources/certspotter/certspotter.go
- v2/pkg/subscraping/sources/securitytrails/securitytrails.go
- v2/pkg/subscraping/sources/virustotal/virustotal.go
- v2/pkg/subscraping/sources/fofa/fofa.go
- v2/pkg/subscraping/sources/pugrecon/pugrecon.go
- v2/pkg/subscraping/sources/crtsh/crtsh.go
🧰 Additional context used
🧬 Code Graph Analysis (6)
v2/pkg/subscraping/sources/hackertarget/hackertarget.go (2)
v2/pkg/subscraping/types.go (3)
Result(81-86)Source(37-62)Error(94-94)v2/pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-20)
v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)
v2/pkg/subscraping/types.go (2)
Result(81-86)Error(94-94)
v2/pkg/subscraping/sources/dnsdb/dnsdb.go (1)
v2/pkg/subscraping/types.go (3)
Result(81-86)Source(37-62)Error(94-94)
v2/pkg/subscraping/sources/threatminer/threatminer.go (2)
v2/pkg/subscraping/types.go (2)
Result(81-86)Error(94-94)v2/pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-20)
v2/pkg/subscraping/sources/shodan/shodan.go (2)
v2/pkg/subscraping/types.go (2)
Result(81-86)Error(94-94)v2/pkg/subscraping/sources/rapiddns/rapiddns.go (1)
Source(18-22)
v2/pkg/subscraping/sources/digitorus/digitorus.go (2)
v2/pkg/subscraping/types.go (3)
Result(81-86)Source(37-62)Error(94-94)v2/pkg/subscraping/sources/anubis/anubis.go (1)
Source(16-20)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Test Builds (macOS-13)
🔇 Additional comments (19)
v2/pkg/subscraping/sources/digitalyama/digitalyama.go (1)
57-62: Deferred‐close error handling is correct, but double-check the defer order.Great to see explicit handling of the
Close()error here.
Because this defer is registered after the earlierdefer func(startTime…)at line 39, it will execute before the results channel is closed, so the extraresults <- …send is safe. Just keep this ordering rule in mind if either block is moved in future refactors.
Looks good as-is.v2/pkg/subscraping/sources/censys/censys.go (1)
131-132: Good move to centralised cleanupReplacing direct
resp.Body.Close()withsession.DiscardHTTPResponse(resp)aligns with the new session-level resource handling pattern, reducing duplication and ensuring consistent draining/closing behaviour.v2/pkg/subscraping/sources/intelx/intelx.go (2)
88-93: EnsureDiscardHTTPResponseis nil-safe
session.SimplePostcan return a non-nil error with a nilresp(e.g. dial failure).
session.DiscardHTTPResponse(resp)is now invoked unconditionally. Please double-check thatDiscardHTTPResponsegracefully handles a nil argument to avoid a potential panic.
101-106: Change looks good – centralised response disposalReplacing the manual
resp.Body.Close()withsession.DiscardHTTPResponse(resp)keeps the success path consistent with the new helper and avoids duplicated code elsewhere.v2/pkg/subscraping/sources/digitorus/digitorus.go (1)
35-43: Guard against a nilrespbefore dereferencing
session.SimpleGetmay return anilresponse on network errors.
Althoughptr.Safe(resp).StatusCodeprevents the panic in the error-handling branch, execution can still reach the scanning code with anilrespiferr == nilbutrespunexpectedly becomesnil(e.g. custom transport). A trivial early check avoids the latent nil-pointer risk:resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://certificatedetails.com/%s", domain)) -if err != nil && ptr.Safe(resp).StatusCode != http.StatusNotFound { +if err != nil && ptr.Safe(resp).StatusCode != http.StatusNotFound { … } +if resp == nil { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("nil HTTP response")} + s.errors++ + return +}v2/pkg/subscraping/sources/dnsdb/dnsdb.go (2)
107-110: Prefersession.DiscardHTTPResponseover manualresp.Body.Closefor consistencyThroughout the codebase the helper wraps the close and logs any failure; here you fall back to a direct
Close, essentially duplicating logic that already exists inDiscardHTTPResponse. Consider replacing these four lines with the helper for uniform behaviour and to avoid having two slightly-different error-handling paths.
[ suggest_optional_refactor ]
119-119: 👍 Consistent cleanup on unmarshal failureSwitching to
session.DiscardHTTPResponse(resp)keeps the error-handling style aligned with the rest of the project and guarantees the body is closed even whenrespisnil.v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (3)
70-76: Centralised response disposal looks goodReplacing multiple scattered
resp.Body.Close()calls with the session-levelDiscardHTTPResponseincreases consistency and allows unified metrics/tracing. Nice clean-up.
91-97: Same nil-safety consideration in pagination error branchThe error branch inside the paginated loop also calls
session.DiscardHTTPResponse(resp)unconditionally.
Ensure the helper handlesnilto avoid crashes when an HTTP request fails before the response object is created.
60-64: Possible nil‐pointer panic whenrespis nil
session.Getmay returnnil, erron network/transport failures.
session.DiscardHTTPResponse(resp)is invoked unconditionally even whenresp == nil; if the helper does not guard against a nil response this will panic and kill the worker goroutine.Run the quick scan below to confirm that
DiscardHTTPResponsecontains a nil-check:#!/bin/bash # Inspect helper for defensive nil handling rg --context 2 'func .*DiscardHTTPResponse' | head -n 20If the helper is not nil-safe, add a guard here or inside the helper:
- session.DiscardHTTPResponse(resp) + if resp != nil { + session.DiscardHTTPResponse(resp) + }v2/pkg/subscraping/sources/bevigil/bevigil.go (1)
49-54: ```shell
#!/bin/bashDisplay DiscardHTTPResponse implementation to check nil handling
sed -n '110,150p' v2/pkg/subscraping/agent.go
</details> <details> <summary>v2/pkg/subscraping/sources/chinaz/chinaz.go (1)</summary> `49-52`: I’d like to inspect how `SimpleGet`, error paths, and `DiscardHTTPResponse` are used around your snippet. Running: ```shell #!/bin/bash # Show the call to SimpleGet and its error handling rg -n -C3 'SimpleGet' v2/pkg/subscraping/sources/chinaz/chinaz.go # Show where io.ReadAll and DiscardHTTPResponse are invoked rg -n -C3 'io.ReadAll|DiscardHTTPResponse' v2/pkg/subscraping/sources/chinaz/chinaz.gov2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)
62-66: Cleanup improvement acknowledgedSwitching to
session.DiscardHTTPResponse(resp)standardises response handling—looks good.v2/pkg/subscraping/sources/facebook/ctlogs.go (2)
48-52: Nice touch on error-checked closeWrapping
resp.Body.Close()in a defer that logs any error improves observability.
120-121: Consistent response disposalUsing
session.DiscardHTTPResponse(resp)right after body read keeps resource handling consistent.v2/pkg/subscraping/sources/github/github.go (2)
115-120: Double-check header usage after body discard
session.DiscardHTTPResponse(resp)closes the body; headers remain accessible, but confirm the helper doesn’t nil-out the response. Subsequent access toresp.Header(line 129) relies on that.
95-96: Standardised cleanup looks goodReplacing direct
Body.Close()withDiscardHTTPResponsekeeps cleanup uniform.Also applies to: 103-104, 176-177
v2/pkg/subscraping/sources/facebook/ctlogs_test.go (1)
45-49: Good practice in testsAdding error logging on body close mirrors production code quality—nice.
v2/pkg/subscraping/sources/fullhunt/fullhunt.go (1)
56-60: ```shell
#!/bin/bashDisplay DiscardHTTPResponse implementation to check for nil response handling
sed -n '110,160p' v2/pkg/subscraping/agent.go
</details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
closes #1600
Summary by CodeRabbit