Skip to content

modernize subfinder#1601

Merged
ehsandeep merged 11 commits intodevfrom
modernize
Jun 20, 2025
Merged

modernize subfinder#1601
ehsandeep merged 11 commits intodevfrom
modernize

Conversation

@dogancanbakir
Copy link
Copy Markdown
Member

@dogancanbakir dogancanbakir commented Jun 17, 2025

closes #1600

Summary by CodeRabbit

  • Refactor
    • Improved internal loop constructs and string formatting for better efficiency and consistency.
    • Enhanced error handling and logging for file and network resource closures.
    • Standardized HTTP response cleanup across multiple data sources.
  • Style
    • Streamlined variable declarations and request body constructions without altering functionality.
  • Tests
    • Added "anubis" to ignored sources in tests due to known redirect issues.

@dogancanbakir dogancanbakir requested a review from ehsandeep June 17, 2025 12:11
@dogancanbakir dogancanbakir self-assigned this Jun 17, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jun 17, 2025

"""

Walkthrough

This change modernizes code across several files by updating loop constructs, string formatting, and map type usage to leverage more contemporary Go idioms. Adjustments include replacing traditional for-loops with range-based loops, using fmt.Appendf for efficient string formatting, updating map type declarations, and improving error handling on resource closures. No public API changes are introduced. Note: some range-based loops over integers are syntactically incorrect and would cause compilation errors.

Changes

Files/Paths Change Summary
v2/pkg/resolve/resolve.go Replaced index-based for-loops with range-based loops (incorrectly, as range over integer is invalid Go).
v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go Changed for-loop to range-based loop for year list generation (invalid range over integer).
v2/pkg/subscraping/sources/crtsh/crtsh.go
v2/pkg/testutils/integration.go
Changed iteration from value-based to index-based over split string results using SplitSeq.
v2/pkg/subscraping/sources/fofa/fofa.go
v2/pkg/subscraping/sources/hunter/hunter.go
v2/pkg/subscraping/sources/quake/quake.go
v2/pkg/subscraping/sources/securitytrails/securitytrails.go
Replaced fmt.Sprintf + conversion with fmt.Appendf for direct byte slice formatting.
v2/pkg/subscraping/sources/netlas/netlas.go Changed map type from map[string]interface{} to map[string]any.
v2/pkg/subscraping/agent.go Added error handling and logging when closing HTTP response body in DiscardHTTPResponse.
v2/pkg/runner/config.go Added error checking and logging for deferred file close in createProviderConfigYAML.
v2/pkg/runner/outputter.go Improved error handling on buffered writer flush after write errors to propagate flush errors properly.
v2/pkg/runner/runner.go Added error checking and logging when closing files after enumeration.
v2/pkg/subscraping/sources/pugrecon/pugrecon.go Wrapped error message string with fmt.Errorf explicitly; removed unused conditional block.
v2/pkg/subscraping/sources/virustotal/virustotal.go Simplified variable declarations by removing explicit type annotations.
Many subscraping source files (e.g., crtsh.go, fofa.go, hunter.go, etc.) Replaced direct calls to resp.Body.Close() with session.DiscardHTTPResponse(resp) for uniform cleanup.
Several subscraping sources (e.g., c99.go, threatcrowd.go, hackertarget.go) Changed deferred resp.Body.Close() to deferred function with error handling and logging on close error.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant ModernizedFunction
    Caller->>ModernizedFunction: Call function (with modernized loop or formatting)
    ModernizedFunction->>ModernizedFunction: Use range-based loop or fmt.Appendf
    ModernizedFunction-->>Caller: Return result
Loading

Assessment against linked issues

Objective (Issue #) Addressed Explanation
Modernize code using more modern Go idioms (#1600)

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes found.

Suggested reviewers

  • ehsandeep

Poem

In fields of code where rabbits hop,
Loops and strings got a modern swap.
With Appendf swift and ranges neat,
The code now dances, light on its feet.
🐇✨ Modern Go, we cheer and clap,
For every byte and every map!
"""


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6520408 and d2faa76.

📒 Files selected for processing (1)
  • v2/pkg/passive/sources_wo_auth_test.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • v2/pkg/passive/sources_wo_auth_test.go
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Test Builds (macOS-13)
  • GitHub Check: Test Builds (ubuntu-latest)
  • GitHub Check: Test Builds (windows-latest)
  • GitHub Check: release-test
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (4)
v2/pkg/resolve/resolve.go (1)

76-88: Wildcard check loop broken

Same issue with maxWildcardChecks (constant int). Use:

- for range maxWildcardChecks {
+ for i := 0; i < maxWildcardChecks; i++ {
v2/pkg/testutils/integration.go (1)

30-35: Non-existent strings.SplitSeq and wrong loop variable

strings.SplitSeq doesn’t exist; additionally range items returns indices, yet you treat them as strings.

- items := strings.SplitSeq(string(data), "\n")
- for i := range items {
-     if i != "" {
-         parts = append(parts, i)
-     }
- }
+ for _, item := range strings.Split(strings.TrimSpace(string(data)), "\n") {
+     if item != "" {
+         parts = append(parts, item)
+     }
+ }
v2/pkg/subscraping/sources/crtsh/crtsh.go (2)

121-129: SplitSeq undefined and loop misuses index

Need standard split and value iteration:

- for subdomain := range strings.SplitSeq(data, "\n") {
+ for _, subdomain := range strings.Split(data, "\n") {

155-162: Repeat of previous issue

- for sub := range strings.SplitSeq(subdomain.NameValue, "\n") {
+ for _, sub := range strings.Split(subdomain.NameValue, "\n") {
🧹 Nitpick comments (1)
v2/pkg/subscraping/sources/netlas/netlas.go (1)

100-122: Avoid superfluous string conversion when the payload is already []byte.

jsonRequestBody is a []byte, yet we allocate a new string and wrap it with strings.NewReader.
Use bytes.NewReader(jsonRequestBody) to save an allocation and garbage-copy.

-		resp, err = session.HTTPRequest(ctx, http.MethodPost, apiUrl, "", map[string]string{
-			"accept":       "application/json",
-			"X-API-Key":    randomApiKey,
-			"Content-Type": "application/json"}, strings.NewReader(string(jsonRequestBody)), subscraping.BasicAuth{})
+		resp, err = session.HTTPRequest(ctx, http.MethodPost, apiUrl, "", map[string]string{
+			"accept":       "application/json",
+			"X-API-Key":    randomApiKey,
+			"Content-Type": "application/json"}, bytes.NewReader(jsonRequestBody), subscraping.BasicAuth{})
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3acf6d5 and 00a3e6f.

⛔ Files ignored due to path filters (1)
  • .github/workflows/build-test.yml is excluded by !**/*.yml
📒 Files selected for processing (9)
  • v2/pkg/resolve/resolve.go (2 hunks)
  • v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (1 hunks)
  • v2/pkg/subscraping/sources/crtsh/crtsh.go (2 hunks)
  • v2/pkg/subscraping/sources/fofa/fofa.go (1 hunks)
  • v2/pkg/subscraping/sources/hunter/hunter.go (1 hunks)
  • v2/pkg/subscraping/sources/netlas/netlas.go (1 hunks)
  • v2/pkg/subscraping/sources/quake/quake.go (1 hunks)
  • v2/pkg/subscraping/sources/securitytrails/securitytrails.go (1 hunks)
  • v2/pkg/testutils/integration.go (1 hunks)
🔇 Additional comments (4)
v2/pkg/subscraping/sources/quake/quake.go (1)

61-64: ```shell
#!/bin/bash

Check the Go version specified in v2/go.mod

grep -n '^go ' v2/go.mod || echo "Go version not specified in v2/go.mod"


</details>
<details>
<summary>v2/pkg/subscraping/sources/securitytrails/securitytrails.go (1)</summary>

`62-66`: **LGTM – switch to `fmt.Appendf` is correct and allocation-free.**

</details>
<details>
<summary>v2/pkg/subscraping/sources/hunter/hunter.go (1)</summary>

`60-65`: **LGTM – more efficient query construction.**

</details>
<details>
<summary>v2/pkg/subscraping/sources/fofa/fofa.go (1)</summary>

`56-59`: **LGTM – matches the idiom used in other sources.**

</details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
v2/pkg/subscraping/sources/pugrecon/pugrecon.go (1)

81-82: Align response cleanup with the new helper

defer resp.Body.Close() bypasses the new error-aware helper and duplicates logic.
Switching keeps the pattern uniform and catches close errors.

-       defer resp.Body.Close()
+       defer session.DiscardHTTPResponse(resp)
♻️ Duplicate comments (2)
v2/pkg/runner/outputter.go (2)

158-164: Same masking issue as above

See the previous comment; the identical error–flush pattern appears here.


233-238: Same masking issue as above

See the first comment for suggested fix/refactor.

🧹 Nitpick comments (3)
v2/pkg/subscraping/sources/pugrecon/pugrecon.go (1)

90-91: Nit: direct string wrap with fmt.Errorf unnecessary

fmt.Errorf("%s", errorMsg) is fine but errors.New(errorMsg) is clearer/cheaper when no formatting is needed.

-       results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("%s", errorMsg)}
+       results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: errors.New(errorMsg)}

(Requires errors import.)

v2/pkg/runner/runner.go (2)

97-99: Defer would guarantee close on early returns

If EnumerateMultipleDomainsWithCtx panics or returns before reaching these lines, the file handle stays open.
Use defer f.Close() with error logging to cover all exits.


144-146: Duplicate close-and-log blocks – consider helper

The two identical close/error-log snippets invite DRY violation. A small helper will reduce repetition and future misses.

Example:

func safeClose(path string, c io.Closer) {
    if err := c.Close(); err != nil {
        gologger.Error().Msgf("Error closing file %s: %s", path, err)
    }
}

Then defer safeClose(outputFile, file).

Also applies to: 164-166

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00a3e6f and a885a39.

⛔ Files ignored due to path filters (1)
  • .github/workflows/build-test.yml is excluded by !**/*.yml
📒 Files selected for processing (9)
  • v2/pkg/runner/config.go (1 hunks)
  • v2/pkg/runner/outputter.go (3 hunks)
  • v2/pkg/runner/runner.go (3 hunks)
  • v2/pkg/subscraping/agent.go (1 hunks)
  • v2/pkg/subscraping/sources/crtsh/crtsh.go (4 hunks)
  • v2/pkg/subscraping/sources/hunter/hunter.go (2 hunks)
  • v2/pkg/subscraping/sources/pugrecon/pugrecon.go (1 hunks)
  • v2/pkg/subscraping/sources/virustotal/virustotal.go (1 hunks)
  • v2/pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • v2/pkg/subscraping/sources/virustotal/virustotal.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • v2/pkg/subscraping/sources/hunter/hunter.go
  • v2/pkg/subscraping/sources/crtsh/crtsh.go
🔇 Additional comments (3)
v2/pkg/subscraping/agent.go (1)

117-124: #!/bin/bash
set -e
echo "Lines 110–140 of v2/pkg/subscraping/agent.go:"
sed -n '110,140p' v2/pkg/subscraping/agent.go

v2/pkg/subscraping/sources/whoisxmlapi/whoisxmlapi.go (1)

70-75: LGTM – consistent cleanup via DiscardHTTPResponse

Nice replacement of the raw Close() with session.DiscardHTTPResponse, keeping resource-handling consistent with the other sources.

v2/pkg/runner/config.go (1)

20-24: Good defensive close with logging

Wrapping the close in a defer + error check is a solid improvement.
Variable shadowing (err :=) is scoped to the if statement, so no side-effects. 👍

@dogancanbakir
Copy link
Copy Markdown
Member Author

dogancanbakir commented Jun 17, 2025

we can ignore flaky integration tests for now. We're continuously monitoring them to improve or remove.

@dogancanbakir dogancanbakir mentioned this pull request Jun 17, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🔭 Outside diff range comments (19)
v2/pkg/subscraping/sources/riddler/riddler.go (2)

32-38: Guard against nil resp before discarding

session.SimpleGet can return a non-nil error together with a nil response.
Calling session.DiscardHTTPResponse(resp) unconditionally risks a panic if resp is nil.

-			s.errors++
-			session.DiscardHTTPResponse(resp)
+			s.errors++
+			if resp != nil { // avoid nil-pointer deref
+				session.DiscardHTTPResponse(resp)
+			}

40-50: Check scanner.Err() to surface I/O errors

The loop ignores any scanning error.
If the body is truncated or the connection drops, valuable error information is lost and the source silently mis-reports an empty result set.

@@
 		for scanner.Scan() {
 			line := scanner.Text()
 			if line == "" {
 				continue
 			}
 			for _, subdomain := range session.Extractor.Extract(line) {
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
 				s.results++
 			}
 		}
+		if scanErr := scanner.Err(); scanErr != nil {
+			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: scanErr}
+			s.errors++
+		}
v2/pkg/subscraping/sources/dnsrepo/dnsrepo.go (1)

67-75: Avoid double-free: remove the second DiscardHTTPResponse call

session.DiscardHTTPResponse(resp) is invoked immediately after the body is read (line 67).
Invoking it again in the JSON-unmarshal error branch (line 73) is redundant and risks a double-close if the helper doesn’t guard against it internally. A single defer registered right after the successful Get call keeps the intent clear and guarantees the response is discarded exactly once on all paths.

-       session.DiscardHTTPResponse(resp)
        return
    }
}

Consider instead:

resp, err := session.Get(...)
if err != nil {
    ...
    return
}
defer session.DiscardHTTPResponse(resp)
v2/pkg/subscraping/sources/anubis/anubis.go (1)

35-40: Possible nil-pointer dereference when request fails

session.SimpleGet can return resp == nil on network/URL errors.
Unconditionally passing that value to session.DiscardHTTPResponse risks a panic if DiscardHTTPResponse dereferences the pointer.

-			session.DiscardHTTPResponse(resp)
+			if resp != nil {
+				session.DiscardHTTPResponse(resp)
+			}
v2/pkg/subscraping/sources/bufferover/bufferover.go (2)

58-66: Potential nil-pointer dereference when response is nil

session.DiscardHTTPResponse(resp) is called even when resp == nil (the branch is guarded by err != nil && resp == nil).
Unless DiscardHTTPResponse guards against a nil argument, this will panic.

 resp, err := session.Get(ctx, sourceURL, "", map[string]string{"x-api-key": apiKey})
-if err != nil && resp == nil {
-    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-    s.errors++
-    session.DiscardHTTPResponse(resp) // resp is definitely nil here
-    return
-}
+if err != nil {
+    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+    s.errors++
+    if resp != nil {                      // guard against nil
+        session.DiscardHTTPResponse(resp)
+    }
+    return
+}

68-74: Duplicate cleanup ‑ prefer a single deferred discard

The response body is discarded here on the decode-error path, but also again on the success path (line 76). Repeating the call is unnecessary and easy to miss in future edits. Initialise a defer immediately after Get succeeds instead.

 resp, err := session.Get(ctx, sourceURL, "", map[string]string{"x-api-key": apiKey})
-if resp != nil {
-    defer session.DiscardHTTPResponse(resp)
-}

This guarantees cleanup exactly once and removes the need for explicit calls later.

v2/pkg/subscraping/sources/alienvault/alienvault.go (1)

41-46: Incorrect nil‐check leads to leaked connections and possible nil dereference

err != nil && resp == nil ignores the very common case where err != nil and resp is non-nil (e.g. non-2xx status, redirect loop, TLS handshake error returned after opening the connection).
In that scenario the code:

  1. Treats the call as success and continues to decode the body, even though err is non-nil.
  2. Never closes the body because execution jumps out before the later DiscardHTTPResponse calls.
  3. Risks dereferencing resp inside Decode even though the request failed.

Refactor the guard to handle any error first and always discard the response if available:

 resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://otx.alienvault.com/api/v1/indicators/domain/%s/passive_dns", domain))
-if err != nil && resp == nil {
-    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-    s.errors++
-    session.DiscardHTTPResponse(resp)
-    return
-}
+if err != nil {
+    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+    s.errors++
+    if resp != nil {
+        session.DiscardHTTPResponse(resp)
+    }
+    return
+}
+defer session.DiscardHTTPResponse(resp) // one‐liner guarantees cleanup on every path
v2/pkg/subscraping/sources/hackertarget/hackertarget.go (1)

47-58: Missing scanner.Err() check may swallow upstream read errors

After the scanning loop completes we never inspect scanner.Err(). If the HTTP body is truncated or the connection resets, those errors are silently ignored and the statistics under-report failures.

@@
         }
     }
+
+    // Surface any I/O errors that occurred during scanning.
+    if err := scanner.Err(); err != nil {
+        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+        s.errors++
+    }

Adding this tiny block keeps error reporting symmetric with the new Close() handling and improves observability.

v2/pkg/subscraping/sources/censys/censys.go (2)

115-120: Guard against nil resp before discarding

session.HTTPRequest can legitimately return a nil response on transport errors. Calling session.DiscardHTTPResponse(resp) unconditionally risks a nil-pointer panic unless that helper already checks for nil internally (not guaranteed from this context).

 if err != nil {
     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
     s.errors++
-    session.DiscardHTTPResponse(resp)
+    if resp != nil {
+        session.DiscardHTTPResponse(resp)
+    }
     return
 }

122-129: Validate HTTP status code before attempting JSON decode

The code proceeds to decode the body without checking resp.StatusCode. A non-2xx status (e.g., 401/429) will produce a confusing JSON parse error and hide the real problem. Prefer an explicit check and early error return:

 var censysResponse response
-err = jsoniter.NewDecoder(resp.Body).Decode(&censysResponse)
+if resp.StatusCode != 200 {
+    session.DiscardHTTPResponse(resp)
+    results <- subscraping.Result{
+        Source: s.Name(), Type: subscraping.Error,
+        Error:  fmt.Errorf("unexpected status code %d", resp.StatusCode),
+    }
+    s.errors++
+    return
+}
+
+err = jsoniter.NewDecoder(resp.Body).Decode(&censysResponse)
 if err != nil {
     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
     s.errors++
     session.DiscardHTTPResponse(resp)
     return
 }
v2/pkg/subscraping/sources/threatbook/threatbook.go (1)

55-60: Error-handling path discards a potentially nil response and skips the err != nil && resp != *non-nil* case

The guard clause fires only when err != nil && resp == nil, yet you invoke session.DiscardHTTPResponse(resp) anyway.

  1. If resp is nil this call is pointless at best and unsafe at worst (depends on the implementation).
  2. If err != nil and resp != nil the current branch is bypassed, leaving the body undiscarded and leaking the connection.

A safer, simpler structure:

- if err != nil && resp == nil {
-     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-     s.errors++
-     session.DiscardHTTPResponse(resp)
-     return
- }
+ if err != nil {
+     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+     s.errors++
+     if resp != nil {
+         session.DiscardHTTPResponse(resp)
+     }
+     return
+ }

This handles every error permutation and never passes a nil pointer into the discard helper.

v2/pkg/subscraping/sources/reconcloud/reconcloud.go (2)

46-52: Wrong error predicate + possible nil-pointer in DiscardHTTPResponse

if err != nil && resp == nil only trips when both are true.
In practice an HTTP client almost always returns nil response together with the error, but you’re still relying on that invariant.
More importantly, when the predicate is true resp is guaranteed to be nil (second operand), yet the code immediately calls session.DiscardHTTPResponse(resp) → potential panic inside that helper.

Fix:

resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://recon.cloud/api/search?domain=%s", domain))
-if err != nil && resp == nil {
+if err != nil {
     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
     s.errors++
-    session.DiscardHTTPResponse(resp)
+    if resp != nil { // defensive, in case http.Client changes behaviour
+        session.DiscardHTTPResponse(resp)
+    }
     return
 }
+
+// ensure body is always cleaned up on every other exit path
+defer session.DiscardHTTPResponse(resp)

This:

  1. Handles any non-nil error.
  2. Protects against nil response values.
  3. Uses a single defer to guarantee the body is discarded exactly once.

55-63: Now redundant calls to DiscardHTTPResponse

With the defer in place (see previous comment) the two explicit calls below are no longer necessary and will double-close the body.

-    session.DiscardHTTPResponse(resp)
-    return
+    return
 }
- session.DiscardHTTPResponse(resp)

Remove them to avoid redundant work and make the control-flow clearer.

v2/pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)

49-65: Missing scanner.Err() check may silently swallow read errors

After the for scanner.Scan() loop finishes, scanner.Err() should be inspected; otherwise network/IO errors are ignored and the source reports zero errors even though parsing failed part-way.

@@
        for scanner.Scan() {
@@
                s.results++
        }
+       if err := scanner.Err(); err != nil {
+           results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+           s.errors++
+       }

This preserves accuracy of statistics and surfaces transient network issues to the caller.

v2/pkg/subscraping/sources/digitorus/digitorus.go (1)

51-64: Handle scanner I/O errors to avoid silent data loss

bufio.Scanner will drop any I/O or token-size error unless scanner.Err() is checked after the loop. Currently such errors are ignored, the scrape silently fails, and s.errors is not incremented.

 for scanner.Scan() {
     …
 }
-if scanner.Err() != nil {
-    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: scanner.Err()}
-    s.errors++
-}
+if err := scanner.Err(); err != nil {
+    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+    s.errors++
+}
v2/pkg/subscraping/sources/robtex/robtext.go (1)

88-90: Possible nil-dereference when session.Get fails

session.Get may return (nil, err) on transport errors. Passing a nil response into session.DiscardHTTPResponse without a guard assumes that the helper tolerates nil; if it ever dereferences resp, this will panic.

-	if err != nil {
-		session.DiscardHTTPResponse(resp)
-		return results, err
+	if err != nil {
+		if resp != nil {
+			session.DiscardHTTPResponse(resp)
+		}
+		return results, err
 	}
v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)

78-82: Page-1 is fetched twice – wastes quota and yields duplicates

The first request already retrieved page=1.
The loop below starts again at page := 1, causing an identical second call and duplicate subdomain emission.

- for page := 1; page <= totalPages; page++ {
+ // page 1 already processed above
+ for page := 2; page <= totalPages; page++ {
v2/pkg/subscraping/sources/chinaz/chinaz.go (1)

41-47: DiscardHTTPResponse must tolerate a nil resp

On the error path resp can legitimately be nil (e.g. network failure or DNS error).
If session.DiscardHTTPResponse blindly dereferences resp.Body this statement will panic and kill the worker goroutine, closing the results channel prematurely.

-            session.DiscardHTTPResponse(resp)
+            if resp != nil {
+                session.DiscardHTTPResponse(resp)
+            }

Alternatively, make DiscardHTTPResponse itself a no-op when resp == nil.

v2/pkg/subscraping/sources/gitlab/gitlab.go (1)

110-123: 404 path still leaks the response body

session.DiscardHTTPResponse(resp) is now correctly used for the happy path (line 122), but if err != nil && resp.StatusCode == http.StatusNotFound (see lines 100-107) the function continues without closing resp.Body, leading to a connection leak.

An easy fix is to always discard/close before the early continue/return, even for the 404 case:

-				if resp == nil || (resp != nil && resp.StatusCode != http.StatusNotFound) {
-					session.DiscardHTTPResponse(resp)
-
-				}
+				if resp == nil {
+
+				}
+				// Always release the body, even on 404
+				session.DiscardHTTPResponse(resp)
+				if resp != nil && resp.StatusCode != http.StatusNotFound {
+
+				}

(or simply defer session.DiscardHTTPResponse(resp) immediately after the GET).

♻️ Duplicate comments (1)
v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)

68-70: Invalid range over integer – won’t compile

for i := range maxYearsBack { … } attempts to iterate over an int.
Replace with an index-based loop:

- for i := range maxYearsBack {
+ for i := 0; i < maxYearsBack; i++ {
🧹 Nitpick comments (18)
v2/pkg/subscraping/sources/digitalyama/digitalyama.go (1)

57-62: Consider reusing session.DiscardHTTPResponse for consistency.

Most other sources updated in this PR switched to the helper session.DiscardHTTPResponse(resp), which both drains and closes the body. Using that helper here would keep the cleanup logic uniform across sources and avoid a tiny bit of duplicated code.

-       defer func() {
-               if err := resp.Body.Close(); err != nil {
-                       results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-                       s.errors++
-               }
-       }()
+       defer session.DiscardHTTPResponse(resp)

Up to you—current code is functionally fine, this is purely a consistency/refactor suggestion.

v2/pkg/subscraping/sources/riddler/riddler.go (1)

40-52: Defer the discard to guarantee cleanup & avoid double-close

Instead of invoking session.DiscardHTTPResponse at the very end of the loop, defer it immediately after the successful SimpleGet.
This (a) guarantees cleanup even on early returns (e.g. context cancel) and
(b) removes the need for the explicit call on line 51.

@@
 		resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://riddler.io/search?q=pld:%s&view_type=data_table", domain))
@@
 		}
 
+		// ensure the response body is always fully discarded/closed
+		defer session.DiscardHTTPResponse(resp)
+
 		scanner := bufio.NewScanner(resp.Body)
@@
-		}
-		session.DiscardHTTPResponse(resp)
+		}
v2/pkg/subscraping/sources/anubis/anubis.go (2)

28-57: Duplicate cleanup calls – consider one defer instead

session.DiscardHTTPResponse(resp) is invoked on every early-return path and once after decoding.
You can simplify control-flow and guarantee cleanup even on future branches by deferring it immediately after the successful SimpleGet.

 		resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://jonlu.ca/anubis/subdomains/%s", domain))
 		if err != nil {
 			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
 			s.errors++
-			if resp != nil {
-				session.DiscardHTTPResponse(resp)
-			}
 			return
 		}
+		// Always close/discard the response when we leave this function.
+		defer session.DiscardHTTPResponse(resp)

This removes repetition and makes future maintenance easier.
(Keep the nil-check inside the defer if DiscardHTTPResponse isn’t nil-safe.)


42-45: Lost error information for non-200 responses

When the service replies with a non-OK status you silently return, giving callers no clue what happened.
At minimum, emit an error result so users and metrics capture the failure.

-		if resp.StatusCode != http.StatusOK {
-			session.DiscardHTTPResponse(resp)
-			return
-		}
+		if resp.StatusCode != http.StatusOK {
+			results <- subscraping.Result{
+				Source: s.Name(),
+				Type:   subscraping.Error,
+				Error:  fmt.Errorf("unexpected status code %d", resp.StatusCode),
+			}
+			s.errors++
+			return
+		}
v2/pkg/subscraping/sources/bufferover/bufferover.go (1)

76-77: Unneeded discard after adding a defer

If you adopt the deferred approach suggested above, this explicit session.DiscardHTTPResponse(resp) becomes redundant and can be removed, simplifying the happy-path flow.

v2/pkg/subscraping/sources/hackertarget/hackertarget.go (1)

40-45: Explicit close-error handling is 👍 — consider centralising with session.DiscardHTTPResponse for consistency

The new deferred block correctly captures and reports resp.Body.Close() errors, avoiding silent leaks.
Across the code-base, most sources now delegate response cleanup to session.DiscardHTTPResponse(resp), which also drains the body for connection reuse. To keep behaviour uniform and prevent future drift, you could:

- defer func() {
-     if err := resp.Body.Close(); err != nil {
-         results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-         s.errors++
-     }
- }()
+ defer func() {
+     if err := session.DiscardHTTPResponse(resp); err != nil {
+         results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+         s.errors++
+     }
+ }()

This removes duplication and leverages the helper’s drain/reuse logic.
If DiscardHTTPResponse does not return an error, you can still call it first and fall back to resp.Body.Close() for error capture.

v2/pkg/subscraping/sources/threatbook/threatbook.go (1)

67-71: Consider deferring the discard for single-exit readability

After successful decoding you immediately call session.DiscardHTTPResponse(resp), which is correct.
Using a defer right after the SimpleGet call would eliminate the need to remember manual invocations on every exit path and prevent future leaks when code is refactored.

resp, err := session.SimpleGet(ctx, url)
if resp != nil {
    defer session.DiscardHTTPResponse(resp)
}

Not blocking, but it reduces cognitive load and guards against maintenance slip-ups.

v2/pkg/subscraping/sources/c99/c99.go (1)

59-64: Close-error handling is welcome, but diverges from the new session.DiscardHTTPResponse helper

Good call catching resp.Body.Close() errors and surfacing them through the results channel.
That said, most updated sources in this PR switched to the centralised helper session.DiscardHTTPResponse(resp), which:

• drains & closes the body (keeps the HTTP transport reusable)
• nil-checks resp
• encapsulates the same error handling logic

Using the helper keeps behaviour uniform and avoids duplicating subtle cleanup logic:

-	defer func() {
-		if err := resp.Body.Close(); err != nil {
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-			s.errors++
-		}
-	}()
+	defer session.DiscardHTTPResponse(resp)

If pushing the error into results is essential, consider extending DiscardHTTPResponse to optionally report through a callback/channel rather than diverging per source.

v2/pkg/subscraping/sources/intelx/intelx.go (1)

126-134: Remove redundant io.ReadAll before discarding the response

The helper already consumes & closes the body; the explicit io.ReadAll followed by another call introduces an unnecessary second read. Streamline this block:

-           _, err = io.ReadAll(resp.Body)
-           if err != nil {
-               results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-               s.errors++
-               session.DiscardHTTPResponse(resp)
-               return
-           }
-           session.DiscardHTTPResponse(resp)
+           session.DiscardHTTPResponse(resp)

This removes needless I/O while preserving error handling through the helper.

v2/pkg/subscraping/sources/digitorus/digitorus.go (1)

44-49: Consider re-using the common response-cleanup helper for consistency

Every other modernized source now relies on session.DiscardHTTPResponse(resp) so that response closing (and associated error logging) is handled in one place.
Duplicating the logic here means:

  1. We lose the single-point behaviour (future fixes/improvements to DiscardHTTPResponse will not reach this file).
  2. The per-source errors++ bookkeeping becomes inconsistent – other sources only log, this one also counts the error.

If you want to keep the local error counting, consider extending DiscardHTTPResponse to accept a callback so each caller can decide what to do with the close error, and then replace the block with a single defer.

v2/pkg/subscraping/sources/rapiddns/rapiddns.go (2)

51-52: Guard against nil response before discarding

session.DiscardHTTPResponse(resp) is invoked here after io.ReadAll fails.
While resp is expected to be non-nil in this branch, we already call the same helper in the earlier err != nil path (line 43) where resp can legitimately be nil.
To be defensive and avoid accidental nil-pointer dereferences should DiscardHTTPResponse ever omit an internal nil-check, wrap the call:

-if session.DiscardHTTPResponse(resp)
+if resp != nil {
+    session.DiscardHTTPResponse(resp)
+}

55-56: Consider draining once per iteration to avoid duplicate work

You now discard the response body twice per successful iteration:

  1. After a read error (lines 51-52).
  2. After a successful read (line 55).

The first is needed only on the error path, while the second is always executed.
A clearer pattern is:

resp, err := session.SimpleGet(...)
if err != nil {
    // handle & return
}
defer session.DiscardHTTPResponse(resp) // runs on both success & error below

body, err := io.ReadAll(resp.Body)
if err != nil {
    // handle & return
}

Because the loop exits on each return, the deferred call fires immediately, keeping memory usage low and avoiding repeated identical statements.
Up to you whether the small readability gain outweighs the extra defer allocation per request.

v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go (2)

54-59: Prefer using session.DiscardHTTPResponse to standardise cleanup across sources

Almost every other modernised source now delegates connection reuse + close-error handling to session.DiscardHTTPResponse. Retaining a bespoke resp.Body.Close() block here diverges from that convention and misses the extra drain logic the helper provides.

A minimal, drop-in adjustment:

-       defer func() {
-           if err := resp.Body.Close(); err != nil {
-               results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-               s.errors++
-           }
-       }()
+       // Drain & close so the connection can be reused.
+       defer func() {
+           if cerr := session.DiscardHTTPResponse(resp); cerr != nil {
+               results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: cerr}
+               s.errors++
+           }
+       }()

Gives identical error reporting while aligning with the rest of the codebase and ensuring HTTP keep-alive.


55-56: Minor lint: avoid shadowing err inside the deferred func

if err := … introduces a new shadow variable that can trip static-analysis linters (shadow, gosimple). Using a distinct name keeps the outer err untouched and silences the warning:

-           if err := resp.Body.Close(); err != nil {
-               results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+           if closeErr := resp.Body.Close(); closeErr != nil {
+               results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: closeErr}
v2/pkg/subscraping/sources/bevigil/bevigil.go (1)

60-67: Prefer a single deferred cleanup instead of multiple explicit calls

session.DiscardHTTPResponse(resp) is now invoked in every exit path (decode failure and success). Replacing these duplicates with one defer immediately after a non-nil resp is obtained shortens the function and eliminates the chance of forgetting a path during future edits.

resp, err := session.Get(ctx, getUrl, "", headers)
if err != nil { … }

+defer session.DiscardHTTPResponse(resp)
 
// remove explicit calls on lines 62 and 66

This mirrors the standard defer resp.Body.Close() idiom and keeps resource-management concerns in one place.

v2/pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)

68-68: Remove redundant discard to prevent second close

With the above deferred cleanup in place, this explicit call becomes redundant (and could be nil when the request itself failed). Deleting it simplifies the flow and avoids needless work.

-           session.DiscardHTTPResponse(resp)
v2/pkg/subscraping/sources/sitedossier/sitedossier.go (1)

66-70: Guard against nil response before discarding

session.DiscardHTTPResponse(resp) is invoked unconditionally.
If resp is ever nil (e.g., SimpleGet returned an error and resp was nil), this could panic unless DiscardHTTPResponse is explicitly nil-safe.

-if err != nil {
-    ...
-    session.DiscardHTTPResponse(resp)
+if err != nil {
+    ...
+    if resp != nil {
+        session.DiscardHTTPResponse(resp)
+    }

Please verify that DiscardHTTPResponse gracefully handles nil; otherwise, add the check.

v2/pkg/subscraping/sources/fullhunt/fullhunt.go (1)

62-63: Duplicate call—safe but redundant once defer is applied.

After moving cleanup into a defer, this line becomes unnecessary and can be removed to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26f287b and 20bef2a.

⛔ Files ignored due to path filters (1)
  • .github/workflows/build-test.yml is excluded by !**/*.yml
📒 Files selected for processing (42)
  • v2/pkg/runner/outputter.go (3 hunks)
  • v2/pkg/subscraping/sources/alienvault/alienvault.go (1 hunks)
  • v2/pkg/subscraping/sources/anubis/anubis.go (2 hunks)
  • v2/pkg/subscraping/sources/bevigil/bevigil.go (1 hunks)
  • v2/pkg/subscraping/sources/bufferover/bufferover.go (1 hunks)
  • v2/pkg/subscraping/sources/c99/c99.go (1 hunks)
  • v2/pkg/subscraping/sources/censys/censys.go (1 hunks)
  • v2/pkg/subscraping/sources/certspotter/certspotter.go (2 hunks)
  • v2/pkg/subscraping/sources/chinaz/chinaz.go (1 hunks)
  • v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (2 hunks)
  • v2/pkg/subscraping/sources/crtsh/crtsh.go (4 hunks)
  • v2/pkg/subscraping/sources/digitalyama/digitalyama.go (1 hunks)
  • v2/pkg/subscraping/sources/digitorus/digitorus.go (1 hunks)
  • v2/pkg/subscraping/sources/dnsdb/dnsdb.go (3 hunks)
  • v2/pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1 hunks)
  • v2/pkg/subscraping/sources/dnsrepo/dnsrepo.go (1 hunks)
  • v2/pkg/subscraping/sources/facebook/ctlogs.go (2 hunks)
  • v2/pkg/subscraping/sources/facebook/ctlogs_test.go (2 hunks)
  • v2/pkg/subscraping/sources/fofa/fofa.go (2 hunks)
  • v2/pkg/subscraping/sources/fullhunt/fullhunt.go (1 hunks)
  • v2/pkg/subscraping/sources/github/github.go (3 hunks)
  • v2/pkg/subscraping/sources/gitlab/gitlab.go (2 hunks)
  • v2/pkg/subscraping/sources/hackertarget/hackertarget.go (1 hunks)
  • v2/pkg/subscraping/sources/hudsonrock/hudsonrock.go (1 hunks)
  • v2/pkg/subscraping/sources/intelx/intelx.go (2 hunks)
  • v2/pkg/subscraping/sources/leakix/leakix.go (1 hunks)
  • v2/pkg/subscraping/sources/netlas/netlas.go (3 hunks)
  • v2/pkg/subscraping/sources/pugrecon/pugrecon.go (2 hunks)
  • v2/pkg/subscraping/sources/quake/quake.go (2 hunks)
  • v2/pkg/subscraping/sources/rapiddns/rapiddns.go (1 hunks)
  • v2/pkg/subscraping/sources/reconcloud/reconcloud.go (1 hunks)
  • v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (2 hunks)
  • v2/pkg/subscraping/sources/riddler/riddler.go (1 hunks)
  • v2/pkg/subscraping/sources/robtex/robtext.go (1 hunks)
  • v2/pkg/subscraping/sources/securitytrails/securitytrails.go (2 hunks)
  • v2/pkg/subscraping/sources/shodan/shodan.go (1 hunks)
  • v2/pkg/subscraping/sources/sitedossier/sitedossier.go (1 hunks)
  • v2/pkg/subscraping/sources/threatbook/threatbook.go (1 hunks)
  • v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go (1 hunks)
  • v2/pkg/subscraping/sources/threatminer/threatminer.go (1 hunks)
  • v2/pkg/subscraping/sources/virustotal/virustotal.go (2 hunks)
  • v2/pkg/subscraping/sources/waybackarchive/waybackarchive.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • v2/pkg/subscraping/sources/quake/quake.go
  • v2/pkg/subscraping/sources/netlas/netlas.go
  • v2/pkg/runner/outputter.go
  • v2/pkg/subscraping/sources/certspotter/certspotter.go
  • v2/pkg/subscraping/sources/securitytrails/securitytrails.go
  • v2/pkg/subscraping/sources/virustotal/virustotal.go
  • v2/pkg/subscraping/sources/fofa/fofa.go
  • v2/pkg/subscraping/sources/pugrecon/pugrecon.go
  • v2/pkg/subscraping/sources/crtsh/crtsh.go
🧰 Additional context used
🧬 Code Graph Analysis (6)
v2/pkg/subscraping/sources/hackertarget/hackertarget.go (2)
v2/pkg/subscraping/types.go (3)
  • Result (81-86)
  • Source (37-62)
  • Error (94-94)
v2/pkg/subscraping/sources/anubis/anubis.go (1)
  • Source (16-20)
v2/pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)
v2/pkg/subscraping/types.go (2)
  • Result (81-86)
  • Error (94-94)
v2/pkg/subscraping/sources/dnsdb/dnsdb.go (1)
v2/pkg/subscraping/types.go (3)
  • Result (81-86)
  • Source (37-62)
  • Error (94-94)
v2/pkg/subscraping/sources/threatminer/threatminer.go (2)
v2/pkg/subscraping/types.go (2)
  • Result (81-86)
  • Error (94-94)
v2/pkg/subscraping/sources/anubis/anubis.go (1)
  • Source (16-20)
v2/pkg/subscraping/sources/shodan/shodan.go (2)
v2/pkg/subscraping/types.go (2)
  • Result (81-86)
  • Error (94-94)
v2/pkg/subscraping/sources/rapiddns/rapiddns.go (1)
  • Source (18-22)
v2/pkg/subscraping/sources/digitorus/digitorus.go (2)
v2/pkg/subscraping/types.go (3)
  • Result (81-86)
  • Source (37-62)
  • Error (94-94)
v2/pkg/subscraping/sources/anubis/anubis.go (1)
  • Source (16-20)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Test Builds (macOS-13)
🔇 Additional comments (19)
v2/pkg/subscraping/sources/digitalyama/digitalyama.go (1)

57-62: Deferred‐close error handling is correct, but double-check the defer order.

Great to see explicit handling of the Close() error here.
Because this defer is registered after the earlier defer func(startTime…) at line 39, it will execute before the results channel is closed, so the extra results <- … send is safe. Just keep this ordering rule in mind if either block is moved in future refactors.
Looks good as-is.

v2/pkg/subscraping/sources/censys/censys.go (1)

131-132: Good move to centralised cleanup

Replacing direct resp.Body.Close() with session.DiscardHTTPResponse(resp) aligns with the new session-level resource handling pattern, reducing duplication and ensuring consistent draining/closing behaviour.

v2/pkg/subscraping/sources/intelx/intelx.go (2)

88-93: Ensure DiscardHTTPResponse is nil-safe

session.SimplePost can return a non-nil error with a nil resp (e.g. dial failure).
session.DiscardHTTPResponse(resp) is now invoked unconditionally. Please double-check that DiscardHTTPResponse gracefully handles a nil argument to avoid a potential panic.


101-106: Change looks good – centralised response disposal

Replacing the manual resp.Body.Close() with session.DiscardHTTPResponse(resp) keeps the success path consistent with the new helper and avoids duplicated code elsewhere.

v2/pkg/subscraping/sources/digitorus/digitorus.go (1)

35-43: Guard against a nil resp before dereferencing

session.SimpleGet may return a nil response on network errors.
Although ptr.Safe(resp).StatusCode prevents the panic in the error-handling branch, execution can still reach the scanning code with a nil resp if err == nil but resp unexpectedly becomes nil (e.g. custom transport). A trivial early check avoids the latent nil-pointer risk:

 resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://certificatedetails.com/%s", domain))
-if err != nil && ptr.Safe(resp).StatusCode != http.StatusNotFound {
+if err != nil && ptr.Safe(resp).StatusCode != http.StatusNotFound {
     …
 }
+if resp == nil {
+    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("nil HTTP response")}
+    s.errors++
+    return
+}
v2/pkg/subscraping/sources/dnsdb/dnsdb.go (2)

107-110: Prefer session.DiscardHTTPResponse over manual resp.Body.Close for consistency

Throughout the codebase the helper wraps the close and logs any failure; here you fall back to a direct Close, essentially duplicating logic that already exists in DiscardHTTPResponse. Consider replacing these four lines with the helper for uniform behaviour and to avoid having two slightly-different error-handling paths.
[ suggest_optional_refactor ]


119-119: 👍 Consistent cleanup on unmarshal failure

Switching to session.DiscardHTTPResponse(resp) keeps the error-handling style aligned with the rest of the project and guarantees the body is closed even when resp is nil.

v2/pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (3)

70-76: Centralised response disposal looks good

Replacing multiple scattered resp.Body.Close() calls with the session-level DiscardHTTPResponse increases consistency and allows unified metrics/tracing. Nice clean-up.


91-97: Same nil-safety consideration in pagination error branch

The error branch inside the paginated loop also calls session.DiscardHTTPResponse(resp) unconditionally.
Ensure the helper handles nil to avoid crashes when an HTTP request fails before the response object is created.


60-64: Possible nil‐pointer panic when resp is nil

session.Get may return nil, err on network/transport failures.
session.DiscardHTTPResponse(resp) is invoked unconditionally even when resp == nil; if the helper does not guard against a nil response this will panic and kill the worker goroutine.

Run the quick scan below to confirm that DiscardHTTPResponse contains a nil-check:

#!/bin/bash
# Inspect helper for defensive nil handling
rg --context 2 'func .*DiscardHTTPResponse' | head -n 20

If the helper is not nil-safe, add a guard here or inside the helper:

- session.DiscardHTTPResponse(resp)
+ if resp != nil {
+     session.DiscardHTTPResponse(resp)
+ }
v2/pkg/subscraping/sources/bevigil/bevigil.go (1)

49-54: ```shell
#!/bin/bash

Display DiscardHTTPResponse implementation to check nil handling

sed -n '110,150p' v2/pkg/subscraping/agent.go


</details>
<details>
<summary>v2/pkg/subscraping/sources/chinaz/chinaz.go (1)</summary>

`49-52`: I’d like to inspect how `SimpleGet`, error paths, and `DiscardHTTPResponse` are used around your snippet. Running:


```shell
#!/bin/bash
# Show the call to SimpleGet and its error handling
rg -n -C3 'SimpleGet' v2/pkg/subscraping/sources/chinaz/chinaz.go

# Show where io.ReadAll and DiscardHTTPResponse are invoked
rg -n -C3 'io.ReadAll|DiscardHTTPResponse' v2/pkg/subscraping/sources/chinaz/chinaz.go
v2/pkg/subscraping/sources/commoncrawl/commoncrawl.go (1)

62-66: Cleanup improvement acknowledged

Switching to session.DiscardHTTPResponse(resp) standardises response handling—looks good.

v2/pkg/subscraping/sources/facebook/ctlogs.go (2)

48-52: Nice touch on error-checked close

Wrapping resp.Body.Close() in a defer that logs any error improves observability.


120-121: Consistent response disposal

Using session.DiscardHTTPResponse(resp) right after body read keeps resource handling consistent.

v2/pkg/subscraping/sources/github/github.go (2)

115-120: Double-check header usage after body discard

session.DiscardHTTPResponse(resp) closes the body; headers remain accessible, but confirm the helper doesn’t nil-out the response. Subsequent access to resp.Header (line 129) relies on that.


95-96: Standardised cleanup looks good

Replacing direct Body.Close() with DiscardHTTPResponse keeps cleanup uniform.

Also applies to: 103-104, 176-177

v2/pkg/subscraping/sources/facebook/ctlogs_test.go (1)

45-49: Good practice in tests

Adding error logging on body close mirrors production code quality—nice.

v2/pkg/subscraping/sources/fullhunt/fullhunt.go (1)

56-60: ```shell
#!/bin/bash

Display DiscardHTTPResponse implementation to check for nil response handling

sed -n '110,160p' v2/pkg/subscraping/agent.go


</details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

@ehsandeep ehsandeep merged commit b669adb into dev Jun 20, 2025
10 checks passed
@ehsandeep ehsandeep deleted the modernize branch June 20, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

modernize subfinder

2 participants