Skip to content

fix:context cancellation sources#1680

Merged
ehsandeep merged 3 commits intodevfrom
fix/context-cancellation-sources
Dec 4, 2025
Merged

fix:context cancellation sources#1680
ehsandeep merged 3 commits intodevfrom
fix/context-cancellation-sources

Conversation

@knakul853
Copy link
Copy Markdown
Contributor

@knakul853 knakul853 commented Dec 1, 2025

closes #1679

Summary by CodeRabbit

Release Notes

  • Refactor
    • Enhanced responsiveness to cancellation requests across all passive DNS enumeration sources for faster shutdown and termination.
    • Improved resource cleanup and eliminated blocking scenarios during shutdown, ensuring smoother operations when interrupts are triggered.
    • Strengthened control flow for more reliable handling of cancellation in all data retrieval paths.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 1, 2025

Walkthrough

This pull request adds context cancellation support across the enumeration framework. The core changes wrap result emissions in select statements that handle context cancellation, add ctx.Done() checks in pagination and processing loops, and ensure result counters are only incremented upon successful channel sends.

Changes

Cohort / File(s) Summary
Core passive enumeration synchronization
pkg/passive/passive.go
Added defer wg.Done() for proper goroutine completion signaling and wrapped result emissions in select statements to guard against context cancellation.
Subscraping sources with simple result emission
pkg/subscraping/sources/alienvault/alienvault.go, anubis/anubis.go, bevigil/bevigil.go, bufferover/bufferover.go, builtwith/builtwith.go, c99/c99.go, digitalyama/digitalyama.go, digitorus/digitorus.go, dnsdumpster/dnsdumpster.go, dnsrepo/dnsrepo.go, domainsproject/domainsproject.go, driftnet/driftnet.go, fofa/fofa.go, fullhunt/fullhunt.go, hackertarget/hackertarget.go, hudsonrock/hudsonrock.go, leakix/leakix.go, netlas/netlas.go, pugrecon/pugrecon.go, reconcloud/reconcloud.go, riddler/riddler.go, robtex/robtext.go, sitedossier/sitedossier.go, threatbook/threatbook.go, threatcrowd/threatcrowd.go, threatminer/threatminer.go, whoisxmlapi/whoisxmlapi.go
Each wraps subdomain result sends in a select statement with ctx.Done() case to enable graceful early exit on cancellation, incrementing result counters only on successful sends.
Subscraping sources with pagination loops
pkg/subscraping/sources/certspotter/certspotter.go, dnsdb/dnsdb.go, onyphe/onyphe.go, rapiddns/rapiddns.go, rsecloud/rsecloud.go, securitytrails/securitytrails.go, shodan/shodan.go, virustotal/virustotal.go, windvane/windvane.go, zoomeyeapi/zoomeyeapi.go
Adds ctx.Done() checks before issuing paginated requests and wraps result emissions in select statements to enable early termination during pagination.
Subscraping sources with loop cancellation checks
pkg/subscraping/sources/chinaz/chinaz.go, intelx/intelx.go, quake/quake.go
Adds ctx.Done() checks at the start of processing loops and wraps result emissions in cancellation-aware select blocks.
Subscraping sources with multi-phase cancellation
pkg/subscraping/sources/censys/censys.go, commoncrawl/commoncrawl.go, crtsh/crtsh.go, github/github.go
Introduces comprehensive context cancellation checks across outer pagination loops, inner processing loops, and line-by-line iteration phases; wraps all result emissions in select with ctx.Done() and ensures HTTP responses are discarded on cancellation.
Subscraping sources with context integration
pkg/subscraping/sources/chaos/chaos.go
Updates Run method signature to explicitly use context parameter (previously unused); adds cancellation checks in subdomain iteration loop and error break behavior.
Subscraping sources with pre-request cancellation
pkg/subscraping/sources/facebook/ctlogs.go, gitlab/gitlab.go, redhuntlabs/redhuntlabs.go, waybackarchive/waybackarchive.go
Adds select-based cancellation checks before API requests and wraps result emission paths in context-aware sends.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Source
    participant Context
    participant ResultChan

    rect rgb(200, 220, 240)
    note over Source: OLD: Unconditional result emission
    Source->>ResultChan: Send result
    Source->>Source: s.results++
    end

    rect rgb(220, 240, 200)
    note over Source: NEW: Context-aware result emission
    loop Each result
        Source->>Context: Check ctx.Done()?
        alt Context cancelled
            Context-->>Source: Done signal received
            Source->>Source: Return early
        else Context active
            Source->>ResultChan: Send result (non-blocking via select)
            alt Send succeeded
                ResultChan-->>Source: Result accepted
                Source->>Source: s.results++
            else Send blocked (unusual)
                Source->>Source: Wait on ctx or send
            end
        end
    end
    end

    Caller->>Context: Cancel context
    Context-->>Source: Propagate cancellation
    Source->>Source: Exit gracefully
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45-60 minutes

Areas requiring extra attention:

  • Pagination loop variations — Files like censys.go, github.go, and commoncrawl.go have multiple nested loops with context checks at different levels; verify that all loops properly handle cancellation and that HTTP responses are discarded on early exit.
  • Context parameter usagechaos.go updates the method signature from unused _ to explicit ctx; verify the signature change is consistent with the interface and that the context is properly threaded through the goroutine.
  • Result counter placement — Across all 50+ files, result counters must be incremented only within the successful send branch of the select statement; spot-check several files to confirm correct placement.
  • Error handling paths — Some sources like rapiddns.go and redhuntlabs.go add response discard logic on error; verify error-path cancellation behavior is consistent with success paths.
  • Goroutine synchronization — Verify that passive.go's defer wg.Done() addition does not introduce double-signaling or race conditions with existing synchronization code.

Poem

🐰 A hop, a skip, through contexts we zip,
Cancellations now caught with a select-based flip,
No more endless loops when the caller says "stop,"
Each source now graceful—no leaks from the top! 🎯

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix/context cancellation sources' clearly identifies the main change: adding context cancellation support to enumeration sources.
Linked Issues check ✅ Passed The PR successfully implements all coding requirements from issue #1679: context checks in pagination loops, cancellation-aware channel sends via select statements, and updates to passive.go.
Out of Scope Changes check ✅ Passed All changes are directly related to the objective of adding context cancellation support across enumeration sources. No out-of-scope modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/context-cancellation-sources

Comment @coderabbitai help to get the list of available commands and usage tips.

@knakul853 knakul853 changed the title Fix/context cancellation sources fix:context cancellation sources Dec 1, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (24)
pkg/subscraping/sources/fullhunt/fullhunt.go (2)

47-51: Inconsistent context handling for error results.

Error results are sent directly to the channel without checking context cancellation, which can cause the send to block indefinitely if the context is cancelled and the receiver stops consuming. Additionally, s.errors++ is incremented before the send completes, which is inconsistent with the pattern used for subdomain results (line 68).

Apply this diff to make error handling consistent with context cancellation:

 	resp, err := session.Get(ctx, fmt.Sprintf("https://fullhunt.io/api/v1/domain/%s/subdomains", domain), "", map[string]string{"X-API-KEY": randomApiKey})
 	if err != nil {
-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}
 		session.DiscardHTTPResponse(resp)
 		return
 	}

56-61: Inconsistent context handling for error results.

Same issue as lines 47-51: error results are sent directly without checking context cancellation, and the counter is incremented before the send completes.

Apply this diff to make error handling consistent with context cancellation:

 	var response fullHuntResponse
 	err = jsoniter.NewDecoder(resp.Body).Decode(&response)
 	if err != nil {
-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}
 		session.DiscardHTTPResponse(resp)
 		return
 	}
pkg/subscraping/sources/threatminer/threatminer.go (1)

40-43: Guard error sends with ctx.Done() to avoid blocking after cancellation

Both error paths still use a plain send on results, which can block indefinitely if the consumer stops reading the channel when the context is cancelled (as per the PR objective to make callers able to abort enumeration). To fully honor cancellation and avoid leaking goroutines, consider wrapping error sends in the same select pattern:

-        if err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            session.DiscardHTTPResponse(resp)
-            return
-        }
+        if err != nil {
+            select {
+            case <-ctx.Done():
+                session.DiscardHTTPResponse(resp)
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            session.DiscardHTTPResponse(resp)
+            return
+        }
@@
-        if err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            return
-        }
+        if err != nil {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            return
+        }

This keeps behavior identical under normal operation but ensures the goroutine can exit promptly when ctx is cancelled instead of blocking on an unreceived error.

Also applies to: 51-53

pkg/subscraping/sources/fofa/fofa.go (1)

98-99: Use context-aware select for result emission and conditional counter increment.

The current implementation sends results directly to the channel without context awareness and unconditionally increments the counter. This pattern can cause:

  1. Blocked sends if the receiver stops consuming after context cancellation
  2. Inflated result counts when sends don't complete

Per the AI summary, other sources "switch direct channel sends to context-aware selects to prevent blocking and to increment s.results only on successful sends."

Apply this diff to align with the cancellation-safe pattern:

-				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
-				s.results++
+				select {
+				case <-ctx.Done():
+					return
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
+					s.results++
+				}
pkg/subscraping/sources/zoomeyeapi/zoomeyeapi.go (1)

74-81: Critical: Error result sends can block after context cancellation.

The direct channel send at line 76 (and similarly at line 87) does not check for context cancellation. Since results is an unbuffered channel, these sends will block indefinitely if the receiver has stopped consuming due to context cancellation, violating the requirement that enumeration must stop immediately.

Apply this diff to wrap error sends in context-aware select statements:

 			if err != nil {
 				if !isForbidden {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-					s.errors++
+					select {
+					case <-ctx.Done():
+						session.DiscardHTTPResponse(resp)
+						return
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+						s.errors++
+					}
 					session.DiscardHTTPResponse(resp)
 				}
 				return
 			}

And similarly for the error case at lines 86-91:

 			if err != nil {
-				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-				s.errors++
+				select {
+				case <-ctx.Done():
+					_ = resp.Body.Close()
+					return
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+					s.errors++
+				}
 				_ = resp.Body.Close()
 				return
 			}
pkg/subscraping/sources/threatbook/threatbook.go (1)

56-56: Error sends must also check context cancellation to prevent goroutine leaks.

All error result sends use direct channel operations without checking ctx.Done(). Since the results channel is unbuffered (line 38), if the context is cancelled and the receiver stops reading, these sends will block indefinitely, causing the goroutine to leak.

Apply a similar select pattern to error sends. For example, for line 56:

-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}

Apply the same pattern to error sends at lines 65, 73-75, and 83 to ensure consistent context-aware behavior across all result emissions.

Also applies to: 65-65, 73-75, 83-83

pkg/subscraping/sources/onyphe/onyphe.go (1)

102-122: Wrap channel sends in select statements to prevent blocked sends after cancellation.

The channel sends to results are not wrapped in select statements with ctx.Done() checks. If the context is canceled while the goroutine attempts to send, and the receiver has stopped consuming from the channel, these sends will block indefinitely, causing a goroutine leak.

Additionally, result counters should only be incremented when the send succeeds (i.e., when the case results <- ... executes, not when ctx.Done() fires).

Apply this pattern to all result sends:

 				for _, subdomain := range record.Subdomains {
 					if subdomain != "" {
-						results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
-						s.results++
+						select {
+						case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
+							s.results++
+						case <-ctx.Done():
+							return
+						}
 					}
 				}
 
 				if record.Hostname != "" {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Hostname}
-					s.results++
+					select {
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Hostname}:
+						s.results++
+					case <-ctx.Done():
+						return
+					}
 				}
 
 				if record.Forward != "" {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Forward}
-					s.results++
+					select {
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Forward}:
+						s.results++
+					case <-ctx.Done():
+						return
+					}
 				}
 
 				if record.Reverse != "" {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse}
-					s.results++
+					select {
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse}:
+						s.results++
+					case <-ctx.Done():
+						return
+					}
 				}
pkg/subscraping/sources/builtwith/builtwith.go (1)

60-61: Wrap error sends in select statements to prevent blocking.

The error result sends at lines 60 and 69 use direct channel sends, which can block indefinitely if the context is cancelled and the receiver stops reading. This is inconsistent with the context-aware pattern applied to subdomain results at lines 77-82.

Apply this pattern to both error sends:

For lines 60-61:

-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			session.DiscardHTTPResponse(resp)
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}

For lines 69-70:

-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			session.DiscardHTTPResponse(resp)
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}

Also applies to: 69-70

pkg/subscraping/sources/facebook/ctlogs.go (1)

112-115: Wrap error result sends in select with ctx.Done() to prevent goroutine leaks.

The error result sends on lines 114, 121, and 128 are not wrapped in a select statement with ctx.Done(), unlike the regular result sends (lines 133-138). If the context is cancelled and the caller stops reading from the results channel, these sends will block indefinitely, causing goroutine leaks.

Apply this diff to add context handling for error sends:

 		resp, err := session.Get(ctx, domainsURL, "", nil)
 		if err != nil {
 			s.errors++
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			}
 			return
 		}
 		bin, err := io.ReadAll(resp.Body)
 		if err != nil {
 			s.errors++
 			gologger.Verbose().Msgf("failed to read response body: %s\n", err)
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			}
 			return
 		}
 		session.DiscardHTTPResponse(resp)
 		response := &response{}
 		if err := json.Unmarshal(bin, response); err != nil {
 			s.errors++
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: errorutil.NewWithErr(err).Msgf("failed to unmarshal response: %s", string(bin))}
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: errorutil.NewWithErr(err).Msgf("failed to unmarshal response: %s", string(bin))}:
+			}
 			return
 		}

Also applies to: 118-122, 126-129

pkg/subscraping/sources/shodan/shodan.go (1)

83-92: Critical: Unconditional channel send can block after context cancellation.

The select statement checks ctx.Done() but falls through to an unconditional channel send on line 89. If the context is cancelled between the select check and the send, the goroutine will block indefinitely, violating the PR's objective to stop enumeration immediately upon cancellation.

Apply this diff to wrap the send inside the select statement:

 			for _, data := range response.Subdomains {
 				select {
 				case <-ctx.Done():
 					return
-				default:
-				}
-				value := fmt.Sprintf("%s.%s", data, response.Domain)
-				results <- subscraping.Result{
-					Source: s.Name(), Type: subscraping.Subdomain, Value: value,
-				}
-				s.results++
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: fmt.Sprintf("%s.%s", data, response.Domain)}:
+					s.results++
+				}
 			}
pkg/subscraping/sources/securitytrails/securitytrails.go (1)

106-119: Inconsistent cancellation handling: subdomain send is unguarded.

The Subdomains loop checks ctx.Done() at the start (lines 107-111), but the actual channel send at line 117 is not wrapped in a select. This differs from the Records loop (lines 98-103) which correctly guards the send. If the context is cancelled during the send, this could block.

Apply this diff for consistency:

 		for _, subdomain := range securityTrailsResponse.Subdomains {
-			select {
-			case <-ctx.Done():
-				return
-			default:
-			}
 			if strings.HasSuffix(subdomain, ".") {
 				subdomain += domain
 			} else {
 				subdomain = subdomain + "." + domain
 			}
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
-			s.results++
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
+				s.results++
+			}
 		}
pkg/subscraping/sources/chaos/chaos.go (1)

44-57: Channel sends should be wrapped in select to prevent goroutine leaks.

The context check at the loop start is good, but the channel sends on lines 50 and 54-56 are not protected. If context is cancelled between the check and the send, and the receiver stops reading (because it also respects cancellation), this goroutine will block forever.

Per the PR objectives, use select with ctx.Done() when sending results to avoid blocked sends after cancellation:

 		for result := range chaosClient.GetSubdomains(&chaos.SubdomainsRequest{
 			Domain: domain,
 		}) {
 			select {
 			case <-ctx.Done():
 				return
 			default:
 			}
 			if result.Error != nil {
-				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: result.Error}
-				s.errors++
+				select {
+				case <-ctx.Done():
+					return
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: result.Error}:
+					s.errors++
+				}
 				break
 			}
-			results <- subscraping.Result{
-				Source: s.Name(), Type: subscraping.Subdomain, Value: fmt.Sprintf("%s.%s", result.Subdomain, domain),
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{
+				Source: s.Name(), Type: subscraping.Subdomain, Value: fmt.Sprintf("%s.%s", result.Subdomain, domain),
+			}:
+				s.results++
 			}
-			s.results++
 		}

With this pattern, the initial select at lines 44-48 becomes redundant and can be removed.

pkg/subscraping/sources/robtex/robtext.go (2)

59-61: Error sends must respect context cancellation to prevent goroutine blocking.

The error result sends at lines 59 and 73 use direct channel operations without checking ctx.Done(). If the context is cancelled and the receiver stops consuming from the channel, these sends will block indefinitely, preventing the goroutine from exiting cleanly and defeating the PR's objective of immediate cancellation.

Apply this pattern to both error sends:

-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}

Also applies to: 73-75


104-116: Add context cancellation check in the response scanning loop.

The enumerate function processes the response line-by-line without checking for context cancellation. For large responses, this means enumeration will continue processing all lines even after the context is cancelled, contradicting the PR objective of stopping "immediately (within milliseconds)."

Apply this diff to respect cancellation during response processing:

 	scanner := bufio.NewScanner(resp.Body)
 	for scanner.Scan() {
+		select {
+		case <-ctx.Done():
+			return results, ctx.Err()
+		default:
+		}
 		line := scanner.Text()
 		if line == "" {
 			continue
pkg/subscraping/sources/chinaz/chinaz.go (1)

41-47: Error-result sends are still not ctx-aware, and err is likely nil in the else branch.

Two related points in the error paths:

  1. Context handling on error sends

    At Lines 41-47 and 72-74, the error results are sent with a plain results <- .... If the consumer stops reading from results when the context is cancelled, these sends can still block the goroutine. For consistency with the new pattern in the loop and the PR objective, consider wrapping these sends as:

    - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
    - s.errors++
    - session.DiscardHTTPResponse(resp)
    - return
    + select {
    + case <-ctx.Done():
    +     session.DiscardHTTPResponse(resp)
    +     return
    + case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
    +     s.errors++
    +     session.DiscardHTTPResponse(resp)
    +     return
    + }

    and similarly in the else branch.

  2. Nil error value in the else branch

    When SubdomainList.ToBool() is false (Lines 71-74), err will typically be nil (the HTTP and io.ReadAll have already succeeded), so you’re emitting an error result with a nil Error field. It would be clearer to construct a descriptive error for this condition, e.g.:

    - results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
    + results <- subscraping.Result{
    +     Source: s.Name(),
    +     Type:   subscraping.Error,
    +     Error:  fmt.Errorf("chinaz: missing or empty ContributingSubdomainList in response"),
    + }

    (and then optionally wrap that send in the ctx-aware select as above).

These are small, self-contained changes but would make error handling and cancellation behavior more consistent with the rest of the PR.

Also applies to: 72-74

pkg/subscraping/sources/windvane/windvane.go (1)

78-83: Consider wrapping error emissions in select for complete context cancellation support.

The error result sends at lines 79 and 90 are not wrapped in select statements. If the context is cancelled while an error is being processed, these sends could block indefinitely if the consumer has stopped reading from the channel.

For consistency with the PR's objective of comprehensive context cancellation support, consider applying the same select pattern used for subdomain results:

 if err != nil {
-	results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-	s.errors++
+	select {
+	case <-ctx.Done():
+		return
+	case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+		s.errors++
+	}
 	session.DiscardHTTPResponse(resp)
 	return
 }

Apply the same pattern at line 90.

Also applies to: 89-94

pkg/subscraping/sources/redhuntlabs/redhuntlabs.go (1)

59-61: Error paths still perform unguarded sends and can block after cancellation

All success-path result sends are now guarded by select on ctx.Done(), but the error paths still send directly to results:

  • Line 61 (initial request error)
  • Line 69 (initial decode error)
  • Line 87 (paginated request error)
  • Line 95 (paginated decode error)

If the upstream consumer stops reading from this source’s channel when ctx is canceled, these unbuffered sends can still block indefinitely and keep the goroutine alive, which partially defeats the PR’s goal of fast cancellation across sources.

Consider applying the same select pattern to error emissions, while ensuring responses are always discarded before potentially returning on cancellation. For example, for the paginated decode error case:

-        err = jsoniter.NewDecoder(resp.Body).Decode(&response)
-        if err != nil {
-          results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-          session.DiscardHTTPResponse(resp)
-          s.errors++
-          continue
-        }
+        err = jsoniter.NewDecoder(resp.Body).Decode(&response)
+        if err != nil {
+          session.DiscardHTTPResponse(resp)
+          s.errors++
+          select {
+          case <-ctx.Done():
+            return
+          case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+          }
+          continue
+        }

A similar pattern can be applied to the other error branches so that all sends on results become cancellation-aware while still cleaning up HTTP responses.

Also applies to: 69-72, 86-90, 93-99

pkg/subscraping/sources/rsecloud/rsecloud.go (1)

50-64: Add ctx-aware select around error sends to avoid blocking after cancellation

The loop-level ctx.Done() check at Lines 53-57 is good and prevents new requests after cancellation. However, in the error paths you still send on results without a select:

  • Line 60: results <- subscraping.Result{... Error: err}
  • Line 70: results <- subscraping.Result{... Error: err}

If the upstream has stopped reading from results because it honored ctx.Done(), these sends can block indefinitely, leaking this goroutine and partially defeating the context-cancellation guarantees you’re adding. Also, s.errors is currently incremented even if the send would block.

Consider mirroring the pattern you use for successful results:

  • Wrap both error sends in select { case <-ctx.Done(): return; case results <- ... }.
  • Move s.errors++ into the results case, so it only increments on successful delivery.

For example:

-				resp, err := session.Get(ctx, fmt.Sprintf("https://api.rsecloud.com/api/v2/subdomains/%s/%s?page=%d", endpoint, domain, page), "", headers)
-				if err != nil {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-					s.errors++
-					session.DiscardHTTPResponse(resp)
-					return
-				}
+				resp, err := session.Get(ctx, fmt.Sprintf("https://api.rsecloud.com/api/v2/subdomains/%s/%s?page=%d", endpoint, domain, page), "", headers)
+				if err != nil {
+					session.DiscardHTTPResponse(resp)
+					select {
+					case <-ctx.Done():
+						return
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+						s.errors++
+					}
+					return
+				}
@@
-				if err != nil {
-					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-					s.errors++
-					return
-				}
+				if err != nil {
+					select {
+					case <-ctx.Done():
+						return
+					case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+						s.errors++
+					}
+					return
+				}

This aligns error handling with the PR’s goal of being fully cancellation-aware in all channel writes and counters.

pkg/subscraping/sources/intelx/intelx.go (1)

81-86: Guard error sends with ctx.Done() as well to avoid blocking on cancellation

All error paths still do a plain results <- subscraping.Result{...}. If the caller stops reading from results after canceling ctx, these sends can block indefinitely, leaking the goroutine and, in some branches, delaying session.DiscardHTTPResponse(resp).

To fully meet the “no blocked sends after cancellation” objective, consider guarding these error emissions with the same pattern you used for subdomain results, e.g.:

- body, err := json.Marshal(reqBody)
- if err != nil {
-     results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-     s.errors++
-     return
- }
+ body, err := json.Marshal(reqBody)
+ if err != nil {
+     select {
+     case <-ctx.Done():
+         return
+     case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+         s.errors++
+     }
+     return
+ }

Similarly, for the branches that have a resp to discard, you can discard first, then use the same select pattern so response cleanup is not skipped when the context is canceled.

Also applies to: 88-94, 96-103, 115-121, 123-129, 131-137

pkg/subscraping/sources/dnsdumpster/dnsdumpster.go (1)

49-65: Also guard error-path sends with ctx.Done() to avoid blocking after cancellation

In both error branches you still send directly to results:

if err != nil {
    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
    s.errors++
    session.DiscardHTTPResponse(resp)
    return
}
...
if err != nil {
    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
    s.errors++
    session.DiscardHTTPResponse(resp)
    return
}

If the caller cancels the context and stops reading from the aggregated results channel, these plain sends can still block indefinitely and leak this goroutine, which undercuts the PR objective of making all emissions cancellation-aware.

You can mirror the pattern used in the records loop and only increment s.errors on a successful send, while respecting ctx and still cleaning up the HTTP response:

-   if err != nil {
-       results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-       s.errors++
-       session.DiscardHTTPResponse(resp)
-       return
-   }
+   if err != nil {
+       select {
+       case <-ctx.Done():
+           session.DiscardHTTPResponse(resp)
+           return
+       case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+           s.errors++
+       }
+       session.DiscardHTTPResponse(resp)
+       return
+   }
@@
-   err = json.NewDecoder(resp.Body).Decode(&response)
-   if err != nil {
-       results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-       s.errors++
-       session.DiscardHTTPResponse(resp)
-       return
-   }
+   err = json.NewDecoder(resp.Body).Decode(&response)
+   if err != nil {
+       select {
+       case <-ctx.Done():
+           // `defer session.DiscardHTTPResponse(resp)` will still run
+           return
+       case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+           s.errors++
+       }
+       return
+   }

This way all result emissions from this source are ctx-aware, and you avoid potential goroutine leaks on error paths.

pkg/subscraping/sources/riddler/riddler.go (1)

32-37: Error path still does a blocking send and can ignore context cancellation

On HTTP error you still do an unconditional:

results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
s.errors++
session.DiscardHTTPResponse(resp)
return

If ctx has been cancelled and upstream has stopped reading from results, this send can block indefinitely, which is exactly what this PR is trying to prevent.

You should make this path context-aware as well, e.g.:

-    resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://riddler.io/search?q=pld:%s&view_type=data_table", domain))
-    if err != nil {
-        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-        s.errors++
-        session.DiscardHTTPResponse(resp)
-        return
-    }
+    resp, err := session.SimpleGet(ctx, fmt.Sprintf("https://riddler.io/search?q=pld:%s&view_type=data_table", domain))
+    if err != nil {
+        defer session.DiscardHTTPResponse(resp)
+        select {
+        case <-ctx.Done():
+            // Context cancelled; just exit without emitting an error result.
+            return
+        case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+            s.errors++
+        }
+        return
+    }

This keeps the error emission consistent with the new context-aware send logic and avoids a potential goroutine leak when cancellation races with an HTTP error.

pkg/subscraping/sources/gitlab/gitlab.go (2)

36-60: Fix race condition: early returns bypass wg.Wait(), causing send-on-closed-channel panic

The returns at lines 127 and 135 exit enumerate() immediately when ctx.Done() fires or on QueryUnescape error, but do so before reaching wg.Wait() at line 142. This leaves spawned goroutines (lines 91–121) still sending to results after Run() defers close(results) at line 44, causing a panic.

Change the early returns to break the pagination loop and defer the return until after wg.Wait():

 	linksHeader := linkheader.Parse(resp.Header.Get("Link"))
+	cancelled := false
 	for _, link := range linksHeader {
 		select {
 		case <-ctx.Done():
-			return
+			cancelled = true
+			break
 		default:
 		}
 		if link.Rel == "next" {
 			nextURL, err := url.QueryUnescape(link.URL)
 			if err != nil {
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
 				s.errors++
-				return
+				cancelled = true
+				break
 			}
 
 			s.enumerate(ctx, nextURL, domainRegexp, headers, session, results)
 		}
 	}
 
 	wg.Wait()
+	if cancelled {
+		return
+	}

This ensures all goroutines complete before enumerate() returns, preventing sends to a closed channel.


87-121: Move defer wg.Done() to the top of the goroutine to ensure it runs on all exit paths

Inside the per-item goroutine, defer wg.Done() is placed at line 121, at the very end of the function body. The error handling path at line 102 returns early before reaching the defer, skipping wg.Done() and leaving the WaitGroup counter > 0. When wg.Wait() is called at line 142, it will block indefinitely.

Move defer wg.Done() to the top of the goroutine immediately after the function declaration:

 	for _, it := range items {
 		go func(item item) {
+			defer wg.Done()
+
 			// The original item.Path causes 404 error because the Gitlab API is expecting the url encoded path
 			fileUrl := fmt.Sprintf("https://gitlab.com/api/v4/projects/%d/repository/files/%s/raw?ref=%s", item.ProjectId, url.QueryEscape(item.Path), item.Ref)
 			resp, err := session.Get(ctx, fileUrl, "", headers)
 			if err != nil {
 				if resp == nil || (resp != nil && resp.StatusCode != http.StatusNotFound) {
 					session.DiscardHTTPResponse(resp)
 
 					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
 					s.errors++
 					return
 				}
 			}
 
 			if resp.StatusCode == http.StatusOK {
 				scanner := bufio.NewScanner(resp.Body)
 				for scanner.Scan() {
 					line := scanner.Text()
 					if line == "" {
 						continue
 					}
 					for _, subdomain := range domainRegexp.FindAllString(line, -1) {
 						results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
 						s.results++
 					}
 				}
 				session.DiscardHTTPResponse(resp)
 			}
-			defer wg.Done()
 		}(it)
 	}
pkg/subscraping/sources/threatcrowd/threatcrowd.go (1)

41-79: Guard all error-path sends with ctx to prevent goroutine leaks on context cancellation

The unconditional sends at lines 43, 50, 57–60, 63, 70, and 76 can block indefinitely if the caller cancels ctx and stops reading from the results channel. This leaves goroutines stuck on channel sends, preventing sync.WaitGroup.Wait() from completing in the manager and causing goroutine leaks.

The select pattern with ctx.Done() is already used in commoncrawl.go and should be applied consistently here to all six error-path sends:

@@
-        req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
-        if err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            return
-        }
+        req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
+        if err != nil {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            return
+        }
@@
-        resp, err := session.Client.Do(req)
-        if err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            return
-        }
+        resp, err := session.Client.Do(req)
+        if err != nil {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            return
+        }
@@
-        defer func() {
-            if err := resp.Body.Close(); err != nil {
-                results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-                s.errors++
-            }
-        }()
+        defer func() {
+            if err := resp.Body.Close(); err != nil {
+                select {
+                case <-ctx.Done():
+                    return
+                case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                    s.errors++
+                }
+            }
+        }()
@@
-        if resp.StatusCode != http.StatusOK {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("unexpected status code: %d", resp.StatusCode)}
-            s.errors++
-            return
-        }
+        if resp.StatusCode != http.StatusOK {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("unexpected status code: %d", resp.StatusCode)}:
+                s.errors++
+            }
+            return
+        }
@@
-        body, err := io.ReadAll(resp.Body)
-        if err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            return
-        }
+        body, err := io.ReadAll(resp.Body)
+        if err != nil {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            return
+        }
@@
-        if err := json.Unmarshal(body, &tcResponse); err != nil {
-            results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-            s.errors++
-            return
-        }
+        if err := json.Unmarshal(body, &tcResponse); err != nil {
+            select {
+            case <-ctx.Done():
+                return
+            case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+                s.errors++
+            }
+            return
+        }

A test that cancels ctx during HTTP failures or unmarshalling would validate this behavior.

♻️ Duplicate comments (1)
pkg/subscraping/sources/quake/quake.go (1)

105-109: Context check added correctly.

The non-blocking context check before processing each item is appropriate and follows the PR objectives for adding ctx.Done() checks in processing loops.

Note: The need to wrap the channel send on line 114 with a select statement was already flagged in the previous comment.

🧹 Nitpick comments (15)
pkg/subscraping/sources/reconcloud/reconcloud.go (1)

64-71: Context-aware result emission is correct; consider applying the same pattern to error sends

The new select around per-asset result emission cleanly respects ctx.Done() and avoids blocked sends, while only incrementing s.results on successful delivery. This aligns well with the PR’s cancellation goals.

For full consistency with “don’t block on send after cancellation”, you might also want to wrap the two error sends (Lines 47–50 and 56–59) in a similar select { case <-ctx.Done(): ... case results <- ... } pattern. That would ensure this goroutine never blocks on an error send if the upstream has already stopped consuming due to context cancellation.

pkg/subscraping/sources/leakix/leakix.go (1)

44-65: Consider making error result sends context-aware as well

The success-path sends are now guarded by a select on ctx.Done(), but the error paths still do direct sends on results (Lines 44-47, 52-55, 60-64). If the upstream consumer ever stops reading results when the context is cancelled, these sends could block and keep the goroutine alive.

To make this source fully consistent with the new pattern and the PR objective, consider wrapping error sends in a similar select:

-		if err != nil {
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-			s.errors++
-			return
-		}
+		if err != nil {
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+				s.errors++
+			}
+			return
+		}

(and likewise for the non‑200 status and JSON decode error paths).

pkg/subscraping/sources/netlas/netlas.go (1)

160-166: Context-aware result emission looks correct; consider mirroring for error sends

The new select on ctx.Done() vs results <- cleanly handles cancellation: the goroutine exits promptly on ctx.Done(), defers still run (so results gets closed and timeTaken recorded), and s.results is only incremented on a successful send. This is aligned with the PR’s cancellation goals.

As a follow-up, you might also wrap the earlier error-result sends in this function in a similar select { case <-ctx.Done(): return; case results <- ... } pattern, so they don’t block if the caller has already canceled and stopped reading from the channel.

pkg/subscraping/sources/onyphe/onyphe.go (1)

78-82: Consider wrapping error sends in select statements for consistency.

For complete context cancellation support, the error result sends (lines 79 and 88) should also be wrapped in select statements with ctx.Done() checks. While these are in error paths that return immediately, they could still block if the context is canceled and the receiver has stopped consuming.

Apply this pattern:

 			if err != nil {
-				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-				s.errors++
+				select {
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+					s.errors++
+				case <-ctx.Done():
+				}
 				session.DiscardHTTPResponse(resp)
 				return
 			}

Also applies to: 86-91

pkg/subscraping/sources/digitorus/digitorus.go (1)

63-69: Result emission is now cancellation‑safe; consider doing the same for error sends

Wrapping subdomain emission in:

  • a select between ctx.Done() and results <- ..., and
  • incrementing s.results only in the successful send branch

is exactly what you want for cancellation‑safe, non‑blocking sends.

To fully match the PR goal (“use ctx.Done() when sending results into channels”), consider applying the same select { case <-ctx.Done(): return ... } pattern to the error sends at Line 38 and Line 46. That would make all emissions from this source safe if the consumer stops reading after cancellation, not just the hot subdomain loop.

pkg/subscraping/sources/waybackarchive/waybackarchive.go (1)

36-36: Consider wrapping error send in select for consistency.

While the caller is typically still listening at this early stage, wrapping this error send in a select with ctx.Done() would provide complete consistency with the pattern applied elsewhere in this PR.

Apply this diff for consistency:

-		results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-		s.errors++
+		select {
+		case <-ctx.Done():
+			return
+		case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+			s.errors++
+		}
pkg/subscraping/sources/robtex/robtext.go (1)

60-60: Consider aligning error counter semantics with results counter.

The error counter is incremented unconditionally, while the results counter (line 82) is only incremented upon successful channel send. For consistency, consider incrementing the error counter only after successfully sending the error result to the channel (which would be inside the select statement if the critical issue above is addressed).

Also applies to: 74-74

pkg/subscraping/sources/digitalyama/digitalyama.go (1)

53-53: Consider wrapping error sends in select statements for consistency.

While the result sends (lines 97-102) correctly use select with ctx.Done() to prevent blocking, the error sends throughout the file do not follow the same pattern. If the context is cancelled and the receiver stops reading from the channel, these error sends could block indefinitely, potentially causing goroutine leaks.

For consistency with the context-aware result sending pattern and to fully align with the PR objectives, consider wrapping error sends in select statements as well.

Example pattern for error sends:

-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-			s.errors++
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+				s.errors++
+			}

This pattern should be applied to all error sends (lines 53, 59, 74, 80, 82, 91) to ensure uniform context-cancellation behavior across all channel operations.

Also applies to: 59-59, 74-74, 80-80, 82-82, 91-91

pkg/subscraping/sources/chinaz/chinaz.go (1)

58-69: Context-aware loop/send looks correct; pre-loop select can be simplified.

The new pattern in the loop correctly:

  • Exits promptly on ctx.Done().
  • Guards the send with a select, so the goroutine won’t stay blocked after cancellation.
  • Increments s.results only on successful sends.

The select at Lines 58-62 is functionally redundant with the later select at Lines 64-69 (which already handles ctx.Done() before/while sending). For readability and a tiny simplification, you could either:

  • Drop the first select entirely and rely on the send select, or
  • Replace it with a cheaper if ctx.Err() != nil { return } pre-check if you want an explicit early-exit before doing jsoniter.Get.

This is non-blocking for merge, just a minor cleanup.

pkg/subscraping/sources/pugrecon/pugrecon.go (1)

108-114: Context-aware result send looks good; consider aligning error sends with the same pattern

The new select around the subdomain emission correctly:

  • Stops the goroutine promptly on ctx.Done().
  • Avoids blocking on results <- ... when the consumer stops reading.
  • Increments s.results only on successful sends.

To fully align this source with the PR’s cancellation goal and avoid any chance of blocking on error paths if the consumer stops reading after cancellation, consider wrapping the earlier error sends (results <- subscraping.Result{Type: subscraping.Error, ...} at lines similar to 60, 76, 83, 95, 103) in the same select pattern and moving the s.errors++ into the successful send branch.

For example, the first error case could be updated like this:

-    if err != nil {
-        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("failed to marshal request body: %w", err)}
-        s.errors++
-        return
-    }
+    if err != nil {
+        select {
+        case <-ctx.Done():
+            return
+        case results <- subscraping.Result{
+            Source: s.Name(),
+            Type:   subscraping.Error,
+            Error:  fmt.Errorf("failed to marshal request body: %w", err),
+        }:
+            s.errors++
+        }
+        return
+    }

You could apply the same pattern to the other error-emission sites in this function for consistency and to guarantee non-blocking behavior under cancellation.

pkg/subscraping/sources/virustotal/virustotal.go (1)

54-58: Context check in pagination loop is correct; consider simpler idiomatic form

The select here does the right thing and stops pagination promptly when ctx is cancelled. If you want to shave some noise, this can be expressed a bit more idiomatically as a direct error check:

-        for {
-            select {
-            case <-ctx.Done():
-                return
-            default:
-            }
+        for {
+            if err := ctx.Err(); err != nil {
+                return
+            }

Behavior is the same, just a bit easier to scan.

pkg/subscraping/sources/c99/c99.go (1)

59-79: Consider making error-result sends context-aware as well

The success path now respects ctx.Done(), but the error paths still perform unconditional sends to results:

  • Deferred resp.Body.Close() error (lines 59-63).
  • JSON decode error (lines 68-71).
  • API-level error in response.Error (lines 74-79).

If the context is cancelled and upstream stops reading from results, these sends can still block and keep the goroutine alive, partially defeating the goal of prompt cancellation.

You can align these with the new pattern by guarding them with ctx.Done() as well. For example:

-		defer func() {
-			if err := resp.Body.Close(); err != nil {
-				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-				s.errors++
-			}
-		}()
+		defer func() {
+			if err := resp.Body.Close(); err != nil {
+				select {
+				case <-ctx.Done():
+					return
+				case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+					s.errors++
+				}
+			}
+		}()

 		var response dnsdbLookupResponse
 		err = jsoniter.NewDecoder(resp.Body).Decode(&response)
 		if err != nil {
-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-			s.errors++
-			return
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+				s.errors++
+			}
+			return
 		}

 		if response.Error != "" {
-			results <- subscraping.Result{
-				Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("%v", response.Error),
-			}
-			s.errors++
-			return
+			select {
+			case <-ctx.Done():
+				return
+			case results <- subscraping.Result{
+				Source: s.Name(), Type: subscraping.Error, Error: fmt.Errorf("%v", response.Error),
+			}:
+				s.errors++
+			}
+			return
 		}

This keeps behavior the same under normal operation but prevents blocked sends after cancellation, bringing the error paths in line with the PR’s context-cancellation goal.

pkg/subscraping/sources/hackertarget/hackertarget.go (1)

49-53: Context check before processing each scanned line looks good

The non-blocking select on ctx.Done() before handling each line should let this goroutine exit quickly on cancellation while avoiding any extra blocking. That aligns well with the PR goal.

As a small optional improvement, you could also check scanner.Err() after the loop and emit an error result (guarded by ctx.Done() similarly), so read errors (including context-related ones) are surfaced instead of silently ignored.

pkg/subscraping/sources/riddler/riddler.go (1)

42-59: Context-aware scanning and result emission look correct; minor duplication only

The added select blocks in the scan loop and per-subdomain send correctly:

  • Stop processing promptly on ctx.Done().
  • Ensure session.DiscardHTTPResponse(resp) is called before returning.
  • Increment s.results only when the send on results actually succeeds.

This matches the PR goal for cooperative cancellation. The only minor nit is that the ctx.Done() + DiscardHTTPResponse + return pattern is duplicated twice; if this shows up in many sources, consider a small helper to reduce repetition.

pkg/subscraping/sources/gitlab/gitlab.go (1)

69-83: Align result sends with ctx cancellation to avoid blocked sends

Per the PR objective, sources should avoid blocking or continuing to emit results after ctx cancellation. In this file, all sends to results are direct:

results <- subscraping.Result{...}

If downstream stops reading from results when ctx is cancelled, these sends can still block goroutines even after pagination stops.

To align with the rest of the PR, consider wrapping sends in a select that respects ctx.Done(); for example:

-	results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
+	select {
+	case <-ctx.Done():
+		return
+	case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+	}

and similarly for the subdomain emission:

-					for _, subdomain := range domainRegexp.FindAllString(line, -1) {
-						results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
-						s.results++
-					}
+					for _, subdomain := range domainRegexp.FindAllString(line, -1) {
+						select {
+						case <-ctx.Done():
+							return
+						case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
+						}
+						s.results++
+					}

This keeps goroutines from getting stuck on channel sends once the caller cancels the context, matching the intended behavior described in the linked issue. Based on PR objectives, this makes the GitLab source consistent with other updated sources.

Also applies to: 95-115

Comment on lines +86 to +90
select {
case <-ctx.Done():
return
default:
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Context check is incomplete; result emission also needs protection.

While checking for cancellation before processing each subdomain is a good step, the subsequent result emission on line 98 is still a direct channel send that can block if the receiver has stopped due to cancellation.

Per the PR objectives and the pattern adopted by other sources, result emission should also be wrapped in a context-aware select.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/fofa/fofa.go around lines 86 to 90, the code checks
ctx.Done() before processing each subdomain but still performs an unconditional
send to the result channel later (around line 98), which can block if the
receiver has stopped; change that send to be context-aware by replacing the
direct channel send with a select that either sends the result on the result
channel or returns/continues when ctx.Done() fires, ensuring you handle
cancellation without blocking (mirror the pattern used elsewhere in other source
implementations).

Comment on lines 192 to 200
for _, subdomain := range domainRegexp.FindAllString(normalizeContent(line), -1) {
results <- subscraping.Result{Source: name, Type: subscraping.Subdomain, Value: subdomain}
s.results++
select {
case <-ctx.Done():
session.DiscardHTTPResponse(resp)
return
case results <- subscraping.Result{Source: name, Type: subscraping.Subdomain, Value: subdomain}:
s.results++
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Data race on s.results increment.

Multiple goroutines spawned at line 161 concurrently increment s.results without synchronization. This applies to both line 198 and line 216.

Options to fix:

  1. Use atomic.AddInt32 for s.results (change field type to int32)
  2. Use a mutex to protect the increment
  3. Send increments through a channel and aggregate in the parent
+import "sync/atomic"

 type Source struct {
 	apiKeys   []string
 	timeTaken time.Duration
 	errors    int
-	results   int
+	results   int32
 	skipped   bool
 }

Then update the increments:

 case results <- subscraping.Result{Source: name, Type: subscraping.Subdomain, Value: subdomain}:
-	s.results++
+	atomic.AddInt32(&s.results, 1)

And update Statistics() to use atomic.LoadInt32(&s.results).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/github/github.go around lines 192–200 (and the other
increment at ~216), s.results is being incremented concurrently causing a data
race; change s.results type to int32, import sync/atomic, replace all direct
increments (s.results++ at lines ~198 and ~216) with atomic.AddInt32(&s.results,
1), and update Statistics() to return atomic.LoadInt32(&s.results) (or convert
the loaded int32 to the expected return type). Ensure the new import is added
and all other accesses to s.results use atomic operations.

Comment on lines +60 to +65
select {
case <-ctx.Done():
return
case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
s.results++
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Result send is now ctx-aware, but error/close sends can still block after cancellation

The select around the subdomain result send correctly prevents this goroutine from blocking indefinitely if ctx is canceled and the caller stops reading from results, and only increments s.results on a successful send. That part looks solid.

However, the error sends at Line 34 (SimpleGet error) and Line 42 (resp.Body.Close() error) still write to results unconditionally. If ctx is already canceled and upstream has stopped consuming from this channel, those sends can block and keep the goroutine alive, which undermines the “stop promptly on cancellation” objective.

Consider wrapping those error sends in the same select pattern you use here, for example:

select {
case <-ctx.Done():
    return
case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
    s.errors++
}

This would make all sends from this source respect context cancellation consistently, including error paths.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/hackertarget/hackertarget.go around lines 60 to 65,
the result send uses a ctx-aware select but the earlier error sends (around
lines ~34 and ~42) still write to results unconditionally and can block after
cancellation; change those unconditional error sends to the same ctx-aware
select pattern (case <-ctx.Done(): return; case results <-
subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
s.errors++), and apply the same pattern for any resp.Body.Close() error send so
error sends respect context cancellation and only increment counters on a
successful non-blocking send.

Comment on lines +63 to +68
select {
case <-ctx.Done():
return
case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
s.results++
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Context‑aware send for subdomain results is good; align error paths with the same pattern

The new select around the subdomain result send correctly respects ctx.Done() and only increments s.results on successful sends. This matches the PR goal of making result emission cancellation‑aware. 👍
However, the two error paths above still do a bare send to results:

  • Line 45: results <- subscraping.Result{... Type: subscraping.Error ...}
  • Line 55: results <- subscraping.Result{... Type: subscraping.Error ...}

If the caller cancels the context and stops draining the results channel, these sends can still block indefinitely, leaking this goroutine and partially defeating the context‑cancellation objective.

I’d recommend using the same select pattern for the error cases and only incrementing s.errors when the send succeeds, e.g.:

@@
-    if err != nil {
-        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-        s.errors++
-        session.DiscardHTTPResponse(resp)
-        return
-    }
+    if err != nil {
+        select {
+        case <-ctx.Done():
+            session.DiscardHTTPResponse(resp)
+            return
+        case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+            s.errors++
+        }
+        session.DiscardHTTPResponse(resp)
+        return
+    }
@@
-    if err != nil {
-        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-        s.errors++
-        session.DiscardHTTPResponse(resp)
-        return
-    }
+    if err != nil {
+        select {
+        case <-ctx.Done():
+            session.DiscardHTTPResponse(resp)
+            return
+        case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+            s.errors++
+        }
+        session.DiscardHTTPResponse(resp)
+        return
+    }

If this PR introduces a shared helper for cancellation‑aware sends elsewhere, you could also delegate to that here to keep all sources consistent.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/hudsonrock/hudsonrock.go around lines 63-68, the
select used for sending subdomain results is cancellation-aware but the earlier
error sends at lines ~45 and ~55 still do bare sends that can block if the
caller cancels; change those two bare sends to use the same select pattern
(select on ctx.Done() and on sending the subscraping.Result to results) and only
increment s.errors when the send case succeeds; if you have a shared
cancellation-aware send helper, call that instead to keep behavior consistent
across sources.

Comment on lines +66 to +70
select {
case <-ctx.Done():
return
default:
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Good addition, but implementation is incomplete.

The context check correctly prevents starting a new pagination iteration after cancellation. However, the PR objectives explicitly state: "Use select with ctx.Done() when sending results into channels to avoid blocked sends after cancellation."

All channel sends in this function (lines 76, 85, 93-95, 114) still use direct send operations, which can block indefinitely if the receiver stops consuming due to context cancellation.

Apply this pattern to all channel sends:

-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-			s.errors++
+			select {
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+				s.errors++
+			case <-ctx.Done():
+				return
+			}

Similarly for lines 85, 93-95, and 114-115. For line 114-115, ensure the counter is only incremented when the send succeeds:

-			results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
-			s.results++
+			select {
+			case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}:
+				s.results++
+			case <-ctx.Done():
+				return
+			}

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +73 to +78
select {
case <-ctx.Done():
return
case results <- subscraping.Result{Source: "sitedossier", Type: subscraping.Subdomain, Value: subdomain}:
s.results++
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Context-aware result send looks good, but error paths still ignore ctx cancellation

The new select around the subdomain send correctly avoids blocking when ctx is canceled and the consumer stops reading. However, the two error sends above (around Line 56 and Line 64) still use a plain results <- ..., so they can still block indefinitely after cancellation and leak this goroutine, which goes against the PR’s objective.

I’d mirror the same pattern for error emissions, e.g.:

@@
-	if err != nil && !isnotfound {
-		results <- subscraping.Result{Source: "sitedossier", Type: subscraping.Error, Error: err}
-		s.errors++
-		session.DiscardHTTPResponse(resp)
-		return
-	}
+	if err != nil && !isnotfound {
+		select {
+		case <-ctx.Done():
+			session.DiscardHTTPResponse(resp)
+			return
+		case results <- subscraping.Result{Source: "sitedossier", Type: subscraping.Error, Error: err}:
+			s.errors++
+		}
+		session.DiscardHTTPResponse(resp)
+		return
+	}
@@
-	body, err := io.ReadAll(resp.Body)
-	if err != nil {
-		results <- subscraping.Result{Source: "sitedossier", Type: subscraping.Error, Error: err}
-		s.errors++
-		session.DiscardHTTPResponse(resp)
-		return
-	}
+	body, err := io.ReadAll(resp.Body)
+	if err != nil {
+		select {
+		case <-ctx.Done():
+			session.DiscardHTTPResponse(resp)
+			return
+		case results <- subscraping.Result{Source: "sitedossier", Type: subscraping.Error, Error: err}:
+			s.errors++
+		}
+		session.DiscardHTTPResponse(resp)
+		return
+	}

(You could also factor this into a small helper to avoid repetition if this pattern appears in multiple sources.)

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/sitedossier/sitedossier.go around lines 56, 64 and
73-78, the error sends use a blocking results <- ... which can hang after ctx
cancellation; replace those plain sends with the same context-aware select used
for subdomain sends (select { case <-ctx.Done(): return; case results <-
subscraping.Result{...}: s.results++ }) so errors won't block when the consumer
stops, and optionally factor the pattern into a small helper like
sendResult(ctx, results, result) to avoid repetition.

Comment on lines 85 to +91
for _, subdomain := range data.Data {
results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain.Id}
s.results++
select {
case <-ctx.Done():
return
case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain.Id}:
s.results++
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make all sends to results context‑aware, not just subdomain emissions

The new select around subdomain emission is good: it avoids blocking on results when ctx is cancelled and only increments s.results on successful sends.

However, other sends in this function are still unconditional:

  • HTTP error path (Lines 63-68)
  • resp.Body.Close() error in the defer (Lines 70-75)
  • JSON decode error (Lines 77-82)

If the caller cancels ctx and stops draining results, these sends can still block indefinitely, defeating the goal of context cancellation and potentially leaking this goroutine.

You can apply the same pattern you used for subdomain results to the error paths, for example:

 resp, err := session.Get(ctx, url, "", map[string]string{"x-apikey": randomApiKey})
 if err != nil {
-    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-    s.errors++
-    session.DiscardHTTPResponse(resp)
-    return
+    select {
+    case <-ctx.Done():
+        // context cancelled; just clean up and exit
+    case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+        s.errors++
+    }
+    session.DiscardHTTPResponse(resp)
+    return
 }
 defer func() {
     if err := resp.Body.Close(); err != nil {
-        results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-        s.errors++
+        select {
+        case <-ctx.Done():
+            return
+        case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+            s.errors++
+        }
     }
 }()
@@
 var data response
 err = jsoniter.NewDecoder(resp.Body).Decode(&data)
 if err != nil {
-    results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}
-    s.errors++
-    return
+    select {
+    case <-ctx.Done():
+        return
+    case results <- subscraping.Result{Source: s.Name(), Type: subscraping.Error, Error: err}:
+        s.errors++
+    }
+    return
 }

This keeps behavior the same when ctx is active while ensuring the source fully respects cancellation and never blocks on results after ctx.Done().

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In pkg/subscraping/sources/virustotal/virustotal.go around lines 63-91, several
sends to the results channel (HTTP error path lines 63-68, resp.Body.Close()
error in the defer lines 70-75, JSON decode error lines 77-82) are unconditional
and can block if the caller cancels ctx; make them context-aware like the
subdomain send at lines 85-91 by replacing each direct send with a select that
first checks <-ctx.Done() (and returns / skips sending) and otherwise sends to
results, and only increment s.results in the case branch where the send
succeeds; apply the same pattern inside the defer (use a select that avoids
blocking if ctx is done) so no send can block after cancellation.

Copy link
Copy Markdown
Member

@tarunKoyalwar tarunKoyalwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@ehsandeep ehsandeep merged commit deccd62 into dev Dec 4, 2025
10 checks passed
@ehsandeep ehsandeep deleted the fix/context-cancellation-sources branch December 4, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add context cancellation support to all enumeration sources

3 participants