fix(core): Add retries with exponential backoff for transient S3 read failures before any data is read.#2203
Conversation
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/JsonParser.cpp`:
- Around line 649-650: The retry loop uses constexpr size_t cMaxRetries{3}
together with the condition attempt <= cMaxRetries, causing 4 attempts (attempts
0..3); either change the loop condition to stop at the intended total attempts
(e.g., use attempt < cMaxRetries for exactly 3 total attempts) or
rename/repurpose the constant to cMaxAttempts and document that it includes the
initial try (and keep the <= condition). Update all occurrences (including the
similar block referenced around lines 655–666) to use the chosen convention
consistently and adjust the constant name or loop condition accordingly so the
semantics match the identifier.
- Around line 668-672: try_create_reader(path, m_network_auth) can fail
transiently but the current code returns false immediately; change
JsonParser.cpp so that when try_create_reader returns nullptr you inspect the
underlying failure (use the same retry/ backoff policy used elsewhere in this
component) and retry reader creation a few times before giving up, only
returning false if the error is determined non-retryable; ensure you still call
m_archive_writer->close() on final failure and reference try_create_reader,
reader, path and m_network_auth when locating the logic to modify (or if you
intentionally decide not to retry, add a brief comment explaining why reader
creation is not retryable).
In `@components/core/src/clp_s/log_converter/log_converter.cpp`:
- Around line 23-24: Both JsonParser::ingest() and convert_files() duplicate
retry constants (cMaxRetries, cInitialBackoff) and loop structure; extract the
constants into a shared header (e.g., Utils.hpp) and factor the retry loop into
a reusable helper (e.g., retry_with_backoff or RetryPolicy) that takes a
callable, max retries and initial backoff so both JsonParser::ingest() and
convert_files() call the same helper; update references to use the shared
constants and helper to remove the duplicated loop/constant definitions and keep
behavior identical.
- Around line 62-68: The reader creation block using clp_s::try_create_reader
currently returns false immediately when reader == nullptr; implement retry
logic similar to JsonParser::ingest so transient network failures are retried:
call try_create_reader(path, command_line_arguments.get_network_auth()) in a
short-loop with a limited number of attempts and exponential/backoff delays,
logging each attempt (use SPDLOG_ERROR/INFO) and only return false after
exhausting retries; ensure the variable reader is rechecked after each attempt
and preserve existing error context in logs.
- Around line 80-91: The code currently calls
clp_s::NetworkUtils::check_and_log_curl_error twice when a retryable CURL error
occurs and attempt >= cMaxRetries; modify the control flow in the block around
clp_s::NetworkUtils::is_retryable_curl_error(reader.get()) so that
check_and_log_curl_error is only invoked once on the terminal failure path: if
is_retryable returns true and attempt < cMaxRetries, perform the retry
(continue) after optionally logging for retry; otherwise (final attempt or
non-retryable), call check_and_log_curl_error(path.path, reader.get()) exactly
once before emitting the SPDLOG_ERROR and returning false. Ensure you reference
the existing symbols attempt, cMaxRetries,
NetworkUtils::is_retryable_curl_error, and
NetworkUtils::check_and_log_curl_error when making the change.
In `@components/core/src/clp_s/Utils.cpp`:
- Around line 186-194: The current retry check treats CURLE_HTTP_RETURNED_ERROR
(from curl_error_info->code()) as always retryable but that includes 4xx client
errors; update the logic in Utils.cpp (the function that inspects
curl_error_info->code()) to distinguish HTTP 5xx from 4xx by calling
curl_easy_getinfo on the CURL handle to fetch CURLINFO_RESPONSE_CODE and only
treat CURLE_HTTP_RETURNED_ERROR as retryable when the response_code is in the
500–599 range; keep the existing checks for CURLE_SSL_CONNECT_ERROR,
CURLE_GOT_NOTHING, CURLE_RECV_ERROR, and CURLE_OPERATION_TIMEDOUT unchanged.
Ensure you access the CURL handle associated with the error (the same one used
to produce curl_error_info) to call curl_easy_getinfo and handle any failure to
get the response code conservatively (do not assume 5xx).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cc080208-cb4a-4c30-b558-158d0a1fff4c
📥 Commits
Reviewing files that changed from the base of the PR and between 9ae078e and 42c09539afd8560d4d2f43e525e90cb5bb55f7e3.
📒 Files selected for processing (4)
components/core/src/clp_s/JsonParser.cppcomponents/core/src/clp_s/Utils.cppcomponents/core/src/clp_s/Utils.hppcomponents/core/src/clp_s/log_converter/log_converter.cpp
| constexpr size_t cMaxRetries{3}; | ||
| constexpr std::chrono::seconds cInitialBackoff{1}; |
There was a problem hiding this comment.
Retry loop executes 4 attempts, not 3 retries.
The loop condition attempt <= cMaxRetries with cMaxRetries{3} means the loop runs for attempt values 0, 1, 2, 3 — that's 4 total attempts (1 initial + 3 retries). If the intention is 3 retries after the initial attempt, this is correct. However, the constant name cMaxRetries suggests the maximum number of retries should be 3, making 4 total attempts.
If you want exactly 3 total attempts (1 initial + 2 retries), change to attempt < cMaxRetries. Otherwise, consider renaming to cMaxAttempts for clarity.
Also applies to: 655-666
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp_s/JsonParser.cpp` around lines 649 - 650, The retry
loop uses constexpr size_t cMaxRetries{3} together with the condition attempt <=
cMaxRetries, causing 4 attempts (attempts 0..3); either change the loop
condition to stop at the intended total attempts (e.g., use attempt <
cMaxRetries for exactly 3 total attempts) or rename/repurpose the constant to
cMaxAttempts and document that it includes the initial try (and keep the <=
condition). Update all occurrences (including the similar block referenced
around lines 655–666) to use the chosen convention consistently and adjust the
constant name or loop condition accordingly so the semantics match the
identifier.
| auto const code{curl_error_info->code()}; | ||
| // CURLE_HTTP_RETURNED_ERROR (22): HTTP 5xx errors (503, 500, etc.) | ||
| // CURLE_SSL_CONNECT_ERROR (35): SSL connection reset | ||
| // CURLE_GOT_NOTHING (52): Empty reply from server | ||
| // CURLE_RECV_ERROR (56): Failure in receiving network data | ||
| // CURLE_OPERATION_TIMEDOUT (28): Operation timed out | ||
| return CURLE_HTTP_RETURNED_ERROR == code || CURLE_SSL_CONNECT_ERROR == code | ||
| || CURLE_GOT_NOTHING == code || CURLE_RECV_ERROR == code | ||
| || CURLE_OPERATION_TIMEDOUT == code; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does CURLE_HTTP_RETURNED_ERROR trigger for HTTP 4xx errors or only 5xx?
💡 Result:
CURLE_HTTP_RETURNED_ERROR triggers for HTTP error codes >= 400 (both 4xx and 5xx), but only if CURLOPT_FAILONERROR is set to TRUE. By default, libcurl considers HTTP 4xx/5xx responses as successful transfers and does not return this error.
Citations:
- 1: https://curl.se/libcurl/c/libcurl-errors.html
- 2: http://curl.haxx.se/libcurl/c/libcurl-errors.html
- 3: https://manpages.ubuntu.com/manpages/jammy/man3/libcurl-errors.3.html
- 4: https://everything.curl.dev/libcurl-http/responses.html
🏁 Script executed:
# First, let's examine the Utils.cpp file around the function in question
cat -n components/core/src/clp_s/Utils.cpp | sed -n '170,210p'Repository: y-scope/clp
Length of output: 1713
🏁 Script executed:
# Search for CURLOPT_FAILONERROR usage in the codebase
rg "CURLOPT_FAILONERROR" --type cpp --type hpp -B 2 -A 2Repository: y-scope/clp
Length of output: 82
🏁 Script executed:
# Search for where is_retryable_curl_error is being used
rg "is_retryable_curl_error" --type cpp --type hpp -B 2 -A 2Repository: y-scope/clp
Length of output: 82
🏁 Script executed:
# Search for CURLOPT_FAILONERROR using correct flags
rg "CURLOPT_FAILONERROR" -t cpp -B 2 -A 2Repository: y-scope/clp
Length of output: 728
🏁 Script executed:
# Search for where is_retryable_curl_error is used
rg "is_retryable_curl_error" -t cpp -B 2 -A 2Repository: y-scope/clp
Length of output: 3067
🏁 Script executed:
# Find and examine NetworkReader implementation
fd -e cpp -e h | xargs grep -l "class NetworkReader" | head -5Repository: y-scope/clp
Length of output: 37
CURLE_HTTP_RETURNED_ERROR includes 4xx client errors, not just 5xx.
With CURLOPT_FAILONERROR enabled (which is explicitly set in CurlDownloadHandler.cpp), CURLE_HTTP_RETURNED_ERROR is triggered for any HTTP error code ≥ 400, including both client errors (4xx) and server errors (5xx). The comment incorrectly states "HTTP 5xx errors (503, 500, etc.)" but the code treats all HTTP errors ≥ 400 as retryable.
This is problematic because client errors like 401 (Unauthorized), 403 (Forbidden), and 404 (Not Found) are non-transient and will not succeed on retry. These indicate problems with the request or credentials, not temporary server issues.
To properly handle only server errors, check the actual HTTP response code using curl_easy_getinfo() with CURLINFO_RESPONSE_CODE and verify it falls in the 500–599 range before treating it as retryable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp_s/Utils.cpp` around lines 186 - 194, The current
retry check treats CURLE_HTTP_RETURNED_ERROR (from curl_error_info->code()) as
always retryable but that includes 4xx client errors; update the logic in
Utils.cpp (the function that inspects curl_error_info->code()) to distinguish
HTTP 5xx from 4xx by calling curl_easy_getinfo on the CURL handle to fetch
CURLINFO_RESPONSE_CODE and only treat CURLE_HTTP_RETURNED_ERROR as retryable
when the response_code is in the 500–599 range; keep the existing checks for
CURLE_SSL_CONNECT_ERROR, CURLE_GOT_NOTHING, CURLE_RECV_ERROR, and
CURLE_OPERATION_TIMEDOUT unchanged. Ensure you access the CURL handle associated
with the error (the same one used to produce curl_error_info) to call
curl_easy_getinfo and handle any failure to get the response code conservatively
(do not assume 5xx).
gibber9809
left a comment
There was a problem hiding this comment.
I can see this PR is aiming to handle two different failure scenarios for both clp-s and log-converter:
- We hit a retryable network error before trying to ingest any data from some input
- We hit a retryable network error after ingesting some data from some input
I think the first scenario is handled properly here for both clp-s and log-converter, but the second scenario will still have some issues. Specifically, in the log-converter path I think we'll end up with some logs being ingested more than once, and the clp-s path probably won't work since most of the code in JsonParser currently assumes that any read failure leads to overall compression failure, and the error handling paths currently leave JsonParser in a state where it can't continue ingestion.
Since handling this second scenario properly will require some more extensive changes, I think it might make sense to reduce our scope to just handling the first scenario in this PR. For that, we could probably create a new input utility like try_create_reader_and_deduce_input_type_with_retries in InputConfig that does the whole create reader -> deduce type -> check for network failure, back off, and retry loop. Then we could use that utility in both log_converter.cpp and JsonParser.cpp to keep the changes in those files minimal.
42c0953 to
90f43fb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/InputConfig.cpp`:
- Around line 494-518: The retry helper currently skips logging curl errors when
file_type == FileType::Unknown on the final attempt or for non-retryable curl
errors; update the control flow around try_create_reader/try_deduce_reader_type
so that when result yields FileType::Unknown and
NetworkUtils::check_and_log_curl_error(reader.get()) indicates a curl error you
always call check_and_log_curl_error (so non-retryable and final-attempt errors
get logged), but only use attempt < max_retries to decide whether to continue
the loop (i.e., keep NetworkUtils::is_retryable_curl_error(reader.get()) &&
attempt < max_retries for the continue branch); locate this logic around the
try_create_reader/try_deduce_reader_type result handling and adjust the branches
that reference reader, result, nested_readers, file_type, attempt, and
max_retries accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 18904f94-1802-44f0-9b86-6ba794b93496
📥 Commits
Reviewing files that changed from the base of the PR and between 42c09539afd8560d4d2f43e525e90cb5bb55f7e3 and 90f43fb.
📒 Files selected for processing (6)
components/core/src/clp_s/InputConfig.cppcomponents/core/src/clp_s/InputConfig.hppcomponents/core/src/clp_s/JsonParser.cppcomponents/core/src/clp_s/Utils.cppcomponents/core/src/clp_s/Utils.hppcomponents/core/src/clp_s/log_converter/log_converter.cpp
| auto reader{try_create_reader(path, network_auth)}; | ||
| if (nullptr == reader) { | ||
| if (attempt < max_retries) { | ||
| SPDLOG_WARN("Failed to create reader for {}, will retry.", path.path); | ||
| continue; | ||
| } | ||
| SPDLOG_ERROR("Failed to open input {} for reading.", path.path); | ||
| return {{}, FileType::Unknown}; | ||
| } | ||
|
|
||
| auto result = try_deduce_reader_type(reader); | ||
| auto const& [nested_readers, file_type] = result; | ||
|
|
||
| if (FileType::Unknown == file_type | ||
| && NetworkUtils::is_retryable_curl_error(reader.get()) | ||
| && attempt < max_retries) | ||
| { | ||
| NetworkUtils::check_and_log_curl_error(path.path, reader.get()); | ||
| continue; | ||
| } | ||
|
|
||
| return result; | ||
| } | ||
|
|
||
| return {{}, FileType::Unknown}; |
There was a problem hiding this comment.
Retryable curl error on final attempt is silently dropped; non-retryable curl error on type deduction is never logged.
Two related observability gaps in this retry helper:
-
When
attempt == max_retriesand the error is a retryable curl error, theif (... && attempt < max_retries)guard skipscheck_and_log_curl_errorand falls through toreturn result;. Sinceresultis{{}, FileType::Unknown}, downstream callers inJsonParser::ingest(lines 681–683) andconvert_files(lines 59–64) gate their own logging onfalse == nested_readers.empty()and therefore also skip it. The last-attempt curl code/message is never logged, which defeats the diagnostic goal of this PR. -
For a non-retryable curl error on type deduction (e.g.,
CURLE_COULDNT_RESOLVE_HOST), the helper bypasses the retry branch entirely and returns without callingcheck_and_log_curl_error; again the callers cannot log becausenested_readersis empty.
Consider unconditionally logging the curl error (when present) on any path that ends in FileType::Unknown, and reserve attempt < max_retries for the continue decision only.
🛠️ Suggested fix
- if (FileType::Unknown == file_type
- && NetworkUtils::is_retryable_curl_error(reader.get())
- && attempt < max_retries)
- {
- NetworkUtils::check_and_log_curl_error(path.path, reader.get());
- continue;
- }
-
- return result;
+ if (FileType::Unknown == file_type) {
+ auto const retryable{NetworkUtils::is_retryable_curl_error(reader.get())};
+ if (retryable && attempt < max_retries) {
+ NetworkUtils::check_and_log_curl_error(path.path, reader.get());
+ continue;
+ }
+ // Final attempt or non-retryable failure: log any curl error before giving up
+ // so the caller (which sees an empty nested_readers vector) isn't left in the dark.
+ NetworkUtils::check_and_log_curl_error(path.path, reader.get());
+ }
+
+ return result;🧰 Tools
🪛 Cppcheck (2.20.0)
[style] 508-508: The function 'wildcard_query_matches_any_encoded_var' is never used.
(unusedFunction)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp_s/InputConfig.cpp` around lines 494 - 518, The retry
helper currently skips logging curl errors when file_type == FileType::Unknown
on the final attempt or for non-retryable curl errors; update the control flow
around try_create_reader/try_deduce_reader_type so that when result yields
FileType::Unknown and NetworkUtils::check_and_log_curl_error(reader.get())
indicates a curl error you always call check_and_log_curl_error (so
non-retryable and final-attempt errors get logged), but only use attempt <
max_retries to decide whether to continue the loop (i.e., keep
NetworkUtils::is_retryable_curl_error(reader.get()) && attempt < max_retries for
the continue branch); locate this logic around the
try_create_reader/try_deduce_reader_type result handling and adjust the branches
that reference reader, result, nested_readers, file_type, attempt, and
max_retries accordingly.
gibber9809
left a comment
There was a problem hiding this comment.
LGTM! Thanks for the changes. Directly edited the PR title as well. Should be good to merge once the CI finishes.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/core/src/clp_s/log_converter/log_converter.cpp (1)
44-67: 🧹 Nitpick | 🔵 TrivialLGTM — refactor consolidates retry/deduce and matches the JsonParser pattern.
Consolidation into
try_create_reader_and_deduce_type_with_retriesremoves the duplicated retry loop and the prior double-logging issue, and thefalse == nested_readers.empty()guard beforecheck_and_log_curl_error(..., nested_readers.front().get())correctly mirrors the pattern inJsonParser::ingest()(guarding against the empty-vector return on reader-creation failure from the helper).One small note: unlike
JsonParser::ingest(), this path does not callclose_nested_readers(nested_readers)on the error return at line 66 (nor on the successfulbreakpath beforeconvert_fileat line 70). Sinceconvert_filesreturns immediately tomainon error, it's effectively benign here, but relying on shared-pointer destruction for cleanup is inconsistent with the sibling callsite — worth considering for symmetry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/core/src/clp_s/log_converter/log_converter.cpp` around lines 44 - 67, The switch handling the result of try_create_reader_and_deduce_type_with_retries doesn't call clp_s::close_nested_readers(nested_readers) for symmetry with JsonParser::ingest: ensure you call clp_s::close_nested_readers(nested_readers) in the error/default branch just before returning false, and also call clp_s::close_nested_readers(nested_readers) on the LogText/EmptyFile path after the switch (before invoking convert_file) so nested readers are explicitly closed in both success and failure paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@components/core/src/clp_s/log_converter/log_converter.cpp`:
- Around line 44-67: The switch handling the result of
try_create_reader_and_deduce_type_with_retries doesn't call
clp_s::close_nested_readers(nested_readers) for symmetry with
JsonParser::ingest: ensure you call clp_s::close_nested_readers(nested_readers)
in the error/default branch just before returning false, and also call
clp_s::close_nested_readers(nested_readers) on the LogText/EmptyFile path after
the switch (before invoking convert_file) so nested readers are explicitly
closed in both success and failure paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0cb0e2e9-68bd-4b90-abda-1fa443226e34
📒 Files selected for processing (1)
components/core/src/clp_s/log_converter/log_converter.cpp
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/core/src/clp_s/InputConfig.cpp`:
- Around line 494-502: The current loop unconditionally retries when
try_create_reader(path, network_auth) returns nullptr; change this so
non-transient failures fail fast: after try_create_reader(...) returns nullptr,
detect whether the path is a network resource (e.g. check the path/url kind or
use the NetworkReader construction error indicator) and only apply the
exponential backoff/retry when the failure is a network/transient curl-type
error (use is_retryable_curl_error or the NetworkReader-specific error info);
for filesystem/missing-permission or other non-retryable reasons, log an
SPDLOG_ERROR with the path and return {{}, FileType::Unknown} immediately
instead of continuing through attempt/max_retries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7c9ec2e5-495a-4a66-b57b-61890a317ffe
📒 Files selected for processing (4)
components/core/src/clp_s/InputConfig.cppcomponents/core/src/clp_s/Utils.cppcomponents/core/src/clp_s/Utils.hppcomponents/core/src/clp_s/log_converter/log_converter.cpp
| auto reader{try_create_reader(path, network_auth)}; | ||
| if (nullptr == reader) { | ||
| if (attempt < max_retries) { | ||
| SPDLOG_WARN("Failed to create reader for {}, will retry.", path.path); | ||
| continue; | ||
| } | ||
| SPDLOG_ERROR("Failed to open input {} for reading.", path.path); | ||
| return {{}, FileType::Unknown}; | ||
| } |
There was a problem hiding this comment.
Unconditional retry of try_create_reader adds wasted backoff for non-transient failures.
try_create_reader returns nullptr for reasons that are almost always non-transient: missing files / permission denied on the filesystem path, invalid URLs or missing/invalid AWS credentials on the network path (the clp::NetworkReader constructor does not itself perform the curl transfer — transient S3 errors surface later during read). Retrying these paths with exponential backoff delays the real error by ~7s (1+2+4) with no hope of succeeding.
Consider scoping retries to the network path and/or only when is_retryable_curl_error applies, and failing fast for filesystem creation failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/core/src/clp_s/InputConfig.cpp` around lines 494 - 502, The
current loop unconditionally retries when try_create_reader(path, network_auth)
returns nullptr; change this so non-transient failures fail fast: after
try_create_reader(...) returns nullptr, detect whether the path is a network
resource (e.g. check the path/url kind or use the NetworkReader construction
error indicator) and only apply the exponential backoff/retry when the failure
is a network/transient curl-type error (use is_retryable_curl_error or the
NetworkReader-specific error info); for filesystem/missing-permission or other
non-retryable reasons, log an SPDLOG_ERROR with the path and return {{},
FileType::Unknown} immediately instead of continuing through
attempt/max_retries.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Deferring to @gibber9809's review.
For the PR title, how about:
fix(core): Add retries with exponential backoff for transient S3 read failures before any data is read.
Can we also edit the PR description to more accurately reflect the changes in this PR?
… failures before any data is read. (y-scope#2203) Co-authored-by: Naman Goyal <namangg@amazon.com> Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com> Co-authored-by: dgibson <devinbook1@gmail.com>
When reading log files from S3, transient network errors (HTTP errors, SSL connection resets, empty replies, timeouts) during reader creation or file type deduction cause the compression task to fail permanently. Since there is no retry mechanism, the affected log files are never ingested.
Changes
NetworkUtils::is_retryable_curl_error()inUtils.hpp/cppto detect transient curl errors (CURLE_HTTP_RETURNED_ERROR,CURLE_SSL_CONNECT_ERROR,CURLE_GOT_NOTHING,CURLE_RECV_ERROR,CURLE_OPERATION_TIMEDOUT).try_create_reader_and_deduce_type_with_retries()inInputConfig.hpp/cpp— a shared utility that wraps reader creation and file type deduction with retry logic (3 retries, exponential backoff starting at 1s).JsonParser::ingest()andlog_converter::convert_files()to use the new utility, replacing inline retry loops.Scope
This PR only handles transient errors that occur before any data is ingested (i.e., during reader creation and input type deduction). Retrying after partial ingestion would require more extensive changes to the ingestion pipeline and is left for a follow-up.