Skip to content

feat: allow accepting timeouts#2063

Merged
thomas-zahner merged 28 commits into
lycheeverse:masterfrom
cristiklein:can-ignore-timeout
Mar 4, 2026
Merged

feat: allow accepting timeouts#2063
thomas-zahner merged 28 commits into
lycheeverse:masterfrom
cristiklein:can-ignore-timeout

Conversation

@cristiklein

@cristiklein cristiklein commented Feb 25, 2026

Copy link
Copy Markdown
Contributor
  • Add --accept-timeouts which means timeouts are accepted as valid links and lychee exits with 0.
  • This feature can also be used from lychee.toml by using accept_timeouts = true
  • The output is unchanged, i.e., the user can still see the number of timeouts.
  • Add test for this feature

Thanks to @yegor256, @mre and @thomas-zahner for the kind words and feedback.

@cristiklein

Copy link
Copy Markdown
Contributor Author

@yegor256 Wanna take this for a spin?

@yegor256

Copy link
Copy Markdown

@cristiklein would it be possible to do this in the lychee.toml?

accept = [200, 202, 301, 302, 307, 401, timeout]

@cristiklein

Copy link
Copy Markdown
Contributor Author

@cristiklein would it be possible to do this in the lychee.toml?

accept = [200, 202, 301, 302, 307, 401, timeout]

Just tested and it also works via lychee.toml:

$ cat blah.md 
<http://127.0.0.1:8080>
$ cat lychee.toml 
accept = [200, 202, 301, 302, 307, 401, "timeout"]
timeout = 1
$ ./target/debug/lychee --config lychee.toml blah.md
1/1 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links                                                                                                       Issues found in 1 input. Find details below.

[blah.md]:
 [TIMEOUT] http://127.0.0.1:8080/ | Timeout

🔍 1 Total (in 11s) ✅ 0 OK 🚫 0 Errors ⏳ 1 Timeouts

$ echo $?
0

@thomas-zahner

thomas-zahner commented Feb 25, 2026

Copy link
Copy Markdown
Member

Thank you for the PR. As mentioned over on the issue we should not use such magic/sentinel values.

Reusing --accept might not be ideal in this case. Tough it would be possible, if we wrapped the accept type in a newtype or struct/enum.

It might be simpler and more straight forward to introduce --accept-timeouts both for users and for the implementation. I'm not quite sure what all the implications of this change are. But they might be bigger than the PR in the current state.

For example all output formats still show an erroneous run.
Issues found in 1 input. Find details below. for compact,
error_map for json
Errors in stdin for detailed
### Errors in stdin for markdown

So the timeouts should maybe be marked as success as early as possible, instead of adding is_success_ignoring_timeouts.

@mre

mre commented Feb 25, 2026

Copy link
Copy Markdown
Member

Thanks for the initiative @cristiklein. It's nice to see someone taking on that task. 😊

I agree with Thomas here. Those special values can cause hard-to-debug edge cases. I should know because we're still dealing with the aftermath of my ad-hoc decisions from the past. 😬

So, yeah, let's make it explicit.

The lychee config would then look like

accept = [200, 202, 301, 302, 307, 401]

# Accept transient timeouts
accept_timeouts = false

@cristiklein

Copy link
Copy Markdown
Contributor Author

Thanks @mre and @thomas-zahner for the kind words and feedback.

I just force-pushed a new PR, which addresses your feedback.

Below is an example invocation:

$ cat blah.md 
<http://127.0.0.1:8080>
$ cat lychee.toml 
accept = [200, 202, 301, 302, 307, 401]
accept_timeouts = true
timeout = 1
$ ./target/debug/lychee blah.md
1/1 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links                          Issues found in 1 input. Find details below.

[blah.md]:
[TIMEOUT] http://127.0.0.1:8080/ (at 1:2) | Timeout

🔍 1 Total (in 11s) ✅ 0 OK 🚫 0 Errors ⏳ 1 Timeouts

$ echo $?
0

Let me know what you think.

@yegor256

Copy link
Copy Markdown

@cristiklein maybe you can update the title and description of the PR, for clarity?

@cristiklein cristiklein changed the title Add accept=timeout Add --accept-timeout Feb 26, 2026
@cristiklein cristiklein changed the title Add --accept-timeout feat: Add --accept-timeout Feb 26, 2026
@cristiklein cristiklein changed the title feat: Add --accept-timeout feat: Add --accept-timeouts Feb 26, 2026
@thomas-zahner

Copy link
Copy Markdown
Member

@cristiklein Cool, that looks better. Also great to see the new test 👍

Now we should think about how to display and output such accepted timeouts. Currently, they are still clearly marked as error. This might be unexpected, though maybe the current behaviour is also fine. But my thinking is that if timeouts are accepted timeouts should probably not be shown as errors, but as successes.

So for example the timeouts shouldn't show up in error_map for the json format. The same would be true for all other formats. Although, maybe this would make it harder to spot the timeouts? What are your thoughts @mre @katrinafyi?

@cristiklein cristiklein changed the title feat: Add --accept-timeouts feat: allow accepting timeouts Feb 26, 2026
@katrinafyi

katrinafyi commented Feb 26, 2026

Copy link
Copy Markdown
Member

I tried it out and I noticed that timeouts are still retried even if --accept-timeouts is given. This makes it take more time than necessary. Is this intentional?

Re output format, I agree that if timeouts are accepted, they shouldn't be printed amongst the errors.

Looking at the related enums, I also think the current enum Status is not the best, because it mixes up two things which I think should be separated: (1) the actual outcome of the request, and (2) how that outcome should be interpreted. This makes it hard to represent cases like a Timeout that should be accepted and we also have to pass the accepted set around a lot. To implement special logic like this PR, it ends up being a lot of post-processing to "reinterpret" the status. If the two concerns were separated, it would be much easier to do this PR and other requests such as rejecting redirects.

But anyway, all that won't be fixed in this PR. In this PR, I guess you would change the lines around where it adds to the error map:

https://github.com/cristiklein/lychee/blob/6a8007ba3d1fb5de95c78c4e60126e824748340d/lychee-bin/src/formatters/stats/response.rs#L93-L106

@cristiklein

Copy link
Copy Markdown
Contributor Author

Hi @katrinafyi . Thanks for looking into the PR.

I tried it out and I noticed that timeouts are still retried even if --accept-timeouts is given. This makes it take more time than necessary. Is this intentional?

Yes, my intention was for --accept-timeout to only change the exit code and nothing else. The user can run lychee --accept-timeouts --max-retries=0 --timeout=1 if they want to minimize execution time.

Re output format, I agree that if timeouts are accepted, they shouldn't be printed amongst the errors.

Just to make sure I understand correctly, do you argue for lychee to suppress outputting [TIMEOUT] and folding that into OKs?

My thinking is the following: This feature will most likely be used in a CI. In that case, the user might still want TIMEOUT in logs, but they want lychee to return 0, so their CI pipeline continues.

Looking at the related enums, I also think the current enum Status is not the best, because it mixes up two things which I think should be separated: (1) the actual outcome of the request, and (2) how that outcome should be interpreted. This makes it hard to represent cases like a Timeout that should be accepted and we also have to pass the accepted set around a lot. To implement special logic like this PR, it ends up being a lot of post-processing to "reinterpret" the status. If the two concerns were separated, it would be much easier to do this PR and other requests such as rejecting redirects.

Just to make sure I understand correctly, are you arguing for moving accept logic to ResponseStats?

But anyway, all that won't be fixed in this PR. In this PR, I guess you would change the lines around where it adds to the error map:

https://github.com/cristiklein/lychee/blob/6a8007ba3d1fb5de95c78c4e60126e824748340d/lychee-bin/src/formatters/stats/response.rs#L93-L106

Just to make sure, you propose timeouts to be added to success_map with --accept-timeouts and error_map without --accept-timeouts? Seems like timeouts are currently not added to any map.

As an alternative, would it make sense to add timeout_map? This would avoid having to pass accept to ResponseStats.add_response_status().

@katrinafyi

Copy link
Copy Markdown
Member

Just to make sure, you propose timeouts to be added to success_map with --accept-timeouts and error_map without --accept-timeouts? Seems like timeouts are currently not added to any map.

If you run

echo https://httpbin.org/delay/2 | cargo run --  - --timeout=1 --max-retries=0 --format json

the timeout are added to error_map (it's included in the is_error() case). I think it would be nice if error_map only contains the errors which would cause link checking to fail. Then, for example, is_success is simply error_map.is_empty() and we don't have to do the big sums.

However, at the moment, printing the timeouts (under the default verbosity) is tied to inclusion in the error_map. It would be nice if this were not the case and that could be done by adding a timeout_map and a separate output section like redirects. But I also defer to @ thomas-zahner and @ mre for whether that is a good idea.

Just to make sure I understand correctly, are you arguing for moving accept logic to ResponseStats?

Sorry, that paragraph was not necessarily directed to you. It's just an early idea and I'm not arguing for anything concrete.

@cristiklein

Copy link
Copy Markdown
Contributor Author

Thanks @katrinafyi for the feedback.

Just to make sure, you propose timeouts to be added to success_map with --accept-timeouts and error_map without --accept-timeouts? Seems like timeouts are currently not added to any map.

If you run

echo https://httpbin.org/delay/2 | cargo run --  - --timeout=1 --max-retries=0 --format json

the timeout are added to error_map (it's included in the is_error() case). I think it would be nice if error_map only contains the errors which would cause link checking to fail. Then, for example, is_success is simply error_map.is_empty() and we don't have to do the big sums.

However, at the moment, printing the timeouts (under the default verbosity) is tied to inclusion in the error_map. It would be nice if this were not the case and that could be done by adding a timeout_map and a separate output section like redirects. But I also defer to @ thomas-zahner and @ mre for whether that is a good idea.

To make the alternative easier to reason about, I went ahead and moved timeouts into timeout_map as a concrete proposal. See 3f9d495

I also simplified is_success and is_success_ignoring_timeouts. is_success now becomes a simple error_map.is_empty() && timeout_map.is_empty(). See 84de85e

Curious whether this direction makes sense conceptually, especially with regard to output verbosity and backward compatibility.

@thomas-zahner thomas-zahner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cristiklein Thank you for updating & improving the tests. I really like the idea of a new timeout_map 👍 This makes the new flag more obvious. It controls how users want to interpret the timeout_map. That's something we could also mention in the flag description.

Comment thread lychee-bin/src/formatters/stats/response.rs Outdated
Comment thread lychee-bin/src/formatters/stats/mod.rs Outdated

@thomas-zahner thomas-zahner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suggest a few other improvements in 6dc06e9

Comment thread lychee-bin/src/formatters/stats/json.rs
Comment thread lychee-bin/src/formatters/stats/mod.rs Outdated
Comment thread lychee-bin/src/formatters/stats/response.rs Outdated
Comment thread lychee-bin/src/options.rs Outdated
Comment thread lychee.example.toml Outdated
cristiklein and others added 5 commits March 4, 2026 10:19
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
@cristiklein

Copy link
Copy Markdown
Contributor Author

Thanks @thomas-zahner . I accepted all your suggestions and made the tests pass again. PTAL.

@cristiklein

Copy link
Copy Markdown
Contributor Author

I also suggest a few other improvements in 6dc06e9

Sorry, I noticed this too late. I added those suggestions too and fixed the tests accordingly.

Comment thread lychee.example.toml Outdated
BOLD_PINK,
"Issues found in {} {input}. Find details below.\n\n",
stats.error_map.len()
"Issues found in {issues} {input}. Find details below.\n\n",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it might be better to make this a user hint in the long run. (not in this PR) We could then take the value of accept_timeouts into account.

@thomas-zahner thomas-zahner merged commit 892cb9b into lycheeverse:master Mar 4, 2026
7 checks passed
@thomas-zahner

Copy link
Copy Markdown
Member

@cristiklein Thank you for the work. I also really like the new timeout_map which emerged as part of this PR.

@mre mre mentioned this pull request Mar 4, 2026
@cristiklein

Copy link
Copy Markdown
Contributor Author

@cristiklein Thank you for the work. I also really like the new timeout_map which emerged as part of this PR.

And thank you for shepherding this PR. I really liked the infrastructure and tests in lychee, and the good discussions we had. It was fun!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants