Skip to content

Respect gitignore#1500

Merged
thomas-zahner merged 11 commits into
lycheeverse:masterfrom
thomas-zahner:respect-gitignore
Sep 22, 2024
Merged

Respect gitignore#1500
thomas-zahner merged 11 commits into
lycheeverse:masterfrom
thomas-zahner:respect-gitignore

Conversation

@thomas-zahner

@thomas-zahner thomas-zahner commented Sep 15, 2024

Copy link
Copy Markdown
Member

Closes #1331

With this PR lychee skips files that are ignored (by git or .ignore) and hidden by default. This behaviour can be disabled with the --no-ignore and --hidden flags respectively. This is done by replacing jwalk with ignore.

Files are considered ignored as defined by standard_filters with the exception of hidden which is handled with the separate flag or config option.

@thomas-zahner thomas-zahner requested a review from mre September 15, 2024 12:14
@thomas-zahner

thomas-zahner commented Sep 15, 2024

Copy link
Copy Markdown
Member Author

@mre I've got three questions:

  1. The test_readme_usage_up_to_date test was annoying me a bit sometimes because all my editors trim empty lines when saving README.md. I've updated the test so that lines containing only whitespace are trimmed, so that empty lines are expected. Is this okay?
  2. For now --gitignore is required so that ignored files are skipped. Would it be okay if we switched it around and update the flag to --no-gitignore? This would be a more sensible default in my opinion but also a breaking change.
  3. Currently there is no way to include hidden files. (hardcoded previously as skip_hidden(true) now hidden(true)) Should I add an option (lychee-lib) and flag (lychee-bin) to include hidden files?

@mre

mre commented Sep 15, 2024

Copy link
Copy Markdown
Member

Very nice!

  1. The test_readme_usage_up_to_date test was annoying me a bit sometimes because all my editors trim empty lines when saving README.md. I've updated the test so that lines containing only whitespace are trimmed, so that empty lines are expected. Is this okay?

Yes, same. Sounds very reasonable to me. Thanks!

  1. For now --gitignore is required so that ignored files are skipped. Would it be okay if we switched it around and update the flag to --no-gitignore? This would be a more sensible default in my opinion but also a breaking change.

Absolutely. I'm all for it. Please go forward and change it if you like. Maybe we could use the --include prefix to align it with the other options. Maybe --include-gitignore or --include-hidden?

  1. Currently there is no way to include hidden files. (hardcoded previously as skip_hidden(true) now hidden(true)) Should I add an option (lychee-lib) and flag (lychee-bin) to include hidden files?

Yes, that sounds sensible to me. 👍

@thomas-zahner

thomas-zahner commented Sep 20, 2024

Copy link
Copy Markdown
Member Author

@mre Thanks for the answers. The PR is now ready for review.

Just one last thing; clippy complains about too many booleans in the Collector struct. Should I extract the three skip booleans into a separate struct? How would you name that struct?

pub struct Collector {
    basic_auth_extractor: Option<BasicAuthExtractor>,
    skip_missing_inputs: bool,
    skip_ignored: bool,
    skip_hidden: bool,
    include_verbatim: bool,
    use_html5ever: bool,
    base: Option<Base>,
}

@mre

mre commented Sep 20, 2024

Copy link
Copy Markdown
Member

Awesome. I don't like that lint too much. Maybe we can ignore it for now.

@mre

mre commented Sep 20, 2024

Copy link
Copy Markdown
Member

Coincidentally, it's the most ignored lint: rust-lang/rust-clippy#5418
(Except for cognitive_complexity, which got removed.)

@thomas-zahner

Copy link
Copy Markdown
Member Author

Yeah I also felt like this is very picky and currently there are 292 results which is quite a lot in comparison to other lints. Began extracting the three bools but it didn't feel like a benefit, only like more code. So I also prefer disabling the lint here.

@mre

mre commented Sep 20, 2024

Copy link
Copy Markdown
Member

I don't know why the publish-check is failing. It seems like that's unrelated and not a blocker.

For consistency, I'd propose to rename the options:

  • --no-ignore--include-ignored or --include-gitignore
  • --hidden--include-hidden

Here's the other options so far:

❯❯❯ lychee --help | rg 'exclude|include'                                                                                                                                                         
      --include <INCLUDE>
          URLs to check (supports regex). Has preference over all excludes
      --exclude <EXCLUDE>
      --exclude-file <EXCLUDE_FILE>
          Deprecated; use `--exclude-path` instead
      --exclude-path <EXCLUDE_PATH>
  -E, --exclude-all-private
          Equivalent to `--exclude-private --exclude-link-local --exclude-loopback`
      --exclude-private
      --exclude-link-local
      --exclude-loopback
      --exclude-mail
          Exclude all mail addresses from checking (deprecated; excluded by default)
      --include-mail
      --include-fragments
      --include-verbatim

I'm assuming you probably modeled the existing flags after ripgrep, which is a fine choice. Looking through the list of existing options above, I realize that they all have to do with links, not inputs (with the exception of --exclude-path). So I have no strong opinion on this. We could merge as-is, but I'd like to document our thought process here, so happy for any feedback.

@thomas-zahner

Copy link
Copy Markdown
Member Author

@mre Yes exactly, I used the same names as ripgrep does as I find them very good.

--include-gitignore too me sounds like I'd include the .gitignore files themself for checking. But that's not the case. As you have mentioned all flags with include or exclude have to do with individual links not files (with one exception). So I prefer the shorter and clearer flags that align with ripgrep.

@thomas-zahner thomas-zahner merged commit 99148af into lycheeverse:master Sep 22, 2024
@mre

mre commented Sep 22, 2024

Copy link
Copy Markdown
Member

Ah, that makes sense. Thanks for the clarification and the awesome PR.

mre added a commit that referenced this pull request Nov 7, 2024
This adds support for overwriting extensions:

```
lychee . --extensions md,html,txt,json,yaml
```

The above would only check these extensions.

This was enabled by moving to `ignore` (#1500 by @thomas-zahner).

Fixes #410
mre added a commit that referenced this pull request Feb 24, 2025
This adds support for overwriting extensions:

```
lychee . --extensions md,html,txt,json,yaml
```

The above would only check these extensions.

This was enabled by moving to `ignore` (#1500 by @thomas-zahner).

Fixes #410
abey79 added a commit to emilk/egui_plot that referenced this pull request May 8, 2025
Title. Follows this PR:
- emilk/egui#7035

Also removed the ignore list since it's no longer needed:
- lycheeverse/lychee#1500
michalsustr pushed a commit to emilk/egui_plot that referenced this pull request Nov 26, 2025
Title. Follows this PR:
- emilk/egui#7035

Also removed the ignore list since it's no longer needed:
- lycheeverse/lychee#1500
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Read .gitignore

2 participants