Skip to content

Stop on first non-match after positive match #1790

@sveniu

Description

@sveniu

Feature request summary

Stop search on first non-matching line, but only after having seen matching lines.

Motivation

On sorted or semi-sorted files, it is often useful to extract consecutive lines that match a given pattern. In some cases involving large files, time and IO can be saved by stopping the search after having matched such a run of consecutive lines.

I'm fairly convinced this feature can save countless CPU cycles and IO in big-ish data situations, particularly where timestamp matching is involved.

Example

Input:

2021-01-29T07:08:09+0100 foo
2021-01-29T08:08:09+0100 bar
2021-01-29T09:08:09+0100 baz
2021-01-29T10:08:09+0100 boz

Behaviour when searching for pattern ^2021-01-29T08 with this option enabled:

  1. First line does not match; continue to next.
  2. Second line matches; print it and continue to next.
  3. Third line does not match; break and exit; line 4 is never scanned.

Pseudo code

have_seen_matches = null
for each line:
  if line matches:
    print line
    have_seen_matches = true
  else:
    if have_seen_matches == true:
      break

Suggested documentation

--stop-on-nonmatch
    Stop reading a file on the first non-matching line that follows a matching
    line. At least one line must have matched for this to take effect.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAn enhancement to the functionality of the software.help wantedOthers are encouraged to work on this issue.rollupA PR that has been merged with many others in a rollup.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions