PowerShell icon indicating copy to clipboard operation
PowerShell copied to clipboard

Limit searching of `charset` attribute in `meta` tag for HTML to first 1024 characters in webcmdlets

Open SteveL-MSFT opened this issue 3 years ago • 2 comments

PR Summary

In the case of HTML content, the cmdlet searches for <meta charset=...> in the body to find the right encoding to use. It uses a regex to do this, however, it applies it to the entire body (which can be quite large). The <meta> tag exists in the <head> tag which is expected at the top of the document under <html>. So the change here is to limit the search to just the first 1k characters. It is possible (since HTML is pretty lenient) to have a bunch of HTML comment tags that pushes the <meta> tag much lower, but seems unlikely. The encoding defaults to UTF-8 which is used by most websites anyways.

Tested manually against the repro in the issue which returns immediately after the payload is downloaded.

PR Context

Fix https://github.com/PowerShell/PowerShell/issues/17762

PR Checklist

SteveL-MSFT avatar Aug 01 '22 21:08 SteveL-MSFT

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. Maintainer, please provide feedback and/or mark it as Waiting on Author

msftbot[bot] avatar Aug 10 '22 02:08 msftbot[bot]

This PR has 2 quantified lines of changes. In general, a change size of upto 200 lines is ideal for the best PR experience!


Quantification details

Label      : Extra Small
Size       : +1 -1
Percentile : 0.8%

Total files changed: 1

Change summary by file extension:
.cs : +1 -1

Change counts above are quantified counts, based on the PullRequestQuantifier customizations.

Why proper sizing of changes matters

Optimal pull request sizes drive a better predictable PR flow as they strike a balance between between PR complexity and PR review overhead. PRs within the optimal size (typical small, or medium sized PRs) mean:

  • Fast and predictable releases to production:
    • Optimal size changes are more likely to be reviewed faster with fewer iterations.
    • Similarity in low PR complexity drives similar review times.
  • Review quality is likely higher as complexity is lower:
    • Bugs are more likely to be detected.
    • Code inconsistencies are more likely to be detected.
  • Knowledge sharing is improved within the participants:
    • Small portions can be assimilated better.
  • Better engineering practices are exercised:
    • Solving big problems by dividing them in well contained, smaller problems.
    • Exercising separation of concerns within the code changes.

What can I do to optimize my changes

  • Use the PullRequestQuantifier to quantify your PR accurately
    • Create a context profile for your repo using the context generator
    • Exclude files that are not necessary to be reviewed or do not increase the review complexity. Example: Autogenerated code, docs, project IDE setting files, binaries, etc. Check out the Excluded section from your prquantifier.yaml context profile.
    • Understand your typical change complexity, drive towards the desired complexity by adjusting the label mapping in your prquantifier.yaml context profile.
    • Only use the labels that matter to you, see context specification to customize your prquantifier.yaml context profile.
  • Change your engineering behaviors
    • For PRs that fall outside of the desired spectrum, review the details and check if:
      • Your PR could be split in smaller, self-contained PRs instead
      • Your PR only solves one particular issue. (For example, don't refactor and code new features in the same PR).

How to interpret the change counts in git diff output

  • One line was added: +1 -0
  • One line was deleted: +0 -1
  • One line was modified: +1 -1 (git diff doesn't know about modified, it will interpret that line like one addition plus one deletion)
  • Change percentiles: Change characteristics (addition, deletion, modification) of this PR in relation to all other PRs within the repository.


Was this comment helpful? :thumbsup:  :ok_hand:  :thumbsdown: (Email) Customize PullRequestQuantifier for this repository.

:tada:v7.4.0-preview.1 has been released which incorporates this pull request.:tada:

Handy links:

msftbot[bot] avatar Dec 20 '22 22:12 msftbot[bot]