Skip to content

[breaking] trust setting to indicate whether input text is trusted#1794

Merged
k4b7 merged 15 commits intoKaTeX:masterfrom
edemaine:trust
Jul 9, 2019
Merged

[breaking] trust setting to indicate whether input text is trusted#1794
k4b7 merged 15 commits intoKaTeX:masterfrom
edemaine:trust

Conversation

@edemaine
Copy link
Member

Here is a draft of a trust setting to indicate whether the input text is trusted (and thus fix #1771), also allowing a function (like we do with strict) to depend on the specific function that we're worried about (advanced usage).

I did not add support for an array of trusted things like I suggested in #1771, in order to keep this simple, but happy to add something like that if we want.

I also have not added any tests; wanted to see if people like this interface first.

@ylemkimon
Copy link
Member

I was thinking of a dictionary like:

{
    allowedProtocols: ["https", "http"],
    allowedClass: /katex-.+/,
    ....
}

but I think the function approach is better.

docs/options.md Outdated
(outside an array/tabular environment). In strict mode, no line break
results, as in LaTeX.
- `trust`: `boolean` or `function` (default: `false`). If `false` (do not trust input), prevent any commands like `\includegraphics` that could enable adverse behavior, rendering them instead in `errorColor`. If `true` (trust input), allow all such commands. Provide a custom function `handler(command, ...)` to customize behavior depending on the command and possibly its arguments (e.g. a URL). A list of such commands:
- `"\\includegraphics"`, with URL argument
Copy link
Member

@ylemkimon ylemkimon Nov 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of deprecating allowedProtocols and allow setting allowed protocols for \url and \href here?

alt = alt.substring(0, alt.lastIndexOf('.'));
}

if (!parser.settings.isTrusted("\\includegraphics", src)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should provide context information as the second argument like:

{
    type: "url",
    url: "https://katex.org/image.png",
    protocol: "https",
    ...
}

or

{
    type: "html",
    class: "my-class",
}

to allow setting across commands and prevent mis-parsing the url.

@edemaine
Copy link
Member Author

edemaine commented Dec 28, 2018

@ylemkimon Sorry for the long delay on looking at this. I've just pushed a new proposal where each call to the trust function gets a TrustContext which includes the command but also e.g. url and protocol as you suggested. What do you think?

Here are some sample uses: (which maybe I should add to documentation)

  • Forbid specific command: trust: (context) => context.command !== '\includegraphics'
  • Allow specific command: trust: (context) => context.command === '\includegraphics'
  • Allow specific protocol: trust: (context) => context.protocol === 'http'
  • Forbid specific protocol: trust: (context) => context.protocol !== 'file'

I agree that this could replace allowedProtocols, if I test for trust in \url and \href as well. In that case, however, the default trust of false will forbid all \url and \href calls. Is that what we want? Maybe, actually.

@codecov-io
Copy link

codecov-io commented Dec 28, 2018

Codecov Report

Merging #1794 into master will decrease coverage by 0.05%.
The diff coverage is 87.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1794      +/-   ##
==========================================
- Coverage   93.39%   93.34%   -0.06%     
==========================================
  Files          79       79              
  Lines        4981     4988       +7     
  Branches      872      876       +4     
==========================================
+ Hits         4652     4656       +4     
- Misses        289      291       +2     
- Partials       40       41       +1
Flag Coverage Δ
#screenshotter 89.06% <28.57%> (-0.1%) ⬇️
#test 86.5% <87.5%> (-0.05%) ⬇️
Impacted Files Coverage Δ
src/parseNode.js 84.21% <ø> (ø) ⬆️
src/defineFunction.js 93.75% <ø> (ø) ⬆️
src/functions/includegraphics.js 0% <0%> (ø) ⬆️
src/utils.js 93.75% <100%> (+0.64%) ⬆️
src/Settings.js 78.57% <100%> (+2.25%) ⬆️
src/Parser.js 95.29% <100%> (-0.09%) ⬇️
src/functions/href.js 96% <100%> (-4%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc79f79...c80f5be. Read the comment docs.

@ylemkimon
Copy link
Member

Is that what we want? Maybe, actually.

I think we should be as conservative as possible, not allowing anything by default.

@edemaine
Copy link
Member Author

edemaine commented Feb 9, 2019

@ylemkimon Agreed, safest does seem like a good default. This will break the existing release, though we can give a setting trust: (context) => ['\\url ', '\\href'].includes(context.command) && ['http', 'https', 'mailto', '_relative'].include s(context.protocol) (already almost an example in the documentation) that mimics the current trust behavior.

This change breaks a lot of tests, which I still need to fix. Also we should either remove or deprecate allowedProtocols.

I think we should try to settle something soon, because #1842 is a pretty critical bug, so once fixed, we're going to want to release -- but I also think we can't release with \includegraphics (already merged) before we have a trust setting.

@k4b7
Copy link
Member

k4b7 commented Mar 17, 2019

@edemaine this is a big change. Thanks for tackling this! ❤️

@edemaine edemaine mentioned this pull request Mar 18, 2019
docs/security.md Outdated
with untrusted inputs; refer to [Options](options.md) for more details.
* `maxSize` can prevent large width/height visual affronts.
* `maxExpand` can prevent infinite macro loop attacks.
* `allowedProtocols` can prevent certain protocols in URLs (e.g., with `\href`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as we're in 0.x (major version zero) stage, we can remove allowedProtocols without deprecation.

src/utils.js Outdated
* Return the protocol of a URL, or "_relative" if the URL does not specify a
* protocol (and thus is relative).
*/
export const urlToProtocol = function(url: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protocolFromUrl or getProtocol(FromUrl)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go with protocolFromUrl since we use fooFromBar in a number of places already.

Copy link
Member

@ylemkimon ylemkimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the PR?

@k4b7
Copy link
Member

k4b7 commented Jul 5, 2019

I resolved the merge conflicts which were due to this PR modifying the /includegraphics docs and the docs being deleted on master. The resolution was to delete them here as well. We can add the docs back in the future once we re-enable /includegraphics after this PR is merged.

@k4b7
Copy link
Member

k4b7 commented Jul 5, 2019

@ylemkimon I can make the changes you requested.

@k4b7
Copy link
Member

k4b7 commented Jul 5, 2019

Looks like there are some flow errors after the merge. I'll fix those up too.

@k4b7 k4b7 changed the title trust setting to indicate whether input text is trusted [breaking] trust setting to indicate whether input text is trusted Jul 5, 2019

- Forbid specific command: `trust: (context) => context.command !== '\\includegraphics'`
- Allow specific command: `trust: (context) => context.command === '\\url'`
- Allow multiple specific commands: `trust: (context) => ['\\url', '\\href'].includes(context.command)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edemaine I really like these examples.

with untrusted inputs; refer to [Options](options.md) for more details.
* `maxSize` can prevent large width/height visual affronts.
* `maxExpand` can prevent infinite macro loop attacks.
* `trust` can allow certain commands that are not always safe (e.g., `\includegraphics`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of changing to something else, but I think we can just re-enable \includegraphics once this PR is merged.

}
result = this.handleUnsupportedCmd();
result = this.formatUnsupportedCmd(text);
this.consume();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consume was moved out of formatUnsupported to here.

protocol?: string,
|},
"\\includegraphics": {|
command: "\\includegraphics",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this in since we want to re-enable \includegraphics once this lands.

"\\url": {|
command: "\\url",
url: string,
protocol?: string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this could be restructured to dedupe the url and protocol we may have other commands that are insecure in other ways in the future.

args: AnyParseNode[],
optArgs: (?AnyParseNode)[],
) => ParseNode<NODETYPE>;
) => UnsupportedCmdParseNode | ParseNode<NODETYPE>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to flip the order to make flow happy. I'll add a link to the issue in the code.

src/utils.js Outdated
* Return the protocol of a URL, or "_relative" if the URL does not specify a
* protocol (and thus is relative).
*/
export const urlToProtocol = function(url: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go with protocolFromUrl since we use fooFromBar in a number of places already.

* should be an object with `command` field specifying the relevant LaTeX
* command (as a string starting with `\`), and any other arguments, etc.
* If `context` has a `url` field, a `protocol` field will automatically
* get added by this function (changing the specified object).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is great!

@k4b7
Copy link
Member

k4b7 commented Jul 5, 2019

I renamed this PR to having [breaking] in the title as a reminder to ourselves to bump the version when publishing. @ylemkimon I think I've addressed all of your concerns. Please have another look when you have some time.

@edemaine
Copy link
Member Author

edemaine commented Jul 6, 2019

Thanks @kevinbarabash for picking up my slack here. (And sorry I've been too busy lately to do it myself.)

Given that \includegraphics was blocking only on this, perhaps the resolution should have been to put it back in? Up to you though.

allowedProtocols: [],
}));
});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to replace these with tests of the corresponding trust settings (as in the documentation examples). In general, it would be good to add trust tests. Because trust violations don't throw errors, you could probably just use snapshot tests... Or test for the appropriate parse node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Will do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've re-added the tests but I had to refactor them a bit to be snapshot tests since \href with trust: false parses, but produces a different parse tree from when trust: true is set.

I've also set the default for the unit tests to trust: false. I can probably remove that setting though since false is the default.

@k4b7
Copy link
Member

k4b7 commented Jul 6, 2019

Given that \includegraphics was blocking only on this, perhaps the resolution should have been to put it back in? Up to you though.

I'd like to keep this PR focused on the new trust setting. Re-enabling \includegraphics will require adding screenshots and tests that are unrelated to rest of this PR. I can create a stacked PR so we can see what those changes look like before this is merged.

@k4b7 k4b7 self-assigned this Jul 6, 2019
@edemaine
Copy link
Member Author

edemaine commented Jul 6, 2019

Makes sense. We'll have the old commits anyway to guide the small changes needed for \includegraphics. I don't think it's necessary to stack PRs.

Copy link
Member

@ylemkimon ylemkimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security setting

4 participants