Skip to content

Support \includegraphics#1620

Merged
k4b7 merged 25 commits intoKaTeX:masterfrom
ronkok:includegraphics
Nov 1, 2018
Merged

Support \includegraphics#1620
k4b7 merged 25 commits intoKaTeX:masterfrom
ronkok:includegraphics

Conversation

@ronkok
Copy link
Collaborator

@ronkok ronkok commented Aug 15, 2018

This PR provides partial support for the \includegraphics function from the graphix package. Thus it addresses issue #898 and possibly #1596.

In the first argument, this PR supports the following key/value pairs:

  1. height=<size>
  2. width=<size>
  3. totalheight=<size>
  4. alt=<string>

Items 1 thru 3 in that list match some of the arguments in the graphix package. The provision for alt is non-standard. Other graphix options such as scale or angle are not supported. Perhaps that could be done for a future release.

Given this input:

\def\logo{\includegraphics[height=0.8em, totalheight=0.9em, width=0.9em, alt=KA logo]{https://cdn.kastatic.org/images/apple-touch-icon-57x57-precomposed.new.png}}
\def\logoB{\includegraphics[height=0.4em, totalheight=0.9em, width=0.9em, alt=KA logo]{https://cdn.kastatic.org/images/apple-touch-icon-57x57-precomposed.new.png}}
\begin{array}{l}
\underline{A\logo} + \sqrt{\logo} + \tfrac{A\logo}{\logo}\\[1em]
\underline{A\logoB} + \sqrt{x\logoB} + \tfrac{A\logoB}{\logoB}
\end{array}

We get the following rendering:

grahics

@ronkok
Copy link
Collaborator Author

ronkok commented Aug 15, 2018

This PR triggers many flow errors. I've done a good faith effort to fix them and succeeded at fixing a few. I don't intend to spend any more time on it. If someone wishes to collaborate with me on this PR and fix the flow errors, I would welcome that.

@k4b7
Copy link
Member

k4b7 commented Aug 17, 2018

@ronkok this is awesome! I'm glad to see it was too difficult. One thing that you could do to make it easier for people collaborate on this PR is push your branch to Khan/KaTeX and create a PR from that. This will allow other people with push access to pushing updates to this PR without having to create PRs against your fork.

@edemaine
Copy link
Member

edemaine commented Aug 17, 2018

@kevinbarabash We can push directly to ronkok/KaTeX as follows:

git remote add ronkok git@github.com:ronkok/KaTeX.git
git fetch ronkok
git checkout ronkok/includegraphics
...make changes, commit...
git push ronkok includegraphics  # I think this is right; I might be off

@edemaine
Copy link
Member

I feel like embedding arbitrary images is also a potential security hole, in that it could be used for tracking users. This is a third situation (I believe) where we might want some kind of security option. (The others are #1437 and #1596.)

In this case, we should probably also restrict the URLs to use protocols from the allowedProtocols setting. But I also fear that's not enough -- links are relatively safe, while embeds less so...

@k4b7
Copy link
Member

k4b7 commented Aug 17, 2018

@edemaine I didn't realize that. I would've assumed that @ronkok would've needed to explicitly grant push access to his repo, but maybe we have push access to that specific branch due b/c it's being used as a PR against this repo. I'll give that a try this evening.

@codecov
Copy link

codecov bot commented Aug 30, 2018

Codecov Report

Merging #1620 into master will decrease coverage by 0.21%.
The diff coverage is 83.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1620      +/-   ##
==========================================
- Coverage   93.89%   93.67%   -0.22%     
==========================================
  Files          79       80       +1     
  Lines        4571     4666      +95     
  Branches      796      813      +17     
==========================================
+ Hits         4292     4371      +79     
- Misses        247      261      +14     
- Partials       32       34       +2
Flag Coverage Δ
#screenshotter 88.62% <82.1%> (+0.06%) ⬆️
#test 85.04% <73.68%> (-0.24%) ⬇️
Impacted Files Coverage Δ
src/functions.js 100% <ø> (ø) ⬆️
src/mathMLTree.js 98.3% <ø> (ø) ⬆️
src/parseNode.js 84.21% <ø> (ø) ⬆️
src/Parser.js 97.23% <100%> (+0.03%) ⬆️
src/domTree.js 88.58% <59.09%> (-4.01%) ⬇️
src/functions/includegraphics.js 89.85% <89.85%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f628ca1...3df6573. Read the comment docs.

import type {CssStyle} from "../domTree";

// TODO: make it easier for commands to get the macro expanded text
// from inside an ordgroup argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edemaine how difficult do you think this would be? The code below for stringFromParseGroup is incomplete since it doesn't handle nested groups. Also, it seems inefficient b/c we're traversing the tree when we should be able to slice part of the expanded token stream and more easily concatenate those tokens into a string.

import mathMLTree from "../mathMLTree";
import {assertNodeType, checkNodeType} from "../parseNode";

const stringFromParseGroup = function(textArray: [ParseNode<"textord">]): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronkok [T] means a 1-tuple of type T where as T[] is an array of T.

const style = {height: height + depth + "em"};
const style: CssStyle = {height: height + depth + "em"};
if (width > 0) {
style.width = width + "em";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow errors here stemmed from the fact that when you create an object literal, flow considers that to be its type and complains about adding new properties after the fact. To get around this you have to specify the type explicitly.

const width = calculateSize(group.width, options);
node.setAttribute("width", width + "em");
}
node.setAttribute("src", group.path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow actually helped me find this error.

Copy link
Member

@k4b7 k4b7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronkok sorry for the delay. I'm requesting the use of "url" (if possible) for one for one of the arg types and tests for the failure modes.

|\implies|$P\implies Q$|`P\implies Q`|
|\in|$\in$||
|\includegraphics|<span style="color:firebrick;">Not supported</span>|[Issue #898](https://github.com/Khan/KaTeX/issues/898)|
|\includegraphics|$\includegraphics[height=0.8em, totalheight=0.9em, width=0.9em, alt=KA logo]{https://cdn.kastatic.org/images/apple-touch-icon-57x57-precomposed.new.png}$|`\includegraphics[height=0.8em, totalheight=0.9em, width=0.9em, alt=KA logo]{https://cdn.kastatic.org/images/apple-touch-icon-57x57-precomposed.new.png}`|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this also works with relative URLs as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this on https://deploy-preview-1620--katex.netlify.com/docs/next/support_table.html#i and it isn't rendering. I think updating this branch should fix it since #1631 was merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard my comment about this not working on netlify, I was looking in the wrong spot. 😅

props: {
numArgs: 1,
numOptionalArgs: 1,
argTypes: ["text", "text"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that we have a url arg type that you could use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first wrote this, I tried to specify a url arg type but got an error. I'll try again and see if I can identify the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding a url arg type. There really isn't any such thing in KaTeX. I've experimented a bit with href.js and replaced argTypes: ["url"] with argTypes: ["text"] and argTypes: ["chaos"]. They both run fine. It doesn't matter what arg type is specified in href.js because Parser.js ignores that specification.

What is really happening is that Lexer.js is preempting the usual logic for functions \href and \url. That urlFunctionRegexString captures both the function name and the first argument group. Parser.js then handles the result as an exception to the regular run of parsing code.

That RegEx, by the way, is not detailed enough to ensure that the argument is a valid url. It seems to be just enough to capture the function name and its first argument group.

So I question the usefulness of changing the code in this PR. The \href example lexes some arbitrary text then parses it as an exception to regular input. This PR does the same thing; it just does it in a different file. And, I would argue, by placing this exception code in the includegraphics.js file, I am grouping exeption code together in manner similar to the way we currently use defineFunction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronkok In Parser, there are escaping routines and allowed protocol checks:
https://github.com/Khan/KaTeX/blob/9b7e10e4f753786f125254a79d7cd88acc08afd4/src/Parser.js#L926-L938

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ylemkimon That's an excellent point. I'll move that code into a utility function so that it is available to includegraphics.

If someone wants to refactor Lexer and Parser in an way that creates an actual url arg type, that would be great, but that's not for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

argTypes: ["chaos"]. They both run fine.

That feels like a bug. @ronkok could you file an issue for this?

if (keyVal.length === 2) {
const str = keyVal[1].trim();
switch (keyVal[0].trim()) {
case "alt":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this as an option.

if (group.totalheight.number > 0) {
depth = calculateSize(group.totalheight, options) - height;
depth = Number(depth.toFixed(2));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, totalheight can be used to vertically shift the image within the group.


const style: CssStyle = {height: height + depth + "em"};
if (width > 0) {
style.width = width + "em";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot what calculateSize did for a second and was confused why we threw away the supplied units, but then remember.

if (width > 0) {
style.width = width + "em";
// Over-rule any max-width: 100% that the img may inherit.
style.maxWidth = width + "em";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do the same for maxHeight? It seems less likely to be an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could add a few lines to the KaTeX CSS that insulate a .katex .img from the inherited .img CSS. Something like what PR #1083 did for SVGs. Then this line of code would be unnecessary.

return node;
},
mathmlBuilder: (group, options) => {
const node = new mathMLTree.MathNode("mglyph", []);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd choice since it's not really a glyph. I assume this what other people are doing when they add images to MathML.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd choice since it's not really a glyph.

Agreed. I used it because all the other MathML elements seemed even less suited than <mglyph>.

I assume this what other people are doing when they add images to MathML.

I haven't found any examples where people add general images to MathML. For glyphs specfically, then yes, the MathML spec does have a section. But general images? We may be breaking new ground.

it("should produce mords", function() {
expect(getBuilt(img)[0].classes).toContain("mord");
});
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some failing tests to check the exceptions thrown in certain errors cases. Maybe try https://jestjs.io/docs/ro/next/expect#tothrowerrormatchinginlinesnapshot apparently it will update the source code of the test file with a string version of the exception being thrown when you run yarn jest -u.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll work something up.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinbarabash In order for jest to use inline snapshots, a project must have prettier installed. [ Reference]. I'll look for another way to get some confidence in the exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that. In that case disregard my suggestion.

@ronkok
Copy link
Collaborator Author

ronkok commented Sep 11, 2018

In the latest commit, I've changed the way that the function matches the url argument. Instead of specifying a text arg type, the PR now uses the same Lexer RegEx as used by the \href and \url functions. Now, URL strings can contain a % character, useful for URL escapes. As an added bonus, we re-use the earlier code that checks the URL protocol.

The PR still does not do anything to address @edemaine's security concerns but I think it is otherwise ready to go.

@ronkok ronkok added the blocked label Sep 13, 2018
@ylemkimon ylemkimon mentioned this pull request Sep 21, 2018
@ronkok ronkok removed the blocked label Oct 23, 2018
@ronkok
Copy link
Collaborator Author

ronkok commented Oct 23, 2018

I've updated this PR to coordinate with the recently added "raw" ArgType. (Thanks!) The graphics are running again and I've eliminated all but one of the flow errors, which is better than my past performance.

I still am unsure about how to address the security issue. I suspect that's a little above my pay grade.

@k4b7
Copy link
Member

k4b7 commented Oct 29, 2018

@ronkok I'll look into this next week.

@k4b7
Copy link
Member

k4b7 commented Oct 29, 2018

Looks like the netlify issue fixed itself. The only thing left an issue with the flow typing.

@ronkok
Copy link
Collaborator Author

ronkok commented Oct 29, 2018

No worries.

return {type: "raw", mode: "text",
string: this.parseStringGroup("raw", optional, true).text};
case "raw": {
const token = this.parseStringGroup("raw", optional, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronkok the issue was that the return type for parseStringGroup is ?Token so a null check was required.

string: token.text,
};
} else {
throw new ParseError("Expected raw group", this.nextToken);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronkok is it possible to construct a test case where this would happen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinbarabash Nothing comes readily to mind. I'll give it a try tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinbarabash I can trigger this parse error by writing a Unicode combining character as the first character in the raw string. Do you want me to write that test into katex-spec.js?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Yes please.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is added. I think I'm done here. Thanks again to @ylemkimon for adding the raw arg type.

@k4b7
Copy link
Member

k4b7 commented Nov 1, 2018

@ronkok thanks for adding this feature. I'm sure people will find some really great uses for it.

@k4b7 k4b7 merged commit f713f32 into KaTeX:master Nov 1, 2018
@edemaine
Copy link
Member

edemaine commented Nov 1, 2018

As I wrote above:

I feel like embedding arbitrary images is also a potential security hole, in that it could be used for tracking users. This is a third situation (I believe) where we might want some kind of security option. (The others are #1437 and #1596.)

I think we really need some kind of security option (like we did with strict) before this should be released. Links and embedded images are quite different, so our existing URL restrictions don't suffice.

@k4b7
Copy link
Member

k4b7 commented Nov 1, 2018

Agree. We should make sure we have a security option before the next release.

@edemaine edemaine mentioned this pull request Nov 1, 2018
@ronkok ronkok deleted the includegraphics branch November 7, 2018 19:18
ylemkimon added a commit that referenced this pull request Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants