Skip to content

Generate reference images from LaTeX#268

Merged
k4b7 merged 1 commit intoKaTeX:masterfrom
gagern:texcmp
Jul 7, 2015
Merged

Generate reference images from LaTeX#268
k4b7 merged 1 commit intoKaTeX:masterfrom
gagern:texcmp

Conversation

@gagern
Copy link
Collaborator

@gagern gagern commented Jun 25, 2015

In #156 (comment), @KevinB7 wrote that he'd like to see TeX renderings of our examples before proceeding. Here they are. The docker may still be improved upon, I guess, in terms of size and similar, but it works. I also added a small tool to compute visual diffs, but haven't committed the results of this. One thing I notice is that in TeX, \llap and \rlap apparently take an argument in text mode.

@gagern gagern changed the title Texcmp Generate reference images from LaTeX Jun 25, 2015
@k4b7
Copy link
Member

k4b7 commented Jun 25, 2015

This is awesome! I was trying to the square root rendering again and started comparing against LaTeX manually and noticed that the amount space under the hrule differs quite a bit between KaTeX and LaTeX. I'll post some screenshots later, but I was wondering if you noticed any differences for the square root test case?

@gagern
Copy link
Collaborator Author

gagern commented Jun 25, 2015

Which test case, Sqrt or SqrtRoot? I noticed considerable differences for both:

sqrt

sqrtroot

The horizontal spacing in the second one seems to be largely affected by the proportions of the digits in the root. I guess this is because LaTeX has a separate font for smaller sizes, while KaTeX simply scales the 10pt font down, right?

@gagern
Copy link
Collaborator Author

gagern commented Jun 25, 2015

I also added a small tool to compute visual diffs, but haven't committed the results of this.

I just committed and pushed these diffs to a separate branch. I'm not sure if you want this in repo. I don't think I'd want it there, since they are easily reproduced as long as you have imagemagick around. But it might be convenient to have a look now.

@sophiebits
Copy link
Contributor

I guess this is because LaTeX has a separate font for smaller sizes, while KaTeX simply scales the 10pt font down, right?

This certainly causes some deviations. We tend to just ignore it.

@k4b7
Copy link
Member

k4b7 commented Jun 25, 2015

@gagern I was look at Sqrt thanks for posting SqrtRoot too. Thanks for posting all of the screenshots. It's nice to know where were at in terms of LaTeX compliance for current features. I agree we can just regenerate these as needed. When we improve any of the renderings we'll update the screenshotter files. I'm not sure we need to store the -pdflatex.png images either.

@gagern
Copy link
Collaborator Author

gagern commented Jun 26, 2015

I'm not sure we need to store the -pdflatex.png images either.

I'm not sure how different those are from the *-firefox.png files. Both can be recreated exactly from the sources with the help of a docker, which requires a working docker setup and considerable amounts of HDD space for the relevant images. Is there any reason to keep the KaTeX snapshots but not their reference images? Is it for the sake of documenting changes to the code base which affect screenshots?

@k4b7
Copy link
Member

k4b7 commented Jun 26, 2015

The difference is that we use *-firefox.png as reference files to identify unintended changes in the rendering. We do this by running the snapshotter and seeing if there are any differences in the files according to git. This process should be automated so that we can fail the build if there are difference, but right now it's a manual process.

If there are, either the code should be fixed so the rendering matches or the reference file should be updated because we fixed/improved the rendering.

On the other hand *-pdflatex.png images aren't being compared against old version of themselves so we don't need to have them in the repo.

@k4b7
Copy link
Member

k4b7 commented Jun 26, 2015

I was looking through some of the -pdflatex.png renderings and some of them are clipped. Do you know why this happening?

@gagern
Copy link
Collaborator Author

gagern commented Jun 26, 2015

some of them are clipped.

That's because I'm aligning the images to maximize overlap, but using the canvas from the Firefox images for reference. So if after that overlap-maximizing alignment, the TeX version extends beyond the margin of the Firefox screenshot, it will get clipped. I fear that without that alignment step, many of the images would have some constant offset over large parts of the canvas, making relative changes harder to see.

@k4b7
Copy link
Member

k4b7 commented Jun 28, 2015

@gagern I would like merge this in. I'll try to do a better review next week. Could you add some comments in texcmp.js to describe what's going on? I assume that you're using the fft to align images.

I was looking at https://github.com/gagern/KaTeX/blob/diffs/test/screenshotter/diff/SizingBaseline.png and realized that the reason why it's so off is because LaTeX doesn't support sizing inside math. And with respect to https://github.com/gagern/KaTeX/blob/diffs/test/screenshotter/diff/NestedFractions.png I see that it's lining things up with the bigger fractions and b/c our spacing is wrong the LaTeX screenshot gets clipped.

Could you make it so that the bounding box of the diffs is the union of the bounding boxes of the images being compared?

Also, if there are things that we know are not going to yield invalid comparisons, i.e. sizing commands inside math, can skip those examples?

@k4b7
Copy link
Member

k4b7 commented Jun 28, 2015

~~I think the reason why the exponents are a bit off is b/c LaTeX actually stretches them horizontally a bit (or uses different glyphs).~~I really need to re-read the thread before posting.

@sophiebits
Copy link
Contributor

It uses different glyphs. Same as #268 (comment).

@gagern
Copy link
Collaborator Author

gagern commented Jun 28, 2015

I would like merge this in.

Without the actual images, right? Will rewrite when I find the time.

Could you make it so that the bounding box of the diffs is the union of the bounding boxes of the images being compared?

The way I implemented things now, I rasterize the PDF to a temporary file, then align that to produce the *-pdflatex.png. So at that point, the clipping has already occurred. The reason I do it this way is because I sometimes want to flip between two renderings, and see what changes. If I move the aligning from the code running inside the docker to the diff, that will no longer be possible that easily. What do you think?

can skip those examples?

Sure, we can add those to the blacklist.

@sophiebits
Copy link
Contributor

Can we just add a little padding to the katex screenshots so that the pdflatex ones don't later get cut off?

@gagern
Copy link
Collaborator Author

gagern commented Jun 28, 2015

Some of them are nearly exhausting their 1024×768 screenshot size. E.g. OpLimits and Spacing. If we add margin or padding to the test.html, these might start scrolling unless we increase the screenshoit size. Is that desirable?

@k4b7
Copy link
Member

k4b7 commented Jun 28, 2015

Without the actual images, right? Will rewrite when I find the time.

That's right. Whenever you have time.

I think @spicyj meant expanding the bounds of the KaTeX images. After alignment we should know the difference in bounds and be able to pad either image accordingly. We probably want to keep the existing reference screenshots as is so this would be another set of temp images. These images could be used for flipping between and generate visualize diffs without clipping any of the renderings.

The clipping doesn't need to be perfect for the images to be useful as indicated by the number of new issues opened due to rendering differences. I think we should merge it and improve the clipping later.

@sophiebits
Copy link
Contributor

Well, whichever works. We could probably reduce the font size and be okay if we're running out of room.

@gagern gagern mentioned this pull request Jun 29, 2015
gagern added a commit to gagern/KaTeX that referenced this pull request Jun 29, 2015
The same test cases we use for our screenshots from Firefox are now also
being rendered by pdflatex, so the resulting images can be used as reference
for how things are supposed to look (if we concentrate on compatibility with
LaTeX).  To make comparisons even easier, the differences between LaTeX and
Firefox snapshots are rendered in a visual way, using different colors.

Discussed in pull request KaTeX#268.
gagern added a commit to gagern/KaTeX that referenced this pull request Jun 29, 2015
The same test cases we use for our screenshots from Firefox are now also
being rendered by pdflatex, so the resulting images can be used as reference
for how things are supposed to look (if we concentrate on compatibility with
LaTeX).  To make comparisons even easier, the differences between LaTeX and
Firefox snapshots are rendered in a visual way, using different colors.

Discussed in pull request KaTeX#268.
@gagern
Copy link
Collaborator Author

gagern commented Jun 29, 2015

OK, rewrote things, incorporating your requests. The rasterized images are now not artificially aligned, but use the dimensions of the underlying document. The alignment is computed but used only for the diffs. So these are indeed rendered to the union of both pictures, then trimmed to the area that's actually being used. I added some comments about what's happening in texcmp.js. And I omitted the resulting images, although I added them on another branch. Hope this helps.

I also implemented some fake math sizing commands (although they still don't work with fractions), a \blue command and something to handle display mode, pre and post text chunks. This makes the images resemble the ones from Firefox much better. The blacklist now contains only three items, namely Colors, DeepFontSizing and KaTeX.

@k4b7
Copy link
Member

k4b7 commented Jun 29, 2015

@gagern Awesome, thanks! I'll try to review this evening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable normal updates, so we don't receive any bug fix releases or similar, just security relevant fixes. The benefit is that we don't have to pin the versions of the packages we install (which are quite numerous, with all the indirect deps pulled in), and still rely on the same versions getting installed in subsequent runs unless something drastic forces a security update to one of the packages. It's one more step towards better reproducibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here to that effect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are files on the host system created by docker, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I wonder whether we should be streaming out those files via a TCP connection instead of a --volume mount. Or pre-create the files outside. Or anything else like this. But I see that as a separate pull request, once the core functionality is in place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Thanks for calling it out in the docs. Pre-creating the files seems like it would be pretty easy to implement.

@gagern
Copy link
Collaborator Author

gagern commented Jul 6, 2015

If you are happy with all the new changes, I'd squash everything into a single commit so you can merge that.

@k4b7
Copy link
Member

k4b7 commented Jul 6, 2015

@gagern the new changes look great! Squash it and I'll merge it.

The same test cases we use for our screenshots from Firefox are now also
being rendered by pdflatex, so the resulting images can be used as reference
for how things are supposed to look (if we concentrate on compatibility with
LaTeX).  To make comparisons even easier, the differences between LaTeX and
Firefox snapshots are rendered in a visual way, using different colors.

Discussed in pull request KaTeX#268.
k4b7 added a commit that referenced this pull request Jul 7, 2015
Generate reference images from LaTeX
@k4b7 k4b7 merged commit a06744e into KaTeX:master Jul 7, 2015
@k4b7
Copy link
Member

k4b7 commented Jul 7, 2015

Thanks for the pull request.

@gagern
Copy link
Collaborator Author

gagern commented Jul 7, 2015

Good to have this in. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants