Render Jupyter Notebooks#59685
Conversation
718c87f to
a4b445f
Compare
Introduces a new dependency: - github.com/bevzzz/nb And updates bazel buildfiles. Issue: sourcegraph#57797
Common bluemonday.Policy defined in a shared package, which presents a convenient API for working with the 3 main functions. Key policy rules tested in policy_test.go to prevent regressions. Switched from .Sanitize to .SanitizeReader in markdown.go: bluemonday always uses the latter under the hood, so we just cut to the chase and omit double conversion. sourcegraph#57797
a4b445f to
23ec3ba
Compare
Bump nb version to 0.2.1 (introduces Extension API) and use extensions: - render Markdown cells with goldmark.Markdown. internal/markdown now exposes a pre-configured renderer with markdown.Goldmark(). - add syntax highlighting to code cells with nb-synth and organize common chroma configurations in htmlutils/. - replace ANSI characters in stream output cells with ansihtml. Here I also updated the standard bluemonday.Policy to allow span classes prefixed with 'ansi-' - Added 2 lightweight extensions from goldmark-jupyter to goldmark and nb. This adds support for cell attachments in markdown cells (read more about it in https://github.com/bevzzz/nb/extension/extra/goldmark-jupyter README) - updated BAZEL build config sourcegraph#59685
global-styles/ansi.scss defines styling for classes that ansihtml uses in its colorized spans. For the moment, it does not differentiate between light/dark themes (more of a POC solution). These are only applicable to richHTML in "Markdown" module (haven't yet found a good way to separate rendered Jupyter notebooks into another component), we import them with the @use directiv in Markdown.module.scss. As far I understand, that should limit the scope to which the styles apply. See https://github.com/robert-nix/ansihtml/blob/master/html.go#L288 for reference. sourcegraph#59685
Extended the bluemonday.Policy to allow "jp-" classes in "div" elements. Most of the CSS in jupyter.scss is borrowed from the official Jupyter theme, with a few adaptations to accomodate external classes, e.g. "chroma". sourcegraph#59685
Deleted ipynb_test.go, only used it to debug some things locally.
When rendering Markdown (and Jupyter) files we can load MathJax scripts on the fly and have them render LaTeX/MathML. Spike because I'm not sure if loading external scripts is a good-enough solution. Works quite well though. Doing this server-side does not seem feasible, and I haven't found any reliable Go libraries to do it. MathJax is something of an industry standard and the go-to solution here. NOTE: Markdown.tsx is used in a lot of places, but I've added a parameter to only enable MathJax for "rendered file" usecase, when we are displaying it in code mirror.
|
I think this PR is now ready to be reviewed.
Some changes that I still have in the pipeline for
Those are important, but should not affect this PR much when the changes are released. |
camdencheek
left a comment
There was a problem hiding this comment.
Great work! Left a few comments on the backend side of things. @vovakulikov or @fkling, could you take a look at the client changes?
vovakulikov
left a comment
There was a problem hiding this comment.
Thanks for contribution, I left a few suggestions about MathJax loading, as soon as you can switch to the npm package instead of CDN you have greenlight from my side, thanks again.
vovakulikov
left a comment
There was a problem hiding this comment.
Sorry misclicked Approve and comment radio
misclicked Approve and comment radio
|
Thanks for your review, I'm going to implement these suggestions in the coming days 🙌 |
This more idiomatic way avoids declaring somewhat ambigiously named "once" variables in every package. htmlutil.Policy() is no longer exported and should be used via SanitizeBuffer, SanitizeString, or SanitiseBytes instead. GH sourcegraph#57797
Push "enabled" check to useMathJax and return early if the hook should not be applied. GH sourcegraph#57797
Instead of sourcing the scripts and fonts from the CDN, we bundle required assets and serve them as static files. mathjax-full is huge (approx. 30M), which is alright because it is not imported anywhere and will not be a part of the final bundle. Changes to BUILD files from running 'bazel configure'. Excluded 'pnpm-lock.yaml' from targets for 'check-yaml' pre-commit hook, because one of the added deps uses syntax which pre-commit-hooks falsely identifies as invalid YAML: pre-commit/pre-commit-hooks#984 GH sourcegraph#57797
This: - Strips out the mathjax scripts to avoid added build complexity - Cleans up the CSS so it passes the linters - Moves the styles to a the global styles so this works with both React and Svelte
c621821 to
0ba15e2
Compare
|
Hey @bevzzz! Really sorry about taking so long to get back to ya. I went ahead and just pushed a commit with some changes so we can get this merged. In short:
Will merge as soon as I can appease CI! |
|
Hey @camdencheek! Nvmnd, I am glad you had the time to look at the PR again. Thanks for handling the CI issues too -- some errors there seem to be slightly "above my paygrade" ( |
|
@bevzzz It's merged and live on sourcegraph.com! Sorry, we seem to be having issues merging PRs from forks, so I had to recreate this #62583. Thanks again for all your work on this -- I know at least a few people who will be very excited about the ability to view jupyter notebooks on Sourcegraph 🙂 |
|
That's so cool! I also know a couple of people who're gonna love it :) |
|
Thank you @bevzzz for your contribution! We gave you a shout-out: https://sourcegraph.com/blog/code-search-now-supports-jupyter-notebook-rendering Would love to send you some swag too: community@sourcegraph.com |
|
@jdorfman that's very kind of you! I've sent an email to community@sourcegraph.com. Glad I could contribute :) |
Hi everyone!
This PR uses
nb, a package for rendering Jupyter notebooks, to address #57797.I wrote
nbbecause I could not find an existing implementation in Go and it also seemed like a fun project to do.It's still lacking some important features, but I wanted to get your feedback early on.
Regarding styling and CSS: it is currently mirroring the standard Jupyter theme, and is more of a POC.
It needs some polish and should definitely be reconciled with the existing branding. I could use some help / advice here :)
This discussion has some reference links (I'll be sure to add these examples to the demo when the PR is ready).
@umpox @pdubroy (since you posted the above examples) would you give me some pointers?
It goes without saying that I would really appreciate any feedback regarding the package itself 🙌
Test plan
nbhas good unit test coverageinternal/htmlutils: unit tests inpolicy_test.goMarkdown.tsxTODO
nbtov0.2.0(once it's released) as it will add some rich formatting capabilitiesgoldmark.Markdownto render "markdown" cells in the notebooks