Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Render Jupyter notebooks#62583

Merged
camdencheek merged 37 commits into
mainfrom
cc/feat/jupyter-notebooks
May 10, 2024
Merged

Render Jupyter notebooks#62583
camdencheek merged 37 commits into
mainfrom
cc/feat/jupyter-notebooks

Conversation

@camdencheek

Copy link
Copy Markdown
Member

This is a duplicate of https://github.com/sourcegraph/sourcegraph/pull/59685 because I couldn't get a PR from an external contributor to build in CI.

Test plan

Described in original PR.

bevzzz and others added 30 commits January 31, 2024 11:43
Introduces a new dependency:
- github.com/bevzzz/nb

And updates bazel buildfiles.

Issue: #57797
Common bluemonday.Policy defined in a shared package,
which presents a convenient API for working with the 3 main functions.
Key policy rules tested in policy_test.go to prevent regressions.

Switched from .Sanitize to .SanitizeReader in markdown.go:
bluemonday always uses the latter under the hood,
so we just cut to the chase and omit double conversion.

#57797
Bump nb version to 0.2.1 (introduces Extension API) and use extensions:
- render Markdown cells with goldmark.Markdown. internal/markdown
now exposes a pre-configured renderer with markdown.Goldmark().
- add syntax highlighting to code cells with nb-synth and organize
common chroma configurations in htmlutils/.
- replace ANSI characters in stream output cells with ansihtml. Here
I also updated the standard bluemonday.Policy to allow span classes
prefixed with 'ansi-'
- Added 2 lightweight extensions from goldmark-jupyter to goldmark
and nb. This adds support for cell attachments in markdown cells
(read more about it in https://github.com/bevzzz/nb/extension/extra/goldmark-jupyter README)
- updated BAZEL build config

#59685
global-styles/ansi.scss defines styling for classes
that ansihtml uses in its colorized spans.
For the moment, it does not differentiate between
light/dark themes (more of a POC solution).

These are only applicable to richHTML in "Markdown" module
(haven't yet found a good way to separate rendered Jupyter notebooks
into another component), we import them with the @use directiv
 in Markdown.module.scss.
As far I understand, that should limit the scope to which the styles apply.

See https://github.com/robert-nix/ansihtml/blob/master/html.go#L288 for reference.

#59685
Extended the bluemonday.Policy to allow "jp-" classes in "div" elements.

Most of the CSS in jupyter.scss is borrowed from the official Jupyter theme,
with a few adaptations to accomodate external classes, e.g. "chroma".

#59685
Deleted ipynb_test.go, only used it to debug some things locally.
When rendering Markdown (and Jupyter) files
we can load MathJax scripts on the fly and
have them render LaTeX/MathML.

Spike because I'm not sure if loading external scripts is
a good-enough solution. Works quite well though.
Doing this server-side does not seem feasible,
and I haven't found any reliable Go libraries to do it.
MathJax is something of an industry standard and the
go-to solution here.

NOTE: Markdown.tsx is used in a lot of places, but
I've added a parameter to only enable MathJax for "rendered file"
usecase, when we are displaying it in code mirror.
This more idiomatic way avoids declaring somewhat ambigiously named
"once" variables in every package.

htmlutil.Policy() is no longer exported and should be used via
SanitizeBuffer, SanitizeString, or SanitiseBytes instead.

GH #57797
Push "enabled" check to useMathJax and
return early if the hook should not be applied.

GH #57797
Instead of sourcing the scripts and fonts from the CDN,
we bundle required assets and serve them as static files.

mathjax-full is huge (approx. 30M), which is alright
because it is not imported anywhere and will not be
a part of the final bundle.

Changes to BUILD files from running 'bazel configure'.

Excluded 'pnpm-lock.yaml' from targets for 'check-yaml' pre-commit hook,
because one of the added deps uses syntax which pre-commit-hooks falsely
identifies as invalid YAML:

pre-commit/pre-commit-hooks#984

GH #57797
Previously chroma highlighting was only used to render
markdown files, so that is where the class names would
be patched. Now, Jupyter renderer relies on them as well,
and we need a single place from where to patch chroma.

Merged all tests in htmlutil to htmlutil_test.go.
Updated bazel BUILD files.

GH #57797
This version supports earlier Jupyter Notebook schemas (v4.0, v3.0 and below)
This:
- Strips out the mathjax scripts to avoid added build complexity
- Cleans up the CSS so it passes the linters
- Moves the styles to a the global styles so this works with both React
  and Svelte
@cla-bot cla-bot Bot added the cla-signed label May 9, 2024
@camdencheek camdencheek marked this pull request as ready for review May 9, 2024 22:53
@camdencheek camdencheek requested a review from a team May 9, 2024 22:59

@jasonhawkharris jasonhawkharris left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@camdencheek camdencheek merged commit a4a1111 into main May 10, 2024
@camdencheek camdencheek deleted the cc/feat/jupyter-notebooks branch May 10, 2024 16:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants