DFLib JJava Jupyter Kernel Documentation

JJava is a Java kernel for Jupyter maintained by the DFLib.org community. Internally, the kernel executes Java code via JShell. Some of its additional commands are supported via a syntax similar to IPython magics.

Features

The kernel supports the following Jupyter features:

Code execution:
Autocompletion (TAB in Jupyter notebook):
Code inspection (Shift-TAB up to 4 times in Jupyter notebook).
Colored error message displays.
Add maven dependencies at runtime (See also magics).
Display rich output. E.g. here is a chart produced by DFLib and ECharts:

chart in jupyter

eval function. (See also kernel) Note: the signature is Object eval(String) throws Exception. This evaluates the expression (a cell) in the user scope and returns the actual evaluation result instead of a serialized one.
Configurable evaluation timeout

Installation

Prerequisites

Java 11 or newer
If you already have another version of jjava kernel installed, remove it with the following command:

jupyter kernelspec remove java

Install Python and Jupyter

There are a few ways to install Python and Jupyter, depending on your OS and preferences. Below we provide a few specific recipes to help you to get started (especially if you are a Java developer new to the Python environment). But generally, Python is available from their official site, and Jupyter has its own installation instructions.

MacOS

If you are on MacOS, you can install both Python and Jupyter ("lab" and "notebook") with a single Homebrew command:

brew install jupyter

Windows

If you are on Windows, you can install Python using the official installer, and use "pip" for Jupyter:

Go to https://www.python.org/downloads/windows/ and download the latest Python installer
Run the installer
Open Command Prompt (cmd), and run pip install jupyterlab (or pip install notebook if you prefer the "classic" notebook)

Install JJava

Download JJava: go to GitHub releases, pick the latest version (or a specific one that you need) and under the "Assets" section download a file called jjava-${version}-kernelspec.zip
Unzip the file into a temporary location
Run the following commands from the parent directory that contains the unzipped kernel folder
```
jupyter kernelspec install jjava-${version}-kernelspec --user --name=java
```
The above is the most common install recipe. To see all options available, run jupyter kernelspec install help.

Check that the Java kernel is installed:

jupyter kernelspec list

Available kernels:
  python3    /path/to/python/kernel
  java       /path/to/java/kernel

Running Jupyter

Depending on which Jupyter environment you installed, there will be a specific command to run the notebook. E.g.:

jupyter notebook

jupyter lab

jupyter console --kernel=java

Configuring

JJava kernel behavior can be configured via environment variables. Here is an example on Windows using cmd:

set JJAVA_JVM_OPTS=-Xmx8192m
jupyter lab

And the same example on Linux or MacOS:

export JJAVA_JVM_OPTS=-Xmx8192m
jupyter lab

# alternatively, store it in ".bashrc" so that the
# variable is always set implicitly
# echo 'export JJAVA_JVM_OPTS=-Xmx8192m' >> ~/.bashrc

Sometimes you don’t fully control the startup environment. If that’s the case, you may store the same variables in the kernel.json file that is a part of the JJava installation. To locate this file, run jupyter kernelspec list command. kernel.json should be located in the directory corresponding to the "java" kernel as listed by this command. In the file, look for the "env" section, and set any number of variables. E.g:

{
    "argv": [
        "java",
        "-jar",
        "{resource_dir}/jjava-launcher.jar",
        "{resource_dir}/jjava.jar",
        "{connection_file}"
    ],
    "display_name": "Java (jjava)",
    "language": "java",
    "interrupt_mode": "message",
    "env": {
      "JJAVA_JVM_OPTS" : "-Xmx8192m"
    },
    "metadata": {
    }
}

If you store your variables in kernel.json, be aware that they will be wiped out on subsequent kernel reinstalls. So you will have to do it again after every kernel upgrade.

Environment Variables

Environment variable Default Description

Environment variable	Default	Description
`JJAVA_COMPILER_OPTS`	`""`	A space delimited list of command line options that would be passed to the `javac` command when compiling a project. For example `-parameters` to enable retaining parameter names for reflection.
`JJAVA_TIMEOUT`	`"-1"`	A duration specifying a timeout (in milliseconds by default) for a single top level statement. If less than `1` then there is no timeout. If desired a time may be specified with a `java.util.concurrent.TimeUnit` may be given following the duration number (e.g., `"30 SECONDS"`).
`JJAVA_CLASSPATH`	`""`	A file path separator delimited list of classpath entries that should be available to the user code. No matter what OS you are on, this should use forward slash `/` as the file separator. Also, each path may actually be a simple glob.
`JJAVA_STARTUP_SCRIPTS_PATH`	`""`	A file path separator delimited list of `.jshell` scripts to run on startup. No matter what OS you are on, this should use forward slash `/` as the file separator. Also, each path may actually be a simple glob.
`JJAVA_STARTUP_SCRIPT`	`""`	A block of Java code to run when the kernel starts up. This may be something like `'import my.utils.*;'` to set up some default imports, or any other relevant code.
`JJAVA_JVM_OPTS`	`""`	A space-delimited list of command line options that would be passed to the `java` command running the kernel.
`JJAVA_LOAD_EXTENSIONS`	`"1"`	Option that controls autoloading Kernel extensions feature. If you do not want third-party libraries to load anything implicitly you could turn it off by `export JJAVA_LOAD_EXTENSIONS=0`

JJAVA_COMPILER_OPTS

""

A space delimited list of command line options that would be passed to the javac command when compiling a project. For example -parameters to enable retaining parameter names for reflection.

JJAVA_TIMEOUT

"-1"

A duration specifying a timeout (in milliseconds by default) for a single top level statement. If less than 1 then there is no timeout. If desired a time may be specified with a java.util.concurrent.TimeUnit may be given following the duration number (e.g., "30 SECONDS").

JJAVA_CLASSPATH

""

A file path separator delimited list of classpath entries that should be available to the user code. No matter what OS you are on, this should use forward slash / as the file separator. Also, each path may actually be a simple glob.

JJAVA_STARTUP_SCRIPTS_PATH

""

A file path separator delimited list of .jshell scripts to run on startup. No matter what OS you are on, this should use forward slash / as the file separator. Also, each path may actually be a simple glob.

JJAVA_STARTUP_SCRIPT

""

A block of Java code to run when the kernel starts up. This may be something like 'import my.utils.*;' to set up some default imports, or any other relevant code.

JJAVA_JVM_OPTS

""

A space-delimited list of command line options that would be passed to the java command running the kernel.

JJAVA_LOAD_EXTENSIONS

"1"

Option that controls autoloading Kernel extensions feature. If you do not want third-party libraries to load anything implicitly you could turn it off by export JJAVA_LOAD_EXTENSIONS=0

Glob Syntax

Variables that support this glob syntax may reference a set of files with a single path-like string. Basic glob queries are supported including:

* to match 0 or more characters up to the next path boundary /
? to match a single character
A path ending in / implicitly adds a * to match all files in the resolved directory

Any relative paths are resolved from the notebook server’s working directory. For example the glob *.jar will match all jars is the directory that the jupyter notebook command was run.

Users on any OS should use / as a path separator.

Notebook Functions

JJava injects a number of static helper methods to access the runtime environment. Here are some examples…

kernel()

kernel() function returns the kernel object associated with the notebook. This is if you need to access JJava runtime internals (such as JShell) for any reason:

JavaKernel kernel = kernel(); (1)

1	The `kernel()` function is a static import provided by default in each notebook.

display(..)

display(..) allows to print objects from anywhere in the cell (not just the last cell statement). It differs from System.out.println(..) in that it will apply custom renderers to the displayed objects (e.g., a DFLib DataFrame will be displayed as a properly truncated table).

display(o);

You can also pass a mime type to display(..) that would e.g., render a piece of text as HTML, markdown, etc.

display("<b>bold</b>", "text/html");
display("_italic_", "text/markdown");

updateDisplay(..)

You can update existing display elements with updateDisplay(..). Here is an example that will show a refreshing countdown:

String id = display("Countdown: 3");
for (int i = 3; i >= 0; i--) {
    updateDisplay(id, "Countdown: " + i);
    Thread.sleep(1000L);
}

display("Liftoff!");

Magics

"Magics" is an IPython concept adopted by JJava kernel. There are "line" and "cell" magics. Both are syntactic sugar for invoking special named functions. Line magics are single-line function calls, with magic name prefixed with %, and arguments separated by spaces:

%mavenRepo snapshots https://s01.oss.sonatype.org/content/repositories/snapshots/
%maven org.dflib:dflib-jupyter:2.0.0-SNAPSHOT

Cell magics are also function calls, but prefixed with %% and followed by an optional list of arguments. The entire remaining cell body treated as the last argument:

%%time

var v1 = doSomething();
var v2 = doSomethingElse();

JJava has a number of built-in magics that help with notebook dependency management, profiling cell code, etc. You can also register your own magics if needed. The following magics are available out of the box:

%classpath

A line magic that adds entries to the notebook classpath. Arguments are glob paths to entries on the local file system. Can be either directories or jars.

%load

A line magic that loads and executes another Java source or a notebook file within this notebook. This provides a simple code modularization and reuse mechanism within Jupyter. E.g., you can have a notebook with some common functions, "importing" its contents to other notebooks like this:

%load ../lib/common.ipynb

Again, this is to include Java sources. For binary dependencies use magics like %classpath and %maven.

%loadFromPOM, %%loadFromPOM

Loads any dependencies mentioned in the referenced POM. Can be specified as either a line or a cell magic. Ignores repositories added with %mavenRepo as the POM would likely have its own set of repos. May be helpful when you need to copy and paste maven POM fragments from third-party documentation.

Line magic arguments:

path to local POM file
list of scope types to filter the dependencies by. Defaults to compile, runtime, system, and import if not supplied.

Cell magic arguments:

varargs list of scope types to filter the dependencies by. Defaults to compile, runtime, system, and import.
body: A partial POM literal.

If the body is an XML <project> tag, then the body is used as a POM without modification. Otherwise, the magic attempts to build a POM based on the XML fragments it gets. <modelVersion>, <groupId>, <artifactId>, and <version> are given default values if not supplied. All children of <dependencies> and <repositories> are collected along with any loose <dependency> and repository tags.

E.g., to add a dependency not in Central, simply add a valid <repository> and <dependency> and the magic will take care of putting it together into a POM:

%%loadFromPOM
<repository>
  <id>snapshots</id>
  <url>https://s01.oss.sonatype.org/content/repositories/snapshots/</url>
</repository>

<dependency>
  <groupId>org.dflib</groupId>
  <artifactId>dflib-jupyter</artifactId>
  <version>2.0.0-SNAPSHOT</version>
</dependency>

%maven

A line magic that adds binary dependencies to the notebook classpath from the Maven Central (or a user-defined) repository. Transitive dependencies are also added. Arguments should follow this format: groupId:artifactId:[packagingType:[classifier]]:version. E.g.:

%maven org.dflib:dflib-jupyter:2.0.0-M5
%maven net.snowflake:snowflake-jdbc:3.22.0

%mavenRepo

A line magic that adds a Maven repository for the benefit of the %maven magic that will use it to search for dependencies. Takes two arguments: <repo_id> <repo_url>.

%mavenRepo snapshots https://s01.oss.sonatype.org/content/repositories/snapshots/

%time, %%time

Measures and prints code execution CPU and wall time. The measurements only cover code execution. Extra time it takes to "compile" each snippet within JShell is not included. The magic can be used with either line or cell syntax:

%time 10 * 10

CPU times: user 0.000 s, sys 0.000 s, total 0.000 s
Wall time: 0.000 s
Out[1]: 100

%%time
long val = 1;
for(long i = 0; i < 100_000_000; i++) {
    val += i * 10;
}
val

CPU times: user 0.078 s, sys 0.001 s, total 0.078 s
Wall time: 0.078 s
Out[2]: 49999999500000001

Jupyter-Aware Libraries

If you are writing a Java library that is specifically intended to work inside Jupyter, JJava provides a way for such a library to execute custom code when the kernel loads it. E.g., for user convenience it might add some library-specific import statements to the environment.

There are two pieces that need to be present in the library .jar for the above to work. The first is a Java class implementing org.dflib.jjava.jupyter.Extension interface:

package my.lib;

import org.dflib.jjava.jupyter.Extension;
import org.dflib.jjava.jupyter.kernel.BaseKernel;

public class MyLibExtension implements Extension {

    @Override
    public void install(BaseKernel kernel) {
        try {
            kernel.eval("import my.lib.*"); (1)
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

1	Adds common imports related to the library.

The second piece is a file declaring this extension to the kernel. The file must have this exact location and name - META-INF/services/org.dflib.jjava.jupyter.Extension - placed on the classpath (e.g. under src/main/resources of a Maven project). It must contain a single line of text corresponding to the fully-qualified name of the Java class above:

my.lib.MyLibExtension

As long as those two files are in the library jar, adding the .jar as a notebook dependency (e.g. as %maven my.lib:my-lib:1.0.0) will cause an execution of the install(..) method.

JJava core itself provides an extension that implicitly loads common imports like java.io, java.time, etc. in every notebook.

Another, more ad-hoc mechanism to load custom code per Jupyter instance (instead of per-notebook and per-library) is setting JJAVA_STARTUP_SCRIPT environment variable to point to a custom JShell script.

To disable all custom extensions for a given Jupyter process, including the JJava core extension you can set JJAVA_LOAD_EXTENSIONS variable to 0:

export JJAVA_LOAD_EXTENSIONS=0

Notebooks and Version Control

This is a common version control hint for Jupyter notebooks, not specific to Java.

Jupyter places data generated by the notebook in the notebook itself. Sometimes this is the desired behavior (e.g. when you want to share the notebook results with your audience via GitHub), but very often this is just a nuisance. Assuming you are using Git, you may automatically strip off the data before each commit by using Git hooks. Here is one possible recipe:

Add a script for stripping off outputs and execution counts somewhere in your repo. In our example it will be in bin/clean_ipynb.py:

#!/usr/bin/env python3
import sys
import json

nb = sys.stdin.read()
json_in = json.loads(nb)

def strip_output_from_cell(cell):
    if "outputs" in cell:
        cell["outputs"] = []
    if "execution_count" in cell:
        cell["execution_count"] = None

for cell in json_in["cells"]:
    strip_output_from_cell(cell)

json.dump(json_in, sys.stdout, sort_keys=True, indent=1, separators=(",",": "))

Create .gitconfig file in the root of your repo:

[filter "clean_ipynb"]
	smudge = cat
	clean = bin/clean_ipynb.py

Create .gitattributes file in the root of your repo, referencing the filter from .gitconfig:
```
*.ipynb  filter=clean_ipynb
```
All the files above should be version-controlled. Every time a user clones the repo, they will need to execute the following command manually to enable this configuration:
```
git config --local include.path ../.gitconfig
```

DFLib JJava Jupyter Kernel Documentation - v1

Features

Installation

Prerequisites

Install Python and Jupyter

MacOS

Windows

Install JJava

Running Jupyter

Configuring

Environment Variables

Glob Syntax

Notebook Functions

kernel()

display(..)

updateDisplay(..)

Magics

%classpath

%load

%loadFromPOM, %%loadFromPOM

%maven

%mavenRepo

%time, %%time

Jupyter-Aware Libraries

Notebooks and Version Control