Skip to content

Disable Jena IRI/literal validation by default#204

Merged
niegrzybkowski merged 5 commits intomainfrom
GH-196/reduce-validation
Aug 25, 2025
Merged

Disable Jena IRI/literal validation by default#204
niegrzybkowski merged 5 commits intomainfrom
GH-196/reduce-validation

Conversation

@Ostrzyciel
Copy link
Member

@Ostrzyciel Ostrzyciel commented Aug 25, 2025

Closes #196

Validating IRIs and literals can take up even ~30–40% of runtime in rdf to-jelly. Most of the time we don't need that at all, because we only care about converting between formats, and not the validity of IRI schemas, or whether 30th February is a real date or not.

This disables validation by default in to-jelly and from-jelly commands, with the largest benefit being in the second case. Jelly by itself already bypasses some of these validations.

Lazy mapping of literals into value space should be fine, but we can't easily enable that from outside of Jena. I wrote a small routine using reflection to fix that. It may fail if reflection is not supported – e.g., if the JVM doesn't support it, or when Graal is misconfigured. We detect that and ignore it – the program will still work. You can run jelly-cli version to see if reflection (and, thus, maximum speed) is available on your platform.

Tested this with assist-iot-weather, 100K elements:

$ ls -lh
total 2,4G
-rw-rw-r-- 1 piotr piotr 217M Aug 25 14:59 assist-iot-weather-100k.jelly
-rw-rw-r-- 1 piotr piotr 2,1G Sep 11  2024 assist-iot-weather-100k.nt
-rwxrwxr-x 1 piotr piotr  63M Aug 25 14:52 jelly-cli-o3
-rwxrwxr-x 1 piotr piotr  63M Aug 25 14:54 jelly-cli-o3-novalidate

Baseline:

$ time ./jelly-cli-o3 rdf to-jelly assist-iot-weather-100k.nt > /dev/null

real    0m20,017s
user    0m19,655s
sys     0m0,360s

$ time ./jelly-cli-o3 rdf from-jelly assist-iot-weather-100k.jelly > /dev/null

real    0m8,177s
user    0m7,964s
sys     0m0,212s
$ time ./jelly-cli-o3-novalidate rdf to-jelly assist-iot-weather-100k.nt > /dev/null

real    0m12,358s
user    0m11,925s
sys     0m0,432s

$ time ./jelly-cli-o3-novalidate rdf from-jelly assist-iot-weather-100k.jelly > /dev/null

real    0m7,878s
user    0m7,725s
sys     0m0,152s

So, we are way faster on to-jelly, and slightly faster on from-jelly.

Copy link
Contributor

@niegrzybkowski niegrzybkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very speedy!

@niegrzybkowski niegrzybkowski merged commit 737c704 into main Aug 25, 2025
7 checks passed
@niegrzybkowski niegrzybkowski deleted the GH-196/reduce-validation branch August 25, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce validation in parsing by default

2 participants