Skip to content

fix(turtle): delegate parsing to oxttl for full Turtle syntax support#57

Merged
cool-japan merged 1 commit intocool-japan:masterfrom
temporaryfix:fix/turtle-parser-oxttl-delegation
Feb 8, 2026
Merged

fix(turtle): delegate parsing to oxttl for full Turtle syntax support#57
cool-japan merged 1 commit intocool-japan:masterfrom
temporaryfix:fix/turtle-parser-oxttl-delegation

Conversation

@temporaryfix
Copy link
Copy Markdown
Contributor

Summary

  • Replace broken parse_str() implementation with oxttl delegation
  • The previous implementation used line-by-line processing with split_whitespace() which couldn't handle standard Turtle syntax (semicolons, commas, quoted strings, blank nodes, collections)
  • Add comprehensive test suite with 16 tests covering all Turtle syntax features

Test plan

  • All 16 new turtle parser tests pass
  • All oxirs-core tests pass (899 tests)
  • All oxirs-shacl tests pass (642 tests)
  • Clippy passes with no warnings

🤖 Generated with Claude Code

The TurtleParser's parse_str() method used broken line-by-line processing
with split_whitespace(), which couldn't handle standard Turtle syntax like
semicolons, commas, quoted strings with spaces, blank node property lists,
or RDF collections.

Changes:
- Replace parse_str() implementation with oxttl delegation
- Make convert_quad helper pub(crate) for reuse
- Remove broken private helper methods
- Add comprehensive test suite (16 tests)

All oxirs-core and oxirs-shacl tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a broken Turtle parser implementation by delegating parsing to the oxttl library instead of using a flawed custom line-by-line parser. The previous implementation couldn't handle standard Turtle syntax features like semicolons, commas, quoted strings with spaces, blank nodes, and collections because it used simple split_whitespace() processing.

Changes:

  • Replaced custom parse_str() implementation with delegation to oxttl's TurtleParser for full Turtle syntax support
  • Removed 240+ lines of broken helper methods (parse_prefix_directive, parse_base_directive, parse_triple_statement, parse_term, parse_predicate)
  • Added comprehensive test suite with 16 tests covering all Turtle syntax features
  • Changed visibility of convert_quad from pub(super) to pub(crate) to enable use from format::turtle module
  • Changed visibility of helpers module from private to pub(crate) to support the above

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
core/oxirs-core/tests/turtle_parser_tests.rs New comprehensive test file with 16 tests covering semicolons, commas, literals, blank nodes, collections, comments, and SHACL syntax
core/oxirs-core/src/format/turtle.rs Replaced broken custom parser with oxttl delegation; removed 240+ lines of flawed parsing logic; retained configuration API
core/oxirs-core/src/format/parser/mod.rs Changed helpers module visibility to pub(crate) to allow access from sibling modules
core/oxirs-core/src/format/parser/helpers.rs Changed convert_quad visibility to pub(crate) to enable use from format::turtle

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +91 to +92
// Parse and collect triples
let reader = Cursor::new(input.as_bytes());
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser's configured prefixes (set via with_prefix()) are not being passed to the oxttl parser. While standard Turtle documents define prefixes inline with @Prefix directives (which oxttl handles correctly), the existing API allows pre-configuring prefixes that would not be recognized during parsing. Consider whether prefixes configured via with_prefix() should be passed to oxttl or whether this field should only be used for serialization. If prefixes should be supported during parsing, you would need to check if oxttl provides a way to pre-configure them.

Suggested change
// Parse and collect triples
let reader = Cursor::new(input.as_bytes());
// Build effective input: configured prefixes (if any) + original input.
// This ensures that prefixes set via `with_prefix` are visible to the oxttl parser,
// which already understands standard `@prefix` Turtle directives.
let mut content = String::new();
if !self.prefixes.is_empty() {
for (prefix, iri) in &self.prefixes {
if prefix.is_empty() {
// Default prefix
content.push_str("@prefix : <");
} else {
content.push_str("@prefix ");
content.push_str(prefix);
content.push_str(": <");
}
content.push_str(iri);
content.push_str("> .\n");
}
}
content.push_str(input);
// Parse and collect triples
let reader = Cursor::new(content.as_bytes());

Copilot uses AI. Check for mistakes.
triple.object,
oxrdf::GraphName::DefaultGraph,
);
let oxirs_quad = convert_quad(quad)?;
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In lenient mode, if convert_quad fails, the entire parsing operation will stop and return an error due to the ? operator. This differs from how oxttl parsing errors are handled (which are skipped in lenient mode). While this is consistent with the existing parser implementation in format/parser/turtle.rs, consider whether conversion errors should also be skipped in lenient mode for consistency. If conversion errors should stop parsing regardless of lenient mode (because they indicate internal conversion issues rather than malformed input), this is fine as-is.

Suggested change
let oxirs_quad = convert_quad(quad)?;
let oxirs_quad = match convert_quad(quad) {
Ok(q) => q,
Err(e) => {
if !self.lenient {
return Err(e);
}
// In lenient mode, skip conversion errors
continue;
}
};

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner

@cool-japan cool-japan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Excellent fix! Replaces 247 lines of broken Turtle parsing with 35 lines delegating to oxttl. All 1,523 tests pass (16 new turtle tests + 865 oxirs-core + 642 oxirs-shacl). Critical bug fix that enables parsing real-world Turtle/SHACL files with semicolons, blank nodes, collections, and complex syntax.

@cool-japan cool-japan merged commit 0a0c8f5 into cool-japan:master Feb 8, 2026
6 checks passed
@temporaryfix temporaryfix deleted the fix/turtle-parser-oxttl-delegation branch February 16, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants