Skip to content

Testing

Joshua Shinavier edited this page Mar 5, 2026 · 12 revisions

Testing in Hydra

Hydra maintains a comprehensive test suite to ensure correctness and parity across all language implementations. Testing is organized into two categories: the common test suite (shared across all implementations) and language-specific tests.

The common test suite (hydra.test.testSuite) is designed to run identically in each Hydra implementation. Passing all test cases in the common test suite is the criterion for a true Hydra implementation, ensuring that all implementations behave identically and can interoperate in heterogeneous environments.

Prerequisites

To run tests, you need:

  • Hydra built locally (see main README)
  • For Haskell: Stack installed
  • For Java: JDK 17+ and Gradle
  • For Python: Python 3.12+ and pytest

This guide is for:

  • Contributors adding new tests to the common suite
  • Developers implementing new language backends
  • Anyone debugging test failures

See also:

Common Test Suite

The common test suite is part of Hydra's "test kernel" and is written in the Hydra DSL at Hydra/Sources/Test. Like the rest of Hydra's kernel, these test definitions are code-generated into Haskell, Java, and Python using Hydra's own coders.

Kernel Tests vs Generation Tests

Hydra's Common Test Suite gives rise to two kinds of tests: kernel tests and generation tests. Both are derived from the same suite of test cases (defined in Hydra/Sources/Test/), but they are instantiated differently to test different modes of operation:

  • Kernel tests validate that Hydra's runtime works correctly. The test cases are code-generated into each target language, and each implementation runs them against its own kernel (primitives, type checker, reducer, etc.). This answers: "Does the Hydra kernel behave correctly in this language?"

  • Generation tests validate that Hydra's code generators produce correct output. The same test cases are used to generate code in each target language, then verify that the generated code compiles and produces the expected results. This answers: "Does code generated by Hydra work correctly in the target language?"

In other words, kernel tests exercise Hydra-as-interpreter, while generation tests exercise Hydra-as-compiler.

Aspect Kernel Tests Generation Tests
Purpose Test the Hydra runtime Test code generation
Source Common test suite DSL Common test suite DSL
Instantiation Generated test harness runs against kernel Generated target-language code is compiled and executed
Update script ./bin/update-kernel-tests.sh ./bin/update-generation-tests.sh

Both test types are generated to src/gen-test/ but serve complementary purposes in ensuring Hydra's correctness across both modes of operation.

Test Categories

The common test suite currently includes:

1. Primitive Function Tests

Tests for Hydra's standard library functions in hydra.lib:

List primitives (Test/Lib/Lists.hs):

  • apply, at, bind, concat, cons, drop, elem
  • group, head, init, intercalate, intersperse, last
  • length, map, nub, null, pure, replicate, reverse
  • safeHead, singleton, sort, tail, take, transpose, zip

Example test cases:

  • lists.at: Access elements at specific indices
  • lists.map: Apply functions over lists
  • lists.concat: Concatenate nested lists
  • lists.reverse: Reverse list order

String primitives (Test/Lib/Strings.hs):

  • cat, cat2, charAt, fromList, intercalate
  • length, lines, null, splitOn, toList
  • toLower, toUpper, unlines

Example test cases:

  • Unicode handling: "ñ世🌍" (combining characters, multi-byte)
  • Empty string edge cases
  • Control characters and special characters

2. Formatting Tests

Case convention conversion tests (Test/Formatting.hs):

  • lower_snake_caseUPPER_SNAKE_CASEcamelCasePascalCase
  • Handles numbers and edge cases: "a_hello_world_42_a42_42a_b"

3. Type Inference Tests

Tests for Hydra's type inference system (Test/Inference):

Test Case Types

The common test suite supports multiple test case types:

  1. Evaluation tests: Verify that a term reduces to an expected result
  2. Case conversion tests: Verify string case convention transformations
  3. Inference tests: Verify that type inference produces the expected type
  4. Inference failure tests: Verify that type inference fails as expected for invalid terms

How It Works

1. Test Definition (DSL)

Tests are defined in the Hydra DSL in hydra-haskell/src/main/haskell/Hydra/Sources/Test/:

-- Example from Test/Lib/Lists.hs
listsReverse = TestGroup "reverse" Nothing [] [
    test "basic list" [1, 2, 3, 4, 5] [5, 4, 3, 2, 1],
    test "single element" [42] [42],
    test "empty list" [] []]
  where
    test name input output =
        primCase name _lists_reverse [intList input] (intList output)

2. Code Generation

Test definitions are generated into each target language using Hydra's coders:

Kernel test generation using convenience scripts:

# Haskell kernel tests
./bin/update-kernel-tests.sh

# Or manually in GHCi
import Hydra.Generation
writeHaskell "src/gen-test/haskell" testModules Nothing

For Java and Python, use the hydra-ext package:

import Hydra.Ext.Generation
writeJava "../hydra-java/src/gen-test/java" testModules Nothing
writePython "../hydra-python/src/gen-test/python" testModules Nothing

Generation test creation (language-specific):

# Haskell generation tests
./bin/update-generation-tests.sh

See Test Generation Architecture below for details on extending to other languages.

3. Test Runners

Each language has a test runner that executes the generated test suite:

Haskell (TestSuiteSpec.hs):

  • Uses HSpec framework
  • Runs: stack test in hydra-haskell/
  • Interactive: stack ghci hydra:lib hydra:hydra-test then Test.Hspec.hspec Hydra.TestSuiteSpec.spec

Java (TestSuiteRunner.java):

  • Uses JUnit 5 with parameterized tests
  • Runs: ./gradlew test from root directory
  • Executes evaluation tests via term reduction

Python (test_suite_runner.py):

  • Uses pytest framework
  • Runs: pytest src/test/python/test_suite_runner.py in hydra-python/
  • Dynamically generates pytest test functions from the test suite
  • Includes evaluation tests via term reduction

Guaranteeing Parity

The common test suite ensures that all Hydra implementations behave identically:

  1. Same test definitions: All implementations run exactly the same tests (generated from one source)
  2. Same primitive functions: Each language implements the same set of primitives with identical behavior
  3. Same reduction semantics: Term evaluation produces identical results across languages
  4. Continuous validation: Tests run on every build to catch regressions

This is critical for heterogeneous environments like Apache TinkerPop where the same logic must execute identically across multiple languages.

Test Generation Architecture

Hydra's test generation system uses a dependency inversion pattern to support multiple target languages. The framework is language-agnostic, with language-specific implementations plugging in via the TestGenerator abstraction.

TestGenerator Abstraction

The TestGenerator type (defined in Generation/Generate.hs) parameterizes over namespace types:

data TestGenerator a = TestGenerator {
  -- Build namespaces for a module, resolving imports and primitives
  testGenNamespacesForModule :: Module -> Graph -> Either String (Namespaces a),

  -- Create a test codec from resolved namespaces
  testGenCreateCodec :: Namespaces a -> TestCodec,

  -- Generate a complete test file for a module and test group
  testGenGenerateTestFile :: Module -> TestGroup -> Graph -> Either String (FilePath, String)
}

Language Implementations

Each language provides a TestGenerator implementation:

Haskell (HaskellCodec.hs):

haskellTestGenerator :: TestGenerator H.ModuleName
haskellTestGenerator = TestGenerator {
  testGenNamespacesForModule = namespacesForModule,
  testGenCreateCodec = haskellTestCodec,
  testGenGenerateTestFile = generateHaskellTestFile
}

Python and Java: Implementations follow the same pattern, providing their own namespace resolution, codec creation, and file generation.

Extending to New Languages

To add generation test support for a new language:

  1. Create a codec module (e.g., PythonCodec.hs):

    • Implement functions to encode Hydra terms/types to language-specific test code
    • Create language-specific namespace resolution
  2. Define a TestGenerator:

    pythonTestGenerator :: TestGenerator PythonModuleName
    pythonTestGenerator = TestGenerator {
      testGenNamespacesForModule = pythonNamespacesForModule,
      testGenCreateCodec = pythonTestCodec,
      testGenGenerateTestFile = generatePythonTestFile
    }
  3. Create an executable in src/exec/update-generation-tests-LANG/:

    • Import your test generator
    • Call generateGenerationTestSuite with it
    • Add to package.yaml executables section
  4. Add convenience script bin/update-generation-tests-LANG.sh

This architecture ensures the core framework remains language-agnostic while supporting arbitrary target languages.

Setting Up and Running Tests

This section provides quick-start instructions for running tests in each language. For complete setup instructions (installing dependencies, configuring your environment), see the language-specific READMEs linked below.

Haskell

See Hydra-Haskell README for full setup instructions.

Run all tests (kernel tests + generation tests + language-specific tests):

cd hydra-haskell
stack test

Interactive testing (useful for debugging):

stack ghci hydra:lib hydra:hydra-test
-- Run kernel tests
Test.Hspec.hspec Hydra.TestSuiteSpec.spec

-- Run generation tests
Test.Hspec.hspec Generation.Spec.spec

Regenerate tests after modifying test sources:

# Regenerate kernel tests (test data structures)
./bin/update-kernel-tests.sh

# Regenerate generation tests (executable hspec specs)
./bin/update-generation-tests.sh

# Run updated tests
stack test

Test directory structure:

hydra-haskell/
├── src/test/haskell/           # Hand-written test infrastructure
│   ├── Spec.hs                 # Entry point (hspec-discover)
│   └── Hydra/TestSuiteSpec.hs  # Kernel test runner
├── src/gen-test/haskell/       # Generated tests
│   ├── Hydra/Test/             # Kernel test data (TestGroup structures)
│   └── Generation/             # Generation tests (executable specs)

Java

See Hydra-Java README for full setup instructions.

Run all tests (kernel tests + generation tests + Java-specific tests):

cd hydra-java
./gradlew test

Or from the repository root:

./gradlew :hydra-java:test

Run a specific test class:

./gradlew test --tests "hydra.VisitorTest"

Regenerate tests (recommended: use the sync script from hydra-ext):

cd hydra-ext
./bin/sync-java.sh

This will regenerate all Java code (kernel modules, eval libs, and tests) and run the test suite.

For manual generation, see the Hydra-Java README.

Test directory structure:

hydra-java/
├── src/test/java/              # Hand-written test infrastructure
│   ├── hydra/TestSuiteRunner.java  # Kernel test runner (JUnit 5)
│   └── hydra/VisitorTest.java      # Java-specific tests
├── src/gen-test/java/          # Generated tests
│   ├── hydra/test/             # Kernel test data (TestGroup structures)
│   └── generation/             # Generation tests (executable JUnit tests)

Python

See Hydra-Python README for full setup instructions.

Run all tests (kernel tests + Python-specific tests):

cd hydra-python
pytest

Run only the common test suite (kernel tests):

pytest src/test/python/test_suite_runner.py

Run generation tests (tests generated code correctness):

pytest src/gen-test/python/generation/

Run a specific test file:

pytest src/test/python/test_grammar.py

Generate a categorized summary report:

python src/test/python/test_summary_report.py

Regenerate tests (recommended: use the sync script from hydra-ext):

cd hydra-ext
./bin/sync-python.sh

This will regenerate all Python code (kernel modules and tests) and run the test suite.

For manual regeneration using bootstrap-from-json:

cd hydra-ext
stack build hydra-ext:exe:bootstrap-from-json
stack exec bootstrap-from-json -- --target python --include-coders --include-tests --include-gentests +RTS -K256M -A32M -RTS

Known limitations: The Python coder does not yet support all literal types (e.g., float32), so some generation tests may fail to generate. Run the generation script to see the current status.

Test directory structure:

hydra-python/
├── src/test/python/              # Hand-written tests
│   ├── test_suite_runner.py      # Kernel test runner
│   ├── test_grammar.py           # Python-specific tests
│   └── ...
├── src/gen-test/python/          # Generated tests
│   ├── hydra/test/               # Kernel test data (TestGroup structures)
│   └── generation/               # Generation tests (pytest test files)
│       └── hydra/test/lib/       # e.g., test_chars.py, test_lists.py

Language-Specific Tests

In addition to the common test suite, each implementation has language-specific tests:

Haskell-Specific Tests

Located in hydra-haskell/src/test/haskell:

  • Haskell coder tests
  • DSL tests
  • Property-based tests (QuickCheck)
  • Integration tests

Generation tests (in src/gen-test/haskell/Generation/) verify Haskell code generation.

Java-Specific Tests

Located in hydra-java/src/test/java:

  • Java coder tests
  • Serialization tests
  • Integration tests
  • Performance tests

Python-Specific Tests

Located in hydra-python/src/test/python:

  • test_python.py - Python-specific functionality
  • test_json.py - JSON encoding/decoding
  • test_grammar.py - Grammar parsing
  • test_generated_code.py - Generated code validation

Run with: pytest in the hydra-python/ directory

Adding New Tests

To the Common Test Suite

  1. Create test definitions in hydra-haskell/src/main/haskell/Hydra/Sources/Test/

    • Add to existing test groups or create new ones
    • Use the testing DSL (TestGroup, primCase, etc.)
  2. Register tests in TestSuite.hs

    • Add test group binding
    • Include in appropriate parent group
  3. Regenerate kernel tests in all implementations:

    # Haskell
    ./bin/update-kernel-tests.sh
    
    # Or manually in GHCi
    import Hydra.Sources.All
    import Hydra.Generation
    let allModules = mainModules ++ testModules
    writeHaskell "src/gen-test/haskell" allModules baseTestModules
    
    # Java and Python (from hydra-ext)
    import Hydra.Ext.Generation
    import Hydra.Sources.All
    let allModules = mainModules ++ testModules
    writeJava "../hydra-java/src/gen-test/java" allModules baseTestModules
    writePython "../hydra-python/src/gen-test/python" allModules baseTestModules
  4. Update generation tests if needed:

    ./bin/update-generation-tests.sh
  5. Run tests in each language to verify

See also:

Language-Specific Tests

Add tests directly to the language-specific test directory using that language's testing framework.

Current Coverage

The common test suite currently provides:

  • List primitives: ~30 functions with multiple test cases each
  • String primitives: ~13 functions including Unicode handling
  • Case conversion: All major case conventions
  • Type inference: Basic inference, algebraic types, nominal types, failures
  • 🚧 Future expansion: More primitive functions, coders, adapters, and kernel functionality

The test suite is continuously expanding to cover more of Hydra's functionality.

See Also