hprof-redact is a tool for processing Java heap dumps (HPROF format) to redact sensitive data while preserving
heap structure and size characteristics. This is useful for:
- Sharing heap dumps for analysis without exposing sensitive string data
- Testing and debugging production issues safely
- Compliance and privacy requirements when handling heap dumps
This is currently just an early prototype, a proof of concept. Feel free to test it and provide me with feedback.
The implementation is based on the HPROF format specified in the OpenJDK source code.
Features:
- Stream-based processing for large heap dumps
- Configurable transformers for redacting string contents and primitive values, including arrays
- Support for redacting field names, class names, method names, and other UTF-8 strings in the heap dump
- Tiny JAR (< 100KB) with only femtocli as a dependency for the CLI interface
Non-Features:
- It doesn't parse every section of the heap dump, it only processes the records relevant for redacting string contents and primitive values.
- It is therefore no general purpose heap dump parser.
- It has no complex redaction logic like jfr-redact and only supports simple transformations of string contents and primitive values, but it can be extended with custom transformers.
Download the latest release from GitHub Releases and run:
java -jar hprof-redact.jar input.hprof output.hprofOr use with JBang: jbang hprof-redact@parttimenerd/hprof-redact
Add to your pom.xml:
<dependency>
<groupId>me.bechberger</groupId>
<artifactId>hprof-redact</artifactId>
<version>0.2.1</version>
</dependency>Usage: hprof-redact [-hV] [--compress] [--transformer=<transformer>] [--verbose] <input>
<output>
Stream and redact HPROF heap dumps.
<input> Input HPROF path.
<output> Output HPROF path or '-' for stdout.
--compress Enable compression format (omit array and string data,
store only sizes).
-h, --help Show this help message and exit.
-t, --transformer=<transformer> Transformer to apply (default: zero).
Options: zero (zero primitives + string
contents), zero-strings (zero string
contents only), drop-strings (empty string
contents).
-v, --verbose Log changed field values (primitive fields
only) to stderr.
-V, --version Print version information and exit.When using the --compress option, the output HPROF format is modified to save space by omitting array and string data:
UTF-8 Strings (HPROF_UTF8):
- Standard format:
[record_tag][time][length][id][data...] - Compress format:
[record_tag][time][-1][actual_length][id](no data)
Primitive Arrays (HPROF_GC_PRIM_ARRAY_DUMP):
- Standard format:
[id][stackTrace][numElements][elementType][elements...] - Compress format:
[id][stackTrace][-1][actual_numElements][elementType](no elements)
This format allows tools to:
- Reconstruct the original heap structure and data types
- Determine array/string sizes without parsing the content
- Significantly reduce file size by omitting bulk data
Use case: When you need to share heap structure information without exposing string or array contents, and downstream tools support the compressed format.
Note: Method names and method signatures are treated as generic UTF-8 strings because they cannot always be distinguished reliably in HPROF records. String transformers therefore apply to them as well.
Zeros out both primitive values and string contents while preserving structure.
- All numeric primitives become
0/0.0f/0.0d - Booleans become
false - Strings become
"0000..."(same length as original, preserving offsets)
Use case: Maximum data redaction while maintaining heap structure analysis.
Only zeros out string contents, leaves primitive values untouched.
- All strings become
"0000..."(same length as original) - Primitive values preserved as-is
- Field names, class names, method names are zeroed
Use case: When you need primitive values for analysis but want to hide string data.
Removes string contents entirely, replaces with empty strings.
- All strings become
""(empty) - Primitive values preserved as-is
- Note: This changes heap layout as strings have different sizes
Use case: Maximum space savings with minimal data preservation.
import me.bechberger.hprof.HprofRedact;
void main() throws IOException {
HprofRedact.process(
Path.of("input.hprof"),
Path.of("output.hprof"),
new ZeroPrimitiveTransformer());
}Implement HprofTransformer:
import me.bechberger.hprofredact.transformer.HprofTransformer;
public class MyTransformer implements HprofTransformer {
@Override
public String transformUtf8String(String value) {
return "REDACTED";
}
@Override
public int transformInt(int value) {
return -1;
}
}mvn clean packageThis generates:
target/hprof-redact.jar- Executable JARtarget/hprof-redact- Native executable (if GraalVM available)
mvn testThe test suite includes:
- Unit tests for HPROF parsing and filtering
- Integration tests with real heap dumps
- Validation against
hprof-slurp(downloaded automatically)
Use the provided capture_heap_dumps.py script to generate test heap dumps in the heap_dumps/ directory.
It compiles and runs Java test programs that create various heap scenarios, captures heap dumps using jmap, and extracts histograms for validation.
python3 capture_heap_dumps.py./release.py [--major|--patch]This:
- Updates version in
pom.xml - Updates
CHANGELOG.md - Runs tests and builds package
- Creates git tag and commits
- Pushes to remote
- Creates GitHub release with artifacts
- https://github.com/agourlay/hprof-slurp: a heap-dump analyzer written in rust
- https://github.com/eaftan/hprof-parser: written in Java
- OpenJDK heapDumper.cpp: the official writer that also includes the format
- https://bugs.openjdk.org/browse/JDK-8337517: Redacted Heap Dumps, but it never got in
- https://eclipse.dev/mat/: Eclipse Memory Analyzer Tool, a powerful heap dump analysis tool
This project is open to feature requests/suggestions, bug reports etc. via GitHub issues. Contribution and feedback are encouraged and always welcome.
MIT, Copyright 2026 SAP SE or an SAP affiliate company, Johannes Bechberger and contributors