|
| 1 | +--- |
| 2 | +applyTo: "src/libraries/System.IO.Compression*/**" |
| 3 | +--- |
| 4 | + |
| 5 | +# System.IO.Compression — Folder-Specific Guidance |
| 6 | + |
| 7 | +## Format Specification Correctness (D12) |
| 8 | + |
| 9 | +- ZIP64 extensions must be used for files over 4GB — extra field sizes, offsets, and header values must use 64-bit fields when the 32-bit range is exceeded |
| 10 | +- Compression levels must align with native library semantics — verify enum-to-native mapping is correct |
| 11 | +- New compression format support (e.g., zstd in ZIP) must include a feature switch for trimming/AOT and an explicit opt-in mechanism |
| 12 | +- Decompression must handle concatenated payloads and partial reads — the decompressor must not assume a single contiguous compressed stream |
| 13 | +- Breaking changes to format handling must be documented and include migration guidance |
| 14 | + |
| 15 | +## Security |
| 16 | + |
| 17 | +- Maximum decompressed size limits must be configurable to prevent zip-bomb attacks, following the existing deflate size limit pattern |
| 18 | +- Archive extraction must validate entry paths to prevent path traversal attacks (entries with `../` segments) |
| 19 | + |
| 20 | +## Performance & Allocation (D5) |
| 21 | + |
| 22 | +- Use `ArrayPool<byte>` for variable-size compression/decompression buffers — return buffers in finally blocks |
| 23 | +- Avoid allocating excessively large fixed buffers per operation (100KB+ per compression operation is expensive) |
| 24 | +- Pin buffers for the duration of native I/O operations |
| 25 | +- Hot paths must avoid per-operation allocations — prefer pooled buffers and cached delegates |
| 26 | +- Closures that capture state on hot paths must be eliminated — use static lambdas with explicit state |
| 27 | + |
| 28 | +## Async Operations |
| 29 | + |
| 30 | +- Async compression/decompression must not perform the actual compression work synchronously before the first await |
| 31 | +- Sync and async code paths must share non-trivial logic through common helpers to prevent divergence |
| 32 | + |
| 33 | +## Cross-Platform Metadata (D19) |
| 34 | + |
| 35 | +- Archive extraction must preserve or correctly translate platform-specific metadata — Unix execute permissions, symlinks, and hidden file attributes |
| 36 | +- File path operations within archives must use forward slashes as the archive-internal separator per the ZIP specification |
| 37 | +- Tests must verify metadata round-trip on both Windows and Unix platforms |
| 38 | + |
| 39 | +## Native Interop |
| 40 | + |
| 41 | +- Native library updates (brotli, zlib, zstd) must be tracked and the managed wrapper updated accordingly |
| 42 | +- Use `LibraryImport` (source-generated) for new P/Invoke declarations |
| 43 | +- SafeHandle-derived types must be used for native compression handles — never store raw IntPtr |
| 44 | +- Native error codes must be mapped to appropriate .NET exceptions with the native error code preserved |
| 45 | + |
| 46 | +## Error Handling (D9) |
| 47 | + |
| 48 | +- Exceptions must be the most specific applicable type — `InvalidDataException` for corrupt archives, `IOException` for I/O failures, with actionable context (entry name, expected vs actual values) |
| 49 | +- Operations on streams that may not support Length/Seek must be guarded appropriately |
| 50 | + |
| 51 | +## Interoperability Testing (D10) |
| 52 | + |
| 53 | +- Tests must use archive files created by external tools — not just round-trip tests with the same .NET implementation |
| 54 | +- Test with archives from multiple platforms and compression libraries to verify cross-tool compatibility |
| 55 | +- Cover edge cases: empty archives, many small entries, entries at size boundaries (4GB, uint.MaxValue) |
| 56 | +- Dispose behavior must be tested — verify resources are released and post-disposal operations throw |
| 57 | + |
0 commit comments