-
-
Notifications
You must be signed in to change notification settings - Fork 90
Encoding speed lags behind libpng #224
Description
Describe the bug
I'm moving a project to use zlib-ng and spng instead of zlib(mainline) and libpng. Overall the perf improvement is great (e.g. for decoding png), but I notice spng actually lags behind libpng when encoding large-ish images.
For encoding image with properties:
17547840 bytes (2708 * 2160 * 3), color_type = SPNG_COLOR_TYPE_TRUECOLOR, bit_depth=8, SPNG_IMG_COMPRESSION_LEVEL=6
I'm seeing these times:
zlib + libpng: ~1600ms
zlib-ng + libpng: ~510ms
zlib-ng + spng: ~740ms (SPNG_SSE=1 or 4 seems about the same)
So zlib-ng is lifting the most weight, and spng is actually a bit of a regression.
Looking at some quick perf stats, the top functions (in the entire program, which is doing some other stuff besides png encoding) are:
| Function Name | Total CPU [unit, %] | Self CPU [unit, %] | Module |
|---|---|---|---|
| longest_match_avx2 | 1957 (11.14%) | 1952 (11.11%) | dolphin |
| lzma_decode | 1250 (7.12%) | 1219 (6.94%) | dolphin |
| encode_scanline | 4485 (25.54%) | 1159 (6.60%) | dolphin |
| paeth | 788 (4.49%) | 784 (4.46%) | dolphin |
(lzma_decode is not spng related, just showing that spng is up there near the top. longest_match_avx2 is from zlib-ng, but from spng calls into it)
or as flame graph, to show which parts are spng related better:

To Reproduce
The libpng version of the code is here and here (C part).
This is the function being timed, for spng case:
bool SavePNG(const std::string& path, const u8* input, ImageByteFormat format, u32 width,
u32 height, u32 stride, int level)
{
spng_color_type color_type;
switch (format) {
case ImageByteFormat::RGB:
color_type = SPNG_COLOR_TYPE_TRUECOLOR;
break;
case ImageByteFormat::RGBA:
color_type = SPNG_COLOR_TYPE_TRUECOLOR_ALPHA;
break;
default:
return false;
}
auto ctx = make_spng_ctx(SPNG_CTX_ENCODER);
if (!ctx)
return false;
auto outfile = File::IOFile(path, "wb");
if (spng_set_png_file(ctx.get(), outfile.GetHandle()))
return false;
if (spng_set_option(ctx.get(), SPNG_IMG_COMPRESSION_LEVEL, level))
return false;
spng_ihdr ihdr{};
ihdr.width = width;
ihdr.height = height;
ihdr.color_type = color_type;
ihdr.bit_depth = 8;
if (spng_set_ihdr(ctx.get(), &ihdr))
return false;
if (spng_encode_image(ctx.get(), nullptr, 0, SPNG_FMT_PNG, SPNG_ENCODE_PROGRESSIVE))
return false;
for (u32 row = 0; row < height; row++) {
const int err = spng_encode_row(ctx.get(), &input[row * stride], stride);
if (err == SPNG_EOI)
break;
if (err)
return false;
}
return true;
}Expected behavior
I expect spng to be faster than libpng :)
I'm mostly curious if these results are expected, or if I've made some simple error in my use of spng which would enable faster processing.
Platform (please complete the following information):
- Architecture: x86-64 (amd zen2)
- OS: Windows 11
- Version v0.7.2