Skip to content

Encoding speed lags behind libpng #224

@shuffle2

Description

@shuffle2

Describe the bug
I'm moving a project to use zlib-ng and spng instead of zlib(mainline) and libpng. Overall the perf improvement is great (e.g. for decoding png), but I notice spng actually lags behind libpng when encoding large-ish images.
For encoding image with properties:
17547840 bytes (2708 * 2160 * 3), color_type = SPNG_COLOR_TYPE_TRUECOLOR, bit_depth=8, SPNG_IMG_COMPRESSION_LEVEL=6

I'm seeing these times:
zlib + libpng: ~1600ms
zlib-ng + libpng: ~510ms
zlib-ng + spng: ~740ms (SPNG_SSE=1 or 4 seems about the same)

So zlib-ng is lifting the most weight, and spng is actually a bit of a regression.

Looking at some quick perf stats, the top functions (in the entire program, which is doing some other stuff besides png encoding) are:

Function Name Total CPU [unit, %] Self CPU [unit, %] Module
 longest_match_avx2 1957 (11.14%) 1952 (11.11%) dolphin
 lzma_decode 1250 (7.12%) 1219 (6.94%) dolphin
 encode_scanline 4485 (25.54%) 1159 (6.60%) dolphin
 paeth 788 (4.49%) 784 (4.46%) dolphin

(lzma_decode is not spng related, just showing that spng is up there near the top. longest_match_avx2 is from zlib-ng, but from spng calls into it)

or as flame graph, to show which parts are spng related better:
image

To Reproduce
The libpng version of the code is here and here (C part).

This is the function being timed, for spng case:

bool SavePNG(const std::string& path, const u8* input, ImageByteFormat format, u32 width,
             u32 height, u32 stride, int level)
{
  spng_color_type color_type;
  switch (format)  {
  case ImageByteFormat::RGB:
    color_type = SPNG_COLOR_TYPE_TRUECOLOR;
    break;
  case ImageByteFormat::RGBA:
    color_type = SPNG_COLOR_TYPE_TRUECOLOR_ALPHA;
    break;
  default:
    return false;
  }

  auto ctx = make_spng_ctx(SPNG_CTX_ENCODER);
  if (!ctx)
    return false;

  auto outfile = File::IOFile(path, "wb");
  if (spng_set_png_file(ctx.get(), outfile.GetHandle()))
    return false;

  if (spng_set_option(ctx.get(), SPNG_IMG_COMPRESSION_LEVEL, level))
    return false;

  spng_ihdr ihdr{};
  ihdr.width = width;
  ihdr.height = height;
  ihdr.color_type = color_type;
  ihdr.bit_depth = 8;
  if (spng_set_ihdr(ctx.get(), &ihdr))
    return false;

  if (spng_encode_image(ctx.get(), nullptr, 0, SPNG_FMT_PNG, SPNG_ENCODE_PROGRESSIVE))
    return false;
  for (u32 row = 0; row < height; row++)  {
    const int err = spng_encode_row(ctx.get(), &input[row * stride], stride);
    if (err == SPNG_EOI)
      break;
    if (err)
      return false;
  }
  return true;
}

Expected behavior
I expect spng to be faster than libpng :)

I'm mostly curious if these results are expected, or if I've made some simple error in my use of spng which would enable faster processing.

Platform (please complete the following information):

  • Architecture: x86-64 (amd zen2)
  • OS: Windows 11
  • Version v0.7.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions