Skip to content

Perf imps#1778

Merged
ice0 merged 2 commits intosynfig:masterfrom
define-private-public:perf_imps
Oct 17, 2020
Merged

Perf imps#1778
ice0 merged 2 commits intosynfig:masterfrom
define-private-public:perf_imps

Conversation

@define-private-public
Copy link
Contributor

As mentioned in #1775, the perf improvement here is very miniscule, but I'd like to push in the two scripts I wrote that should help with measurement. I plan on adding some more PRs that should be more significant.

So show you how tiny the change has been, here is some output from the view_comparison_graph.py script:
Figure_1

@lgtm-com
Copy link

lgtm-com bot commented Oct 15, 2020

This pull request introduces 1 alert when merging cb4aca2 into 8b31696 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@ice0
Copy link
Collaborator

ice0 commented Oct 15, 2020

Hi!
Nice to see you start from ColorMatrix class. Fixing it should improve performance greatly.

In my opinion, the main slowdown factor is the lack of batch processing of pixel data. Since it uses pointers to color functions, this results in an expensive function call for every pixel processed.

I can illustrate this with a small example:
test.cpp

#include <random>
#include <ctime>

inline int val2(int r, int g, int b) {
  return r + g + b*2;
}


inline int val(int r, int g, int b) {
  return r*2 + g + b;
}

int main() {
  const int count = 1024*1024*100;
  std::srand(std::time(nullptr)); // use current time as seed for random generator

  std::uint32_t out = 0;
  int r = std::rand();
  int g = std::rand();
  int b = std::rand();

  typedef int (*funcptr)(int, int, int);  // Declare typedef
  funcptr func = nullptr;    // Define function-pointer variable, and initialize
  
  // without std::rand compiler can do optimization here 
  if (std::rand() > 100) {
    func = &val;
  } else {
    func = &val2;
  }

  for (int i = 0; i < count; i++) {
  #ifdef PTR
     out += func(r, g, b);
  #else
     out += val(r, g, b);
  #endif
  }
}

test results:
call a function for each pixel

$ g++ ./test.cpp -O3 -DPTR
$ time ./a.exe

real    0m0.134s
user    0m0.000s
sys     0m0.000s

function call now inlined

$ g++ ./test.cpp -O3
$ time ./a.exe

real    0m0.032s
user    0m0.015s
sys     0m0.015s

3x speedup!

P.S.

  1. Maybe we need requirements.txt here (for python scripts)
  2. Can you change spaces to tabs? (what's not critical, but whole project uses tabs)

@define-private-public
Copy link
Contributor Author

define-private-public commented Oct 15, 2020

So far I've only been making cursory glances on instructions/lines that could be moved around to take better advantage of of auto-vectorization. I plan on doing more substantial changes, but I want to learn a bit more about the project before going a little more into adjusting the architecture of things. I'm well aware of how pointer chasing can ruin performance.

  1. I'll get you a requirements.txt soon,
  2. I'll put the spaces into tabs for the C++ code. Do you want it for the Python scripts as well? Most people try to follow PEP8 and they recommend spaces (with an indent of 4) over tabs.

@ice0
Copy link
Collaborator

ice0 commented Oct 15, 2020

I'll put the spaces into tabs for the C++ code. Do you want it for the Python scripts as well? Most people try to follow PEP8 and they recommend spaces (with an indent of 4) over tabs.

Only for C++ code.

@define-private-public define-private-public force-pushed the perf_imps branch 2 times, most recently from 446cc31 to aa46274 Compare October 16, 2020 00:32
@define-private-public
Copy link
Contributor Author

Requirements file added and spaces->tabs fixed.

@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 1 alert when merging aa46274 into 0576fa8 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@ice0
Copy link
Collaborator

ice0 commented Oct 16, 2020

Please check LGTM warning about num_runs variable. Is it used somewhere?

First script does a render on all of the render tests.  Second script
use data from the first to show comparisons in runs.  Read the headers
of the scripts for details

Part of synfig#1775
the `m**` values were being referenced in a constant manor.  It's best
to avoid writing data to another variable (multiple times) if a `const`
can suffice.

closes synfig#1175
@define-private-public
Copy link
Contributor Author

whoops, that was a leftover form a previous iteration. It's gone now.

@ice0 ice0 merged commit e51bebe into synfig:master Oct 17, 2020
@ice0
Copy link
Collaborator

ice0 commented Oct 17, 2020

Merged. Thank you!

@define-private-public define-private-public deleted the perf_imps branch October 17, 2020 04:23
define-private-public added a commit to define-private-public/synfig that referenced this pull request Oct 19, 2020
I didn't see much of a noticable speedup except for about three of the
test .sif files.  This does make things cleaner though.  It was
mentioned that maybe these should be removed in synfig#1778
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants