Skip to content

Conversation

@malinah
Copy link
Contributor

@malinah malinah commented Oct 24, 2018

This is a working concept of profiling API related to issue #3864.

Please review this pull-request and comment especially on the overall integration into the Micropython.

So far the main concern of the profiler is runtime instruction tracing used to create detailed code coverage reports. To achieve this the analogue of sys.settrace was implemented and puts Micropython closer to feature a debugger #3009.

Building and Running

So far only the Unix port is concerned.

Enable profiling in ports/unix/mpconfigport.h like so:

#define MICROPY_PY_SYS_SETTRACE (1)

MICROPY_PERSISTENT_CODE_SAVE is required by the MICROPY_PY_SYS_TRACE.

Python API

sys.settrace(tracefunc)

As defined in original Python3 documentation sys.settrace(tracefunc). The Frame object and Code objects are only partially implemented.

@stinos
Copy link
Contributor

stinos commented Oct 24, 2018

This looks pretty neat (some commit message will need rewording and more explanation probably), and the example runs fine for the msvc port. I do get a maximum recursion depth exceeded in debug builds though but that's expected.

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

I think this should depend on #3412 as a vehicle to make this (and many more) feature available in a standard way.

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

Has unrelated commits:

moductypes: add layout parameter to uctypes.sizeof
modutimeq: add discard method

@stinos
Copy link
Contributor

stinos commented Oct 24, 2018

I think this should depend on #3412 as a vehicle to make this (and many more) feature available in a standard way.

You mean a standard way other than defining features in mpconfigport.h? Otherwise these are just orthogonal issues (or i'm missing some point here)

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

Let's take sys.atexit() as an example. Not related to profiling per se. Should be in a separate commit, with its own config option, then test - everything as usual. Commit message should explain why that name was chosen, given that it's not compatible with CPython.

And likely, submitted as a separate PR for quicker merging (and unloading this PR for noise-free review).

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

You mean a standard way other than defining features in mpconfigport.h? Otherwise these are just orthogonal issues (or i'm missing some point here)

I mean that one argument against adding more and more features was "this is not useful for intended [production] usage of MicroPython (fitting it where other stuff doesn't fit)" and "this can only lead to fragmentation".

So, IMHO, before going to stuff more and more features, answers to those concerns should be given, and #3412 tries to do just that.

@stinos
Copy link
Contributor

stinos commented Oct 24, 2018

Yeah I get that, it's the how I don't see. Like, practically, what does this entail? Just make special_dev_build which configures PP directives/extra source files or something completely different? But probably that's a discussion which should be held on that issue instead of here :P

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

Sure, nothing magic: pfalcon/pycopy@1752b8a

@pfalcon
Copy link
Contributor

pfalcon commented Oct 24, 2018

@malinah : Oh, and otherwise, this is a great feature of course, thanks for posting!

@pfalcon
Copy link
Contributor

pfalcon commented Nov 1, 2018

@dpgeorge , You may want to review, and if correct, merge commits "tools/mpy-tools.py: Fix opcode size enumeration and constant definition." and "py/bc.c: Fix opcode size enumeration." ahead of anything else. Then setup suitable tests to avoid that in the future.

@malinah
Copy link
Contributor Author

malinah commented Mar 28, 2019

@pfalcon I see we are working around the same area (bytecode, vm) - https://github.com/pfalcon/micropython/commits/pfalcon vs https://github.com/trezor/micropython/commits/malinah/profiling. We should sync to avoid redundant work. Where can I reach you? Cheers.

@pfalcon
Copy link
Contributor

pfalcon commented Mar 28, 2019

@pfalcon I see we are working around the same area (bytecode, vm).
We should sync to avoid redundant work.

Sure, that's why comments above and #4342.

Where can I reach you?

I'm here.

@malinah
Copy link
Contributor Author

malinah commented Jun 19, 2019

Hi, I reworked the whole implementation and it should make more sense now. However, I have a problem building the uPy with MICROPY_PERSISTENT_CODE_SAVE enabled. I'm getting some errors in py/persistentcode.c that I'm currently ugly-patching (not in this PR) and I don't know how to fix them properly. Can you give me a feedback please? Cheers and thanks.

Copy link
Contributor

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From 1st look, the update spots about the same issues as the original patch had.

@malinah
Copy link
Contributor Author

malinah commented Jul 1, 2019

Hi, If you could give me more feedback please that would be great. @dpgeorge @pfalcon @stinos. Thank you and cheers!

@stinos
Copy link
Contributor

stinos commented Jul 1, 2019

From a quick look: code itself looks sort of fine and operational, but it's a bit hard to provide constructive feedback on it because the commits look like work in progress built on top of work in progress. I'd suggest to rework this so it's close to it's final state:

  • remove dead code like commented out code, sections guarded with if 0 without explanation why, bunch of TODO comments etc. Also e.g. in your tests: all your functions are called test_xxx so it is really unnecessary to have a comment # xxx on the line above them.
  • rebase the whole thing, squash and/or split fixup commits where they belong; you have commits like 'Fix off by one', I get that it's nice for you to remember but this cannot be merged as a separate commit. Likewise for commits which consist only of renaming of variables. And as mentioned earlier, introducing new features which aren't strictly related to or requires for settrace like atexit should be in separate commits.
  • on the other hand I'm not sure it's interesting to have separate commits for adding new files and actually using them (just my idea, others might disagree), it's pretty hard to tell now from which commit on there's actually a usable implementation, e.g. there's a separate commit with 'py: Stitch tracing into the vm.' but is there any use for the previous commits if this code isn't present?
  • add proper commit messages: most commits consists of only a subject line, general guideline for commits is that the commit message body explains why a change is needed and what it does. Not saying all commits need this but just as an example there's a commit 'unix: Trigger uatxit callback when actually exiting.' Looks interesting, but to understand why this is suddenly needed from the code alone it looks like I'd have to go through the code myself. (i.e. MICROPY_PY_SYS_UATEXIT is implemented already, does it not work as expected now?) If the commit would explain why this is needed and what the effect is everything is clear at first sight..

@prusnak
Copy link
Contributor

prusnak commented Jul 1, 2019

@stinos While I agree with the comments above, it is possible to view squashed changes as one commit here: https://github.com/micropython/micropython/pull/4265/files

@stinos
Copy link
Contributor

stinos commented Jul 1, 2019

@prusnak thanks but I'm aware of that already :)

@dpgeorge
Copy link
Member

dpgeorge commented Jul 9, 2019

@malinah thanks a lot for posting this PR, it's a welcome feature for sure, useful for (at least) profiling, debugging, coverage testing.

The commits here still need a fair bit of work to be in a state for merging. There are minor things like camel-case variable names and duplicated code (eg prof_extract_prelude functionality exists in persistentcode.c), and lots of seemingly unused code (eg prof_bytecode_linenox and PROF_PRINT_INSTR).

But more importantly, from a bigger picture point of view: ideally the commits would be made to add logically independent functionality. Eg first just add bare-bones sys.settrace() functionality that just calls the callback with None passed in. Then move on to implement frame objects, etc, gradually adding functionality. It doesn't have to be 1 commit for each set of working functionality, but it would be great if there were a handful of commits that added something self contained (even better with working tests), which could be more easily reviewed and merged.

Anyway, I'm not sure how you'd like to proceed because it will be a lot more work to make it a clean PR in this way. At the very least please remove unused code so there's less to review.


I did test this PR on unix and stm32 (with minor changes to get it working on stm32) and it does work very well, and has a very faithful API the same as CPython. So well done there!

On unix 64-bit the code size increase is around 11k. On stm32 it's around 8.5k. That's quite acceptable IMO.

As for performance, I tested both these ports before and after enabling the profiling feature. Note that with profiling enabled it doesn't actually set sys.settrace(), the following numbers are testing the VM speed with the additional hooks needed by the profiler (overhead of profiling when not used).

For unix 64-bit:

diff of scores (higher is better)
N=5000 M=1000             unix-base ->  unix-prof         diff      diff% (error%)
bm_chaos.py                14832.12 ->   11159.97 :   -3672.15 = -24.758% (+/-7.65%)
bm_fannkuch.py                 7.33 ->       5.49 :      -1.84 = -25.102% (+/-1.84%)
bm_float.py               301922.69 ->  220679.32 :  -81243.37 = -26.909% (+/-3.23%)
bm_hexiom.py                 673.90 ->     446.06 :    -227.84 = -33.809% (+/-2.47%)
bm_nqueens.py             300178.77 ->  222071.98 :  -78106.79 = -26.020% (+/-3.16%)
bm_pidigits.py              2055.16 ->    2028.26 :     -26.90 =  -1.309% (+/-2.46%)
misc_aes.py                17334.31 ->   10074.71 :   -7259.60 = -41.880% (+/-3.77%)
misc_mandel.py            128709.37 ->  104342.61 :  -24366.76 = -18.932% (+/-4.67%)
misc_pystone.py            91528.72 ->   62108.57 :  -29420.15 = -32.143% (+/-3.61%)
misc_raytrace.py           25008.10 ->   19785.31 :   -5222.79 = -20.884% (+/-4.55%)
viper_call0.py             24579.51 ->   23262.08 :   -1317.43 =  -5.360% (+/-11.54%)
viper_call1a.py            23672.42 ->   23459.22 :    -213.20 =  -0.901% (+/-13.28%)
viper_call1b.py            18712.93 ->   19400.68 :    +687.75 =  +3.675% (+/-13.61%)
viper_call1c.py            20015.56 ->   18945.96 :   -1069.60 =  -5.344% (+/-9.03%)
viper_call2a.py            22129.29 ->   23672.68 :   +1543.39 =  +6.974% (+/-13.03%)
viper_call2b.py            16035.62 ->   17589.49 :   +1553.87 =  +9.690% (+/-13.30%)

For PYBv1.0:

diff of scores (higher is better)
N=100 M=100               pyb-base  ->   pyb-prof         diff      diff% (error%)
bm_chaos.py                  294.77 ->     259.99 :     -34.78 = -11.799% (+/-0.01%)
bm_fannkuch.py                72.11 ->      62.56 :      -9.55 = -13.244% (+/-0.00%)
bm_float.py                 4779.11 ->    4204.22 :    -574.89 = -12.029% (+/-0.01%)
bm_hexiom.py                  34.46 ->      27.34 :      -7.12 = -20.662% (+/-0.01%)
bm_nqueens.py               4103.28 ->    3374.21 :    -729.07 = -17.768% (+/-0.01%)
bm_pidigits.py               624.21 ->     596.84 :     -27.37 =  -4.385% (+/-0.36%)
misc_aes.py                  342.47 ->     280.31 :     -62.16 = -18.150% (+/-0.00%)
misc_mandel.py              2980.08 ->    2672.15 :    -307.93 = -10.333% (+/-0.01%)
misc_pystone.py             1875.17 ->    1513.70 :    -361.47 = -19.277% (+/-0.00%)
misc_raytrace.py             297.19 ->     265.28 :     -31.91 = -10.737% (+/-0.01%)
viper_call0.py               576.83 ->     576.70 :      -0.13 =  -0.023% (+/-0.00%)
viper_call1a.py              550.37 ->     550.25 :      -0.12 =  -0.022% (+/-0.00%)
viper_call1b.py              434.90 ->     434.85 :      -0.05 =  -0.011% (+/-0.01%)
viper_call1c.py              439.47 ->     439.37 :      -0.10 =  -0.023% (+/-0.01%)
viper_call2a.py              536.28 ->     536.19 :      -0.09 =  -0.017% (+/-0.01%)
viper_call2b.py              377.28 ->     377.22 :      -0.06 =  -0.016% (+/-0.00%)

Unix sees a loss of 20-40% performance. PYBv1.0 sees a 10-20% loss. They both agree that misc_aes, bm_hexiom and misc_pystone are the tests with the biggest performance loss (they have lots of lines of code so makes sense that VM performance is down). And bm_pidigits sees very little loss because it's mainly stressing bignum arithmetic (makes sense).

py/profiling.c Outdated
}

// OPCODE
if (false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A way to implement this would be to support setting frame.f_trace_opcodes=True, a feature added in CPython 3.7.

@malinah malinah force-pushed the malinah/profiling branch from 62a40d2 to b9d1104 Compare July 11, 2019 15:06
@dpgeorge
Copy link
Member

@malinah I see you force-pushed this PR. Please let me know when it's ready for another review.

@malinah
Copy link
Contributor Author

malinah commented Aug 14, 2019

@malinah I see you force-pushed this PR. Please let me know when it's ready for another review.

@dpgeorge I think I modified the code and commits according to resolved comments. Check it out. Cheers!

@malinah malinah changed the title WIP POC Profiling. settrace implementation. Profiling: sys.settrace implementation. Aug 14, 2019
@malinah malinah force-pushed the malinah/profiling branch from a858696 to 0db9b68 Compare August 15, 2019 02:11
@malinah malinah closed this Aug 15, 2019
@malinah malinah reopened this Aug 15, 2019
@malinah malinah force-pushed the malinah/profiling branch from e11cb83 to e210bdf Compare August 19, 2019 15:35
 - Introducing the feature into the vm.c and surrounding runtime.
 - Introducing the feature into the build system and configuration.
 - To enable the feature enable MICROPY_PY_SYS_SETTRACE in the
   mpconfigport.h in the unix port. Other ports are not supported
   right now.
 - tests/misc: Added initial tests for `sys.settrace` feature.
 - tests/run-tests: register misc/trace_profilin.py as special testcase.
 - tests/cmdline/cmd_showbc.py had to be updated because of the changes
   in the compile.c regarding the `sys.settrace` specific line emitters.
@malinah malinah force-pushed the malinah/profiling branch from e210bdf to a403df7 Compare August 19, 2019 15:45
@malinah
Copy link
Contributor Author

malinah commented Aug 19, 2019

@dpgeorge All comments worked in and rebased on the current master.

@dpgeorge
Copy link
Member

Superseded by #5026

@dpgeorge dpgeorge closed this Aug 30, 2019
kamtom480 pushed a commit to kamtom480/micropython that referenced this pull request Mar 3, 2021
Clear out PIOs and State Machines on RP2040 soft reset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants