feat: prepare and pool processes by WinPlay02 · Pull Request #87 · Safe-DS/Runner

WinPlay02 · 2024-04-17T16:09:23Z

Closes #85

Summary of Changes

Use a process pool to keep started processes waiting
The max. amount of pipeline processes is now set to 4.
Reuse started processes. This should be correct, as the same pipeline process cannot be used by multiple pipelines at the same time. As the metapath is reset to remove the custom generated Safe-DS pipeline code, only global library imports (and settings) should remain. If this is a concern, maxtasksperchild can be set to 1, in which case pipeline processes are not reused.
Reuse shared memory location for saving placeholders, if the memoization infrastructure has added such a location to the object being saved

github-actions · 2024-04-17T16:11:11Z

🦙 MegaLinter status: ✅ SUCCESS

Descriptor	Linter	Files	Fixed	Errors	Elapsed time
✅ PYTHON	black	2	0	0	0.78s
✅ PYTHON	mypy	2		0	2.24s
✅ PYTHON	ruff	2	0	0	0.03s
✅ REPOSITORY	git_diff	yes		no	0.02s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by

codecov · 2024-04-17T16:18:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (50d831f) to head (fa53a52).

Additional details and impacted files

@@            Coverage Diff            @@
##              main       #87   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           14        14           
  Lines          733       750   +17     
=========================================
+ Hits           733       750   +17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…rted

lars-reimann · 2024-04-17T17:30:36Z

The max. amount of pipeline processes is now set to the amount of available CPU cores.

We probably don't need that many:

Most of the time, the processes will idle. As long as the runner is local, 2 workers should be enough. I don't think the user would fire off pipeline runs without waiting for completion of the previous one.

WinPlay02 · 2024-04-17T17:38:29Z

Most of the time, the processes will idle. As long as the runner is local, 2 workers should be enough. I don't think the user would fire off pipeline runs without waiting for completion of the previous one.

I see, that's a lot of memory wasted. But two processes are easily saturated, if two tables are opened in the EDA view (waiting for stats). A normal execution would only be queued and needs to wait for any running pipelines to complete first.
The statistical analysis of the EDA view does another pipeline execution after the table has been fetched, according to my observations.
I'd say 4 would probably be a good middle ground to limit the processes to.

WinPlay02 · 2024-04-17T17:42:44Z

Also, coverage seems broken again. It doesn't seem to like multiprocessing pools very much.

lars-reimann · 2024-04-17T17:44:41Z

I'd say 4 would probably be a good middle ground to limit the processes to.

Sure, seems good to me. Can easily be tweaked later.

The statistical analysis of the EDA view does another pipeline execution after the table has been fetched, according to my observations.

But then the original worker is already free again, no?

lars-reimann · 2024-04-17T17:47:10Z

Also, coverage seems broken again. It doesn't seem to like multiprocessing pools very much.

I would not consider this a blocker for this PR. I'll investigate whether that can be fixed later.

Edit: Maybe this already fixes the issue.

test: verify that memoized types are correctly sent misc: change max amount of pipeline processes to 4

WinPlay02 · 2024-04-17T17:55:58Z

But then the original worker is already free again, no?

When viewing one table, then this is true. But everything would break, as soon as two tables are waiting for stats. This might be too cautious, but since gathering the stats for the EDA view takes a bit of time, this could be irritating.

lars-reimann · 2024-04-17T17:57:47Z

But everything would break, as soon as two tables are waiting for stats.

Break in what way?

WinPlay02 · 2024-04-17T18:01:34Z

But everything would break, as soon as two tables are waiting for stats.

Break in what way?

Maybe break is not the right description.
Manual executions wait for as long as stats and plots are being calculated. For large tables, this can take some time.
As a user, I'd think something broke, as nothing (obvious) is happening anymore.

lars-reimann · 2024-04-17T18:04:38Z

But everything would break, as soon as two tables are waiting for stats.

Break in what way?

Maybe break is not the right description. Manual executions wait for as long as stats and plots are being calculated. For large tables, this can take some time. As a user, I'd think something broke, as nothing (obvious) is happening anymore.

I see, that's an issue for another day (and someone else 😉).

…o process-prime

…ashable

WinPlay02 · 2024-04-17T18:36:33Z

This PR should lead to a modest improvement in startup time, and a small improvement (mostly depending on the content) in runtime.

I don't know what the target is, and how much potential for startup-optimization is left, but I was able to get small and medium tables after a few 100ms, after pressing the "Explore" lens.

lars-reimann · 2024-04-17T19:26:11Z

This PR should lead to a modest improvement in startup time, and a small improvement (mostly depending on the content) in runtime.

I don't know what the target is, and how much potential for startup-optimization is left, but I was able to get small and medium tables after a few 100ms, after pressing the "Explore" lens.

It already feels a lot better. The last idea I'd have is to add an initializer to the process pool doing something like

def _init():
    from safeds.data.tabular.containers import Table

    Table()

Then all processes can immediately display a Table.

lars-reimann · 2024-04-17T19:58:14Z

There seems to be an issue with the memoization of block & expression lambdas now:

package test

pipeline whoSurvived {
    val titanic = Table
        .fromCsvFile("titanic.csv")
        .removeColumns(["id", "ticket", "cabin", "port_embarked", "fare"]);

    val filtered = titanic.filterRows((row) ->
        row.getValue("age") as Float <= 100
    );
}

If you change the 100 and explore filtered a couple of times, you eventually stop getting updates.

WinPlay02 · 2024-04-17T21:31:20Z

There seems to be an issue with the memoization of block & expression lambdas now:
package test

pipeline whoSurvived {
    val titanic = Table
        .fromCsvFile("titanic.csv")
        .removeColumns(["id", "ticket", "cabin", "port_embarked", "fare"]);

    val filtered = titanic.filterRows((row) ->
        row.getValue("age") as Float <= 100
    );
}
If you change the 100 and explore filtered a couple of times, you eventually stop getting updates.

I can't reproduce this right now (but I'm seeing another somewhat related error, Error during pipeline execution: Document not found, during exploration).
I suppose this is the problem from above, that exploring causes stats collection to fill up all available pipeline processes, which causes everything after that to stall.
Considering that, I don't know what about the lambdas would be special to break the memoization/runner.

lars-reimann · 2024-04-21T19:03:05Z

I've started looking into this a little: It seems like inspect.getsource itself returns outdated values:

2024-04-21 20:57:34.616 [debug] root:Received Message: {"type":"program","id":"166d653e-e280-4455-946c-374a60f4e2a1","data":{"code":{"demo":{"gen_titanic":"# Imports ----------------------------------------------------------------------\r\n\r\nimport safeds_runner\r\nfrom safeds.data.tabular.containers import Column\r\n\r\n# Pipelines --------------------------------------------------------------------\r\n\r\ndef example3():\r\n    column = safeds_runner.memoized_static_call(\"safeds.data.tabular.containers.Column\", lambda *_ : Column('test', data=[1, 2, 3]), ['test', [1, 2, 3]], [])\r\n    safeds_runner.save_placeholder('column', column)\r\n    def __gen_lambda_0(param1):\r\n        return (param1) < (2)\r\n    allMatch = safeds_runner.memoized_dynamic_call(\"all\", None, [column, __gen_lambda_0], [])\r\n    safeds_runner.save_placeholder('allMatch', allMatch)\r\n","gen_titanic_example3":"from .gen_titanic import example3\r\n\r\nif __name__ == '__main__':\r\n    example3()\r\n"}},"main":{"modulepath":"demo","module":"titanic","pipeline":"example3"},"cwd":"c:\\Users\\Lars\\OneDrive\\Desktop\\test"}}
2024-04-21 20:57:34.621 [debug] root:Looking up value for key ('safeds.data.tabular.containers._column.Column.all', (ExplicitIdentityWrapperLazy(_value=Column('test', [1, 2, 3]), memory=SharedMemory('wnsm_78221081', size=4096), id=UUID('ae7dbd79-497e-4ebb-ac2f-6aaf47184952'), hash=504026043636193121), '    def __gen_lambda_0(param1):\n        return (param1) < (4)\n'), ())

In the program message, the lambda code is

def __gen_lambda_0(param1):
    return (param1) < (2)

but in the key it's

def __gen_lambda_0(param1):
    return (param1) < (4)

I've also had it throw an exception after adding more lines to a pipeline that contain lambdas:

2024-04-21 21:00:18.067 [debug] [Runner] [9fd84cfe-e31c-4201-b80f-86cef175ae0a] lineno is out of bounds
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_pipeline_manager.py line 298
	at <frozen runpy> line 226
	at <frozen runpy> line 98
	at <frozen runpy> line 88
	at demo/gen_titanic_example3 line 4
	at demo/gen_titanic line 19 (mapped to 'titanic.sds' line 7)
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_pipeline_manager.py line 419
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_memoization_map.py line 146
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_memoization_utils.py line 396
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_memoization_utils.py line 343
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_memoization_utils.py line 343
	at E:\Repositories\safe-ds\Runner\src\safeds_runner\server\_memoization_utils.py line 345
	at C:\Users\Lars\AppData\Local\Programs\Python\Python312\Lib\inspect.py line 1282
	at C:\Users\Lars\AppData\Local\Programs\Python\Python312\Lib\inspect.py line 1264
	at C:\Users\Lars\AppData\Local\Programs\Python\Python312\Lib\inspect.py line 1128

…nd up to a maximum

lars-reimann · 2024-04-21T20:10:59Z

Looks like this Python bug is back. Got it working now 🎉.

lars-reimann

Execution time feels great now, awesome stuff!

WinPlay02 · 2024-04-21T21:22:42Z

Thanks for investigating further, and great that it is now working 🎉

## [0.12.0](v0.11.0...v0.12.0) (2024-04-22) ### Features * handle list of filenames in `absolute_path` and `file_mtime` ([#89](#89)) ([50d831f](50d831f)), closes [#88](#88) * prepare and pool processes ([#87](#87)) ([e5e7011](e5e7011)), closes [#85](#85)

lars-reimann · 2024-04-22T18:27:28Z

🎉 This PR is included in version 0.12.0 🎉

The release is available on:

v0.12.0
GitHub release

Your semantic-release bot 📦🚀

WinPlay02 added 3 commits April 17, 2024 17:29

feat: run pipeline process in a process pool

fb5f536

feat: allow reusing of existing pipeline processes

25ca16f

Merge branch 'main' into process-prime

a8931ec

WinPlay02 and others added 5 commits April 17, 2024 18:38

feat: don't serialize/deserialize when saving placeholder, if not needed

97efe05

docs: document _catch_subprocess_error

98d8711

style: use other multiprocessing.pool.Pool definition

5438660

style: do not use non-api function to check that process pool has sta…

92e453a

…rted

style: apply automated linter fixes

4c9507a

WinPlay02 and others added 3 commits April 17, 2024 19:47

test: remove skipping of certain tests, it doesn't seem needed anymore

e35d1f6

test: verify that memoized types are correctly sent misc: change max amount of pipeline processes to 4

Merge remote-tracking branch 'origin/process-prime' into process-prime

16020ee

style: apply automated linter fixes

1de867f

WinPlay02 and others added 4 commits April 17, 2024 20:11

test: cleanly shutdown process pool to fix collection of coverage

268c995

Merge branch 'process-prime' of https://github.com/Safe-DS/Runner int…

617e599

…o process-prime

test: use SafeDsEncoder as an object, that is not deterministically h…

07e542a

…ashable

style: apply automated linter fixes

46c02e2

WinPlay02 marked this pull request as ready for review April 17, 2024 18:36

WinPlay02 requested a review from a team as a code owner April 17, 2024 18:36

Merge branch 'main' into process-prime

bc4eeaa

lars-reimann force-pushed the process-prime branch from 16ecbf7 to 505b5fb Compare April 21, 2024 19:15

refactor: use a ProcessPoolExecutor, so workers are spawned on dema…

f5afd39

…nd up to a maximum

lars-reimann force-pushed the process-prime branch from 505b5fb to f5afd39 Compare April 21, 2024 19:28

fix: clear line cache so getSource does not return outdated values

0c3f285

lars-reimann force-pushed the process-prime branch from 634209d to 0c3f285 Compare April 21, 2024 20:15

style: apply automated linter fixes

fa53a52

lars-reimann approved these changes Apr 21, 2024

View reviewed changes

lars-reimann merged commit e5e7011 into main Apr 21, 2024

lars-reimann deleted the process-prime branch April 21, 2024 20:30

lars-reimann added the released Included in a release label Apr 22, 2024

WinPlay02 mentioned this pull request May 2, 2024

CUDA tensors not released in time #78

Closed

Conversation

WinPlay02 commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes

Uh oh!

github-actions bot commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦙 MegaLinter status: ✅ SUCCESS

Uh oh!

codecov bot commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lars-reimann commented Apr 17, 2024

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lars-reimann commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 17, 2024

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 17, 2024

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 17, 2024

Uh oh!

WinPlay02 commented Apr 17, 2024

Uh oh!

lars-reimann commented Apr 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lars-reimann commented Apr 21, 2024

Uh oh!

lars-reimann left a comment

Choose a reason for hiding this comment

Uh oh!

WinPlay02 commented Apr 21, 2024

Uh oh!

lars-reimann commented Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

WinPlay02 commented Apr 17, 2024 •

edited

Loading

github-actions bot commented Apr 17, 2024 •

edited

Loading

codecov bot commented Apr 17, 2024 •

edited

Loading

lars-reimann commented Apr 17, 2024 •

edited

Loading

lars-reimann commented Apr 17, 2024 •

edited

Loading

lars-reimann commented Apr 21, 2024 •

edited

Loading