Remove cython md5 hashing since it breaks the build process#3254
Remove cython md5 hashing since it breaks the build process#3254sciunto merged 3 commits intoscikit-image:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3254 +/- ##
==========================================
- Coverage 87.77% 87.01% -0.77%
==========================================
Files 323 340 +17
Lines 27138 27430 +292
==========================================
+ Hits 23820 23867 +47
- Misses 3318 3563 +245
Continue to review full report at Codecov.
|
5487ae2 to
e52dfe7
Compare
|
Other people thinking about parallel builds: matplotlib/matplotlib#1507 |
d3a4cb5 to
d17a82c
Compare
|
As a note, if you setup CCACHE correctly, then recompiling skimage takes almost no time. |
d17a82c to
c2e4b4a
Compare
skimage/_build.py
Outdated
| pyx_files[i] = pyxfile.replace('.pyx.in', '.pyx') | ||
|
|
||
| return md5_cached != md5_new.encode('utf-8') | ||
| cythonize(pyx_files, nthreads=8) |
There was a problem hiding this comment.
I'm not sure about hardcoding the number of threads. I mean this looks like a reasonable default, but I'm just not sure :).
There was a problem hiding this comment.
Let me experiment with this number tonight. Cython might use ncores*2 automatically.
There was a problem hiding this comment.
c2e4b4a to
5f11cb4
Compare
|
This might (partially) solve #3387 don't you think? |
|
@hmaarrfk Could you rebase please? Could you indicate your command line for ccache as well? |
stefanv
left a comment
There was a problem hiding this comment.
Is this a new feature in Cython? How does it know not to regenerate the C file?
skimage/_build.py
Outdated
| pyx_files[i] = pyxfile.replace('.pyx.in', '.pyx') | ||
|
|
||
| return md5_cached != md5_new.encode('utf-8') | ||
| # Cython doesn't automatically chose a number of threads > 1 |
ccd97f9 to
0544afd
Compare
|
@stefanv beats me, this might be kinda like "I'm so new to cython I do not really remember when it didn't do that" |
|
@sciunto I don't really know a reliable installation guide, can you try type these manually to test |
|
I have this in my |
|
|
How did python 3.7 finish so quickly? 15 mins vs 30+ mins. |
|
I gained 2 minutes (from 7 to 5) on my machine. I changed the path to reflect the difference of my distribution. (which g++ /usr/lib/ccache/bin/g++) The gain is not tremendous. |
|
ccache should make gcc commands instant... i7 laptop:
|
|
Can you give the output of: Mine is (once ccache is built of course) |
|
Thanks for sharing. Quite weird over here. My cache size was 2GB over 5GB. Cache hits were low and only preprocessed. |
|
it is not always perfect, and soemtimes finiky, but yeah, sometimes it works. |
|
Look at the time stamps of the generated files:
```
(py36) carbo:/tmp $ ls -al foo.c
-rw-r--r-- 1 stefan stefan 159139 Sep 12 10:53 foo.c
(py36) carbo:/tmp $ cython foo.pyx
(py36) carbo:/tmp $ ls -al foo.c
-rw-r--r-- 1 stefan stefan 159139 Sep 12 10:54 foo.c
```
|
|
@stefanv but we aren't calling cythonize directly, we are calling it through python. I think it behaves differently for me: Try this command |
|
@stefanv I think cython and cythonize are two different commands. Dam dam dam............. cythonizecython |
|
To test this, you need to have it inside of a setup.py file:
```
from distutils.core import setup, Extension
from Cython.Build import cythonize
module1 = cythonize('foo.pyx')[0]
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
ext_modules = [module1])
```
But, you are right that, in combination with distutils, it remembers
what has been built before and doesn't rebuild it. I would be curious
to know where that information is stored!
So, yes, +1 from me on this change then.
|
|
For an automatic selection of the number of threads, use |
a566694 to
4dd5152
Compare
|
Thanks @emmanuelle tests are now failing because of #3432 |
4dd5152 to
60e04bf
Compare
|
@jni, let me rebase. |
60e04bf to
7a3c154
Compare
|
Thanks for waiting @jni, this build stuff tends to break if we wait too long between test + merge. |
|
I merge for @jni. |
|
💯 |
|
A comment regarding the discussion on CCACHE. I checked on a system with a i7 CPU, and it divided the duration by a factor 4. It might be that my old atom CPU is not good enough for such optimization. |
|
Sorry you have an Atom 😢 |
In developing #3253, I had to keep issuing commands:
this is because the hashing doesn't consider
pxdfiles.This leads to the cython compiler doing really weird things and crashing.
Finally, cython will check itself if the files have changed, so this isn't necessary.
As a bonus, if you clump together c and c++ files seperately, cython will cythonize them in parallel using our infrastructure. This speeds things up greatly on multicore machines.
You have to hard code the number of threads to use, or maybe somebody knows theThanks @emmanuellenproccommand in python, but 8 is a safe number for now :/Checklist
[It's fine to submit PRs which are a work in progress! But before they are merged, all PRs should provide:]
./doc/examples(new features only)./benchmarks, if your changes aren't covered by anexisting benchmark
[For detailed information on these and other aspects see scikit-image contribution guidelines]
References
[If this is a bug-fix or enhancement, it closes issue # ]
[If this is a new feature, it implements the following paper: ]
For reviewers
(Don't remove the checklist below.)
later.
__init__.py.doc/release/release_dev.rst.@meeseeksdev backport to v0.14.x