Skip to content

[tools] Parallelize tools/clang_format_new.py#34750

Closed
SplitInfinity wants to merge 1 commit intopytorch:masterfrom
SplitInfinity:parallelize-clang-format
Closed

[tools] Parallelize tools/clang_format_new.py#34750
SplitInfinity wants to merge 1 commit intopytorch:masterfrom
SplitInfinity:parallelize-clang-format

Conversation

@SplitInfinity
Copy link
Copy Markdown

Summary
This commit parallelizes the invocation of clang-format on all files
in tools/clang_format_new.py using asyncio.

Testing
Ran and timed the script.

Before

$ time ./tools/clang_format_new.py  --diff
...
real	0m7.615s
user	0m6.012s
sys	0m1.634s

After

$ time ./tools/clang_format_new.py  --diff
...
Some files not formatted correctly

real	0m2.156s
user	0m8.488s
sys	0m3.201s

@SplitInfinity SplitInfinity requested a review from suo March 13, 2020 23:48
@SplitInfinity
Copy link
Copy Markdown
Author

Uhh, I must be doing something weird because I'm seeing this in the regular, non-diff mode:

/opt/anaconda3/lib/python3.7/asyncio/unix_events.py:867: RuntimeWarning: A loop is being detached from a child watcher with pending handlers
  RuntimeWarning)

@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Mar 14, 2020

💊 CircleCI build failures summary and remediations

As of commit 5d7320c (more details on the Dr. CI page):


  • 2/3 failures introduced in this PR

  • 1/3 broken upstream at merge base 4425619 from Mar 12 until Mar 13 (61 commits; 962e362 - 1734bd6)

    Please rebase on the viable/strict branch (expand for instructions)

    Since your merge base is older than viable/strict, run these commands:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🕵️ 2 new failures recognized by patterns

The following build failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (1/2)

Step: "Build" (full log | pattern match details) <confirmed not flaky by 2 failures>

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/pytorch_build_data.py 
Auto-merging .circleci/cimodel/data/pytorch_build_data.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/dimensions.py 
Auto-merging .circleci/cimodel/data/dimensions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/caffe2_build_data.py 
Auto-merging .circleci/cimodel/data/caffe2_build_data.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/binary_build_definitions.py 
Auto-merging .circleci/cimodel/data/binary_build_definitions.py 
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/binary_build_data.py 
Auto-merging .circleci/cimodel/data/binary_build_data.py 
Automatic merge failed; fix conflicts and then commit the result. 

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_build (2/2)

Step: "Build" (full log | pattern match details) <confirmed not flaky by 2 failures>

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\include\algorithm(2749): error C2780: '_OutIt std::move(_InIt,_InIt,_OutIt)': expects 3 arguments - 1 provided
        with
        [
            T=int64_t,
            _Ty=c10::IValue,
            _RanIt=c10::impl::ListIterator<int64_t,std::_Vector_iterator<std::_Vector_val<std::_Simple_types<c10::IValue>>>>
        ]
..\caffe2\operators\experimental\c10\cpu\expand_dims_cpu.cc(17): note: while compiling class template member function 'void caffe2::`anonymous-namespace'::expand_dims_cpu<float>::operator ()(const at::Tensor &,const at::Tensor &,c10::List<int64_t>)'
C:\Users\circleci\project\aten\src\ATen/core/boxing/kernel_functor.h(276): note: see reference to function template instantiation 'void caffe2::`anonymous-namespace'::expand_dims_cpu<float>::operator ()(const at::Tensor &,const at::Tensor &,c10::List<int64_t>)' being compiled
..\caffe2\operators\experimental\c10\cpu\expand_dims_cpu.cc(60): note: see reference to class template instantiation 'caffe2::`anonymous-namespace'::expand_dims_cpu<float>' being compiled
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\include\algorithm(2749): error C2672: 'std::move': no matching overloaded function found
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\include\algorithm(2749): error C2780: '_OutIt std::move(_InIt,_InIt,_OutIt)': expects 3 arguments - 1 provided
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.25.28610\include\xutility(3939): note: see declaration of 'std::move'
Microsoft (R) C/C++ Optimizing Compiler Version 19.25.28610.4 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

INITION /MD /O2 /Ob2 /DNDEBUG /w /EHa /bigobj -DNDEBUG   -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /Z7 /EHa /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -O2 -DCAFFE2_BUILD_MAIN_LIB -DONNX_BUILD_MAIN_LIB -std:c++14 /showIncludes /Focaffe2\CMakeFiles\torch_cpu.dir\operators\experimental\c10\cpu\flatten_cpu.cc.obj /Fdcaffe2\CMakeFiles\torch_cpu.dir\ /FS -c ..\caffe2\operators\experimental\c10\cpu\flatten_cpu.cc 
Microsoft (R) C/C++ Optimizing Compiler Version 19.25.28610.4 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

U_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /EHa /bigobj -DNDEBUG   -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /Z7 /EHa /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -O2 -DCAFFE2_BUILD_MAIN_LIB -DONNX_BUILD_MAIN_LIB -std:c++14 /showIncludes /Focaffe2\CMakeFiles\torch_cpu.dir\operators\zero_gradient_op.cc.obj /Fdcaffe2\CMakeFiles\torch_cpu.dir\ /FS -c ..\caffe2\operators\zero_gradient_op.cc 
Microsoft (R) C/C++ Optimizing Compiler Version 19.25.28610.4 for x64

🚧 1 upstream failure:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 4 times.

@suo
Copy link
Copy Markdown
Member

suo commented Mar 17, 2020

ready for review?

@SplitInfinity
Copy link
Copy Markdown
Author

SplitInfinity commented Mar 17, 2020

Yes. There is that one warning I couldn't figure out how to address. I think it's asyncio.gather related because it doesn't happen in --diff mode. I thought a second pair of eyes might notice the problem (and one that actually know how to use asyncio lol).

@suo
Copy link
Copy Markdown
Member

suo commented Mar 17, 2020

this is my output when I run: https://gist.github.com/suo/47015d5b20c35799d1571a91125faf45

@SplitInfinity
Copy link
Copy Markdown
Author

What do you see when you run with --diff? Same problem?

For some reason I don't see that on my end but it looks like there are too many processes being created. I guess it makes sense to cap the maximum number of subprocesses?

@suo
Copy link
Copy Markdown
Member

suo commented Mar 17, 2020

yeah, using a semaphore works.

Your other warning is because you don't call communicate() on the subprocess shell in run_clang_format_on_file so the main program exits with hanging subprocesses

Copy link
Copy Markdown
Member

@suo suo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to request changes to clear this from my queue

@SplitInfinity SplitInfinity force-pushed the parallelize-clang-format branch from da6bea0 to 0a78470 Compare March 17, 2020 19:23
@SplitInfinity SplitInfinity requested a review from suo March 17, 2020 19:28
@SplitInfinity
Copy link
Copy Markdown
Author

  • Added wait for subprocess in run_clang_format_on_file
  • Added asyncio.Semaphore to cap the maximum number of concurrent formatting processes

Copy link
Copy Markdown
Member

@suo suo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Summary:
This commit parallelizes the invocation of clang-format on all files
in tools/clang_format_new.py using asyncio.

Testing:
Ran and timed the script.
@SplitInfinity SplitInfinity force-pushed the parallelize-clang-format branch from 0a78470 to 5d7320c Compare March 18, 2020 21:28
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@SplitInfinity merged this pull request in 2a1c838.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
**Summary**
This commit parallelizes the invocation of `clang-format` on all files
in `tools/clang_format_new.py` using `asyncio`.

**Testing**
Ran and timed the script.

*Before*
```
$ time ./tools/clang_format_new.py  --diff
...
real	0m7.615s
user	0m6.012s
sys	0m1.634s
```

*After*
```
$ time ./tools/clang_format_new.py  --diff
...
Some files not formatted correctly

real	0m2.156s
user	0m8.488s
sys	0m3.201s
```
Pull Request resolved: pytorch#34750

Differential Revision: D20523133

Pulled By: SplitInfinity

fbshipit-source-id: 509741a0b4fcfcdcd7c5a45654e3453b4874d256
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants