Intoduce torch.testing._internal.opinfos by antocuni · Pull Request #59871 · pytorch/pytorch

antocuni · 2021-06-11T12:49:24Z

@mruberry this PR is still not complete but I'd like to get some feedback before continuing, to avoid doing unnecessary work.
I started to experiment with what we discussed in #55159, and in particular:

op_db is not an instance of OpInfoDB instead of being a simple list. This allows to register() OpInfos from multiple places, instead of having to list them explicitly into a big long list
I moved linalg.det and linalg.cond in their own file, to have a feeling on how the new system looks like. A good pro is that now we can implement the sample_inputs_* functions in the proximity of the OpInfo itself, instead of having them many lines above.

Looking forward to get some feedback!

facebook-github-bot · 2021-06-11T12:49:30Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/59871
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 07eabdc (more details on the Dr. CI page):

2/3 failures possibly* introduced in this PR
- 1/2 non-scanned failure(s)
1/3 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jun 30 19:10:04 RuntimeError: test_quantization failed!

Jun 30 19:10:02 Generated XML report: test-reports/dist-gloo/test_quantization/TEST-quantization.core.test_quantized_op.TestQuantizedOps-20210630190112.xml
Jun 30 19:10:02 Generated XML report: test-reports/dist-gloo/test_quantization/TEST-quantization.core.test_quantized_tensor.TestQuantizedTensor-20210630190112.xml
Jun 30 19:10:02 Generated XML report: test-reports/dist-gloo/test_quantization/TEST-quantization.core.test_workflow_module.TestRecordHistogramObserver-20210630190112.xml
Jun 30 19:10:02 Generated XML report: test-reports/dist-gloo/test_quantization/TEST-quantization.bc.test_backward_compatibility.TestSerialization-20210630190112.xml
Jun 30 19:10:02 Generated XML report: test-reports/dist-gloo/test_quantization/TEST-quantization.core.test_quantized_module.TestStaticQuantizedModule-20210630190112.xml
Jun 30 19:10:04 Traceback (most recent call last):
Jun 30 19:10:04   File "test/run_test.py", line 1308, in <module>
Jun 30 19:10:04     main()
Jun 30 19:10:04   File "test/run_test.py", line 1287, in main
Jun 30 19:10:04     raise RuntimeError(err_message)
Jun 30 19:10:04 RuntimeError: test_quantization failed!
Jun 30 19:10:04 + cleanup
Jun 30 19:10:04 + retcode=1
Jun 30 19:10:04 + set +x
Jun 30 19:10:04 =================== sccache compilation log ===================
Jun 30 19:10:04 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jun 30 19:10:04 Compile requests                      0
Jun 30 19:10:04 Compile requests executed             0
Jun 30 19:10:04 Cache hits                            0
Jun 30 19:10:04 Cache misses                          0
Jun 30 19:10:04 Cache timeouts                        0

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_xla_linux_bionic_py3_6_clang9_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun) ❄️

fatal: Could not read from remote repository.

remote: Enumerating objects: 216, done.        
remote: Counting objects:   0% (1/216)        
remote: Counting objects:   1% (3/216)        
remote: Counting objects:   2% (5/216)        
remote: Counting objects:   3% (7/216)        
remote: Counting objects:   4% (9/216)        
remote: Counting objects:   5% (11/216)        
remote: Counting objects:   6% (13/216)        
remote: Counting objects:   7% (16/216)        
remote: Counting objects:   8% (18/216)        
remote: Counting objects:   9% (20/216)        
remote: Counting objects:  10% (22/216)        
remote: Counting objects:  11% (24/216)        
remote: Counting objects:  12% (26/216)        
remote: Counting objects:  13% (29/216)        
remote: Counting objects:  14% (31/216)        
remote: Counting objects:  15% (33/216)        
remote: Counting objects:  16% (35/216)        
remote: Counting objects:  17% (37/216)        
remote: Counting objects:  18% (39/216)        
remote: Counting objects:  19% (42/216)        
remote: Counting objects:  20% (44/216)        
remote: Counting objects:  21% (46/216)        
remote: Counting objects:  22% (48/216)        
remote: Counting objects:  23% (50/216)        
remote: Counting objects:  24% (52/216)        
remote: Counting objects:  25% (54/216)        
remote: Counting objects:  26% (57/216)        
remote: Counting objects:  27% (59/216)        
remote: Counting objects:  28% (61/216)        
remote: Counting objects:  29% (63/216)        
remote: Counting objects:  30% (65/216)        
remote: Counting objects:  31% (67/216)        
remote: Counting objects:  32% (70/216)        
remote: Counting objects:  33% (72/216)        
remote: Counting objects:  34% (74/216)        
remote: Counting objects:  35% (76/216)        
remote: Counting objects:  36% (78/216)        
remote: Counting objects:  37% (80/216)        
remote: Counting objects:  38% (83/216)        
remote: Counting objects:  39% (85/216)        
remote: Counting objects:  40% (87/216)        
remote: Counting objects:  41% (89/216)        
remote: Counting objects:  42% (91/216)        
remote: Counting objects:  43% (93/216)        
remote: Counting objects:  44% (96/216)        
remote: Counting objects:  45% (98/216)        
remote: Counting objects:  46% (100/216)        
remote: Counting objects:  47% (102/216)        
remote: Counting objects:  48% (104/216)        
remote: Counting objects:  49% (106/216)        
remote: Counting objects:  50% (108/216)        
remote: Counting objects:  51% (111/216)        
remote: Counting objects:  52% (113/216)        
remote: Counting objects:  53% (115/216)        
remote: Counting objects:  54% (117/216)        
remote: Counting objects:  55% (119/216)        
remote: Counting objects:  56% (121/216)        
remote: Counting objects:  57% (124/216)        
remote: Counting objects:  58% (126/216)        
remote: Counting objects:  59% (128/216)        
remote: Counting objects:  60% (130/216)        
remote: Counting objects:  61% (132/216)        
remote: Counting objects:  62% (134/216)        
remote: Counting objects:  63% (137/216)        
remote: Counting objects:  64% (139/216)        
remote: Counting objects:  65% (141/216)        
remote: Counting objects:  66% (143/216)        
remote: Counting objects:  67% (145/216)        
remote: Counting objects:  68% (147/216)        
remote: Counting objects:  69% (150/216)        
remote: Counting objects:  70% (152/216)        
remote: Counting objects:  71% (154/216)        
remote: Counting objects:  72% (156/216)        
remote: Counting objects:  73% (158/216)        
remote: Counting objects:  74% (160/216)        
remote: Counting objects:  75% (162/216)        
remote: Counting objects:  76% (165/216)        
remote: Counting objects:  77% (167/216)        
remote: Counting objects:  78% (169/216)        
remote: Counting objects:  79% (171/216)        
remote: Counting objects:  80% (173/216)        
remote: Counting objects:  81% (175/216)        
remote: Counting objects:  82% (178/216)        
remote: Counting objects:  83% (180/216)        
remote: Counting objects:  84% (182/216)        
remote: Counting objects:  85% (184/216)        
remote: Counting objects:  86% (186/216)        
remote: Counting objects:  87% (188/216)        
remote: Counting objects:  88% (191/216)        
remote: Counting objects:  89% (193/216)        
remote: Counting objects:  90% (195/216)        
remote: Counting objects:  91% (197/216)        
remote: Counting objects:  92% (199/216)        
remote: Counting objects:  93% (201/216)        
remote: Counting objects:  94% (204/216)        
remote: Counting objects:  95% (206/216)        
remote: Counting objects:  96% (208/216)        
remote: Counting objects:  97% (210/216)        
remote: Counting objects:  98% (212/216)        
remote: Counting objects:  99% (214/216)        
remote: Counting objects: 100% (216/216)        
remote: Counting objects: 100% (216/216), done.        
remote: Compressing objects:   0% (1/113)        
remote: Compressing objects:   1% (2/113)        
remote: Compressing objects:   2% (3/113)        
remote: Compressing objects:   3% (4/113)        
remote: Compressing objects:   4% (5/113)        
remote: Compressing objects:   5% (6/113)        
remote: Compressing objects:   6% (7/113)        
remote: Compressing objects:   7% (8/113)        
remote: Compressing objects:   8% (10/113)        
remote: Compressing objects:   9% (11/113)        
remote: Compressing objects:  10% (12/113)        
remote: Compressing objects:  11% (13/113)        
remote: Compressing objects:  12% (14/113)        
remote: Compressing objects:  13% (15/113)        
remote: Compressing objects:  14% (16/113)        
remote: Compressing objects:  15% (17/113)        
remote: Compressing objects:  16% (19/113)        
remote: Compressing objects:  17% (20/113)        
remote: Compressing objects:  18% (21/113)        
remote: Compressing objects:  19% (22/113)        
remote: Compressing objects:  20% (23/113)        
remote: Compressing objects:  21% (24/113)        
remote: Compressing objects:  22% (25/113)        
remote: Compressing objects:  23% (26/113)        
remote: Compressing objects:  24% (28/113)        
remote: Compressing objects:  25% (29/113)        
remote: Compressing objects:  26% (30/113)        
remote: Compressing objects:  27% (31/113)        
remote: Compressing objects:  28% (32/113)        
remote: Compressing objects:  29% (33/113)        
remote: Compressing objects:  30% (34/113)        
remote: Compressing objects:  31% (36/113)        
remote: Compressing objects:  32% (37/113)        
remote: Compressing objects:  33% (38/113)        
remote: Compressing objects:  34% (39/113)        
remote: Compressing objects:  35% (40/113)        
remote: Compressing objects:  36% (41/113)        
remote: Compressing objects:  37% (42/113)        
remote: Compressing objects:  38% (43/113)        
remote: Compressing objects:  39% (45/113)        
remote: Compressing objects:  40% (46/113)        
remote: Compressing objects:  41% (47/113)        
remote: Compressing objects:  42% (48/113)        
remote: Compressing objects:  43% (49/113)        
remote: Compressing objects:  44% (50/113)        
remote: Compressing objects:  45% (51/113)        
remote: Compressing objects:  46% (52/113)        
remote: Compressing objects:  47% (54/113)        
remote: Compressing objects:  48% (55/113)        
remote: Compressing objects:  49% (56/113)        
remote: Compressing objects:  50% (57/113)        
remote: Compressing objects:  51% (58/113)        
remote: Compressing objects:  52% (59/113)        
remote: Compressing objects:  53% (60/113)        
remote: Compressing objects:  54% (62/113)        
remote: Compressing objects:  55% (63/113)        
remote: Compressing objects:  56% (64/113)        
remote: Compressing objects:  57% (65/113)        
remote: Compressing objects:  58% (66/113)        
remote: Compressing objects:  59% (67/113)        
remote: Compressing objects:  60% (68/113)        
remote: Compressing objects:  61% (69/113)        
remote: Compressing objects:  62% (71/113)        
remote: Compressing objects:  63% (72/113)        
remote: Compressing objects:  64% (73/113)        
remote: Compressing objects:  65% (74/113)        
remote: Compressing objects:  66% (75/113)        
remote: Compressing objects:  67% (76/113)        
remote: Compressing objects:  68% (77/113)        
remote: Compressing objects:  69% (78/113)        
remote: Compressing objects:  70% (80/113)        
remote: Compressing objects:  71% (81/113)        
remote: Compressing objects:  72% (82/113)        
remote: Compressing objects:  73% (83/113)        
remote: Compressing objects:  74% (84/113)        
remote: Compressing objects:  75% (85/113)        
remote: Compressing objects:  76% (86/113)        
remote: Compressing objects:  77% (88/113)        
remote: Compressing objects:  78% (89/113)        
remote: Compressing objects:  79% (90/113)        
remote: Compressing objects:  80% (91/113)        
remote: Compressing objects:  81% (92/113)        
remote: Compressing objects:  82% (93/113)        
remote: Compressing objects:  83% (94/113)        
remote: Compressing objects:  84% (95/113)        
remote: Compressing objects:  85% (97/113)        
remote: Compressing objects:  86% (98/113)        
remote: Compressing objects:  87% (99/113)        
remote: Compressing objects:  88% (100/113)        
remote: Compressing objects:  89% (101/113)        
remote: Compressing objects:  90% (102/113)        
remote: Compressing objects:  91% (103/113)        
remote: Compressing objects:  92% (104/113)        
remote: Compressing objects:  93% (106/113)        
remote: Compressing objects:  94% (107/113)        
remote: Compressing objects:  95% (108/113)        
remote: Compressing objects:  96% (109/113)        
remote: Compressing objects:  97% (110/113)        
Receiving objects:   0% (1/124)
Receiving objects:   1% (2/124)
Receiving objects:   2% (3/124)
Receiving objects:   3% (4/124)
Receiving objects:   4% (5/124)
Receiving objects:   5% (7/124)
Receiving objects:   6% (8/124)
Receiving objects:   7% (9/124)
Receiving objects:   8% (10/124)
Receiving objects:   9% (12/124)
Receiving objects:  10% (13/124)
Receiving objects:  11% (14/124)
Receiving objects:  12% (15/124)
Receiving objects:  13% (17/124)
Receiving objects:  14% (18/124)
Receiving objects:  15% (19/124)
Receiving objects:  16% (20/124)
Receiving objects:  17% (22/124)
Receiving objects:  18% (23/124)
Receiving objects:  19% (24/124)
Receiving objects:  20% (25/124)
Receiving objects:  21% (27/124)
Receiving objects:  22% (28/124)
Receiving objects:  23% (29/124)
Receiving objects:  24% (30/124)
Receiving objects:  25% (31/124)
Receiving objects:  26% (33/124)
Receiving objects:  27% (34/124)
Receiving objects:  28% (35/124)
Receiving objects:  29% (36/124)
Receiving objects:  30% (38/124)
Receiving objects:  31% (39/124)
Receiving objects:  32% (40/124)
Receiving objects:  33% (41/124)
Receiving objects:  34% (43/124)
Receiving objects:  35% (44/124)
Receiving objects:  36% (45/124)
Receiving objects:  37% (46/124)
Receiving objects:  38% (48/124)
Receiving objects:  39% (49/124)
Receiving objects:  40% (50/124)
Receiving objects:  41% (51/124)
Receiving objects:  42% (53/124)
Receiving objects:  43% (54/124)
Receiving objects:  44% (55/124)
Receiving objects:  45% (56/124)
Receiving objects:  46% (58/124)
Receiving objects:  47% (59/124)
Receiving objects:  48% (60/124)
Receiving objects:  49% (61/124)
Receiving objects:  50% (62/124)
Receiving objects:  51% (64/124)
Receiving objects:  52% (65/124)
Receiving objects:  53% (66/124)
Receiving objects:  54% (67/124)
Receiving objects:  55% (69/124)
Receiving objects:  56% (70/124)
Receiving objects:  57% (71/124)
Receiving objects:  58% (72/124)
Receiving objects:  59% (74/124)
Receiving objects:  60% (75/124)
Receiving objects:  61% (76/124)
Receiving objects:  62% (77/124)
Receiving objects:  63% (79/124)
Receiving objects:  64% (80/124)
Receiving objects:  65% (81/124)
Receiving objects:  66% (82/124)
Receiving objects:  67% (84/124)
Receiving objects:  68% (85/124)
Receiving objects:  69% (86/124)
Receiving objects:  70% (87/124)
Receiving objects:  71% (89/124)
Receiving objects:  72% (90/124)
Receiving objects:  73% (91/124)
Receiving objects:  74% (92/124)
Receiving objects:  75% (93/124)
Receiving objects:  76% (95/124)
Receiving objects:  77% (96/124)
Receiving objects:  78% (97/124)
Receiving objects:  79% (98/124)
Receiving objects:  80% (100/124)
Receiving objects:  81% (101/124)
Receiving objects:  82% (102/124)
Receiving objects:  83% (103/124)
Receiving objects:  84% (105/124)
Receiving objects:  85% (106/124)
Receiving objects:  86% (107/124)
Receiving objects:  87% (108/124)
Receiving objects:  88% (110/124)
Receiving objects:  89% (111/124)
Receiving objects:  90% (112/124)
Receiving objects:  91% (113/124)
remote: Total 124 (delta 88), reused 27 (delta 7), pack-reused 0        
Receiving objects:  92% (115/124)
Receiving objects:  93% (116/124)
Receiving objects:  94% (117/124)
Receiving objects:  95% (118/124)
Receiving objects:  96% (120/124)
Receiving objects:  97% (121/124)
Receiving objects:  98% (122/124)
Receiving objects:  99% (123/124)
Receiving objects: 100% (124/124)
Receiving objects: 100% (124/124), 104.97 KiB | 3.75 MiB/s, done.
Resolving deltas:   0% (0/88)
Resolving deltas:   1% (1/88)
Resolving deltas:   2% (2/88)
Resolving deltas:   3% (3/88)
Resolving deltas:   4% (4/88)
Resolving deltas:   5% (5/88)
Resolving deltas:   6% (6/88)
Resolving deltas:   7% (7/88)
Resolving deltas:   9% (8/88)
Resolving deltas:  10% (9/88)
Resolving deltas:  11% (10/88)
Resolving deltas:  12% (11/88)
Resolving deltas:  13% (12/88)
Resolving deltas:  14% (13/88)
Resolving deltas:  15% (14/88)
Resolving deltas:  17% (15/88)
Resolving deltas:  18% (16/88)
Resolving deltas:  19% (17/88)
Resolving deltas:  20% (18/88)
Resolving deltas:  21% (19/88)
Resolving deltas:  22% (20/88)
Resolving deltas:  23% (21/88)
Resolving deltas:  25% (22/88)
Resolving deltas:  26% (23/88)
Resolving deltas:  27% (24/88)
Resolving deltas:  28% (25/88)
Resolving deltas:  29% (26/88)
Resolving deltas:  30% (27/88)
Resolving deltas:  31% (28/88)
Resolving deltas:  32% (29/88)
Resolving deltas:  34% (30/88)
Resolving deltas:  35% (31/88)
Resolving deltas:  36% (32/88)
Resolving deltas:  37% (33/88)
Resolving deltas:  38% (34/88)
Resolving deltas:  39% (35/88)
Resolving deltas:  40% (36/88)
Resolving deltas:  42% (37/88)
Resolving deltas:  43% (38/88)
Resolving deltas:  44% (39/88)
Resolving deltas:  45% (40/88)
Resolving deltas:  46% (41/88)
Resolving deltas:  47% (42/88)
Resolving deltas:  48% (43/88)
Resolving deltas:  50% (44/88)
Resolving deltas:  51% (45/88)
Resolving deltas:  52% (46/88)
Resolving deltas:  53% (47/88)
Resolving deltas:  54% (48/88)
Resolving deltas:  55% (49/88)
Resolving deltas:  56% (50/88)
Resolving deltas:  57% (51/88)
Resolving deltas:  59% (52/88)
Resolving deltas:  60% (53/88)
Resolving deltas:  61% (54/88)
Resolving deltas:  62% (55/88)
Resolving deltas:  63% (56/88)
Resolving deltas:  64% (57/88)
Resolving deltas:  65% (58/88)
Resolving deltas:  67% (59/88)
Resolving deltas:  68% (60/88)
Resolving deltas:  69% (61/88)
Resolving deltas:  70% (62/88)
Resolving deltas:  71% (63/88)
Resolving deltas:  72% (64/88)
Resolving deltas:  73% (65/88)
Resolving deltas:  75% (66/88)
Resolving deltas:  76% (67/88)
Resolving deltas:  77% (68/88)
Resolving deltas:  78% (69/88)
Resolving deltas:  79% (70/88)
Resolving deltas:  80% (71/88)
Resolving deltas:  81% (72/88)
Resolving deltas:  82% (73/88)
Resolving deltas:  84% (74/88)
Resolving deltas:  85% (75/88)
Resolving deltas:  86% (76/88)
Resolving deltas:  87% (77/88)
Resolving deltas:  88% (78/88)
Resolving deltas:  89% (79/88)
Resolving deltas:  90% (80/88)
Resolving deltas:  92% (81/88)
Resolving deltas:  93% (82/88)
Resolving deltas:  94% (83/88)
Resolving deltas:  95% (84/88)
Resolving deltas:  96% (85/88)
Resolving deltas:  97% (86/88)
Resolving deltas:  98% (87/88)
Resolving deltas: 100% (88/88)
Resolving deltas: 100% (88/88), completed with 75 local objects.
From ssh://github.com/NVlabs/cub
 * branch            d106ddb991a56c3df1b6d51b2409e36ba8181ce4 -> FETCH_HEAD
remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0        
ssh: connect to host github.com port 22: Connection timed out

fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Fetched in submodule path 'third_party/cudnn_frontend', but it did not contain 51e60d891b689d618e7a623509a779c422a420f7. Direct fetching of that commit failed.


Exited with code exit status 1

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.2-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

mruberry

What do you think, @kshitij12345, @pmeier?

common_methods_invocations.py is currently 7500 lines, and we like to keep files below 10k lines. So I think we have time before we have to pursue a split. If/when we do pursue a split I agree this seems like the best approach.

Should we do this now, @antocuni? What do you see as the pros/cons? My attempt at a list:

PROS

nicer file names (because common_methods_invocations.py is a terrible name)
can put sample inputs next to ops, at least for some operators
moves classes like UnaryUfuncInfo into their own files, so files have clearer purposes

CONS

can no longer search all opinfos in one file
sample inputs next to ops or an op after op pattern both seem valid, and some classes of operator (unary elementwise, binary elementwise, reductions) will use the same sample inputs for the entire class, so at least a few will use an op after op pattern; this makes PRO #2 above a little dubious

What other PROS/CONS should we consider, @antocuni?

antocuni · 2021-06-14T15:37:21Z

What do you think, @kshitij12345, @pmeier?

common_methods_invocations.py is currently 7500 lines, and we like to keep files below 10k lines. So I think we have time before we have to pursue a split. If/when we do pursue a split I agree this seems like the best approach.

just my 2 cents here, but generally speaking 10k lines seems a LOT to me, also considering that Python source code is typically more dense than other languages. It makes it impossible to read the file from top to bottom and the only reasonable way to navigate around it is to search for a known string. Even 7.5k is already too much, at least for me.

Also, it seems that the OpInfo approach is really successful and we will continuing to add new features and new operators, so eventually we will reach the 10k limit anyway. So, the sooner we do the split, the less work we have to do later when we will inevitably need to do it.

Should we do this now, @antocuni? What do you see as the pros/cons? My attempt at a list:

PROS

nicer file names (because common_methods_invocations.py is a terrible name)

+1 for this :)

can put sample inputs next to ops, at least for some operators

moves classes like UnaryUfuncInfo into their own files, so files have clearer purposes

CONS

can no longer search all opinfos in one file

this can be mitigated by using a sane naming convention for the files, so that one can easily grep into them. In this PR I started using opinfo_*.py, but we could even use a separate package opinfo/*.py (e.g., opinfo/linalg.py).

sample inputs next to ops or an op after op pattern both seem valid, and some classes of operator (unary elementwise, binary elementwise, reductions) will use the same sample inputs for the entire class, so at least a few will use an op after op pattern; this makes PRO Don't support legacy Python #2 above a little dubious

yes, but there are others cases in which there are ~6000 lines between the definition of the sample_input_* function and the corresponding OpInfo. Even if the PRO #2 does not applies everywhere, it's still a big improvements when it does.

What other PROS/CONS should we consider, @antocuni?

Extra PROS

with this approach, we can easily combine multiple OpInfoDB into one. For example, right now we start from a single big op_db and from there we create smaller dbs by filtering out what we don't want, like unary_ufuncs, spectral_funcs, etc. We could do the opposite and have opinfo/unary.py to define unary_db, opinfo/spectral.py to define spectral_func_db and opinfo/all.py to be defined as unary_db + spectral_db + other_ops_db + ... (but we need to find a better name for other_ops_db)
it makes it possible to split the logic and the code of OpInfo, OpInfoDB, SampleInput etc. from the actual definition of the DB itself (which will be in the opinfo/*.py files)
it enables us to experiment with new/better syntax to declare OpInfos, either now or in the future. See e.g. the class-based syntax which I proposed in Better syntax for OpInfo #55159

Extra CONS

we need to repeat a lot of import boilerplate at the beginning of each file

kshitij12345 · 2021-06-14T18:09:29Z

nicer file names (because common_methods_invocations.py is a terrible name)

+1

sample inputs next to ops or and op after op pattern both seem valid, and some classes of operator (unary elementwise, binary elementwise, reductions) will use the same sample inputs for the entire class, so at least a few will use an op after op pattern; this makes PRO #2 above a little dubious

I agree with @antocuni that this structure gives us more flexibility going forward.

Also from my personal experience, just looking at test_torch.py was intimidating (as a new contributor back when it was 15K lines and over). I really like that with this approach we can split the OpInfo into coherent files.

Thanks @antocuni!

pmeier · 2021-06-15T06:02:16Z

Very much in favor of having multiple smaller files. What about a structure like this:

torch/
└── testing
    └── _internal
        └── op_info
            ├── binary.py
            ├── __init__.py
            ├── linalg.py
            ├── misc.py
            ├── reduction.py
            ├── unary.py
            └── utils.py

We could group every operator in the torch namespace into unary, binary, reduction, and misc (or split this further if we like to). For operators that have their own namespace, I would just keep it for example linalg. Stuff that is used by all files, for example the base OpInfo class could be put in utils. As @antocuni proposed, every file could now define its own small database which could be fused in __init__.

antocuni · 2021-06-19T09:08:13Z

Very much in favor of having multiple smaller files. What about a structure like this:

torch/
└── testing
    └── _internal
        └── op_info
            ├── binary.py
            ├── __init__.py
            ├── linalg.py
            ├── misc.py
            ├── reduction.py
            ├── unary.py
            └── utils.py

+1 for this.

We could group every operator in the torch namespace into unary, binary, reduction, and misc (or split this further if we like to). For operators that have their own namespace, I would just keep it for example linalg. Stuff that is used by all files, for example the base OpInfo class could be put in utils. As @antocuni proposed, every file could now define its own small database which could be fused in __init__.

Defining an OpInfoDB in each file and merge them together inside __init__ has another advantage: in the current version of the branch, import torch.testing._internal.opinfo_linalg has the side effect of mutating _internal.op_db (by calling .register() on it) and thus has do be done at the end of the file.
By reversing the approach, we can do a much simpler and easier to understand composition. E.g. inside op_info/__init__.py we will have something like this:

from . import unary, binary, linalg, misc # ...
op_db = OpInfoDB.merge('all', unary.op_db, binary.op_db, linalg.op_db, misc.op_db, ...)

No side effects, no need to do imports at the end, everything straightforward.

@mruberry what do you think?

mruberry · 2021-06-20T08:22:22Z

This sounds good. What about keeping our lives simpler for now and still just using a list? Lists in Python can be appended with each new OpInfo instead of declaring them in it. The central repo can just import the lists and extend or add them or whatever to create its own list.

For files, we could start with a new folder in _internal, like proposed above, called "opinfos". The files in it can mirror different test files and classes. We could start with init.py, core.py or opinfo.py, unary_elementwise.py, and foreach.py for example. These could define the lists "all_opinfos" (replacing the name "op_db"), define the OpInfo-related classes, a "unary_elementwise_opinfos" list, and the special "foreach_opinfos" list, which contains OpInfos that don't appear in "all_opinfos".

That seems simple and flexible while achieving our organizational goals? It also lets us create a more specialized data structure in the future if we want to without locking us into anything today. We could add the additional files like binary_elementwise.py and linalg.py etc. in the same PR or a follow-up.

…-opdb

antocuni · 2021-06-30T14:24:04Z

This sounds good. What about keeping our lives simpler for now and still just using a list? Lists in Python can be appended with each new OpInfo instead of declaring them in it. The central repo can just import the lists and extend or add them or whatever to create its own list.

For files, we could start with a new folder in _internal, like proposed above, called "opinfos". The files in it can mirror different test files and classes.

@mruberry I changed the name of this PR to reflect the this new plan and implemented it.
This PR:

introduces torch.testing._internal.opinfos
move the majority of the old common_method_invocations.py into multiple files in opinfos/
fix the imports accordingly

This PR is going to be a conflict nightmare since common_methods_invocations.py is modifed very often, so to mitigate the problem for now I tried to minimize the changes. In particular, I did not rename anything, and op_db is still defined as a big single list inside opinfos/db.py.

I propose to continue the refactoring in follow up PRs, but assuming that tests are green, hopefully this one should land as quickly as possible, because the more we wait the more likely it is to become out of date.

…s used also elsewhere

mruberry · 2021-07-09T05:35:53Z

This sounds good. What about keeping our lives simpler for now and still just using a list? Lists in Python can be appended with each new OpInfo instead of declaring them in it. The central repo can just import the lists and extend or add them or whatever to create its own list.
For files, we could start with a new folder in _internal, like proposed above, called "opinfos". The files in it can mirror different test files and classes.

@mruberry I changed the name of this PR to reflect the this new plan and implemented it.
This PR:

introduces torch.testing._internal.opinfos

move the majority of the old common_method_invocations.py into multiple files in opinfos/

fix the imports accordingly

This PR is going to be a conflict nightmare since common_methods_invocations.py is modifed very often, so to mitigate the problem for now I tried to minimize the changes. In particular, I did not rename anything, and op_db is still defined as a big single list inside opinfos/db.py.

I propose to continue the refactoring in follow up PRs, but assuming that tests are green, hopefully this one should land as quickly as possible, because the more we wait the more likely it is to become out of date.

Cool; let me think if we can sneak this in this weekend.

mruberry · 2021-07-19T06:03:43Z

fyi @antocuni I haven't forgotten about this but need to find the right time to make the update, thanks for being patient

antocuni · 2021-07-19T15:29:42Z

fyi @antocuni I haven't forgotten about this but need to find the right time to make the update, thanks for being patient

no problem, thank you.
I think it's not worth trying to fix the conflicts continuously, so please ping me when you are ready to do the merge, and I'll fix it

kshitij12345 · 2021-12-16T08:16:53Z

Right now common_method_invocations is around 15k lines and Github has issue rendering.

Gentle ping @mruberry

mruberry · 2021-12-16T21:45:26Z

Right now common_method_invocations is around 15k lines and Github has issue rendering.

Gentle ping @mruberry

Yeah it's become too big

github-actions · 2022-04-13T08:44:21Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

pmeier · 2022-04-13T08:54:57Z

We should not let this go to waste. Working on the file is a pain. @antocuni is no longer with Quansight, but I can take this up to at least fix the merge conflicts or do additional changes if requested.

antocuni added the module: testing Issues related to the torch.testing module (not tests) label Jun 11, 2021

facebook-github-bot added the cla signed label Jun 11, 2021

pytorchbot added the open source label Jun 11, 2021

mruberry reviewed Jun 13, 2021

View reviewed changes

antocuni added 2 commits June 30, 2021 10:04

WIP: move OpInfo and op_db to opinfos/core.py and opinfos/db.py

c7feeff

move SpecralFuncInfo to its own file

42d8d72

antocuni force-pushed the antocuni/refactor-opdb branch from 9f96068 to 42d8d72 Compare June 30, 2021 10:11

antocuni added 3 commits June 30, 2021 13:32

move ShapeFuncInfo to its own file

417a466

move ForeachFuncInfo to its own file

e8c8d6d

remove ununsed imports

7a6417d

antocuni changed the title ~~WIP: Introduce OpInfoDB~~ Intoduce torch.testing._internal.opinfos Jun 30, 2021

antocuni added 3 commits June 30, 2021 14:07

Merge remote-tracking branch 'upstream/master' into antocuni/refactor…

edfce87

…-opdb

remove unused imports

73c4027

this got lost in the merge

f88d82f

antocuni added 3 commits June 30, 2021 14:25

make flake8 happy

3378636

more fixes

92bddf9

flake8

784ecf2

antocuni marked this pull request as ready for review June 30, 2021 14:35

antocuni added 2 commits June 30, 2021 14:37

missing imports

f4fd30c

move mask_not_all_zeros back to common_methods_invocations, since it'…

07eabdc

…s used also elsewhere

kshitij12345 mentioned this pull request Aug 5, 2021

[opinfo] nn.functional.unfold #62705

Closed

github-actions bot added the Stale label Apr 13, 2022

github-actions bot closed this May 13, 2022

antocuni mentioned this pull request Jul 31, 2022

Split up common_methods_invocations.py? #82518

Closed

Conversation

antocuni commented Jun 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jun 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_test2 (1/1)

❄️ 1 failure tentatively classified as flaky

pytorch_xla_linux_bionic_py3_6_clang9_build (1/1)

ci.pytorch.org: 1 failed

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

antocuni commented Jun 14, 2021

Uh oh!

kshitij12345 commented Jun 14, 2021

Uh oh!

pmeier commented Jun 15, 2021

Uh oh!

antocuni commented Jun 19, 2021

Uh oh!

mruberry commented Jun 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

antocuni commented Jun 30, 2021

Uh oh!

mruberry commented Jul 9, 2021

Uh oh!

mruberry commented Jul 19, 2021

Uh oh!

antocuni commented Jul 19, 2021

Uh oh!

kshitij12345 commented Dec 16, 2021

Uh oh!

mruberry commented Dec 16, 2021

Uh oh!

github-actions bot commented Apr 13, 2022

Uh oh!

pmeier commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

antocuni commented Jun 11, 2021 •

edited

Loading

facebook-github-bot commented Jun 11, 2021 •

edited

Loading

mruberry commented Jun 20, 2021 •

edited

Loading

pmeier commented Apr 13, 2022 •

edited

Loading