Skip to content

Replace all CHECK_ and DCHECK_ with TORCH_* macros #82032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from

Conversation

wconstab
Copy link
Contributor

@wconstab wconstab commented Jul 22, 2022

Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases.

All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed.
c10/util/logging_is_not_google_glog.h
c10/util/logging_is_google_glog.h

Fixes #81415

cc @miladm @malfet

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jul 22, 2022

🔗 Helpful links

❌ 1 New Failures

As of commit c3db99b (more details on the Dr. CI page):

Expand to see more
  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (dynamo, 2, 2, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-07-25T23:12:04.7119121Z RuntimeError: test_nn failed!
2022-07-25T23:12:02.5654182Z Generated XML report: test-reports/python-unittest/test_nn/TEST-TestModuleGlobalHooks-20220725230408.xml
2022-07-25T23:12:02.7528338Z Generated XML report: test-reports/python-unittest/test_nn/TEST-TestNN-20220725230408.xml
2022-07-25T23:12:02.8274177Z Generated XML report: test-reports/python-unittest/test_nn/TEST-TestNNDeviceTypeCPU-20220725230408.xml
2022-07-25T23:12:02.8298278Z Generated XML report: test-reports/python-unittest/test_nn/TEST-TestNNInit-20220725230408.xml
2022-07-25T23:12:02.8304647Z Generated XML report: test-reports/python-unittest/test_nn/TEST-TestStateDictHooks-20220725230408.xml
2022-07-25T23:12:04.7095654Z Traceback (most recent call last):
2022-07-25T23:12:04.7096165Z   File "test/run_test.py", line 966, in <module>
2022-07-25T23:12:04.7115685Z     main()
2022-07-25T23:12:04.7115897Z   File "test/run_test.py", line 944, in main
2022-07-25T23:12:04.7118515Z     raise RuntimeError(err_message)
2022-07-25T23:12:04.7119121Z RuntimeError: test_nn failed!
2022-07-25T23:12:04.9900105Z 
2022-07-25T23:12:04.9900642Z real	44m58.453s
2022-07-25T23:12:04.9900868Z user	224m9.938s
2022-07-25T23:12:04.9901038Z sys	8m22.279s
2022-07-25T23:12:04.9933519Z ##[error]Process completed with exit code 1.
2022-07-25T23:12:04.9968026Z Prepare all required actions
2022-07-25T23:12:04.9968326Z Getting action download info
2022-07-25T23:12:05.1503415Z ##[group]Run ./.github/actions/get-workflow-job-id
2022-07-25T23:12:05.1503643Z with:
2022-07-25T23:12:05.1503987Z   github-token: ***

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue oncall: distributed Add this issue/PR to distributed oncall triage queue labels Jul 22, 2022
@facebook-github-bot
Copy link
Contributor

@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

2 similar comments
@facebook-github-bot
Copy link
Contributor

@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@wconstab
Copy link
Contributor Author

OSS CI failures seem transient/unrelated:

pull / linux-bionic-cuda11.6-py3.7-gcc7 / test (functorch, 1, 1, linux.4xlarge.nvidia.gpu)

  • this test doesn't seem to have run on most of the master commits between friday and now, and the failure logs look unrelated

pull / linux-focal-rocm5.2-py3.7 / build

  • network issue

might wait on being able to run internal CI to land if I can, but otherwise looks ready to land.

@miladm miladm self-requested a review July 25, 2022 18:18
@facebook-github-bot
Copy link
Contributor

@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Collaborator

@miladm miladm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks for providing this support @wconstab! My local build of PyTorch/XLA is unblocked by this change (i.e. running into new errors unrelated to these macros - lol).

Reference PyTorch/XLA PR: pytorch/xla#3745

CC @JackCaoG

wconstab added 17 commits July 25, 2022 21:40
cd {pytorch root}

find aten c10 caffe2 modules test torch -type f -exec sed -i 's/DCHECK_EQ/TORCH_DCHECK_EQ/g' {} \;
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_EQ/ TORCH_CHECK_EQ/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_NE/ TORCH_CHECK_NE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_LE/ TORCH_CHECK_LE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_LT/ TORCH_CHECK_LT/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_GE/ TORCH_CHECK_GE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ CHECK_GT/ TORCH_CHECK_GT/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ DCHECK_NE/ TORCH_DCHECK_NE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ DCHECK_LE/ TORCH_DCHECK_LE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ DCHECK_LT/ TORCH_DCHECK_LT/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ DCHECK_GE/ TORCH_DCHECK_GE/g' {} \;
lintrunner -a
find aten c10 caffe2 modules test torch -type f -exec sed -i 's/ DCHECK_GT/ TORCH_DCHECK_GT/g' {} \;
lintrunner -a
@github-actions
Copy link
Contributor

Hey @wconstab.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jul 27, 2022
Summary:
Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases.

All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed.
c10/util/logging_is_not_google_glog.h
c10/util/logging_is_google_glog.h

Fixes #81415

cc miladm malfet

Pull Request resolved: #82032
Approved by: https://github.com/soumith, https://github.com/miladm

Test Plan:
contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/4f34cd6d1e91dcd82ee30c3ea39bdb8a0fa93e8b

Original Phabricator Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Reviewed By: osalpekar

Differential Revision: D38180841

Pulled By: wconstab

fbshipit-source-id: b2f6c3fef609ec6ad616929a16d342e525ef0155
facebook-github-bot pushed a commit to pytorch/audio that referenced this pull request Jul 27, 2022
Summary:
Pull Request resolved: #2582

CHECK_ were deprecated in upstream so we should replace them here as
well

Similar to pytorch/vision#6322, relates to pytorch/pytorch#82032

Signed-off-by: Eli Uriegas <[email protected]>

Test Plan: Imported from OSS

Reviewed By: malfet, mthrok

Differential Revision: D38208356

Pulled By: seemethere

fbshipit-source-id: 6f42d517362f415e0775803514eee2628402918f
skim0514 pushed a commit to skim0514/audio that referenced this pull request Jul 27, 2022
Summary:
Pull Request resolved: pytorch#2582

CHECK_ were deprecated in upstream so we should replace them here as
well

Similar to pytorch/vision#6322, relates to pytorch/pytorch#82032

Signed-off-by: Eli Uriegas <[email protected]>

Test Plan: Imported from OSS

Reviewed By: malfet, mthrok

Differential Revision: D38208356

Pulled By: seemethere

fbshipit-source-id: 6f42d517362f415e0775803514eee2628402918f
@fmassa
Copy link
Member

fmassa commented Aug 3, 2022

Hi @wconstab ,

I have the impression that the DCHECK_* and CHECK_* macros now have the same behavior, which means that CHECK_* macros are compiled away in release mode (which was not the behavior of GLOG). Was this intended?

cc @datumbox

@datumbox
Copy link
Contributor

datumbox commented Aug 3, 2022

+1

This caused issues on some of the tests on TorchVision which used the idiom:
CHECK_EQ(buffer.size(), fread(buffer.data(), 1, buffer.size(), f));

You can see additional info at pytorch/vision#6322 (review)

facebook-github-bot pushed a commit to pytorch/vision that referenced this pull request Aug 3, 2022
Summary:
Resolves https://www.internalfb.com/tasks/?t=128004042

Caused by TorchVision's PR at #6322 which was in response to a change on PyTorch Core at pytorch/pytorch#82032

Reviewed By: fmassa

Differential Revision: D38383266

fbshipit-source-id: 3f0ebd04a610031c4720123d1869a851f76455cd
datumbox added a commit to datumbox/vision that referenced this pull request Aug 3, 2022
Summary:
Resolves https://www.internalfb.com/tasks/?t=128004042

Caused by TorchVision's PR at pytorch#6322 which was in response to a change on PyTorch Core at pytorch/pytorch#82032

Reviewed By: fmassa

Differential Revision: D38383266

fbshipit-source-id: 3f0ebd04a610031c4720123d1869a851f76455cd
datumbox added a commit to pytorch/vision that referenced this pull request Aug 3, 2022
Summary:
Resolves https://www.internalfb.com/tasks/?t=128004042

Caused by TorchVision's PR at #6322 which was in response to a change on PyTorch Core at pytorch/pytorch#82032

Reviewed By: fmassa

Differential Revision: D38383266

fbshipit-source-id: 3f0ebd04a610031c4720123d1869a851f76455cd
@wconstab
Copy link
Contributor Author

Hi @wconstab ,

I have the impression that the DCHECK_* and CHECK_* macros now have the same behavior, which means that CHECK_* macros are compiled away in release mode (which was not the behavior of GLOG). Was this intended?

cc @datumbox

Hey @fmassa, sorry for the trouble.

A bit of background first, which someone may correct me on as I am not familiar with the history or intention behind this. PyTorch is sometimes built with google log (glog) and sometimes without it.

logging_is_not_google_glog.h defines CHECK macros that are compiled away in release builds. My diff simply renames CHECK_ to TORCH_CHECK here.

logging_is_google_glog.h defines CHECK_ so that it always runs with or without release builds. My diff changes the semantics here, making them match logging_is_not_google_glog as well as providing the name wrapper to map TORCH_CHECK to CHECK.

So, the question I would have is, going forward do we want to have different behavior for TORCH_CHECK depending on whether we use glog under the hood, or do we want consistent behavior? I'm missing the context of why the inconsistency was there in the first place- perhaps it was intentional, or perhaps it was an oversight. cc @ezyang

@wconstab
Copy link
Contributor Author

After chatting with @ezyang I realized that this diff incorrectly set CHECK_ macros to be optimized out in the 'is_glog' file, while in the 'is_not_glog' file only the DCHECK macros get optimized out.

I have opened #83216 to fix this.

Additionally, we were discussing the merits of maintaining TORCH_CHECK_* and TORCH_DCHECK_* macros in the first place, which are implemented as aborts under the hood in contrast with the TORCH_CHECK macro defined in c10/util/Exception.h which is implemented as an exception. It seems like it would be better to reimplement TORCH_CHECK_EQ and friends so they call into TORCH_CHECK and become exceptions instead of aborts. This would subtly change behavior though.

pytorchmergebot pushed a commit that referenced this pull request Aug 11, 2022
Makes TORCH_CHECK_* run unconditionally, leaving only TORCH_DCHECK_*
special-cased to be optimized out in release builds.

Fixes a bug in #82032, relating to this comment
#82032 (comment)

Pull Request resolved: #83216
Approved by: https://github.com/ezyang, https://github.com/datumbox
facebook-github-bot pushed a commit that referenced this pull request Aug 11, 2022
Summary:
Makes TORCH_CHECK_* run unconditionally, leaving only TORCH_DCHECK_*
special-cased to be optimized out in release builds.

Fixes a bug in #82032, relating to this comment
#82032 (comment)

Pull Request resolved: #83216
Approved by: https://github.com/ezyang, https://github.com/datumbox

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/abb2204f6a9c5af4e14e11cc69de8bcb5cceaea0

Reviewed By: seemethere

Differential Revision: D38624296

Pulled By: wconstab

fbshipit-source-id: 23f8db4b3cad7ba9d2105a33b0d40736326f7c4a
cchan added a commit to cchan/pytorch that referenced this pull request Sep 8, 2022
pytorchmergebot pushed a commit that referenced this pull request Sep 9, 2022
Building master fails with the following:

```
pytorch/caffe2/contrib/nccl/cuda_nccl_gpu.cc:180:51: error: 'CHECK_NOTNULL' was not declared in this scope; did you mean 'TORCH_CHECK_NOTNULL'?
  180 |     CUDA_ENFORCE(cudaStreamWaitEvent(CHECK_NOTNULL(ex.stream), event, 0));
```

Seems like #82032 just missed one find-replace. cc @wconstab

Not sure why this wouldn't have been caught elsewhere.
Pull Request resolved: #84720
Approved by: https://github.com/wconstab
pytorchmergebot pushed a commit that referenced this pull request Sep 9, 2022
Building master fails with the following:

```
pytorch/caffe2/contrib/nccl/cuda_nccl_gpu.cc:180:51: error: 'CHECK_NOTNULL' was not declared in this scope; did you mean 'TORCH_CHECK_NOTNULL'?
  180 |     CUDA_ENFORCE(cudaStreamWaitEvent(CHECK_NOTNULL(ex.stream), event, 0));
```

Seems like #82032 just missed one find-replace. cc @wconstab

Not sure why this wouldn't have been caught elsewhere.
Pull Request resolved: #84720
Approved by: https://github.com/wconstab
rdspring1 pushed a commit to rdspring1/pytorch that referenced this pull request Sep 10, 2022
Building master fails with the following:

```
pytorch/caffe2/contrib/nccl/cuda_nccl_gpu.cc:180:51: error: 'CHECK_NOTNULL' was not declared in this scope; did you mean 'TORCH_CHECK_NOTNULL'?
  180 |     CUDA_ENFORCE(cudaStreamWaitEvent(CHECK_NOTNULL(ex.stream), event, 0));
```

Seems like pytorch#82032 just missed one find-replace. cc @wconstab

Not sure why this wouldn't have been caught elsewhere.
Pull Request resolved: pytorch#84720
Approved by: https://github.com/wconstab
facebook-github-bot pushed a commit that referenced this pull request Sep 11, 2022
Summary:
Building master fails with the following:

```
pytorch/caffe2/contrib/nccl/cuda_nccl_gpu.cc:180:51: error: 'CHECK_NOTNULL' was not declared in this scope; did you mean 'TORCH_CHECK_NOTNULL'?
  180 |     CUDA_ENFORCE(cudaStreamWaitEvent(CHECK_NOTNULL(ex.stream), event, 0));
```

Seems like #82032 just missed one find-replace. cc wconstab

Not sure why this wouldn't have been caught elsewhere.

Pull Request resolved: #84720
Approved by: https://github.com/wconstab

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/0fd8f6b93cb3d1342a10ef71d4b27356f0dfc9b1

Reviewed By: izaitsevfb

Differential Revision: D39407045

fbshipit-source-id: 5e14964d966ddf819459d01e3b357abd6632d72d
@github-actions github-actions bot deleted the wconstab/more_checks branch February 18, 2024 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue oncall: jit Add this issue/PR to JIT oncall triage queue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CHECK_* macro signatures break downstream pytorch/xla build
7 participants