-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Report leaking environment variables in tests #5872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5872 +/- ##
======================================
Coverage 93% 93%
======================================
Files 179 179
Lines 15325 15329 +4
======================================
+ Hits 14218 14222 +4
Misses 1107 1107 |
d01d690
to
8a3052a
Compare
@awaelchli how is it going here, still WIP or ready to go? 🐰 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions. |
seem this quite old PR with several handers of commits behind the master, consider finishing it or closing as most likely the conflicts will make the PR challenging to finish... 🐰 |
Why couldn't we finish this? Is it just a matter of addressing each leaking test? Or is there a change in Lightning required? |
@carmocca yes it would require a clean up for certain environment variables that get set. I don't know whether it is worth doing that in Lightning. Right now, many tests would error with this fixture, for example because variables like However, on master we currently have a fixture for resetting the Examples: ERROR tests/trainer/test_data_loading.py::test_replace_distributed_sampler[1] - AssertionError: test is leaking environment variable(s): MASTER_ADDR, MASTER_PORT, NODE_RANK, LOCAL_RANK, WORLD_SIZE
ERROR tests/trainer/test_data_loading.py::test_replace_distributed_sampler[2] - AssertionError: test is leaking environment variable(s): MASTER_ADDR, MASTER_PORT, NODE_RANK, LOCAL_RANK
ERROR tests/trainer/test_data_loading.py::test_replace_distributed_sampler[3] - AssertionError: test is leaking environment variable(s): MASTER_ADDR, MASTER_PORT, NODE_RANK, LOCAL_RANK
ERROR tests/trainer/test_dataloaders.py::test_auto_add_worker_init_fn - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_dataloaders.py::test_auto_add_worker_init_fn_distributed - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/test_dataloaders.py::test_dataloader_distributed_sampler - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/test_dataloaders.py::test_dataloader_distributed_sampler_already_attached - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/test_dataloaders.py::test_batch_size_smaller_than_num_gpus - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/test_trainer.py::test_loading_meta_tags - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_loading_yaml - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_gradient_clipping_by_norm[32] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_gradient_clipping_by_norm[16] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_gradient_clipping_by_value[32] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_gradient_clipping_by_value[16] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_trainer_predict_ddp_cpu - AssertionError: test is leaking environment variable(s): MASTER_PORT
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[32-None] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[32-False] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[32-True] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[64-None] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[64-False] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_predict_return_predictions_cpu[64-True] - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS
ERROR tests/trainer/test_trainer.py::test_setup_hook_move_to_device_correctly - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_model_in_correct_mode_during_stages[ddp_cpu-2] - AssertionError: test is leaking environment variable(s): MASTER_PORT
ERROR tests/trainer/test_trainer.py::test_fit_test_synchronization - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, MASTER_PORT
ERROR tests/trainer/test_trainer.py::test_multiple_trainer_constant_memory_allocated - AssertionError: test is leaking environment variable(s): MASTER_ADDR, MASTER_PORT, NODE_RANK, LOCAL_RANK, CUDA_DEVICE_ORDER
ERROR tests/trainer/test_trainer.py::test_error_handling_all_stages[ddp_cpu-2] - AssertionError: test is leaking environment variable(s): MASTER_PORT
ERROR tests/trainer/logging_/test_distributed_logging.py::test_all_rank_logging_ddp_cpu - AssertionError: test is leaking environment variable(s): MASTER_PORT
ERROR tests/trainer/logging_/test_distributed_logging.py::test_all_rank_logging_ddp_spawn - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/logging_/test_logger_connector.py::test_epoch_results_cache_dp - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_logging_sync_dist_true[1] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_logging_sync_dist_true[2] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_metric_are_properly_reduced - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_log_gpu_memory_without_logging_on_step[all] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_log_gpu_memory_without_logging_on_step[min_max] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/logging_/test_train_loop_logging.py::test_move_metrics_to_cpu - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_multiple_optimizers_manual_no_return[kwargs1] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_multiple_optimizers_manual_native_amp - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_manual_optimization_and_return_tensor - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/optimization/test_manual_optimization.py::test_manual_optimization_and_accumulated_gradient - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_multiple_optimizers_step - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_step_with_optimizer_closure_with_different_frequencies_ddp_spawn - AssertionError: test is leaking environment variable(s): PL_GLOBAL_SEED, PL_SEED_WORKERS, CUDA_DEVICE_ORDER, MASTER_PORT
ERROR tests/trainer/optimization/test_manual_optimization.py::test_multiple_optimizers_logging[16] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/optimization/test_manual_optimization.py::test_multiple_optimizers_logging[32] - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER
ERROR tests/trainer/properties/test_get_model.py::test_get_model_ddp_cpu - AssertionError: test is leaking environment variable(s): MASTER_PORT
ERROR tests/trainer/properties/test_get_model.py::test_get_model_gpu - AssertionError: test is leaking environment variable(s): CUDA_DEVICE_ORDER |
We could whitelist these for now: CUDA_DEVICE_ORDER
LOCAL_RANK
MASTER_ADDR
MASTER_PORT
NODE_RANK
PL_GLOBAL_SEED
PL_SEED_WORKERS
WORLD_SIZE |
Co-authored-by: Adrian Wälchli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
What does this PR do?
Currently Lightning "pollutes" the environment with variables. To some degree, this is necessary for example for multiprocessing etc. but unfortunately the variables don't get cleaned up properly, and it may not be so obvious where and what to clean up.
We currently have this fixture which will restore the environment after each test:
This PR changes this into an assertion and each test needs to leave the environment unchanged.For now, this fixture serves the purpose to identify which tests are leaking. After we fix all the issues, we can include this fixture for every test.
In order to solve this, we need proper teardown logic for Trainer, Accelerator, Plugins, Logger etc.
Fixes #5757
This PR reports any new leaking environment variables.
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃