-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Add stereo preset transforms #6549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stereo preset transforms #6549
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeodorPoncu , thanks for the PR. I have several comments, please let me know what you think.
Hi @TeodorPoncu, I think we should add the EvalPresets in https://github.com/pytorch/vision/blob/main/torchvision/transforms/_presets.py since this will be needed in the model weight , see https://github.com/pytorch/vision/blob/main/torchvision/models/optical_flow/raft.py#L538 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeodorPoncu , I have few comments
references/depth/stereo/presets.py
Outdated
|
||
class StereoMatchingEvalPreset(torch.nn.Module): | ||
def __init__( | ||
self, mean: float = 0.5, std: float = 0.5, resize_size=None, interpolation_type: str = "bilinear" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add type hint for resize_size (I think it should be Optional[Tuple[int, int]]
in this case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @TeodorPoncu ! I think overall looks good already, just one last bit about the MakeValidDisparityMask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeodorPoncu , the code looks good. I just have few things I want to discuss or confirm with you for the MakeValidDisparityMask
and _resize_sparse_flow
.
references/depth/stereo/presets.py
Outdated
] | ||
|
||
if use_grayscale: | ||
transforms.append(T.ConverToGrayscale()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, lets make it Convert
instead of Conver
.
ConverToGrayscale -> ConvertToGrayscale
…dorPoncu/vision into add-stereo-preset-transforms
valid_flow_new *= scale_x | ||
|
||
flow_new[:, ii_valid_new, jj_valid_new] = valid_flow_new | ||
valid_new[ii_valid_new, jj_valid_new] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion, lets fix this to:
valid_new[ii_valid_new, jj_valid_new] = mask[ii_valid, jj_valid]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @TeodorPoncu , sorry I was wrong about the transforms/_presets.py
before. It should be inside prototype area, hence the right file to put it in is in torchvision/prototype/transforms/_presets.py
.
Other than this, it looks good , I just have a few question and NITs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeodorPoncu , I think this will be the last review iteration for me. Few nits on the typing.
I will approve first so it is not blocking. Thanks again for the PR!
references/depth/stereo/presets.py
Outdated
self, | ||
mean: float = 0.5, | ||
std: float = 0.5, | ||
resize_size: Optional[Tuple[int, int]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, lets use Tuple[int, ...]
like the others that has changed
|
||
class ValidateModelInput(torch.nn.Module): | ||
# Pass-through transform that checks the shape and dtypes to make sure the model gets what it expects | ||
def forward(self, images, disparities, masks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint
|
||
|
||
class ConvertImageDtype(torch.nn.Module): | ||
def __init__(self, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint
|
||
class AsymmetricColorJitter(T.ColorJitter): | ||
# p determines the probability of doing asymmetric vs symmetric color jittering | ||
def __init__(self, brightness=0, contrast=0, saturation=0, hue=0, p=0.2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint
# these can be viewed as occlusions present in both camera views. | ||
# Similarly to Optical Flow occlusion prediction tasks, we mask these pixels in the disparity map | ||
def __init__( | ||
self, p: float = 0.5, erase_px_range: Tuple[int, int] = (50, 100), value=0, inplace=False, max_erase=2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint for value, inplace, max_erase
# This adds an occlusion in the right image | ||
# the occluded patch works as a patch erase where the erase value is the mean | ||
# of the pixels from the selected zone | ||
def __init__(self, p=0.5, occlusion_px_range: Tuple[int, int] = (50, 100), inplace=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint for p, inplace
# https://github.com/pytorch/vision/pull/5026/files#r762932579 | ||
def __init__( | ||
self, | ||
crop_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint for crop_size, rescale_prob, scaling_type, interpolation_type
|
||
|
||
class Compose(torch.nn.Module): | ||
def __init__(self, transforms): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint
self.transforms = transforms | ||
|
||
@torch.inference_mode() | ||
def forward(self, images, disparities, masks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, add type hint
Summary: * Added transforms for Stereo Matching * changed implicit Y scaling to 0. * Adressed some comments * addressed type hint * Added interpolation random interpolation strategy * Aligned crop get params * fixed bug in RandomErase * Adressed scaling and typos * Adressed occlusion typo * Changed parameter order in F.erase * fixed random erase * Added inference preset transform for stereo matching * added contiguous reshape to output tensors * Adressed comments * Modified the transform preset to use Tuple[int, int] * adressed NITs * added grayscale transform, align resize -> mask * changed max disparity default behaviour * added fixed resize, changed masking in sparse flow masking * update to align with argparse * changed default mask in asymetric pairs * moved grayscale order * changed grayscale api to accept to tensor variant * mypy fix * changed resize specs * adressed nits * added type hints * mypy fix * mypy fix * mypy fix Reviewed By: NicolasHug Differential Revision: D39765306 fbshipit-source-id: f649e4ab6ce9ce23fd465a98732e01a1854c148f Co-authored-by: Joao Gomes <[email protected]>
These are the transforms used for Stereo Matching. The resize and rescale is mostly a pure copy paste from the Optical Flow one, with the exception that it performs scaling just on the X flow channel.