-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Update RAFT Stereo to be more sync with CREStereo implementation #6575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update RAFT Stereo to be more sync with CREStereo implementation #6575
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one question such that I know how to sync CREStereo to these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think the changes make sense and make the code much easier to read. Coming with basic knowledge of Depth Perception, the new naming allows me to understand what is happening and why.
if flow_init is not None: | ||
coords1 = coords1 + flow_init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only change for which I don't have context, but I understand it's a new feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When wanting to perform inference at a resolution significantly larger than that at which the model is trained, you can perform cascaded inference.
Cascaded inference first computes the flow for a downsampled version of the image, and uses that flow as a prior for the full resolution image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to give more context:
Here is the code for the cascaded inference: https://github.com/pytorch/vision/blob/test-crestereo-training/references/stereo_matching/evaluation.py#L57
And here is reference to original raft implementation on this part: https://github.com/princeton-vl/RAFT-Stereo/blob/main/core/raft_stereo.py#L104
My doubt is do stereo models depend on order left and right input images. Can we not reverse them? |
@oke-aditya no we can't reverse the input since what the model does is guessing how much "displacement" of the pixels from left_image to the right_image, and this usually mean all pixels will be shifted to the right (not the other way around). We have however augmentation that do horizontal flip, and in this case we will also flip the order of right and left image. |
Perhaps we can document this. |
… instead of actual depth
…n into prototype/update-raft
@oke-aditya do you have any recommendation on where we should document it? I think this behavior will be general on all depth stereo model (currently we also plan to add CREStereo). Maybe we can add on the README when we add the references? |
Probably one root page for stereo models ? Also cc @datumbox for better ideas 💡 |
The test error is not related to this PR changes |
…tion (#6575) Summary: * Update raft_stereo to sync forward param with CREStereo and have output_channel * Add flow_init param to docstring * Use output_channels instead of output_channel * Replace depth with disparity since what we predict actually disparity instead of actual depth Reviewed By: jdsgomes Differential Revision: D39543287 fbshipit-source-id: af1ba127fb25eedd3618d6f7b57565daf04ad986
After discussing with @TeodorPoncu for on the CREStereo implementation, we plan to make the implementation more consistent. Here are the changes in RAFT Stereo:
image1 and image2
toleft_image and right_image
flow_init
as inputself.output_channel
as variable in RAFTStereo classHere is the link to CREStereo implementation PR: #6310