Skip to content

segmentation fault for pcl icp implementation in pytorch cpp extension #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
onlytailei opened this issue Oct 5, 2018 · 9 comments
Closed

Comments

@onlytailei
Copy link

onlytailei commented Oct 5, 2018

  • OS: ubuntu 16.04
  • PyTorch version: 0.4.0
  • How you installed PyTorch (conda, pip, source): conda
  • Python version: anaconda python 3.6.0
  • CUDA/cuDNN version: 9.0
  • GPU models and configuration: GeForce GTX 1080
  • GCC version (if compiling from source): 5.4.0

I’m trying to build a cpp extension for point cloud iterative closest point using the icp function in pcl-1.7 http://pointclouds.org/documentation/tutorials/iterative_closest_point.php.

The data transforming from at::tensor to pcl::Pointcloud is fine. However, as soon as I declare a new icp object, there will be a segmentation fault.

image

I also tried to add more arguments to the CppExtension as https://github.com/strawlab/python-pcl/blob/master/setup.py. But it doesn’t help.

To repeat the bug, you can clone the related files from https://github.com/onlytailei/icp_extension.
There should be pcl and eigen in the system

sudo apt-get install libpcl-all
sudo apt-get install libeigen3-dev

Then build the extension through:

python setup install.py

Comment/Uncomment this line in icp_op.cpp.

pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ> icp;

And rebuild the extension, you will see the difference.

python icp_test.py
@goldsborough
Copy link
Contributor

I'll check out the code, but I don't think this is a bug with C++ extensions per se. Did you run tests of your C++ code before binding it into Python, and made sure it was generally correct and does not segfault?

@onlytailei
Copy link
Author

onlytailei commented Oct 6, 2018

Yes. I tried the pure C++ example. There is no segfault.
You can download this C++ example from here.

mkdir build
cd build
cmake ../
make 
./iterative_closest_point

@goldsborough
Copy link
Contributor

goldsborough commented Oct 6, 2018

https://github.com/onlytailei/icp_extension/blob/master/icp_op.cpp#L88 looks wrong to me. tensorFromBlob does not copy data. It only references the blob you give it. You have to call .clone() to deep copy the data.

Change

at::Tensor output = torch::CPU(at::kFloat).tensorFromBlob(output_array, {batch_size,p_cloud_size, 3});

to

at::Tensor output = torch::CPU(at::kFloat).tensorFromBlob(output_array, {batch_size,p_cloud_size, 3}).clone();

or even better, to

at::Tensor output = torch::from_blob(output_array, {batch_size, p_cloud_size, 3}).clone();

@onlytailei
Copy link
Author

Thank you!
And any idea about this line?

pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ> icp;

As soon as you uncomment this one, the segfault happen.
In the cpp_example, it is fine.

@onlytailei
Copy link
Author

From the debug info, it seems that some attributes of icp cannot be released successfully through boost smart pointer. However, I have no idea why nothing is wrong in pure cpp example. Maybe there is some conflict between boost and torch.

#0 0x00007fffba96fd12 in boost::detail::atomic_exchange_and_add (dv=-1, pw=0x656572546453) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:50
#1 boost::detail::sp_counted_base::release (this=0x65657254644b) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:144
#2 boost::detail::shared_count::~shared_count (this=0x13d4660, __in_chrg=) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443
#3 boost::shared_ptr<pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > >::~shared_ptr (this=0x13d4658, __in_chrg=)
at /usr/include/boost/smart_ptr/shared_ptr.hpp:323
#4 pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > >::~KdTree (this=0x13d4620, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/search/kdtree.h:99
#5 pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > >::~KdTree (this=0x13d4620, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/search/kdtree.h:99
#6 boost::checked_delete<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > > > (x=0x13d4620)
at /usr/include/boost/core/checked_delete.hpp:34
#7 boost::detail::sp_counted_impl_p<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > > >::dispose (this=)
at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
#8 0x00007fffba96b8fa in boost::detail::sp_counted_base::release (this=0x13d4570) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#9 0x00007fffba976c5d in boost::detail::sp_counted_base::release (this=) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:109
#10 boost::detail::shared_count::~shared_count (this=0x13b3950, __in_chrg=) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443
#11 boost::shared_ptr<pcl::search::KdTree<pcl::PointXYZ, pcl::KdTreeFLANN<pcl::PointXYZ, flann::L2_Simple > > >::~shared_ptr (this=0x13b3948,
__in_chrg=) at /usr/include/boost/smart_ptr/shared_ptr.hpp:323
#12 pcl::registration::CorrespondenceEstimationBase<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimationBase (this=this@entry=0x13b3900,
__in_chrg=) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:109
#13 0x00007fffba976d10 in pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimation (this=0x13b3900,
__in_chrg=) at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:419
#14 pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float>::~CorrespondenceEstimation (this=0x13b3900, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/registration/correspondence_estimation.h:419
#15 boost::checked_delete<pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float> > (x=0x13b3900)
at /usr/include/boost/core/checked_delete.hpp:34
#16 boost::detail::sp_counted_impl_p<pcl::registration::CorrespondenceEstimation<pcl::PointXYZ, pcl::PointXYZ, float> >::dispose (this=)
at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
#17 0x00007fffba96b8fa in boost::detail::sp_counted_base::release (this=0x13d4590) at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#18 0x00007fffba973ed5 in boost::detail::sp_counted_base::release (this=) at /usr/include/boost/function/function_template.hpp:510
#19 boost::detail::shared_count::~shared_count (this=0x13bfa50, __in_chrg=) at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443
#20 boost::shared_ptr<pcl::registration::CorrespondenceEstimationBase<pcl::PointXYZ, pcl::PointXYZ, float> >::~shared_ptr (this=0x13bfa48, __in_chrg=)
at /usr/include/boost/smart_ptr/shared_ptr.hpp:323
#21 pcl::Registration<pcl::PointXYZ, pcl::PointXYZ, float>::~Registration (this=this@entry=0x13bf8c0, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/registration/registration.h:132
#22 0x00007fffba973fc5 in pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>::~IterativeClosestPoint (this=0x13bf8c0, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/registration/icp.h:155
#23 pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>::~IterativeClosestPoint (this=0x13bf8c0, __in_chrg=)
at /usr/local/include/pcl-1.8/pcl/registration/icp.h:155
#24 0x00007fffba96d062 in boost::movelib::default_delete<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> >::operator()<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> > (this=, ptr=) at /usr/include/boost/move/default_delete.hpp:181
#25 boost::movelib::unique_ptr<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float>, boost::movelib::default_delete<pcl::IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ, float> > >::~unique_ptr (this=, __in_chrg=) at /usr/include/boost/move/unique_ptr.hpp:559
#26 icp_forward (p_cloud=..., q_cloud=...) at icp_op.cpp:77

Another problem is that you mentioned torch::from_blob. Which extra header file should I include to use it? With torch/torch.h, it cannot find this function.

@goldsborough
Copy link
Contributor

For the from_blob: That was introduced in a later version of PyTorch, it wasn't available in 0.4.0 -- my bad.

I spent some time today trying to reproduce your bug in a docker container but I could not. I use this Dockerfile:

FROM ubuntu:xenial

RUN apt-get update  -y \
  && apt-get install -y git cmake vim make wget gnupg build-essential software-properties-common gdb

RUN apt-get install -y libpcl-dev

RUN wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh \
  && chmod +x miniconda.sh \
  && ./miniconda.sh -b -p ~/local/miniconda

RUN . ~/local/miniconda/bin/activate && conda install -c pytorch pytorch==0.4.0

WORKDIR /home

and then everything seems to work pretty well:

(base) root@7354d80b0db7:/home# cat /home/
.git/        .gitignore   Dockerfile   README.md    __pycache__/ icp.py       icp_op.cpp   icp_test.py  setup.py
(base) root@7354d80b0db7:/home# cat /home/^C
(base) root@7354d80b0db7:/home# python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())'
/root/local/miniconda/lib/python3.6/site-packages
(base) root@7354d80b0db7:/home# find /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.pty
(base) root@7354d80b0db7:/home# find /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.py
/root/local/miniconda/lib/python3.6/site-packages/torch/utils/cpp_extension.py
(base) root@7354d80b0db7:/home# ^Cnd /root/local/miniconda/lib/python3.6/site-packages -name cpp_extension.py
(base) root@7354d80b0db7:/home# less /root/local/miniconda/lib/python3.6/site-packages/torch/utils/cpp_extension.py
(base) root@7354d80b0db7:/home# ls
Dockerfile  README.md  __pycache__  icp.py  icp_op.cpp  icp_test.py  setup.py
(base) root@7354d80b0db7:/home# python setup.py build develop
running build
running build_ext
building 'icp_cpp' extension
creating build
creating build/temp.linux-x86_64-3.6
gcc -pthread -B /root/local/miniconda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DEIGEN_YES_I_KNOW_SPARSE_MODULE_IS_NOT_STABLE_YET=1 -I/root/local/miniconda/lib/python3.6/site-packages/numpy/core/include -I/usr/include/pcl-1.7 -I/usr/include/ni -I/usr/include/eigen3 -I/usr/include/ni -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include/TH -I/root/local/miniconda/lib/python3.6/site-packages/torch/lib/include/THC -I/root/local/miniconda/include/python3.6m -c icp_op.cpp -o build/temp.linux-x86_64-3.6/icp_op.o -DTORCH_EXTENSION_NAME=icp_cpp -std=c++11
cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
creating build/lib.linux-x86_64-3.6
g++ -pthread -shared -B /root/local/miniconda/compiler_compat -L/root/local/miniconda/lib -Wl,-rpath=/root/local/miniconda/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/icp_op.o -lpcl_registration -lpcl_segmentation -lpcl_features -lpcl_surface -lpcl_tracking -lpcl_filters -lpcl_sample_consensus -lpcl_visualization -lpcl_io -lOpenNI -lpcl_search -lpcl_kdtree -lflann_cpp -lpcl_octree -lpcl_common -o build/lib.linux-x86_64-3.6/icp_cpp.cpython-36m-x86_64-linux-gnu.so -lboost_system
running develop
running egg_info
creating icp_cpp.egg-info
writing icp_cpp.egg-info/PKG-INFO
writing dependency_links to icp_cpp.egg-info/dependency_links.txt
writing top-level names to icp_cpp.egg-info/top_level.txt
writing manifest file 'icp_cpp.egg-info/SOURCES.txt'
reading manifest file 'icp_cpp.egg-info/SOURCES.txt'
writing manifest file 'icp_cpp.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-3.6/icp_cpp.cpython-36m-x86_64-linux-gnu.so ->
Creating /root/local/miniconda/lib/python3.6/site-packages/icp-cpp.egg-link (link to .)
Adding icp-cpp 1.0 to easy-install.pth file

Installed /home
Processing dependencies for icp-cpp==1.0
Finished processing dependencies for icp-cpp==1.0
(base) root@7354d80b0db7:/home# python
.git/                                    README.md                                icp.py                                   icp_op.cpp
.gitignore                               __pycache__/                             icp_cpp.cpython-36m-x86_64-linux-gnu.so  icp_test.py
Dockerfile                               build/                                   icp_cpp.egg-info/                        setup.py
(base) root@7354d80b0db7:/home# python
.git/                                    README.md                                icp.py                                   icp_op.cpp
.gitignore                               __pycache__/                             icp_cpp.cpython-36m-x86_64-linux-gnu.so  icp_test.py
Dockerfile                               build/                                   icp_cpp.egg-info/                        setup.py
(base) root@7354d80b0db7:/home# python icp_test.py
tensor([[[-1.2634e+00, -2.3912e-01,  3.1981e-01],
         [-7.7116e-01,  3.9494e-02, -3.2341e-01],
         [-2.0449e+00,  6.7875e-01, -9.5829e-01],
         ...,
         [-1.1826e-01, -1.0028e+00, -8.6894e-02],
         [ 2.6089e-01, -8.6151e-02,  3.6891e-01],
         [ 1.4749e-01,  9.5050e-01, -4.9166e-01]]]) tensor([[[-1.6041, -0.4577,  0.8348],
         [ 1.0041, -1.2082, -0.4258],
         [ 0.1405, -1.9008, -1.2343],
         ...,
         [-0.1316, -0.4407, -0.4610],
         [ 1.6476, -1.3544,  0.5584],
         [-0.1154,  0.6452, -0.3808]]])

Did you make any progress on your end?

@onlytailei
Copy link
Author

Thank you @goldsborough !
I tried your docker. It really works!!
I will check my environment and close the issue.
Many thanks!

@joshi-bharat
Copy link

@onlytailei I am having the same issue. Were you able to resolve this problem? I am having segmentation fault at IterativeClosestPoint<pcl::PointXYZ, pcl::PointXYZ> icp;
I am using stable pytorch 1.1 and cuda-toolkit 10.0.

@healthysong
Copy link

I am having the same issue. Were you able to resolve this problem? I am having segmentation fault at kdtree.setInputCloud(clouds);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants