Skip to content

AVX2 makes PCL crash #5806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Vakarian15 opened this issue Sep 11, 2023 · 20 comments
Closed

AVX2 makes PCL crash #5806

Vakarian15 opened this issue Sep 11, 2023 · 20 comments
Labels

Comments

@Vakarian15
Copy link

Vakarian15 commented Sep 11, 2023

Describe the bug

I am trying to downsample a pcl::PointXYZRGBNormal PointCloud using a VoxelGrid filter.
AVX2 makes my program crash, I tried it with 1.12.0 and 1.13.1, both of them have the same problem.

To Reproduce

int main()
{
	pcl::PointCloud<pcl::PointXYZRGBNormal>::Ptr cloud(new pcl::PointCloud<pcl::PointXYZRGBNormal>());
	pcl::PointCloud<pcl::PointXYZRGBNormal>::Ptr cloud_filtered(new pcl::PointCloud<pcl::PointXYZRGBNormal>());

	// Fill in the cloud data
	pcl::PCDReader reader;
	reader.read("test.pcd", *cloud); 

	// Create the filtering object
	pcl::VoxelGrid<pcl::PointXYZRGBNormal> sor;
	sor.setInputCloud(cloud);
	sor.setLeafSize(0.01f, 0.01f, 0.01f);
	sor.filter(*cloud_filtered);
	return (0);
}

If I disable AVX2, it works. If I replace pcl::PointCloud<pcl::PointXYZRGBNormal> with pcl::PCLPointCloud2, it works. So I'm not sure it is a problem of pcl::PointXYZRGBNormal or pcl::VoxelGrid.
Here is the sample project including the dataset: PCLTest.zip

Screenshots/Code snippets
Stack track
image

image

Your Environment (please complete the following information):

  • OS: Win11
  • Compiler: CMake, MSVC 14.27
  • PCL Version: 1.12.0 and 1.13.1 download from vcpkg

Additional context

In 1.13.1, both pcl::PointXYZRGBNormal and pcl::VoxelGrid have PCL_MAKE_ALIGNED_OPERATOR_NEW macro. So I have no idea what the problem is. Thanks in advance.

@Vakarian15 Vakarian15 added kind: bug Type of issue status: triage Labels incomplete labels Sep 11, 2023
@larshg
Copy link
Contributor

larshg commented Sep 12, 2023

@Vakarian15 can you try find the configure log file from VCPKG build of PCL - maybe we can see something in the log, whether PCL was build with AVX support or not.
If its not and you enable it, in your project, it can fail with these errors.

@Vakarian15
Copy link
Author

Vakarian15 commented Sep 12, 2023

@Vakarian15 can you try find the configure log file from VCPKG build of PCL - maybe we can see something in the log, whether PCL was build with AVX support or not. If its not and you enable it, in your project, it can fail with these errors.

@larshg Thank you for your swift response. Here is the configure log
config-x64-windows-CMakeCache.txt.log
It has PCL_ENABLE_AVX:BOOL=ON. Not sure whether this means it has AVX series support or only support AVX. But I still get the same error when I enable AVX instead of AVX2 in my project

@larshg
Copy link
Contributor

larshg commented Sep 12, 2023

That was the cmake cache and yes, the variable you refer to, is that it should test for AVX(2), but I can't find any "HAVE_AVX" or the like.

I was requesting the configuration output when running cmake.
It should be like:
C:\vcpkg\buildtrees\pcl\install-x64-windows-rel-out.log
Or
C:\vcpkg\buildtrees\pcl\configure-x64-windows-rel-out.log

Its example from the azure pipline, but hopefully you can find similiar :-)

@larshg
Copy link
Contributor

larshg commented Sep 12, 2023

I can find
HAVE_SSE4_2_EXTENSIONS:INTERNAL=1
Which indicates you have at least sse 4.2.

@Vakarian15
Copy link
Author

Vakarian15 commented Sep 12, 2023

Here are the files
config-x64-windows-out.log
install-x64-windows-rel-out.log
And you are right, I cannot find AVX flags.
I tired to create a overlay port, by adding -DPCL_ENABLE_AVX=ON to profile.cmake. Delete buildtrees/pcl folder and install pcl with the overlay port
image
And here is the new config log
config-x64-windows-out.log
-DPCL_ENABLE_AVX=ON was passed to cmake successfully.
But the problem is not fixed. Could you please have a look? @larshg Is there something I did wrong?

@larshg
Copy link
Contributor

larshg commented Sep 13, 2023

Ahh, vcpkg have a patch for SSE https://github.com/microsoft/vcpkg/blob/master/ports/pcl/fix-check-sse.patch

But not for AVX and they don't parse default CXX settings, so it will never check for AVX instructions.
You can try make a patch similiar to the SSE one and add this to the vcpkg portfile.

@Vakarian15
Copy link
Author

Ahh, vcpkg have a patch for SSE https://github.com/microsoft/vcpkg/blob/master/ports/pcl/fix-check-sse.patch

But not for AVX and they don't parse default CXX settings, so it will never check for AVX instructions. You can try make a patch similiar to the SSE one and add this to the vcpkg portfile.

I see, thanks for your time!

@Vakarian15
Copy link
Author

Vakarian15 commented Sep 13, 2023

I made a patch for AVX
Screenshot 2023-09-13 111549
Then add the patch to protfile.cmake
Screenshot 2023-09-13 111637
Screenshot 2023-09-13 111659
Then remove pcl from vcpkg and reinstall with overlay ports
And here is the new log file
config-x64-windows-out.log
It says "Performing Test HAVE_AVX2 - Success" in the log.
But the problem is not fixed.

@larshg
Copy link
Contributor

larshg commented Sep 13, 2023

And you have enabled AVX in your project as well as copied new dlls etc?

@Vakarian15
Copy link
Author

Yes, I even copied the whole vcpkg\installed\x64-windows\bin folder.
I tried allinone installer, it doesn't work.
I also tired to build from source with -DPCL_ENABLE_AVX=ON, it doesn't work either.

@Aposhian
Copy link

Aposhian commented Sep 28, 2023

I was able to get AVX2 to work by using only custom point types with EIGEN_ALIGN32 (and then making sure to define PCL_NO_PRECOMPILE). It would be nice to have preprocessor directives for PCL that made it easier to change max alignment, similar to how Eigen does it upstream.

@Neumann-A
Copy link

I find it funny that there are two connected issues so close together. I think #5805 is the problem. PCL is using EIGEN_ALIGN16 which probably does not work with /arch:AVX2. I would say: Let the compiler decide the alignment or make it dependent on the target architecture.

@mvieth
Copy link
Member

mvieth commented Feb 9, 2024

Hi, thank you for your suggestions, but I tested and I see no reason to assume that EIGEN_ALIGN16 causes the problem:

  • first of all, I could reproduce the issue with the original code (pcl::PointXYZRGBNormal, no PCL_NO_PRECOMPILE, PCL installed via vcpkg)
  • if I simply add #define PCL_NO_PRECOMPILE at the top, it works
  • if I add a custom point type that is basically a clone of pcl::PointXYZRGBNormal (including EIGEN_ALIGN16), and add define PCL_NO_PRECOMPILE (necessary for custom point type), it works

Same as Lars, I think the problem is likely that vcpkg built PCL without AVX/AVX2, so the precompiled classes use a "vanilla" malloc instead of Eigen's custom aligned malloc. When the user code is compiled with AVX/AVX2, Eigen's aligned_free is used (not a "vanilla" free). The mismatch between the "vanilla" malloc and Eigen's aligned_free causes the problem. When using PCL_NO_PRECOMPILE, the PCL code is compiled together with the user code, and all flags and architecture options are consistent.
Adding a patch for AVX (like the patch that vcpkg has for SSE) is a good first step, but I believe it did not take effect yet because there is a second check for "${CMAKE_CXX_FLAGS}" STREQUAL "${CMAKE_CXX_FLAGS_DEFAULT}" : https://github.com/PointCloudLibrary/pcl/blob/master/CMakeLists.txt#L150 So AVX and AVX2 are found, but the flags are not passed to CMAKE_CXX_FLAGS.
I will make some more tests, and then create a PR for PCL, and perhaps also submit a patch to vcpkg.

@Aposhian
Copy link

As another data point, I encountered this bug on Linux with installing PCL binaries from apt, so it isn't just vcpkg. I haven't tried using the built-in EIGEN_ALIGN16 while simply specifying PCL_NO_PRECOMPILE.

@s-trinh
Copy link

s-trinh commented May 20, 2025

The current tutorial (https://pcl.readthedocs.io/projects/tutorials/en/master/using_pcl_pcl_config.html / pcl-1.15.0-rc1) advices to use:

add_definitions(${PCL_DEFINITIONS})

In my tests on Ubuntu, PCL_DEFINITIONS is empty and I think PCL_COMPILE_OPTIONS should be used instead?

Moreover, AVX flags are only added to PCL_COMPILE_OPTIONS and not to PCL_DEFINITIONS :

pcl/PCLConfig.cmake.in

Lines 449 to 454 in 46f15a6

#set SSE flags used compiling PCL
list(APPEND PCL_DEFINITIONS @PCLCONFIG_SSE_DEFINITIONS@)
list(APPEND PCL_COMPILE_OPTIONS @PCLCONFIG_SSE_COMPILE_OPTIONS@)
#set AVX flags used compiling PCL
list(APPEND PCL_COMPILE_OPTIONS @PCLCONFIG_AVX_COMPILE_OPTIONS@)

It looks like only with MSVC we have SSE flags added to SSE_DEFINITIONS (this would explain why PCL_DEFINITIONS is empty with me):

elseif(MSVC)
if(SSE_LEVEL GREATER_EQUAL 4.2)
string(APPEND SSE_DEFINITIONS " -D__SSE4_2__")
endif()
if(SSE_LEVEL GREATER_EQUAL 4.1)
string(APPEND SSE_DEFINITIONS " -D__SSE4_1__")
endif()
if(SSE_LEVEL GREATER_EQUAL 3.1)
string(APPEND SSE_DEFINITIONS " -D__SSSE3__")
endif()
if(SSE_LEVEL GREATER_EQUAL 3.0)
string(APPEND SSE_DEFINITIONS " -D__SSE3__")
endif()
if(SSE_LEVEL GREATER_EQUAL 2.0)
string(APPEND SSE_DEFINITIONS " -D__SSE2__")
endif()
if(SSE_LEVEL GREATER_EQUAL 1.0)
string(APPEND SSE_DEFINITIONS " -D__SSE__")
endif()
endif()

And thus PCLCONFIG_SSE_DEFINITIONS is empty on Ubuntu:

set(PCLCONFIG_SSE_DEFINITIONS "${SSE_DEFINITIONS}")
set(PCLCONFIG_SSE_COMPILE_OPTIONS ${SSE_FLAGS})
set(PCLCONFIG_AVX_COMPILE_OPTIONS ${AVX_FLAGS})


Regarding the possible Eigen alignment issue:

#if !defined(PCL_SILENCE_MALLOC_WARNING) && !defined(__NVCC__)
#if PCL_USES_EIGEN_HANDMADE_ALIGNED_MALLOC
// EIGEN_DEFAULT_ALIGN_BYTES and EIGEN_MALLOC_ALREADY_ALIGNED will be set after including Eigen/Core
// this condition is the same as in the function aligned_malloc in Memory.h in the Eigen code
#if (defined(EIGEN_DEFAULT_ALIGN_BYTES) && EIGEN_DEFAULT_ALIGN_BYTES==0) || (defined(EIGEN_MALLOC_ALREADY_ALIGNED) && EIGEN_MALLOC_ALREADY_ALIGNED)
#if defined(_MSC_VER)
#error "Potential runtime error due to aligned malloc mismatch! You likely have to compile your code with AVX enabled or define EIGEN_MAX_ALIGN_BYTES=32 (to silence this message at your own risk, define PCL_SILENCE_MALLOC_WARNING=1)"
#else // defined(_MSC_VER)
#warning "Potential runtime error due to aligned malloc mismatch! You likely have to compile your code with AVX enabled or define EIGEN_MAX_ALIGN_BYTES=32 (to silence this message at your own risk, define PCL_SILENCE_MALLOC_WARNING=1)"
#endif // defined(_MSC_VER)
#endif
#else // PCL_USES_EIGEN_HANDMADE_ALIGNED_MALLOC
#if (defined(EIGEN_DEFAULT_ALIGN_BYTES) && EIGEN_DEFAULT_ALIGN_BYTES!=0) && (defined(EIGEN_MALLOC_ALREADY_ALIGNED) && !EIGEN_MALLOC_ALREADY_ALIGNED)
#if defined(_MSC_VER)
#error "Potential runtime error due to aligned malloc mismatch! PCL was likely compiled without AVX support but you enabled AVX for your code (to silence this message at your own risk, define PCL_SILENCE_MALLOC_WARNING=1)"
#else // defined(_MSC_VER)
#warning "Potential runtime error due to aligned malloc mismatch! PCL was likely compiled without AVX support but you enabled AVX for your code (to silence this message at your own risk, define PCL_SILENCE_MALLOC_WARNING=1)"
#endif // defined(_MSC_VER)
#endif
#endif // PCL_USES_EIGEN_HANDMADE_ALIGNED_MALLOC
#endif // !defined(PCL_SILENCE_MALLOC_WARNING)

I think -march=native is enabled by default:

if(PCL_ENABLE_MARCHNATIVE AND (NOT CMAKE_CROSSCOMPILING))
if(HAVE_MARCH)
string(APPEND SSE_FLAGS " -march=native")
else()
string(APPEND SSE_FLAGS " -mtune=native")
endif()
message(STATUS "Using CPU native flags for SSE optimization: ${SSE_FLAGS}")
endif()

This means that on recent x86 architectures, AVX512 instruction sets will be enabled and EIGEN_MAX_ALIGN_BYTES=64 should actually be used instead of 32?
Anyway, this approach looks too complicated for a regular user.

@larshg
Copy link
Contributor

larshg commented May 21, 2025

@s-trinh in newer versions, the PCL_DEFINITIONS and PCL_COMPILE_OPTIONS is actually not really required anymore.

If you print the variables for a given cmake target, ie. pcl_common you get this list:

1> [CMake] pcl_common BINARY_DIR = C:/dev/pcl issues/iss_keypoints_extract/out/build/x64-Debug
1> [CMake] pcl_common CXX_MODULE_SETS = 
1> [CMake] pcl_common HEADER_SETS = 
1> [CMake] pcl_common IMPORTED = TRUE
1> [CMake] pcl_common IMPORTED_GLOBAL = FALSE
1> [CMake] pcl_common IMPORTED_IMPLIB = C:/dev/pcl/out/install/x64-Release/lib/pcl_common.lib
1> [CMake] pcl_common INTERFACE_COMPILE_DEFINITIONS = __SSE4_2__;__SSE4_1__;__SSSE3__;__SSE3__;__SSE2__;__SSE__;BOOST_ALL_NO_LIB
1> [CMake] pcl_common INTERFACE_COMPILE_FEATURES = cxx_std_17
1> [CMake] pcl_common INTERFACE_COMPILE_OPTIONS = $<$<COMPILE_LANGUAGE:CXX>:/arch:AVX2>
1> [CMake] pcl_common INTERFACE_CXX_MODULE_SETS = 
1> [CMake] pcl_common INTERFACE_HEADER_SETS = 
1> [CMake] pcl_common INTERFACE_INCLUDE_DIRECTORIES = C:/dev/vcpkg/installed/x64-windows/include/eigen3;C:/dev/pcl/out/install/x64-Release/include/pcl-1.15;C:/dev/pcl/out/install/x64-Release/include/pcl-1.15
1> [CMake] pcl_common INTERFACE_LINK_LIBRARIES = Eigen3::Eigen;Boost::system;Boost::iostreams;Boost::filesystem;Boost::serialization
1> [CMake] pcl_common NAME = pcl_common
1> [CMake] pcl_common POSITION_INDEPENDENT_CODE = True
1> [CMake] pcl_common SOURCE_DIR = C:/dev/pcl issues/iss_keypoints_extract
1> [CMake] pcl_common SYSTEM = ON
1> [CMake] pcl_common TYPE = SHARED_LIBRARY

Hence all the required flags are added, if you just use:
target_link_libraries(your_app PRIVATE pcl_common)

Cmake code for printing target properties can be found here.

The above is on windows.

This means that on recent x86 architectures, AVX512 instruction sets will be enabled and EIGEN_MAX_ALIGN_BYTES=64 should actually be used instead of 32?

I don't think this is set anywhere explicitly in PCL. In what you refer to, its only mentioned in comments.

@s-trinh
Copy link

s-trinh commented May 21, 2025

@larshg

Thanks for your answer.
Using target_link_libraries(your_app PRIVATE pcl_common) is better and the tutorials could be updated accordingly.


This is what I have on Ubuntu/Linux:

  • output of cmake ../pcl -DBUILD_tools=OFF -DBUILD_global_tests=OFF -DPCL_DISABLE_GPU_TESTS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$(pwd)/install:
-- DOXYGEN_FOUND 
-- HTML_HELP_COMPILER 
-- Found CPack generators: DEB
-- PCL build with following flags:
--  -Wabi=18 -Wall -Wextra -fno-strict-aliasing -msse4.2 -mfpmath=sse -march=native -mavx2 -fopenmp
-- The following subsystems will be built:
--   common
  • printing pcl_common using:
## https://stackoverflow.com/questions/32183975/how-to-print-all-the-properties-of-a-target-in-cmake/56738858#56738858
## https://stackoverflow.com/a/56738858/3743145

## Get all properties that cmake supports
execute_process(COMMAND cmake --help-property-list OUTPUT_VARIABLE CMAKE_PROPERTY_LIST)
## Convert command output into a CMake list
STRING(REGEX REPLACE ";" "\\\\;" CMAKE_PROPERTY_LIST "${CMAKE_PROPERTY_LIST}")
STRING(REGEX REPLACE "\n" ";" CMAKE_PROPERTY_LIST "${CMAKE_PROPERTY_LIST}")

list(REMOVE_DUPLICATES CMAKE_PROPERTY_LIST)

function(print_target_properties tgt)
    if(NOT TARGET ${tgt})
      message("There is no target named '${tgt}'")
      return()
    endif()

    foreach (prop ${CMAKE_PROPERTY_LIST})
        string(REPLACE "<CONFIG>" "${CMAKE_BUILD_TYPE}" prop ${prop})
        get_target_property(propval ${tgt} ${prop})
        if (propval)
            message ("${tgt} ${prop} = ${propval}")
        endif()
    endforeach(prop)
endfunction(print_target_properties)

print_target_properties(pcl_common)

gives:

 pcl_common BINARY_DIR = [...]/pcl-example-TMP/build
 pcl_common IMPORTED = TRUE
 pcl_common IMPORTED_CONFIGURATIONS = RELEASE;DEBUG
 pcl_common INTERFACE_COMPILE_FEATURES = cxx_std_17
 pcl_common INTERFACE_COMPILE_OPTIONS = $<$<COMPILE_LANGUAGE:CXX>:-msse4.2;-mfpmath=sse;-march=native;-mavx2>
 pcl_common INTERFACE_INCLUDE_DIRECTORIES =
 /usr/include/eigen3;[...]/pcl-build-TMP/install/include/pcl-1.15;[...]/pcl-build-TMP/install/include/pcl-1.15
 pcl_common INTERFACE_LINK_LIBRARIES = Eigen3::Eigen;Boost::system;Boost::iostreams;Boost::filesystem;Boost::serialization
 pcl_common LOCATION = [...]/pcl-build-TMP/install/lib/libpcl_common.so
 pcl_common LOCATION_Release = [...]/pcl-build-TMP/install/lib/libpcl_common.so
 pcl_common MACOSX_PACKAGE_LOCATION = [...]/pcl-build-TMP/install/lib/libpcl_common.so
 pcl_common NAME = pcl_common
 pcl_common POSITION_INDEPENDENT_CODE = True
 pcl_common SOURCE_DIR = [...]/pcl-example-TMP
 pcl_common SYSTEM = ON
 pcl_common TYPE = SHARED_LIBRARY
 pcl_common VS_DEPLOYMENT_LOCATION = [...]/pcl-build-TMP/install/lib/libpcl_common.so

-march=native will enable the following flags on a recent CPU and thus AVX512 instructions set:

gcc -march=native -E -v - </dev/null 2>&1 | grep cc1

/usr/libexec/gcc/x86_64-linux-gnu/13/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -mno-sse4a -mno-fma4 -mno-xop -mfma -mavx512f -mbmi -mbmi2 -maes -mpclmul -mavx512vl -mavx512bw
[...]

Ultimately, using target_link_libraries(your_app PRIVATE pcl_common) should be the way to go (using the same optimizations than PCL) and not trying to follow the message in You likely have to compile your code with AVX enabled or define EIGEN_MAX_ALIGN_BYTES=32 in my opinion (recent CPUs have AVX512 and PCL is built with -march=native by default on Linux).

@mvieth
Copy link
Member

mvieth commented May 22, 2025

@s-trinh Linking to ${PCL_LIBRARIES} in CMake is the the recommended (and modern) way of doing it. It will automatically take care of architecture options and optimizations. ${PCL_LIBRARIES} contains pcl_common as well as pcl_io, pcl_filters, ... (all the PCL modules that have been requested in the find_package call, or all modules if no specific modules have been requested). I agree that the tutorials should be updated and not mention PCL_DEFINITIONS or any of the PCL_*_DIRS variables any more. I can do that when I have the time.
Regarding the message You likely have to compile your code with AVX enabled or define EIGEN_MAX_ALIGN_BYTES=32: This check was primarily added for people who do not use CMake (and perhaps for people who use CMake but not in the recommended way). The main point is that Eigen has a special way of creating aligned memory, which is inherited by PCL and activated when compiling with AVX (or higher, e.g. AVX512). Even if the CPU supports AVX512, enabling just AVX or defining EIGEN_MAX_ALIGN_BYTES=32 should be enough to avoid alignment issues and segmentation faults. So people who use the recommended CMake way will never see this warning. People who do not use CMake and see this warning, can solve it by enabling AVX or by enabling AVX512 (if the CPU supports it) or by defining EIGEN_MAX_ALIGN_BYTES=32 or EIGEN_MAX_ALIGN_BYTES=64.
For further information, please see #6184 .
By the way, whether PCL is built with -march=native also depends on the installation method. Yes, it is enabled by default, but some package managers (for example vcpkg) disable it.

@mvieth mvieth added module: cmake and removed status: triage Labels incomplete labels May 22, 2025
@s-trinh
Copy link

s-trinh commented May 22, 2025

@mvieth
Thanks for the answer.

Even if the CPU supports AVX512, enabling just AVX or defining EIGEN_MAX_ALIGN_BYTES=32 should be enough to avoid alignment issues and segmentation faults.

Regardless of CMake, to be sure, if PCL is built with optimizations for AVX512 for a CPU with AVX512, it is thus necessary in the user code to have 64 bytes alignment for Eigen objects. The quoted phrase is ambiguous to me.

fspindle added a commit to fspindle/visp that referenced this issue May 22, 2025
- Instead of using PCL_COMPILE_OPTIONS cmake var, we are using INTERFACE_COMPILE_OPTIONS property
  over pcl components to build an internal var named PCL_DEPS_COMPILE_OPTIONS
- This new strategy is the one recommended in PointCloudLibrary/pcl#5806
- We recall that propagating pcl compile option is requested to avoid
  Potential runtime error due to aligned malloc mismatch!
  See lagadic#1681
@larshg
Copy link
Contributor

larshg commented Jun 2, 2025

Closing as its not a bug in PCL per se. But rather a configuration issue thats not aligned between PCL build libraries and user code. There have been added various checks and documentation on how to correctly handle this.

@larshg larshg closed this as completed Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants