Skip to content

[ML] Native compilation and unit test for Linux on aarch64 #1132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 14, 2020

Conversation

droberts195
Copy link
Contributor

@droberts195 droberts195 commented Apr 8, 2020

This PR adds the ability to compile and unit test the code on
the aarch64 architecture.

It is NOT workable aarch64 support. That would additionally
require the ability to cross compile for aarch64 from the
x86_64 release manager and CI VMs, and also changes to the
Gradle scripts that assemble the final artifacts.

However, the relative ease of getting the code to compile and
run on aarch64 suggests that supporting this platform is a
realistic goal.

The unit test changes highlight that aarch64 floating point
calculations produce slightly different results to the
equivalent x86_64 calculations.

This PR adds the ability to compile and unit test the code on
the aarch64 architecture.

It is NOT workable aarch64 support.  That would additionally
require the ability to cross compile for aarch64 from the
x86_64 release manager and CI VMs, and also changes to the
Gradle scripts that assemble the final artifacts.

However, the relative ease of getting the code to compile and
run on aarch64 suggests that supporting this platform is a
realistic goal.

The unit test changes highlight that aarch64 floating point
calculations produce slightly different results to the
equivalent x86_64 calculations.  We also need to decide which,
if any, SIMD flags to use on aarch64.  As it stands this PR is
using none.
Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Just some minor comments on build flags.

mk/linux.mk Outdated
RAPIDJSONCPPFLAGS=-DRAPIDJSON_HAS_STDSTRING -DRAPIDJSON_SSE42
else
RAPIDJSONCPPFLAGS=-DRAPIDJSON_HAS_STDSTRING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use NEON optimisations?

Suggested change
RAPIDJSONCPPFLAGS=-DRAPIDJSON_HAS_STDSTRING
RAPIDJSONCPPFLAGS=-DRAPIDJSON_HAS_STDSTRING -RAPIDJSON_NEON

@@ -23,9 +24,15 @@ COVERAGE=--coverage
endif
endif

ifeq ($(HARDWARE_ARCH),aarch64)
ARCHCFLAGS=-march=armv8-a+crc+crypto
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that all aarch64 CPUs have SIMD instructions, so there's no need to tell the compiler it can use them - it just will. The Amazon Graviton processors used in EC2 aarch64 instances also support the CRC and Crypto instructions, so I told the compiler it can use them. These were added to the ARMv8 architecture fairly early on, so should not restrict usage too much. I suspect any hardware that didn't support these instructions would be unable to run ML well anyway, e.g. a 5 year old Raspberry Pi.

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@droberts195 droberts195 merged commit 58ee917 into elastic:master Apr 14, 2020
@droberts195 droberts195 deleted the native_arm branch April 14, 2020 08:44
droberts195 added a commit to droberts195/ml-cpp that referenced this pull request Apr 14, 2020
This PR adds the ability to compile and unit test the code on
the aarch64 architecture.

It is NOT workable aarch64 support.  That would additionally
require the ability to cross compile for aarch64 from the
x86_64 release manager and CI VMs, and also changes to the
Gradle scripts that assemble the final artifacts.

However, the relative ease of getting the code to compile and
run on aarch64 suggests that supporting this platform is a
realistic goal.

The unit test changes highlight that aarch64 floating point
calculations produce slightly different results to the
equivalent x86_64 calculations.

Backport of elastic#1132
droberts195 added a commit to droberts195/ml-cpp that referenced this pull request Apr 14, 2020
Following on from elastic#1132, this change adds:

1. The ability to cross compile a Linux build for aarch64 on
   Linux on another hardware architecture
2. A Docker container for x86_64 that is set up for cross
   compiling for aarch64
3. A Docker container for aarch64 that can both build and run
   unit tests
4. Changes to CI scripts to utilise these Docker containers
   where appropriate
droberts195 added a commit that referenced this pull request Apr 14, 2020
This PR adds the ability to compile and unit test the code on
the aarch64 architecture.

It is NOT workable aarch64 support.  That would additionally
require the ability to cross compile for aarch64 from the
x86_64 release manager and CI VMs, and also changes to the
Gradle scripts that assemble the final artifacts.

However, the relative ease of getting the code to compile and
run on aarch64 suggests that supporting this platform is a
realistic goal.

The unit test changes highlight that aarch64 floating point
calculations produce slightly different results to the
equivalent x86_64 calculations.

Backport of #1132
droberts195 added a commit that referenced this pull request Apr 15, 2020
…1135)

Following on from #1132, this change adds:

1. The ability to cross compile a Linux build for aarch64 on
   Linux on another hardware architecture
2. A Docker container for x86_64 that is set up for cross
   compiling for aarch64
3. A Docker container for aarch64 that can both build and run
   unit tests
4. Changes to CI scripts to utilise these Docker containers
   where appropriate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants