-
Notifications
You must be signed in to change notification settings - Fork 65
[ML] Add cross compilation support, Docker images and CI for aarch64 #1135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,8 +75,25 @@ case `uname` in | |
STL_LOCATION= | ||
ZLIB_LOCATION= | ||
else | ||
echo "Cannot cross compile to $CPP_CROSS_COMPILE" | ||
exit 3 | ||
SYSROOT=/usr/local/sysroot-$CPP_CROSS_COMPILE-linux-gnu | ||
BOOST_LOCATION=$SYSROOT/usr/local/gcc75/lib | ||
BOOST_COMPILER=gcc | ||
if [ "$CPP_CROSS_COMPILE" = aarch64 ] ; then | ||
BOOST_ARCH=a64 | ||
else | ||
echo "Cannot cross compile to $CPP_CROSS_COMPILE" | ||
exit 3 | ||
fi | ||
BOOST_EXTENSION=mt-${BOOST_ARCH}-1_71.so.1.71.0 | ||
BOOST_LIBRARIES='atomic chrono date_time filesystem iostreams log log_setup program_options regex system thread' | ||
XML_LOCATION=$SYSROOT/usr/local/gcc75/lib | ||
XML_EXTENSION=.so.2 | ||
GCC_RT_LOCATION=$SYSROOT/usr/local/gcc75/lib64 | ||
GCC_RT_EXTENSION=.so.1 | ||
STL_LOCATION=$SYSROOT/usr/local/gcc75/lib64 | ||
STL_PREFIX=libstdc++ | ||
STL_EXTENSION=.so.6 | ||
ZLIB_LOCATION= | ||
fi | ||
;; | ||
|
||
|
@@ -183,7 +200,7 @@ fi | |
case `uname` in | ||
|
||
Linux) | ||
if [ -n "$INSTALL_DIR" -a -z "$CPP_CROSS_COMPILE" ] ; then | ||
if [ -n "$INSTALL_DIR" -a "$CPP_CROSS_COMPILE" != macosx ] ; then | ||
cd "$INSTALL_DIR" | ||
for FILE in `find . -type f | egrep -v '^core|-debug$|libMl'` | ||
do | ||
|
@@ -192,13 +209,7 @@ case `uname` in | |
if [ $? -eq 0 ] ; then | ||
echo "Set RPATH in $FILE" | ||
else | ||
# Set RPATH for 3rd party libraries that reference other libraries we ship | ||
ldd $FILE | grep /usr/local/lib >/dev/null 2>&1 && patchelf --set-rpath '$ORIGIN/.' $FILE | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed this bit because it hasn't been doing anything since we switched from storing the 3rd party libraries we build from |
||
if [ $? -eq 0 ] ; then | ||
echo "Set RPATH in $FILE" | ||
else | ||
echo "Did not set RPATH in $FILE" | ||
fi | ||
echo "Did not set RPATH in $FILE" | ||
fi | ||
done | ||
fi | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
# Machine Learning Build Machine Setup for Linux aarch64 cross compiled | ||
|
||
You will need the following environment variables to be defined: | ||
|
||
- `JAVA_HOME` - Should point to the JDK you want to use to run Gradle. | ||
- `CPP_CROSS_COMPILE` - Should be set to "aarch64". | ||
- `CPP_SRC_HOME` - Only required if building the C++ code directly using `make`, as Gradle sets it automatically. | ||
- `PATH` - Must have `/usr/local/gcc75/bin` before `/usr/bin` and `/bin`. | ||
- `LD_LIBRARY_PATH` - Must have `/usr/local/gcc75/lib64` and `/usr/local/gcc75/lib` before `/usr/lib` and `/lib`. | ||
|
||
For example, you might create a .bashrc file in your home directory containing this: | ||
|
||
``` | ||
umask 0002 | ||
export JAVA_HOME=/usr/local/jdk1.8.0_121 | ||
export LD_LIBRARY_PATH=/usr/local/gcc75/lib64:/usr/local/gcc75/lib:/usr/lib:/lib | ||
export PATH=$JAVA_HOME/bin:/usr/local/gcc75/bin:/usr/bin:/bin:/usr/sbin:/sbin | ||
# Only required if building the C++ code directly using make - adjust depending on the location of your Git clone | ||
export CPP_SRC_HOME=$HOME/ml-cpp | ||
export CPP_CROSS_COMPILE=aarch64 | ||
``` | ||
|
||
### Initial Preparation | ||
|
||
Start by configuring a native Linux aarch64 build server as described in [linux.md](linux.md). | ||
|
||
The remainder of these instructions assume the native aarch64 build server you have configured is for CentOS 7. This is what builds for distribution are currently built on. | ||
|
||
On the fully configured native aarch64 build server, run the following commands: | ||
|
||
``` | ||
cd /usr | ||
tar jcvf ~/usr-aarch64-linux-gnu.tar.bz2 include lib lib64 local | ||
``` | ||
|
||
These instructions assume the host platform is also CentOS 7, but x86_64 instead of aarch64. It makes life much easier if the host distribution is the same as the target distribution. | ||
|
||
Transfer the archive created in your home directory on the native aarch64 build server, `usr-aarch64-linux-gnu.tar.bz2`, to your home directory on the x86_64 host build server. | ||
|
||
### OS Packages | ||
|
||
You need the C++ compiler and the headers for the `zlib` library that comes with the OS. You also need the archive utilities `unzip` and `bzip2`. On RHEL/CentOS these can be installed using: | ||
|
||
``` | ||
sudo yum install bzip2 gcc-c++ texinfo unzip zlib-devel | ||
``` | ||
|
||
### Transferred Build Dependencies | ||
|
||
Add the dependencies that you copied from the fully configured native aarch64 build server in the "Initial Preparation" step. | ||
|
||
``` | ||
sudo mkdir -p /usr/local/sysroot-aarch64-linux-gnu/usr | ||
cd /usr/local/sysroot-aarch64-linux-gnu/usr | ||
sudo tar jxvf ~/usr-aarch64-linux-gnu.tar.bz2 | ||
cd .. | ||
sudo ln -s usr/lib lib | ||
sudo ln -s usr/lib64 lib64 | ||
``` | ||
|
||
### General settings for building the tools | ||
|
||
Most of the tools are built via a GNU "configure" script. There are some environment variables that affect the behaviour of this. Therefore, when building ANY tool on Linux, set the following environment variables: | ||
|
||
``` | ||
export CFLAGS='-g -O3 -fstack-protector -D_FORTIFY_SOURCE=2' | ||
export CXXFLAGS='-g -O3 -fstack-protector -D_FORTIFY_SOURCE=2' | ||
export LDFLAGS='-Wl,-z,relro -Wl,-z,now' | ||
export LDFLAGS_FOR_TARGET='-Wl,-z,relro -Wl,-z,now' | ||
unset LIBRARY_PATH | ||
``` | ||
|
||
These environment variables only need to be set when building tools on Linux. They should NOT be set when compiling the Machine Learning source code (as this should pick up all settings from our Makefiles). | ||
|
||
### binutils (bootstrap version) | ||
|
||
Since we build with a more recent gcc than comes with the host system, we must build it from source. To build a cross compiler we need cross build tools, so we need to build versions that are compatible with the system compiler that we'll use to build the more recent gcc. | ||
|
||
Download `binutils-2.25.tar.bz2` from <http://ftpmirror.gnu.org/binutils/binutils-2.25.tar.bz2>. | ||
|
||
Uncompress and untar the resulting file. Then run: | ||
|
||
``` | ||
unset LD_LIBRARY_PATH | ||
export PATH=/usr/bin:/bin:/usr/sbin:/sbin | ||
./configure --with-sysroot=/usr/local/sysroot-aarch64-linux-gnu --target=aarch64-linux-gnu --with-system-zlib --disable-multilib --disable-libstdcxx | ||
``` | ||
|
||
This should build an appropriate Makefile. Assuming it does, type: | ||
|
||
``` | ||
make | ||
sudo make install | ||
``` | ||
|
||
to install. | ||
|
||
### gcc | ||
|
||
We have to build on old Linux versions to enable our software to run on the older versions of Linux that users have. However, this means the default compiler on our Linux build servers is also very old. To enable use of more modern C++ features, we use the default compiler to build a newer version of gcc and then use that to build all our other dependencies. | ||
|
||
Download `gcc-7.5.0.tar.gz` from <http://ftpmirror.gnu.org/gcc/gcc-7.5.0/gcc-7.5.0.tar.gz>. | ||
|
||
Unlike most automake-based tools, gcc must be built in a directory adjacent to the directory containing its source code, so build and install it like this: | ||
|
||
``` | ||
tar zxvf gcc-7.5.0.tar.gz | ||
cd gcc-7.5.0 | ||
contrib/download_prerequisites | ||
sed -i -e 's/$(SHLIB_LDFLAGS)/-Wl,-z,relro -Wl,-z,now $(SHLIB_LDFLAGS)/' libgcc/config/t-slibgcc | ||
cd .. | ||
mkdir gcc-7.5.0-build | ||
cd gcc-7.5.0-build | ||
unset LD_LIBRARY_PATH | ||
export PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin | ||
../gcc-7.5.0/configure --prefix=/usr/local/gcc75 --with-sysroot=/usr/local/sysroot-aarch64-linux-gnu --target=aarch64-linux-gnu --enable-languages=c,c++ --enable-vtable-verify --with-system-zlib --disable-multilib | ||
make -j 6 | ||
sudo env PATH="$PATH" make install | ||
``` | ||
|
||
(Note the `env PATH="$PATH"` bit in the install command - this is because the cross tools we put in `/usr/local/bin` are needed during the install.) | ||
|
||
To confirm that everything works correctly run: | ||
|
||
``` | ||
aarch64-linux-gnu-g++ --version | ||
``` | ||
|
||
It should print: | ||
|
||
``` | ||
aarch64-linux-gnu-g++ (GCC) 7.5.0 | ||
``` | ||
|
||
in the first line of the output. If it doesn't then double check that `/usr/local/gcc75/bin` is near the beginning of your `PATH`. | ||
|
||
### binutils (final version) | ||
|
||
Also due to building on old Linux versions yet wanting to use modern libraries we have to install an up-to-date version of binutils. This will be used in preference to the bootstrap version by ensuring that `/usr/local/gcc75/bin` is at the beginning of `PATH`. | ||
|
||
Download `binutils-2.34.tar.bz2` from <http://ftpmirror.gnu.org/binutils/binutils-2.34.tar.bz2>. | ||
|
||
Uncompress and untar the resulting file. Then run: | ||
|
||
``` | ||
./configure --prefix=/usr/local/gcc75 --with-sysroot=/usr/local/sysroot-aarch64-linux-gnu --target=aarch64-linux-gnu --enable-vtable-verify --with-system-zlib --disable-multilib --disable-libstdcxx --with-gcc-major-version-only | ||
``` | ||
|
||
This should build an appropriate Makefile. Assuming it does, type: | ||
|
||
``` | ||
make | ||
sudo make install | ||
``` | ||
|
||
to install. | ||
|
||
### patchelf | ||
|
||
Obtain patchelf from <http://nixos.org/releases/patchelf/patchelf-0.9/> - the download file will be `patchelf-0.9.tar.bz2`. | ||
|
||
Extract it to a temporary directory using: | ||
|
||
``` | ||
bzip2 -cd patchelf-0.9.tar.bz2 | tar xvf - | ||
``` | ||
|
||
In the resulting `patchelf-0.9` directory, run the: | ||
|
||
``` | ||
./configure --prefix=/usr/local/gcc75 | ||
``` | ||
|
||
script. This should build an appropriate Makefile. Assuming it does, run: | ||
|
||
``` | ||
make | ||
sudo make install | ||
``` | ||
|
||
to complete the build. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#!/bin/bash | ||
# | ||
# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
# or more contributor license agreements. Licensed under the Elastic License; | ||
# you may not use this file except in compliance with the Elastic License. | ||
# | ||
|
||
# Builds the Docker image that can be used to compile the machine learning | ||
# C++ code for Linux. | ||
# | ||
# This script is not intended to be run regularly. When changing the tools | ||
# or 3rd party components required to build the machine learning C++ code | ||
# increment the version, change the Dockerfile and build a new image to be | ||
# used for subsequent builds on this branch. Then update the version to be | ||
# used for builds in docker/linux_builder/Dockerfile. | ||
|
||
HOST=push.docker.elastic.co | ||
ACCOUNT=ml-dev | ||
REPOSITORY=ml-linux-aarch64-cross-build | ||
VERSION=1 | ||
|
||
set -e | ||
|
||
cd `dirname $0` | ||
|
||
docker build --no-cache -t $HOST/$ACCOUNT/$REPOSITORY:$VERSION linux_aarch64_cross_image | ||
# Get a username and password for this by visiting | ||
# https://docker.elastic.co:7000 and allowing it to authenticate against your | ||
# GitHub account | ||
docker login $HOST | ||
docker push $HOST/$ACCOUNT/$REPOSITORY:$VERSION | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
#!/bin/bash | ||
# | ||
# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
# or more contributor license agreements. Licensed under the Elastic License; | ||
# you may not use this file except in compliance with the Elastic License. | ||
# | ||
|
||
# Builds the Docker image that can be used to compile the machine learning | ||
# C++ code for Linux. | ||
# | ||
# This script is not intended to be run regularly. When changing the tools | ||
# or 3rd party components required to build the machine learning C++ code | ||
# increment the version, change the Dockerfile and build a new image to be | ||
# used for subsequent builds on this branch. Then update the version to be | ||
# used for builds in docker/linux_builder/Dockerfile. | ||
|
||
if [ `uname -m` != aarch64 ] ; then | ||
echo "Native build images must be built on the correct hardware architecture" | ||
echo "Required: aarch64, Current:" `uname -m` | ||
exit 1 | ||
fi | ||
|
||
HOST=push.docker.elastic.co | ||
ACCOUNT=ml-dev | ||
REPOSITORY=ml-linux-aarch64-native-build | ||
VERSION=1 | ||
|
||
set -e | ||
|
||
cd `dirname $0` | ||
|
||
docker build --no-cache -t $HOST/$ACCOUNT/$REPOSITORY:$VERSION linux_aarch64_native_image | ||
# Get a username and password for this by visiting | ||
# https://docker.elastic.co:7000 and allowing it to authenticate against your | ||
# GitHub account | ||
docker login $HOST | ||
docker push $HOST/$ACCOUNT/$REPOSITORY:$VERSION | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# | ||
# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
# or more contributor license agreements. Licensed under the Elastic License; | ||
# you may not use this file except in compliance with the Elastic License. | ||
# | ||
|
||
# Increment the version here when a new tools/3rd party components image is built | ||
FROM docker.elastic.co/ml-dev/ml-linux-aarch64-cross-build:1 | ||
|
||
MAINTAINER David Roberts <[email protected]> | ||
|
||
# Copy the current Git repository into the container | ||
COPY . /ml-cpp/ | ||
|
||
# Tell the build we want to cross compile | ||
ENV CPP_CROSS_COMPILE aarch64 | ||
|
||
# Pass through any version qualifier (default none) | ||
ARG VERSION_QUALIFIER= | ||
|
||
# Pass through whether this is a snapshot build (default yes if not specified) | ||
ARG SNAPSHOT=yes | ||
|
||
# Run the build | ||
RUN \ | ||
/ml-cpp/dev-tools/docker/docker_entrypoint.sh | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to set SYSROOT, etc if this if condition fails. It seems more natural to just add this as a new
elif [ "$CPP_CROSS_COMPILE" = aarch64 ] ; then
and keep the failure case at the end? Then if we added another target we could just add a newelif
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did it like this is that during the porting exercise I realised that a port to Linux on any other architecture could reuse a lot of the same code. Everything between lines 78 and 96 would be the same except the Boost abbreviation for the hardware architecture.
(This is also why I named the new make rules file
linux_cross_compile_linux.mk
. I started out withlinux_cross_compile_aarch64.mk
but realised before I opened the PR that almost the entire file would be the same for a cross compile to Linux on any other hardware architecture. Maybe in 5 years time we'll be cross compiling x86_64 from aarch64.)