Skip to content

Latest arm/v7 docker image includes wrong arch plugin #239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
groggemans opened this issue Oct 31, 2021 · 4 comments · Fixed by #240
Closed

Latest arm/v7 docker image includes wrong arch plugin #239

groggemans opened this issue Oct 31, 2021 · 4 comments · Fixed by #240

Comments

@groggemans
Copy link
Contributor

groggemans commented Oct 31, 2021

What happened:
Starting the latest container build on an rpi4 exits with wrong exec format error;

standard_init_linux.go:228: exec user process caused: exec format error

What you expected to happen:
The image to start successfully.

How to reproduce it:
Start the image on an arm v7 target (RPI 4)

sudo docker run -it mcr.microsoft.com/k8s/csi/nfs-csi:latest
Unable to find image 'mcr.microsoft.com/k8s/csi/nfs-csi:latest' locally
latest: Pulling from k8s/csi/nfs-csi
1d5132fc992f: Pull complete 
ce2f5493356f: Pull complete 
674f4f63de92: Pull complete 
Digest: sha256:b56fea83ef6b6896378e283a38ea8846d932499e723bbe3c354fa425fb361d85
Status: Downloaded newer image for mcr.microsoft.com/k8s/csi/nfs-csi:latest
standard_init_linux.go:228: exec user process caused: exec format error

Anything else we need to know?:
The downloaded image is for the correct architecture

        "Architecture": "arm",                                                                                                                                                
        "Variant": "v7",                                                                                                                                                      
        "Os": "linux",    

Starting the image with /bin/sh as the entrypoint also works, it's just the nfsplugin that seems to have been compiled for another architecture.

Environment:

  • CSI Driver version: latest
  • Kubernetes version (use kubectl version): /
  • OS (e.g. from /etc/os-release): Raspbian GNU/Linux 10 (buster)
  • Kernel (e.g. uname -a): Linux rpi4 5.10.49-v7l+ #1436 SMP Wed Jul 14 14:18:38 BST 2021 armv7l GNU/Linux
  • Install tools: /
  • Others: /
@zhengtianbao
Copy link

Maybe it's GOARM flag missing in go build cmd, in cross compilation situations, it is recommended that always set an appropriate GOARM value along with GOARCH.

how about replace this:

CGO_ENABLED=0 GOOS=linux GOARCH=arm go build -a -ldflags "${LDFLAGS} ${EXT_LDFLAGS}" -mod vendor -o bin/arm/v7/nfsplugin ./cmd/nfsplugin

to:

CGO_ENABLED=0 GOOS=linux GOARCH=arm GOARM=7 go build -a -ldflags "${LDFLAGS} ${EXT_LDFLAGS}" -mod vendor -o bin/arm/v7/nfsplugin ./cmd/nfsplugin

If it's right, I'd like to make a PR to fix it.

refer: https://github.com/golang/go/wiki/GoArm

@andyzhangx
Copy link
Member

I rebuilt latest tag with following diff, does it work now?

# git diff
diff --git a/Makefile b/Makefile
index 2b85660..6653aa8 100644
--- a/Makefile
+++ b/Makefile
@@ -100,7 +100,7 @@ nfs:

 .PHONY: nfs-armv7
 nfs-armv7:
-       CGO_ENABLED=0 GOOS=linux GOARCH=arm go build -a -ldflags "${LDFLAGS} ${EXT_LDFLAGS}" -mod vendor -o bin/arm/v7/nfsplugin ./cmd/nfsplugin
+       CGO_ENABLED=0 GOOS=linux GOARCH=arm GOARM=7 go build -a -ldflags "${LDFLAGS} ${EXT_LDFLAGS}" -mod vendor -o bin/arm/v7/nfsplugin ./cmd/nfsplugin

 .PHONY: container-build
 container-build:

groggemans added a commit to groggemans/csi-driver-nfs that referenced this issue Nov 1, 2021
Moves the ARCH build argument after the first FROM so it has the proper
value.
(https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact)

Adds the GOARM=7 option to the armv7 GO build.

Fixes kubernetes-csi#239
@groggemans
Copy link
Contributor Author

Thank you for the fast reply!

I pulled the latest tag again, but didn't see a change in the layer hash. I might be pulling from the wrong mirror?

So I did have some time to try and dig a bit deeper as well, first started compiling the binary and trying it out on the rpi directly. Building with or without GOARM flag didn't seem to make a difference and both worked. Although I do agree that it's a good idea to include it.

Then I tried to cross compile the docker image, and that's where I ran into some trouble..
The build couldn't copy the binary as something went wrong with the path;

make container-linux-armv7
docker buildx build --pull --output=type=docker --platform="linux/arm/v7" \
        -t andyzhangx/nfsplugin:v3.0.0-linux-arm-v7 --build-arg ARCH=arm/v7 .
[+] Building 0.4s (6/7)                                                                                                                                                       
 => [internal] load build definition from Dockerfile                                                                                                                     0.0s
 => => transferring dockerfile: 1.08kB                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [internal] load metadata for k8s.gcr.io/build-image/debian-base:buster-v1.6.0                                                                                        0.3s
 => [internal] load build context                                                                                                                                        0.0s
 => => transferring context: 25B                                                                                                                                         0.0s
 => [1/3] FROM k8s.gcr.io/build-image/debian-base:buster-v1.6.0@sha256:1c81250e8dc13a7b1c9fbd7a7f47843e1bc7c66cbf308f315772b88673a82bf2                                  0.0s
 => ERROR [2/3] COPY bin//nfsplugin /nfsplugin                                                                                                                           0.0s
------
 > [2/3] COPY bin//nfsplugin /nfsplugin:
------
error: failed to solve: failed to compute cache key: "/bin/nfsplugin" not found: not found
make: *** [Makefile:112: container-linux-armv7] Error 1

When building for all architectures this step does succeed.

=> [2/3] COPY bin//nfsplugin /nfsplugin
=> [3/3] RUN ...

The ARCH is not properly substituted in the bin path resulting in the wrong binary to be copied if all architectures are build.

Looking at the cause it seem this is because the ARCH build ARG is defined before the first FROM, which will make it inaccessible for the later build steps; https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact

Placing the ARG after the FROM in the docker file fixes the issue;

=> [2/3] COPY bin/arm/v7/nfsplugin /nfsplugin

I'll create a PR that combines the GOARM and ARG fixes.

@groggemans
Copy link
Contributor Author

I initially just moved the ARG ARCH=amd64 underneath the FROM in the docker file but that caused some issues with the code in release-tools/build.make, then tried to fix that file as well, but that wasn't allowed by one of the tests. Final solution applied in the PR is to let the ARG ARCH have no default, so this results in the same behavior as before if not provided, and properly uses the ARCH argument when it is provided.

k8s-ci-robot added a commit that referenced this issue Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants