-
Notifications
You must be signed in to change notification settings - Fork 96
"java.net.SocketException: Socket closed" when in a cluster mode + Docker + acquireHostList enabled #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The solution would be to simply check if the connection instance is still alive before returning it from the |
Hi @wajda , |
This happens on multiple environments. First we run into this issue on Kubernetes on AWS, then my colleague reproduced it locally using Docker 20, while it worked for me on Fedora's docker 19 (moby-engine). AfetreAfter installing Docker-ce 20 the issue occurred to me as well. On my localhost I use the following setup to reproduce it:
docker run -it --rm \
--name=adb
-p 8528:8528 \
-v /var/run/docker.sock:/var/run/docker.sock arangodb/arangodb-starter \
--starter.local \
--starter.address=172.17.0.1 \
--docker.container=adb (not sure if I used For the driver config, enable Then I access it via VST on |
Hi @wajda , Can you please try connecting the driver to |
On a local Docker yes, that would work (as I mentioned in AbsaOSS/spline#869 (comment)). The problem is that it's not always possible in a real prod environments with a more complicated networking. where for instance, IPs are auto generated and aren't stable enough to be put in a config file for example. So the first client connection needs to be done on alias, for example. |
I have ArangoDB cluster on Kubernetes. Connection string to arangodb: arangodb-cluster.arango-namespace.svc.cluster.local:8529 Kubernetes doesn't have stable IP and operates via dns-names: Arango pods names: arangodb-cluster-crdn-c4eas9vw-044b59 20.0.62.185 arangodb-cluster-prmr-9njqj9u2-044b59 20.0.63.161 If I delete pod "arangodb-cluster-prmr-9njqj9u2-044b59 20.0.63.161" - kubermnetes will create new pod with new IP address |
Thanks for clarifying, it makes sense to me now! |
The issue was first discovered here AbsaOSS/spline#869
The error occurs in the combination of circumstances: Cluster mode + Docker +
acqureHostList=true
My understanding of what is happening is the following.
When the VST connection is established the respective
HostHandler
asksVstCommunication
class to refresh the host list from the server. When the new hosts are added to the set, the old ones (unless are pointing to exactly the sameip:port
) are immediately discarded along with all associated connection pools and sockets.The problem is that the connection instance, that has just been created and triggered the host list refreshing process in the first place, the one that is being returned from the
VstCommunication.connect()
method holds a pointer to the host that might have just been discarded (and the associated socket closed) during this host list refreshing routine. As a result in this circumstances theVstCommunication.connect()
method returns a connection that is dead on the moment of creation, with all the consequences.This is exactly what happens when ArangoDB runs in a virtualized environment (Docker in our case) when the networking is organized in a way that the client process addresses the server via a different IP (or a host name) that the server sees from inside its network.
The issue is reproducible by spinning up a DB cluster via
arangodb-starter
in a Docker, and runArangoDBTest.execute_acquireHostList_enabled()
test method against it.The text was updated successfully, but these errors were encountered: