Skip to content
This repository was archived by the owner on Feb 12, 2024. It is now read-only.

In-browser incorrect ipfs.swarm.connect on refresh #3400

Closed
hunterInt opened this issue Nov 15, 2020 · 16 comments
Closed

In-browser incorrect ipfs.swarm.connect on refresh #3400

hunterInt opened this issue Nov 15, 2020 · 16 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked

Comments

@hunterInt
Copy link

Version:
0.50.2

Platform:
Linux ubuntu 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Severity:
Medium - A non-essential functionality does not work, performance issues, etc.

Description:

When I first open my webpage and attempt to connect to another IPFS Node that I own I can successfully connect:

1st

However, when I refresh the same page I get this error:

2nd

You'll notice that both the successful and unsuccessful connections are attempting to connect to the same peerID (QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL) but with the incorrect multiAddr the second time (/ip4/127.0.0.1/tcp/4002/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL).

I'm assuming this is because JS-IPFS is attempting to connect to the external node via its peerID through the JS-IPFS node within the browser. Which obviously won't work since the browser is not listening to 127.0.0.1.

How can I prevent JS-IPFS from attempting to connect to a local (127.0.0.1) when I already gave it a domain and multiAddr to connect to (/dns4/example.com/tcp/4005/wss/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL)?

Steps to reproduce the error:

ipfsNode = await IPFS.create();

const serverMultiAddr = `/dns4/example.com/tcp/4005/wss/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL`;
console.log('serverMultiAddr', serverMultiAddr);

const connected = await ipfsNode.swarm.connect(serverMultiAddr);
console.log('connected', connected);
@hunterInt hunterInt added the need/triage Needs initial labeling and prioritization label Nov 15, 2020
@achingbrain
Copy link
Member

libp2p stores known connection details in it's peer store and will try to reconnect when the node starts up (on page load after refreshing the page in this case).

The remote node is probably advertising WSS and TCP addresses. Since browser nodes cannot connect to TCP addresses it's not clear why libp2p would attempt to do so as it can only fail.

@vasco-santos can these addresses be filtered out or otherwise not connected to?

@achingbrain achingbrain added kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked and removed need/triage Needs initial labeling and prioritization labels Nov 16, 2020
@vasco-santos
Copy link
Member

vasco-santos commented Nov 16, 2020

Something looks strange in this case. Could you please log the AddressBook content after the swarm.connect and let me know the output? ipfs.libp2p.peerStore.addressBook.get(peerId: PeerId)

As far as I understand, the dial is failing, and that is the strange part. The flow behind is the following:

First start:

  • Node starts
  • Node discover peers over time (adds their addresses to the AddressBook) and eventually connect to them (a manual dial will include the address being discovered and added to the Book)
  • Once a connection is established, the identify protocol kicks in and peers exchange their announce addresses and protocols
  • Node stops, all the information is persisted

Next starts

  • Node starts
  • If autoDial is enabled (true by default), Node checks its persisted peerStore
  • For each entry in the AddressBook (within a configurable threshold), it will create a DialTarget with all the known addresses of the peer
  • The dialer will try to connect to the target once it has availability (and multiple addresses might be tried in parallel)
  • For each address, the dialer will check if there is a transport for it (https://github.com/libp2p/js-libp2p-interfaces/tree/master/src/transport#filtering-addresses) and if there is, its dial function will be called, otherwise this address fails (no action performed)
  • It will stop once a connection using one of the addresses is accomplished (Promise resolved), or if all the known addresses fail (Promise rejected)

Considering the above, it is expected that the tcp based addresses will simply throw as there will be no transport for it. If all the addresses fail, the Aggregate error is thrown, which seems to be the case.

The problematic point here is why all the addresses fail, taking into account that the dns websockets multiaddr should work.

@hunterInt
Copy link
Author

@vasco-santos thank you for looking into this and your in-depth explanation! I added the following code:

const PeerId = require('peer-id');

const peerIdBuffer = Buffer.from(serverIPFSPeerID, 'utf8');
const peerID = new PeerId(peerIdBuffer);

const addressBook = await ipfsNode.libp2p.peerStore.addressBook.get(peerID);
console.log('addressBook', addressBook);

And addressBook returns undefined.

If it helps I tested this on both Firefox and Brave.

@vasco-santos
Copy link
Member

You are doing that exactly after the connect right?

Perhaps the peerId is not well created. You can probably use const peerId = PeerId.createFromCID('QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL'). Or even better, you can use connection.remotePeer as it is the PeerId of remote.

// Connect code above

const addressesRecord = ipfsNode.libp2p.peerStore.addressBook.get(connection.remotePeer)

// You can also get all the peers and iterate them
const peers = ipfsNode.libp2p.peerStore.peers

// Listen for changes triggered from identify/...
ipfsNode.libp2p.peerStore.on('change:multiaddrs', ({ peerId, multiaddrs}) => {
  if (peerId.equals(connection.remotePeer) {
    console.log(multiaddrs)
  }
})

Let me know the results in both first run and second run. It might be good to check the PeerStore before the dial, in order to make sure you are using a new ipfs repo.

btw, if you need the docs are here:

@hunterInt
Copy link
Author

@vasco-santos yep exactly after the connection. Creating the peerId from PeerId.createFromCID('QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL') seemed to work, however.

Here is what I am doing:

const serverMultiAddr = '/dns4/' + serverDomain + '/tcp/4005/wss/p2p/' + serverIPFSPeerID;
console.log('serverMultiAddr', serverMultiAddr);
const connection = await ipfsNode.swarm.connect(serverMultiAddr).catch(console.error);
console.log('connection', connection);

const addressesRecord = await ipfsNode.libp2p.peerStore.addressBook.get(connection.remotePeer);

// You can also get all the peers and iterate them
const peers = ipfsNode.libp2p.peerStore.peers;
console.log('peers', peers);

ipfsNode.libp2p.peerStore.on('change:multiaddrs', ({ peerId, multiaddrs }) => {
	if (peerId.equals(connection.remotePeer)) {
		console.log(multiaddrs);
		console.log(multiaddrs.length);

		for (let i = 0; i < multiaddrs.length; i++) {
			console.log(multiaddrs[i]);
			console.log(new TextDecoder('utf-8').decode(multiaddrs[i].bytes));
			console.log('\n');
		}

	}
});

const PeerId = require('peer-id');
const peerID = PeerId.createFromCID(serverIPFSPeerID);
const addressBook = await ipfsNode.libp2p.peerStore.addressBook.get(peerID);

console.log('addressBook', addressBook);
for (let i = 0; i < addressBook.length; i++) {
	console.log('addressBook[i].multiaddr', addressBook[i].multiaddr)
}

And here is the output:

image

Once I reloaded the page this is the errors that I got:

er

Since connection was undefined once the page was reloaded ipfsNode.libp2p.peerStore.addressBook.get doesn't seem to return anything. I have noticed however that it isn't every always the 2nd page reload that causes these errors, sometimes it is on the 2nd or 3rd reload.

@vasco-santos
Copy link
Member

Thanks for the logs. For what I can see, after a dial happens your browser node will know 6 multiaddrs for the other peer.
Once it will start again, it will try to connect to it, but it is only trying out 5 multiaddrs, which all fail. I believe that the 5 multiaddrs in the AggregateError do not include the one you are using for the manual dia, right?

const serverMultiAddr = '/dns4/' + serverDomain + '/tcp/4005/wss/p2p/' + serverIPFSPeerID;
console.log('serverMultiAddr', serverMultiAddr);

const PeerId = require('peer-id');
const serverPeerID = PeerId.createFromCID(serverIPFSPeerID);
const addressBookRecordBeforeDial = ipfsNode.libp2p.peerStore.addressBook.get(serverPeerID);

// This should be empty in the first load, and contain known multiaddrs in the next load
console.log('addressBookRecordBeforeDial', addressBookRecordBeforeDial.length)
for (let i = 0; i < addressBookRecordBeforeDial.length; i++) {
	console.log('addressBookRecordBeforeDial[i].multiaddr', addressBookRecordBeforeDial[i].multiaddr.toString())
}

// This will log the multiaddrs discovered over time for the peer
ipfsNode.libp2p.peerStore.on('change:multiaddrs', ({ peerId, multiaddrs }) => {
	if (peerId.equals(serverPeerID)) {
                  console.log('new multiaddrs discovered')
		console.log(multiaddrs.map((m) => m.toString());
		console.log(multiaddrs.length);
	}
})

const connection = await ipfsNode.swarm.connect(serverMultiAddr).catch(console.error);
console.log('connection', connection);

const addressBookRecordAfterDial = ipfsNode.libp2p.peerStore.addressBook.get(serverPeerID);

// This should have all the peers that will appear in the second dial
console.log('addressBookRecordBeforeDial', addressBookRecordAfterDial.length)
for (let i = 0; i < addressBookRecordAfterDial.length; i++) {
	console.log('addressBookRecordAfterDial[i].multiaddr', addressBookRecordAfterDial[i].multiaddr.toString())
}

It would be good to run this and ack my assumptions in the comments. I guess they might be wrong causing the issue.

So, I already have a suspicion for this. I believe that the address that you are using is being lost. My theory is:

First load

  • No peers known, need to wait for peers to be discovered
  • A manual dial happens, via a multiaddr. This multiaddr will be stored in the AddressBook
  • Connection is established and identify protocol kicks in
  • During identify, the peer multiaddrs are exchanged and stored in the AddressBook
    • We see the identify protocol as the source of truth, so we replace everything we knew by the multiaddrs we received
    • ⚠️ We did not receive the multiaddr used

Second load

  • We know peers, let's connect to them
  • A manual dial kicks in, but libp2p understand that this peer is already being dialed (it was persisted)
  • The manual dial will sit and wait for the Promise of the previous dial to be fulfilled, but it will be rejected as the 5 addresses fail.
    • The address is not used

What can be done to mitigate this?

  • A hack to test this out would be to overwrite the peer addressBook after the dial
    • Add in the end: ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)])
    • This should make the next dial on load just use this multiaddr instead of the ones that were discovered
  • Are you responsible for the serverPeer?

From libp2p stand point, we should look at improving the dial flow for these cases. We should probably inspect the content of an on going dial to see if we have new addresses for the peer in the second dial. I will open an issue to track that once we confirm this is the issue

@hunterInt
Copy link
Author

hunterInt commented Nov 19, 2020

I believe that the 5 multiaddrs in the AggregateError do not include the one you are using for the manual dia, right?

@vasco-santos one of the 5 multiaddrs does have the correct IP address, but it is incorrectly over TCP and with the incorrect port:

four

Thank you for sending over that code snippet. Here are the results on the 1st page load:

first load no errors

After reloading the page for about an hour I still haven't been able to reproduce the error... Not sure why.

Are you responsible for the serverPeer?

Yes.

This will make your peer announce its public address in the identify protocol, instead of the local addresses that are not reachable.

Not quite sure that I follow. Is this for the swarm addresses that my server node will listen on? This is what I currently have:

    "Swarm": [
      "/ip4/0.0.0.0/tcp/4002",
      "/ip4/127.0.0.1/tcp/4003/ws"
    ],

And I have an NginX server listening on port 4005 for secure websocket connections and then forwarding those connections to my insecure IPFS sever listening on port 4003. Since the only connections my node should have is from the browser via a websocket should I remove the "/ip4/0.0.0.0/tcp/4002", from the swam list? When the server's node starts up this is what it prints out:

Swarm listening on /ip4/127.0.0.1/tcp/4002/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL
Swarm listening on /ip4/server.ip.address/tcp/4002/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL
Swarm listening on /ip4/10.10.0.7/tcp/4002/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL
Swarm listening on /ip4/10.136.15.235/tcp/4002/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL
Swarm listening on /ip4/127.0.0.1/tcp/4003/ws/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL

@hunterInt
Copy link
Author

hunterInt commented Nov 19, 2020

I removed the "/ip4/0.0.0.0/tcp/4002", from my server node's swarm list. Now when I start up the server I only get this:

Swarm listening on /ip4/127.0.0.1/tcp/4003/ws/p2p/QmeYrvjcWzJTGRv8aJVfManvQfsBaN5fPwhXmMCzaLL

I also have added in your recommended addressBook.set( method so that my code now looks like this:

const serverMultiAddr = '/dns4/' + serverDomain + '/tcp/4005/wss/p2p/' + serverIPFSPeerID;
console.log('serverMultiAddr', serverMultiAddr);

const PeerId = require('peer-id');
const serverPeerID = PeerId.createFromCID(serverIPFSPeerID);

const connection = await ipfsNode.swarm.connect(serverMultiAddr).catch(console.error);
console.log('connection', connection);

ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)]);// <-------- added this

So far I have been able to successfully connect to my server's node through ipfsNode.swarm.connect(serverMultiAddr) on every page reload without any issues.

UPDATE:
This appears to not have fixed the issue:

fff

@vasco-santos
Copy link
Member

vasco-santos commented Nov 20, 2020

@vasco-santos one of the 5 multiaddrs does have the correct IP address, but it is incorrectly over TCP and with the incorrect port:

The TCP address makes sense, since you were also listening on TCP.

Not quite sure that I follow. Is this for the swarm addresses that my server node will listen on? This is what I currently have:

Those swarm addresses will map into listen addresses of libp2p. However, it is recommended that your libp2p node announces the addresses that it can be dialed to. Perhaps we should add it to the IPFS config to ease this (I will think about it better).

So, my recommendation was to do:

const ipfs = await IPFS.create({
    libp2p: {
      addresses: {
        announce: ['/dns4/ipfs.io'] // Your DNS
      }
    }
  })

WIth this, your server node will announce its dns address on the identify exchange and everything should simply work. However, it seems that you are using a IPFS daemon directly, without using the programatically API. I will see if we can add this to the configuration to ease this process.

And I have an NginX server listening on port 4005 for secure websocket connections and then forwarding those connections to my insecure IPFS sever listening on port 4003. Since the only connections my node should have is from the browser via a websocket should I remove the "/ip4/0.0.0.0/tcp/4002", from the swam list? When the server's node starts up this is what it prints out:

If you do not have any use case where peers outside of the browser context will connect to your node, then I can recommend you to remove it as you don't need to be listening for TCP connections. By the way, if 0.0.0.0 is exposing your server ip, did you try using /ip4/0.0.0.0/tcp/4003/ws. I am not sure on how our listening here works, but if it prints out the swarm address for the public ip with websocket star, this could also work.

This appears to not have fixed the issue:

Ok, even hackier approach for now to have things running. You can do the addressBook set just before the dial and it will need to simply work.

Meanwhile, we need to get this done properly, let me know how it worked with the 0.0.0.0 while I will create an issue for having the config with announce addresses

EDIT: Added PR with the ability to add the announce addresses #3409

achingbrain pushed a commit that referenced this issue Nov 20, 2020
This PR adds the ability to add announces addresses via the config file and create options.

This is really useful for browser contexts, so that other peers can announce their public addresses, like dns addresses.

Ref: #3400
@hunterInt
Copy link
Author

So, my recommendation was to do:

const ipfs = await IPFS.create({
    libp2p: {
      addresses: {
        announce: ['/dns4/ipfs.io'] // Your DNS
      }
    }
  })

However, it seems that you are using a IPFS daemon directly, without using the programatically API.

This works. I'm using JS-IPFS on NodeJs for my server so this will work.

By the way, if 0.0.0.0 is exposing your server ip, did you try using /ip4/0.0.0.0/tcp/4003/ws

I tried this, but the server started to listen on /ip4/MY IP/tcp/4003/ws which I do not believe will work since JS-IPFS cannot directly listen for secure websocket connections. Currently my server listens for secure websocket connections on port 4005 using NginX (since NginX allows me to use my server's certificate to secure websocket connections). Then the, now insecure, websocket connection is forwarded to my JS-IPFS server listening on port 4003. So if I listen directly on /ip4/0.0.0.0/tcp/4003/ws the browser will not allow for the connection since it is insecure.

Ok, even hackier approach for now to have things running. You can do the addressBook set just before the dial and it will need to simply work.

I now have this set-up (just to be safe I added the addressBook.set before and after haha):

await ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)]);
const connection = await ipfsNode.swarm.connect(serverMultiAddr).catch(console.error);
console.log('connection', connection);

await ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)]);

And so far this is working.

@hunterInt
Copy link
Author

Also, maybe it is already being considered, but it would be nice for JS-IPFS to be able to listen on secure websocket connections directly without an NginX reverse proxy server. 🙂

@vasco-santos
Copy link
Member

This works. I'm using JS-IPFS on NodeJs for my server so this will work.

Nice, so if you do the announce with your dns multiaddr, you should not need to do the addressBook hack anymore.

Also, maybe it is already being considered, but it would be nice for JS-IPFS to be able to listen on secure websocket connections directly without an NginX reverse proxy server. 🙂

We are talking about some ideas to improve that experience, also taking into account go-ipfs. We should get a nicer solution in the next year

@hunterInt
Copy link
Author

Nice, so if you do the announce with your dns multiaddr, you should not need to do the addressBook hack anymore.

Ok that makes sense, I've added that in.

Well now after a reload or two I am still getting an error, but this one seems slightly different:
error occurred during XX handshake: Peer ID doesn't match libp2p public key:
now

const serverMultiAddr = '/dns4/' + serverDomain + '/tcp/4005/wss/p2p/' + serverIPFSPeerID;
const serverPeerID = PeerId.createFromCID(serverIPFSPeerID);
const addressBookRecordBeforeDial = ipfsNode.libp2p.peerStore.addressBook.get(serverPeerID);

await ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)]);
const connection = await ipfsNode.swarm.connect(serverMultiAddr).catch(console.error);
console.log('connection', connection);

await ipfsNode.libp2p.peerStore.addressBook.set(serverPeerID, [multiaddr(serverMultiAddr)]);

Maybe this is because of the DNS announce?

@vasco-santos
Copy link
Member

Ok, did you include the peer Id in the dns address? The announce address should basically be your server multiaddr including the peerId part and the tcp port

@hunterInt
Copy link
Author

Ah, didn't realize that. This seemed to fix that issue. Thank you for all your help.

@vasco-santos
Copy link
Member

Glad that it is sorted out.
Closing the issue. Feel free to reopen if further issues happen

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

3 participants