Skip to content

fix: client not approved timeout exception #2466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 changes: 2 additions & 0 deletions com.unity.netcode.gameobjects/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ Additional documentation and release notes are available at [Multiplayer Documen

### Fixed

- Fixed issue with approval time out where upon a client not being approved the server-side time out coroutine, when it ended, could throw an exception on third party transports.(#2466)
- Fixed issue where the approval time out coroutine was not being stopped once a client was approved or not approved. (#2466)
- Fixed an issue where Named Message Handlers could remove themselves causing an exception when the metrics tried to access the name of the message.(#2426)
- Fixed registry of public `NetworkVariable`s in derived `NetworkBehaviour`s (#2423)
- Fixed issue where runtime association of `Animator` properties to `AnimationCurve`s would cause `NetworkAnimator` to attempt to update those changes. (#2416)
Expand Down
61 changes: 57 additions & 4 deletions com.unity.netcode.gameobjects/Runtime/Core/NetworkManager.cs
Original file line number Diff line number Diff line change
Expand Up @@ -1615,7 +1615,12 @@ private IEnumerator ApprovalTimeout(ulong clientId)

if (IsServer)
{
DisconnectClient(clientId);
// Only disconnect when the connection times out.
// If the connection is not approved it will automatically disconnect the client
if (timedOut)
{
DisconnectClient(clientId);
}
Comment on lines -1618 to +1623
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I'd like to quote your (@NoelStephensUnity) comment inline here (for the discussion):

With other transports the removal of client id mapping can happen immediately (server side) when the client is not approved yet the coroutine is left running. Upon timing out (a non-approved server-side client time out coroutine), the server then attempts to disconnect the client which ends up throwing an exception. This doesn't happen with UnityTransport (currently) so this PR is to just to track the coroutines and end them when the client or the server side have received some form of approval response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also quoting an exception from the linked issue (#2389):

System.Collections.Generic.Dictionary`2[TKey,TValue].get_Item (TKey key) (at <d6232873609549b8a045fa15811a5bd3>:0)
Unity.Netcode.NetworkManager.ClientIdToTransportId (System.UInt64 clientId) (at Library/PackageCache/[email protected]/Runtime/Core/NetworkManager.cs:1781)
Unity.Netcode.NetworkManager.DisconnectRemoteClient (System.UInt64 clientId) (at Library/PackageCache/[email protected]/Runtime/Core/NetworkManager.cs:1361)
Unity.Netcode.NetworkManager.DisconnectClient (System.UInt64 clientId, System.String reason) (at Library/PackageCache/[email protected]/Runtime/Core/NetworkManager.cs:2063)
Unity.Netcode.NetworkManager.DisconnectClient (System.UInt64 clientId) (at Library/PackageCache/[email protected]/Runtime/Core/NetworkManager.cs:2038)
Unity.Netcode.NetworkManager+<ApprovalTimeout>d__180.MoveNext () (at Library/PackageCache/[email protected]/Runtime/Core/NetworkManager.cs:1765)
UnityEngine.SetupCoroutine.InvokeMoveNext (System.Collections.IEnumerator enumerator, System.IntPtr returnValueAddress) (at <bae255e3e08e46f7bc2fbd23dde96338>:0)```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not change the behavior here — the minimum fix is to simply handle NM.DisconnectClient() call gracefully and leave other issues/exceptions related to transport, including 3rd-party ones, up their own implementation.

As a rule of thumb, a transport should be able to handle a disconnect or a send message call to an unknown or "just-removed" connection gracefully.

Perhaps this would give the general idea:

internal ulong TransportIdToClientId(ulong transportId)
{
-    return transportId == m_ServerTransportId ? ServerClientId : m_TransportIdToClientIdMap[transportId];
+    if (transportId == m_ServerTransportId)
+    {
+        return ServerClientId;
+    }
+
+    if (m_TransportIdToClientIdMap.TryGetValue(transportId, out var clientId))
+    {
+        return clientId;
+    }
+
+    return default;
}

internal ulong ClientIdToTransportId(ulong clientId)
{
-    return clientId == ServerClientId ? m_ServerTransportId : m_ClientIdToTransportIdMap[clientId];
+    if (clientId == ServerClientId)
+    {
+        return m_ServerTransportId;
+    }
+
+    if (m_ClientIdToTransportIdMap.TryGetValue(clientId, out var transportId))
+    {
+        return transportId;
+    }
+
+    return default;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure why I approached it this way.
Fixing the transport id mapping is indeed a less intrusive fix for the issue.

}
else
{
Expand All @@ -1634,6 +1639,52 @@ internal ulong ClientIdToTransportId(ulong clientId)
return clientId == ServerClientId ? m_ServerTransportId : m_ClientIdToTransportIdMap[clientId];
}

internal Dictionary<ulong, Coroutine> ApprovalTimeouts = new Dictionary<ulong, Coroutine>();

/// <summary>
/// Starts a timeout coroutine during the initial connection sequence.
/// </summary>
/// <remarks>
/// The clientId is relative to the local instance.
/// Server: Uses the client identifier to associate started coroutines.
/// Client: Uses the server identifier to associate the started coroutine.
/// </remarks>
/// <param name="clientId"></param>
private void StartClientTimeout(ulong clientId)
{
// Don't need to start a timeout coroutine locally for the host
if (IsServer && clientId == ServerClientId)
{
return;
}
if (!ApprovalTimeouts.ContainsKey(clientId))
{
ApprovalTimeouts.Add(clientId, StartCoroutine(ApprovalTimeout(clientId)));
}
else
{
// Theoretically this should never happen, but in case it does log a warning
if (LogLevel <= LogLevel.Normal)
{
NetworkLog.LogWarning($"Attempting to start a client approval timeout while one already exists!");
}
ApprovalTimeouts[clientId] = StartCoroutine(ApprovalTimeout(clientId));
}
}

/// <summary>
/// Stops the time out coroutine
/// </summary>
internal void StopClientTimeout(ulong clientId)
{
if (!ApprovalTimeouts.ContainsKey(clientId))
{
return;
}
StopCoroutine(ApprovalTimeouts[clientId]);
ApprovalTimeouts.Remove(clientId);
}

private void HandleRawTransportPoll(NetworkEvent networkEvent, ulong clientId, ArraySegment<byte> payload, float receiveTime)
{
var transportId = clientId;
Expand Down Expand Up @@ -1675,7 +1726,7 @@ private void HandleRawTransportPoll(NetworkEvent networkEvent, ulong clientId, A
ConnectionState = PendingClient.State.PendingConnection
});

StartCoroutine(ApprovalTimeout(clientId));
StartClientTimeout(clientId);
}
else
{
Expand All @@ -1685,7 +1736,8 @@ private void HandleRawTransportPoll(NetworkEvent networkEvent, ulong clientId, A
}

SendConnectionRequest();
StartCoroutine(ApprovalTimeout(clientId));

StartClientTimeout(clientId);
}

#if DEVELOPMENT_BUILD || UNITY_EDITOR
Expand Down Expand Up @@ -2046,6 +2098,8 @@ private void SyncTime()
/// <param name="response">The response to allow the player in or not, with its parameters</param>
internal void HandleConnectionApproval(ulong ownerClientId, ConnectionApprovalResponse response)
{
// Stop the server-side coroutine whether approved or not
StopClientTimeout(ownerClientId);
if (response.Approved)
{
// Inform new client it got approved
Expand Down Expand Up @@ -2164,7 +2218,6 @@ internal void HandleConnectionApproval(ulong ownerClientId, ConnectionApprovalRe

MessagingSystem.ProcessSendQueues();
}

PendingClients.Remove(ownerClientId);
DisconnectRemoteClient(ownerClientId);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,10 @@ public void Handle(ref NetworkContext context)
networkManager.LocalClient = new NetworkClient() { ClientId = networkManager.LocalClientId };
networkManager.IsApproved = true;

// Upon receiving the approved message, stop the time out coroutine.
// (Client's associate their time out coroutine with the server identifier.)
networkManager.StopClientTimeout(NetworkManager.ServerClientId);

// Only if scene management is disabled do we handle NetworkObject synchronization at this point
if (!networkManager.NetworkConfig.EnableSceneManagement)
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,16 @@ namespace Unity.Netcode.RuntimeTests
{
[TestFixture(ApprovalTimedOutTypes.ServerDoesNotRespond)]
[TestFixture(ApprovalTimedOutTypes.ClientDoesNotRequest)]
[TestFixture(ApprovalTimedOutTypes.ClientIsNotApproved)]
public class ConnectionApprovalTimeoutTests : NetcodeIntegrationTest
{
protected override int NumberOfClients => 1;

public enum ApprovalTimedOutTypes
{
ClientDoesNotRequest,
ServerDoesNotRespond
ServerDoesNotRespond,
ClientIsNotApproved,
}

private ApprovalTimedOutTypes m_ApprovalFailureType;
Expand Down Expand Up @@ -48,13 +50,35 @@ protected override IEnumerator OnTearDown()

protected override void OnServerAndClientsCreated()
{
m_ServerNetworkManager.NetworkConfig.ClientConnectionBufferTimeout = k_TestTimeoutPeriod;
m_ServerNetworkManager.LogLevel = LogLevel.Developer;
m_ClientNetworkManagers[0].NetworkConfig.ClientConnectionBufferTimeout = k_TestTimeoutPeriod;
m_ServerNetworkManager.NetworkConfig.ConnectionApproval = m_ApprovalFailureType == ApprovalTimedOutTypes.ClientIsNotApproved;
if (m_ApprovalFailureType == ApprovalTimedOutTypes.ClientIsNotApproved)
{
m_ServerNetworkManager.ConnectionApprovalCallback = NetworkManagerObject_ConnectionApprovalCallback;
}
else
{
m_ServerNetworkManager.NetworkConfig.ClientConnectionBufferTimeout = k_TestTimeoutPeriod;
m_ClientNetworkManagers[0].NetworkConfig.ClientConnectionBufferTimeout = k_TestTimeoutPeriod;
}
m_ClientNetworkManagers[0].NetworkConfig.ConnectionApproval = m_ApprovalFailureType == ApprovalTimedOutTypes.ClientIsNotApproved;
m_ClientNetworkManagers[0].LogLevel = LogLevel.Developer;
base.OnServerAndClientsCreated();
}

private bool m_ClientWasNotApproved;
private void NetworkManagerObject_ConnectionApprovalCallback(NetworkManager.ConnectionApprovalRequest request, NetworkManager.ConnectionApprovalResponse response)
{
response.Approved = request.ClientNetworkId == NetworkManager.ServerClientId;
m_ClientWasNotApproved = !response.Approved;
if (!response.Approved)
{
Assert.IsTrue(m_ServerNetworkManager.ApprovalTimeouts.Count > 0, $"There are ({m_ServerNetworkManager.ApprovalTimeouts.Count}) approval timeout coroutines when there should at least be 1!");
}
response.Reason = "Testing approval failure and TimedOut coroutine.";
VerboseDebug($"Client {request.ClientNetworkId} was approved {response.Approved}");
}

protected override IEnumerator OnStartedServerAndClients()
{
if (m_ApprovalFailureType == ApprovalTimedOutTypes.ServerDoesNotRespond)
Expand All @@ -64,7 +88,7 @@ protected override IEnumerator OnStartedServerAndClients()
m_ExpectedLogMessage = new Regex("Timed out waiting for the server to approve the connection request.");
m_LogType = LogType.Log;
}
else
else if (m_ApprovalFailureType == ApprovalTimedOutTypes.ClientDoesNotRequest)
{
// We catch (don't process) the incoming connection request message to simulate a transport connection but the client never
// sends (or takes too long to send) the connection request.
Expand All @@ -74,23 +98,37 @@ protected override IEnumerator OnStartedServerAndClients()
m_ExpectedLogMessage = new Regex("Server detected a transport connection from Client-1, but timed out waiting for the connection request message.");
m_LogType = LogType.Warning;
}
// Otherwise the not approved test doesn't intercept any messages
yield return null;
}

[UnityTest]
public IEnumerator ValidateApprovalTimeout()
{
// Delay for half of the wait period
yield return new WaitForSeconds(k_TestTimeoutPeriod * 0.5f);
if (m_ApprovalFailureType != ApprovalTimedOutTypes.ClientIsNotApproved)
{
// Delay for half of the wait period
yield return new WaitForSeconds(k_TestTimeoutPeriod * 0.5f);

// Verify we haven't received the time out message yet
NetcodeLogAssert.LogWasNotReceived(LogType.Log, m_ExpectedLogMessage);

// Verify we haven't received the time out message yet
NetcodeLogAssert.LogWasNotReceived(LogType.Log, m_ExpectedLogMessage);
// Wait for 3/4s of the time out period to pass (totaling 1.25x the wait period)
yield return new WaitForSeconds(k_TestTimeoutPeriod * 0.75f);

// Wait for 3/4s of the time out period to pass (totaling 1.25x the wait period)
yield return new WaitForSeconds(k_TestTimeoutPeriod * 0.75f);
// We should have the test relative log message by this time.
NetcodeLogAssert.LogWasReceived(m_LogType, m_ExpectedLogMessage);
}
else
{
// Make sure a client was not approved
yield return WaitForConditionOrTimeOut(() => m_ClientWasNotApproved);
AssertOnTimeout($"Did not detect that the client was not approved while waiting!");

// We should have the test relative log message by this time.
NetcodeLogAssert.LogWasReceived(m_LogType, m_ExpectedLogMessage);
// Make sure that there are no timed out coroutines left running on the server after a client is not approved
yield return WaitForConditionOrTimeOut(() => m_ServerNetworkManager.ApprovalTimeouts.Count == 0);
AssertOnTimeout($"Waited for {nameof(NetworkManager.ApprovalTimeouts)} to reach a count of 0 but is still at {m_ServerNetworkManager.ApprovalTimeouts.Count}!");
}

// It should only have the host client connected
Assert.AreEqual(1, m_ServerNetworkManager.ConnectedClients.Count, $"Expected only one client when there were {m_ServerNetworkManager.ConnectedClients.Count} clients connected!");
Expand Down