Skip to content

Commit 5bd06a8

Browse files
authored
DRIVERS-1954: SDAM should give priority to electionId over setVersion when updating topology (#1122)
1 parent 6464c06 commit 5bd06a8

17 files changed

+877
-119
lines changed

source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst

+60-46
Original file line numberDiff line numberDiff line change
@@ -293,10 +293,11 @@ Fields:
293293

294294
* type: a `TopologyType`_ enum value. See `initial TopologyType`_.
295295
* setName: the replica set name. Default null.
296-
* maxSetVersion: an integer or null. The largest setVersion ever reported by
297-
a primary. Default null.
298296
* maxElectionId: an ObjectId or null. The largest electionId ever reported by
299-
a primary. Default null.
297+
a primary. Default null. Part of the (``electionId``, ``setVersion``) tuple.
298+
* maxSetVersion: an integer or null. The largest setVersion ever reported by
299+
a primary. It may not monotonically increase, as electionId takes precedence in ordering
300+
Default null. Part of the (``electionId``, ``setVersion``) tuple.
300301
* servers: a set of ServerDescription instances.
301302
Default contains one server: "localhost:27017", ServerType Unknown.
302303
* stale: a boolean for single-threaded clients, whether the topology must
@@ -351,10 +352,10 @@ Fields:
351352
The client `monitors all three types of servers`_ in a replica set.
352353
* (=) tags: map from string to string. Default empty.
353354
* (=) setName: string or null. Default null.
354-
* (=) setVersion: integer or null. Default null.
355355
* (=) electionId: an ObjectId, if this is a MongoDB 2.6+ replica set member that
356-
believes it is primary. See `using setVersion and electionId to detect stale primaries`_.
356+
believes it is primary. See `using electionId and setVersion to detect stale primaries`_.
357357
Default null.
358+
* (=) setVersion: integer or null. Default null.
358359
* (=) primary: an address. This server's opinion of who the primary is.
359360
Default null.
360361
* lastUpdateTime: when this server was last checked. Default "infinity ago".
@@ -1094,57 +1095,49 @@ updateRSWithPrimaryFromMember
10941095
updateRSFromPrimary
10951096
This subroutine is executed with a ServerDescription of type RSPrimary::
10961097

1097-
if description.address not in topologyDescription.servers:
1098+
if serverDescription.address not in topologyDescription.servers:
10981099
return
10991100

11001101
if topologyDescription.setName is null:
1101-
topologyDescription.setName = description.setName
1102+
topologyDescription.setName = serverDescription.setName
11021103

1103-
else if topologyDescription.setName != description.setName:
1104+
else if topologyDescription.setName != serverDescription.setName:
11041105
# We found a primary but it doesn't have the setName
11051106
# provided by the user or previously discovered.
11061107
remove this server from topologyDescription and stop monitoring it
11071108
checkIfHasPrimary()
11081109
return
11091110

1110-
if description.setVersion is not null and description.electionId is not null:
1111-
# Election ids are ObjectIds, see
1112-
# "using setVersion and electionId to detect stale primaries"
1113-
# for comparison rules.
1114-
if (topologyDescription.maxSetVersion is not null and
1115-
topologyDescription.maxElectionId is not null and (
1116-
topologyDescription.maxSetVersion > description.setVersion or (
1117-
topologyDescription.maxSetVersion == description.setVersion and
1118-
topologyDescription.maxElectionId > description.electionId
1119-
)
1120-
):
1121-
1122-
# Stale primary.
1123-
replace description with a default ServerDescription of type "Unknown"
1124-
checkIfHasPrimary()
1125-
return
1126-
1127-
topologyDescription.maxElectionId = description.electionId
1128-
1129-
if (description.setVersion is not null and
1130-
(topologyDescription.maxSetVersion is null or
1131-
description.setVersion > topologyDescription.maxSetVersion)):
1132-
1133-
topologyDescription.maxSetVersion = description.setVersion
1111+
# Election ids are ObjectIds, see
1112+
# "using setVersion and electionId to detect stale primaries"
1113+
# for comparison rules.
1114+
1115+
# Null values for both electionId and setVersion are always considered less than
1116+
if serverDescription.electionId > serverDescription.maxElectionId or (
1117+
serverDescription.electionId == topologyDescription.maxElectionId
1118+
and serverDescription.setVersion >= topologyDescription.maxSetVersion
1119+
):
1120+
topologyDescription.maxElectionId = serverDescription.electionId
1121+
topologyDescription.maxSetVersion = serverDescription.setVersion
1122+
else:
1123+
# Stale primary.
1124+
# replace serverDescription with a default ServerDescription of type "Unknown"
1125+
checkIfHasPrimary()
1126+
return
11341127

11351128
for each server in topologyDescription.servers:
1136-
if server.address != description.address:
1129+
if server.address != serverDescription.address:
11371130
if server.type is RSPrimary:
11381131
# See note below about invalidating an old primary.
11391132
replace the server with a default ServerDescription of type "Unknown"
11401133

1141-
for each address in description's "hosts", "passives", and "arbiters":
1134+
for each address in serverDescription's "hosts", "passives", and "arbiters":
11421135
if address is not in topologyDescription.servers:
11431136
add new default ServerDescription of type "Unknown"
11441137
begin monitoring the new server
11451138

11461139
for each server in topologyDescription.servers:
1147-
if server.address not in description's "hosts", "passives", or "arbiters":
1140+
if server.address not in serverDescription's "hosts", "passives", or "arbiters":
11481141
remove the server and stop monitoring it
11491142

11501143
checkIfHasPrimary()
@@ -1165,10 +1158,10 @@ updateRSFromPrimary
11651158

11661159
See `replica set monitoring with and without a primary`_.
11671160

1168-
If the server is primary with an obsolete setVersion or electionId, it is
1161+
If the server is primary with an obsolete electionId or setVersion, it is
11691162
likely a stale primary that is going to step down. Mark it Unknown and let periodic
11701163
monitoring detect when it becomes secondary. See
1171-
`using setVersion and electionId to detect stale primaries`_.
1164+
`using electionId and setVersion to detect stale primaries`_.
11721165

11731166
A note on checking "me": Unlike `updateRSWithPrimaryFromMember`, there is no need to remove the server if the address is not equal to
11741167
"me": since the server address will not be a member of either "hosts", "passives", or "arbiters", the server will already have been
@@ -1966,7 +1959,7 @@ list are removed.
19661959

19671960
.. _stale primaries:
19681961

1969-
Using setVersion and electionId to detect stale primaries
1962+
Using electionId and setVersion to detect stale primaries
19701963
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''
19711964

19721965
Replica set members running MongoDB 2.6.10+ or 3.0+ include an integer called
@@ -1977,13 +1970,17 @@ protocol versions; electionIds from one protocol version must not be compared
19771970
to electionIds from a different protocol.
19781971

19791972
Because protocol version changes require replica set reconfiguration,
1980-
clients use the tuple (setVersion, electionId) to detect stale primaries.
1973+
clients use the tuple (electionId, setVersion) to detect stale primaries.
1974+
The tuple order comparison MUST be checked in the order of electionId followed
1975+
by setVersion since that order of comparison is guaranteed monotonicity.
19811976

1982-
The client remembers the greatest setVersion and electionId reported by a primary,
1977+
The client remembers the greatest electionId and setVersion reported by a primary,
19831978
and distrusts primaries from older setVersions or from the same setVersion
19841979
but with lesser electionIds.
1985-
It compares setVersions as integer values.
1986-
It compares electionIds as 12-byte big-endian integers.
1980+
1981+
- It compares electionIds as 12-byte sequence i.e. memory comparison.
1982+
- It compares setVersions as integer values.
1983+
19871984
This prevents the client from oscillating
19881985
between the old and new primary during a split-brain period,
19891986
and helps provide read-your-writes consistency with write concern "majority"
@@ -2024,15 +2021,18 @@ reads with WriteConcern Majority and ReadPreference Primary."
20242021
Detecting a stale primary
20252022
`````````````````````````
20262023

2027-
To prevent this scenario, the client uses setVersion and electionId to
2024+
To prevent this scenario, the client uses electionId and setVersion to
20282025
determine which primary was elected last. In this case, it would not consider
2029-
A primary, nor read from it, after receiving B's hello or legacy hello response with the
2030-
same setVersion and a greater electionId.
2026+
"A" a primary, nor read from it because server B will have a greater electionId
2027+
but the same setVersion.
20312028

20322029
Monotonicity
20332030
````````````
20342031

2035-
The electionId is an ObjectId compared bytewise in big-endian order.
2032+
The electionId is an ObjectId compared bytewise in order.
2033+
2034+
(ie. 000000000000000000000001 > 000000000000000000000000, FF0000000000000000000000 > FE0000000000000000000000 etc.)
2035+
20362036
In some server versions, it is monotonic with respect
20372037
to a particular servers' system clock, but is not globally monotonic across
20382038
a deployment. However, if inter-server clock skews are small, it can be
@@ -2426,6 +2426,18 @@ Why is auto-discovery the preferred default?
24262426

24272427
Auto-discovery is most resilient and is therefore preferred.
24282428

2429+
Why is it possible for maxSetVersion to go down?
2430+
''''''''''''''''''''''''''''''''''''''''''''''''
2431+
2432+
``maxElectionId`` and ``maxSetVersion`` are actually considered a pair of values
2433+
Drivers MAY consider implementing comparison in code as a tuple of the two to ensure their always updated together:
2434+
2435+
.. code:: typescript
2436+
2437+
// New tuple old tuple
2438+
{ electionId: 2, setVersion: 1 } > { electionId: 1, setVersion: 50 }
2439+
2440+
In this scenario, the maxSetVersion goes from 50 to 1, but the maxElectionId is raised to 2.
24292441

24302442
Acknowledgments
24312443
---------------
@@ -2440,6 +2452,8 @@ Bernie Hackett gently oversaw the specification process.
24402452
Changes
24412453
-------
24422454

2455+
2021-01-17: Require clients to compare (electionId, setVersion) tuples.
2456+
24432457
2015-12-17: Require clients to compare (setVersion, electionId) tuples.
24442458

24452459
2015-10-09: Specify electionID comparison method.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
{
2+
"description": "ElectionId is considered higher precedence than setVersion",
3+
"uri": "mongodb://a/?replicaSet=rs",
4+
"phases": [
5+
{
6+
"responses": [
7+
[
8+
"a:27017",
9+
{
10+
"ok": 1,
11+
"helloOk": true,
12+
"isWritablePrimary": true,
13+
"hosts": [
14+
"a:27017",
15+
"b:27017"
16+
],
17+
"setName": "rs",
18+
"setVersion": 1,
19+
"electionId": {
20+
"$oid": "000000000000000000000001"
21+
},
22+
"minWireVersion": 0,
23+
"maxWireVersion": 6
24+
}
25+
],
26+
[
27+
"b:27017",
28+
{
29+
"ok": 1,
30+
"helloOk": true,
31+
"isWritablePrimary": true,
32+
"hosts": [
33+
"a:27017",
34+
"b:27017"
35+
],
36+
"setName": "rs",
37+
"setVersion": 2,
38+
"electionId": {
39+
"$oid": "000000000000000000000001"
40+
},
41+
"minWireVersion": 0,
42+
"maxWireVersion": 6
43+
}
44+
],
45+
[
46+
"a:27017",
47+
{
48+
"ok": 1,
49+
"helloOk": true,
50+
"isWritablePrimary": true,
51+
"hosts": [
52+
"a:27017",
53+
"b:27017"
54+
],
55+
"setName": "rs",
56+
"setVersion": 1,
57+
"electionId": {
58+
"$oid": "000000000000000000000002"
59+
},
60+
"minWireVersion": 0,
61+
"maxWireVersion": 6
62+
}
63+
]
64+
],
65+
"outcome": {
66+
"servers": {
67+
"a:27017": {
68+
"type": "RSPrimary",
69+
"setName": "rs",
70+
"setVersion": 1,
71+
"electionId": {
72+
"$oid": "000000000000000000000002"
73+
}
74+
},
75+
"b:27017": {
76+
"type": "Unknown",
77+
"setName": null,
78+
"setVersion": null,
79+
"electionId": null
80+
}
81+
},
82+
"topologyType": "ReplicaSetWithPrimary",
83+
"logicalSessionTimeoutMinutes": null,
84+
"setName": "rs",
85+
"maxSetVersion": 1,
86+
"maxElectionId": {
87+
"$oid": "000000000000000000000002"
88+
}
89+
}
90+
}
91+
]
92+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
description: ElectionId is considered higher precedence than setVersion
2+
uri: "mongodb://a/?replicaSet=rs"
3+
phases:
4+
- responses:
5+
- - "a:27017"
6+
- ok: 1
7+
helloOk: true
8+
isWritablePrimary: true
9+
hosts:
10+
- "a:27017"
11+
- "b:27017"
12+
setName: rs
13+
setVersion: 1
14+
electionId:
15+
$oid: "000000000000000000000001"
16+
minWireVersion: 0
17+
maxWireVersion: 6
18+
- - "b:27017"
19+
- ok: 1
20+
helloOk: true
21+
isWritablePrimary: true
22+
hosts:
23+
- "a:27017"
24+
- "b:27017"
25+
setName: rs
26+
setVersion: 2 # Even though "B" reports the newer setVersion, "A" will report the newer electionId which should allow it to remain the primary
27+
electionId:
28+
$oid: "000000000000000000000001"
29+
minWireVersion: 0
30+
maxWireVersion: 6
31+
- - "a:27017"
32+
- ok: 1
33+
helloOk: true
34+
isWritablePrimary: true
35+
hosts:
36+
- "a:27017"
37+
- "b:27017"
38+
setName: rs
39+
setVersion: 1
40+
electionId:
41+
$oid: "000000000000000000000002"
42+
minWireVersion: 0
43+
maxWireVersion: 6
44+
outcome:
45+
servers:
46+
"a:27017":
47+
type: RSPrimary
48+
setName: rs
49+
setVersion: 1
50+
electionId:
51+
$oid: "000000000000000000000002"
52+
"b:27017":
53+
type: Unknown
54+
setName: null
55+
setVersion: null
56+
electionId: null
57+
topologyType: ReplicaSetWithPrimary
58+
logicalSessionTimeoutMinutes: null
59+
setName: rs
60+
maxSetVersion: 1
61+
maxElectionId:
62+
$oid: "000000000000000000000002"

0 commit comments

Comments
 (0)