Description
Versions/Environment
- What version of Rust are you using?
rustc 1.84.1 (e71f9a9a9 2025-01-27)
- What operating system are you using?
macOS Sequoia 15.3.1
- What versions of the driver and its dependencies are you using? (Run
cargo pkgid mongodb
&cargo pkgid bson
)
registry+https://github.com/rust-lang/crates.io-index#[email protected]
registry+https://github.com/rust-lang/crates.io-index#[email protected]
- What version of MongoDB are you using? (Check with the MongoDB shell using
db.version()
)
"6.0.20"
- What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)?
Replica Set
Describe the bug
When connected to a hidden secondary replica with directConnection=true
, the MongoDB Rust driver fails to retrieve additional documents beyond 16MB (by default) due to server selection logic.
BE SPECIFIC:
- What is the expected behavior and what is actually happening?
Actual:
let cursor = collection.find(doc! {}).batch_size(10000).await?
match cursor.try_next().await {
Ok(Some(doc)) => {
batch.push(doc);
println!("Batch length: {}", batch.len());
},
Ok(None) => break, // No more documents
Err(e) => {
println!("Error fetching document: {}", e);
continue;
}
}
At one time mongo transfers only 16MB of data which in my case is around 3k documents. Afterwards cursor attempts to fetch more documents. However, the server selection logic determines that there is no data-bearing replica available for polling, as RsOther
is not included in the is_data_bearing
function.
As a result, the cursor remains stuck and cannot fetch additional documents..
Kind: Server selection timeout: None of the available servers suitable for criteria Predicate. Topology: { Type: Single, Servers: [ { Address: mongo3:27019, Type: RsOther, Average RTT: 5.854334ms, Last Update Time: 2025-03-14 13:39:21.591 +00:00:00, Max Wire Version: 17, Min Wire Version: 0, Replica Set Name: rs0, Replica Set Version: 1 } ] }, labels: {}
Expected:
The MongoDB cursor should continuously poll for additional documents until all available data has been retrieved.
- Do you have any ideas on why this may be happening that could give us a
clue in the right direction?
A hidden MongoDB replica is classified as RsOther
. When attempting to select a server for a get_more
request, RsOther
instances cannot be selected. Additionally, when directConnection=true
is used, the driver does not allow any other servers to be chosen, leading to a deadlock situation where the cursor cannot fetch additional documents.
- Are there multiple ways of triggering this bug (perhaps more than one
function produce a crash)?
This issue occurs with any cursor that has a batch_size
large enough to retrieve more than 16MB of data from MongoDB. Once the initial batch limit is reached, the cursor fails to fetch additional documents due to the server selection constraints.
To Reproduce
- Set up a MongoDB replica set.
- Configure one of the replicas as hidden.
- Connect to the hidden replica using
directConnection=true
. - Create a cursor with a
batch_size
large enough to exceed 16MB of data (or your system’s default limit). - Start polling the cursor.
- Observe that the cursor gets stuck and cannot fetch additional documents.