Skip to content

feat(NODE-6507): generate encryption configuration on mongoose connect #15320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 15, 2025

Conversation

baileympearson
Copy link
Contributor

@baileympearson baileympearson commented Mar 18, 2025

Summary

This PR contains the meat and potatoes of the automatic CSFLE/QE integration into mongoose. With the changes included, users can now create encrypted schemas, instantiate models with them, insert and query using their model and have the resultant documents be encrypted and decrypted automatically.

This PR contains a lot of test changes. I tried to be as exhaustive to ensure as much coverage as possible, and because the schema format is unlikely to change (changing the format of encryptedFieldsMap or schemaMap would break a lot of drivers and applications). So ideally these tests are not a large maintenance burden.

Examples

See the integration tests in encryption.test.ts for an example of how the API works.

@baileympearson baileympearson marked this pull request as ready for review March 18, 2025 16:50
Copy link
Collaborator

@vkarpov15 vkarpov15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see tests covering:

  1. Subdocuments
  2. Document arrays
  3. Maps

If you can do CSFLE with nested paths, you should be able to also do CSFLE with subdocuments, which are slightly different.

@baileympearson
Copy link
Contributor Author

I'd also like to see tests covering:

  • Subdocuments
  • Document arrays
  • Maps
    If you can do CSFLE with nested paths, you should be able to also do CSFLE with subdocuments, which are slightly different.

I believe I now have tests for all of these. Let me know if I'm still missing anything

const isNonRootDiscriminator = schema.discriminatorMapping && !schema.discriminatorMapping.isRoot;
if (isNonRootDiscriminator) {
const rootSchema = schema._baseSchema;
schema.eachPath((pathname) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential edge case here: discriminator base schema defines nested path, discriminator child schema defines subdocument with same path but different options.

const schema = new Schema({
  name: {
    first: { type: String, encrypt: { keyId: [keyId], algorithm } }
  }
});

const discriminatorSchema = new Schema({
  name: new Schema({ first: Number }) // Different type, no encryption, stored as same field in MDB
});

schema.eachpath() doesn't account for subdocuments because subdocuments have a distinct schema (it does account for nested paths though because nested paths do not have their own schema)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I wrote a test for this scenario and you were right.

I've been trying out some different approaches and I can't find a better solution than something like this:

  function* allPaths(schema, prefix) {
    for (const path of Object.keys(schema.paths)) {
      const fullPath = prefix != null ? `${prefix}.${path}` : path;
      if (schema.path(path).instance === 'Embedded') {
        yield* allPaths(schema.path(path).schema, fullPath);
      } else {
        yield fullPath;
      }
    }
  }

  const paths = new Set(allPaths(schema));

  for (const path of allPaths(model.schema)) {
    if (paths.has(path) && (model.schema._hasEncryptedField(path) || schema._hasEncryptedField(path))) {
      throw new Error(`cannot declare an encrypted field on child schema overriding base schema. key=${path}`);
    }
  }

This generates all possible paths in the schema (the above doesn't handle arrays, but encryption on fields in arrays isn't supported so that's out of scope here). Not my first choice, because this feels brittle. I'll keep looking at it but

  1. Can you think of a better solution here than recursively iterating over all paths + nested schemas?
  2. If not, does a utility that I could use like allPaths exist somewhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think recursively checking all paths is necessary here because you just want to find conflicts in the top-level paths. So if discriminator schema has a path pathname with an encrypted field, and root schema has a nested path with rootSchema.nested[pathname.split('.')[0]], you can already call that a conflict and throw an error. Similarly, if discriminator schema has a nested path pathname but in root schema you have rootSchema.paths[pathname] then you can also throw an error.

Copy link
Contributor Author

@baileympearson baileympearson Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, right? That would throw an error in scenarios where the child schema provides a subdocument with the same root path but doesn't modify the encrypted path. ex:

const schema = new Schema({
  name: {
    first: { type: String, encrypt: { keyId: [keyId], algorithm } }
  }
});

const discriminatorSchema = new Schema({
  name: new Schema({ age: Number }) // Different path, no encryption
});

I'd expect this to be fine, because there isn't a conflicting path for name.first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case you described, name: new Schema({ age: Number }) would actually override name: { first: { type: String } }, so the discriminator schema would not have a name.first property at all. Discriminators merge nested paths from root schema, but subdocuments override because subdocuments can have middleware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the approach of determining conflicting base paths, as you suggested but I ran into two complications:

  1. It isn't enough to simply detect base path collisions, you need to check that something in the schema also has an encrypted field that uses the base path as well. This is feasible but requires some logic to allow for asking "is there an encrypted field at some subpath of this path?"
  2. It felt like there were more corner cases than not - schemas with only one level of nesting, one level of subdocuments, sub documents on parent but not child and vice versa, multiple layers of nesting of subdocuments, etc.

I decided that the two above points complicated the approach. The changes in this PR were much simpler for me to work with.. This approach works by:

  1. finding the intersection of all keypaths for all fields (including subdocuments) between the parent and child
  2. for each keypath, validates that this keypath isn't encrypted on the parent and on the child

Let me know what you think - I can rework it if you would prefer a different approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is a reasonable approach. This approach is more restrictive, but simpler implementation. If it proves to be too restrictive, we can come up with a more flexible approach.

}
}

function* allNestedPaths(schema) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat use case for generators, but I think doing [...Object.keys(schema.paths), ...Object.keys(schema.singleNestedPaths)] would be more concise and more consistent with the rest of the codebase

Copy link
Collaborator

@vkarpov15 vkarpov15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge into csfle branch

@vkarpov15 vkarpov15 merged commit 2fc58da into Automattic:csfle Apr 15, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants