Skip to content

refactor: Store data in unhydrated nodes directly instead of wrapping MapTree nodes #24739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

CraigMacomber
Copy link
Contributor

@CraigMacomber CraigMacomber commented May 30, 2025

Description

Implement map tree APIs with UnhydratedFlexTreeNode directly instead of a wrapping a MapTree.

Store contextual default providers directly in a strongly typed way in the tree. Implicitly generate them using a global context if requested while unhydrated instead of special casing identifiers in the read code paths in multiple places.

Provide contextual defaults as part of prepare for hydration pass instead of its own pass.

Overall, this refactor should make the handling un unhydrated data simpler and more robust:

  • It is no longer possible for the unhydrated tree and map tree and/or the maps associating them to be out of sync.
  • The special case handling of identifiers has been implemented in a general way that will handle any contextual defaults, and no longer needs special logic in any code path reading from the tree.
  • There is no longer questionable ownership of ExclusiveMapTree nodes due to cache lookup.
  • Memory use should be slightly reduced by avoiding having the separate map tree objects and the caches to associate them with the unhydrated nodes.

Breaking Changes

Defaulted Identifiers show up when enumerating fields on an unhydrated node.

This seems like an improvement/simplification (makes unhydrated and hydrated nodes more similar, and identifiers less special), but could theoretically break something.

Reviewer Guidance

The review process is outlined on this wiki page.

@github-actions github-actions bot added base: main PRs targeted against main branch area: dds Issues related to distributed data structures area: dds: tree labels May 30, 2025
CraigMacomber added a commit that referenced this pull request May 31, 2025
@CraigMacomber CraigMacomber changed the title Unhydrated refactor refactor: Store data in unhydrated nodes directly instead of wrapping MapTree nodes Jun 2, 2025
@@ -10,30 +10,24 @@ import { isFluidHandle } from "@fluidframework/runtime-utils/internal";
import {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file and its functions have not been renamed yet to help simplify the diff.
A follow-up change will rename them.

@CraigMacomber CraigMacomber marked this pull request as ready for review June 2, 2025 17:08
@CraigMacomber CraigMacomber requested review from a team as code owners June 2, 2025 17:08
@@ -1112,8 +1114,6 @@ export class SchemaFactory<
*
* - A compressed form of the identifier can be accessed at runtime via the {@link TreeNodeApi.shortId|Tree.shortId()} API.
*
* - It will not be present in the object's iterable properties until explicitly read or until having been inserted into a tree.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removal is the potentially breaking change noted in the changeset.

/**
* Cache storing the TreeNode for this inner node.
*/
public treeNode: TreeNode | undefined = undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the write access to this be restricted? If not, what are the rules / expectations for external setting of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a time where we expect the value to be set back to undefined (externally) after being populated? If not, would a setter that doesn't accept undefined be appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have clarified the comment slightly, but otherwise left this as is.

How flex trees get associated with TreeNodes is really an implementation detail of the code which makes TreeNodes, (currently TreeNodeKernel), and this code shouldn't care. It's providing this field for use by TreeNodeKernel, which, as an implementation detail, currently will never set this back to undefined.

I don't think making this a private property then adding getters and setters with strictly enforces rules really would simplify things over the existing code: it would just move the policy from tree node kernel to living in both places.

These classes are mostly used as implementation of FlexTreeNode, and properties like this one which are only exposed in the implementation and not via the interface are only used by code which is implementing unhydrated node specific logic, which there shouldn't be much of (especially after this change). This isn't a type that's at all wildly used and the total set of places with access to this property is very small.

Users who are not part of the implementation logic of unhydrated nodes will just use getOrCreateNodeFromInnerNode. We can't encapsulate that get or create logic inside of the getter here as that would be a cyclic dep and break our layering. Adding a whole getter+setter wrapper to place a limit of the setter while still getting the getter return undefined when not generated yet just doesn't seem like a real safety win worth the runtime access overhead and extra logical coupling.


public constructor(
public readonly simpleContext: Context,
public readonly context: FlexTreeContext,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of an interface implementation and thus implicitly gets inherited documentation.

unhydratedContext,
mapTreeFromNodeData(input as object, this as unknown as ImplicitAllowedTypes),
);
return mapTreeFromNodeData(input as object, this as unknown as ImplicitAllowedTypes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: a quick comment to explain the cast through unknown would probably be good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tweaked to avoid casting through unknown

@noencke noencke requested a review from daesunp June 2, 2025 22:48
0xa27 /* Expected hydrated identifier */,
);
const identifierValue = identifier.value as string;
const identifierField = flexNode.tryGetField(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daesunp FYI the logic has changed here - you may want to review

/**
* Returns true if and only if the given iterable has at least one element.
*/
export function iterableHasSome<T>(iterable: Iterable<T>): boolean {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be slick if this were just another overload of hasSome (put it below the array: readonly T[] variant so it has lowest precedence). This implementation should work for all of the hasSome overloads. I know you are wary of overloads 😅, but in this case it seems like it'd be a net good.

): void {
if (hydratedData !== undefined) {
// Prepare content before validating side this populated defaults using the provided context rather than the global context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble parsing this paragraph (but it's definitely better than no paragraph - thanks). It's trying to communicate two things?

  1. That we apply the defaults (via prepareContentForHydration) before we validate against the schema, and
  2. That we have more context available to us here than we do inside of schema validation, so it's important to apply the defaults with that context while we can

?

onEdit?: () => void,
/** Creates a field with the given attributes */
export function createField(
...args: [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

...args: ConstructorParameters<typeof UnhydratedFlexTreeField>

To implicitly explain why the arguments are like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats way better. Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds: tree area: dds Issues related to distributed data structures base: main PRs targeted against main branch changeset-present
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants