-
Notifications
You must be signed in to change notification settings - Fork 38
feat!(messagev2): tweak dag-cbor message schema #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For: 1. Efficiency: compacting the noisy structures into tuples representations and making top-level components of a message optional. 2. Migrations: providing a secondary mechanism to lean on for versioning if we want a gentler upgrade path than libp2p protocol versioning. Closes: #351
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I have a few comments:
- I am pretty anxious about tuple representation for requests and responses. While the versioning helps, if we can move around fields without breaking versions, I much prefer that. Also seems like message size is already a reduction so... maybe that's fine?
- I feel less anxious about blocks being tuple. Blocks are blocks. They have a well defined standard and aren't likely to change.
- why do we need versioning in the message itself? Isn't the libp2p protocol version sufficient to give us this information?
2f1f55a
to
b549d9b
Compare
b549d9b
to
a5b21c4
Compare
Adjusted schema as per feedback above and some discussion today via Zoom. The requests and responses are back to maps with keys because we don't expect them to repeat much and it's nice to have them descriptive. But I've also optionalised more of the fields in there so they could be left out entirely without any pain. {
"gs2": {
"blk": [
[
{ "/": { "bytes": "AVUSIA" } },
{ "/": { "bytes": "QgTLmh40xfCOmyCqdgkOcAILtWwMo9OvcpbNEFilESiQ/tIYSI8ITY355INftUrQRf/ZNuO/cmGwQmxRNSoJeBbtdEgruQhLSn7YrcUX8zceDgQ0tRFiXNGkF5IkPczc/ogJSw" } }
],
[
{ "/": { "bytes": "AVUSIA" } },
{ "/": { "bytes": "xfPTKlWZ2kO4US4NsU11HtHWQJ2dy3rtBdbMbAmHxjTufEI28FF8INzGqZ6G5LrcAl5fXoG0WWEoUimfNvEQ6Qa6IW4ByDcDUq0ziUrrH+WNzNCC9FaSKBovz42VowZ4zaRVhw" } }
]
],
"req": [
{
"ext": {
"AppleSauce/McGee": "yee haw"
},
"id": { "/": { "bytes": "k3Nu8gWHTfq53ypWnVFtwg" } },
"pri": 101,
"root": { "/": "bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi" },
"sel": {"R":{":>":{"|":[{".":{}},{"a":{">":{"@":{}}}}]},"l":{"none":{}}}},
"type": "n"
},
{
"id": { "/": { "bytes": "mdBpy6/wQyqT1LiaihfQWQ" } },
"pri": 202,
"root": { "/": "bafyreibdoxfay27gf4ye3t5a7aa5h4z2azw7hhhz36qrbf5qleldj76qfy" },
"sel": {"R":{":>":{"a":{">":{"@":{}}}},"l":{"none":{}}}},
"type": "n"
}
],
"rsp": [
{
"meta": [
[
{ "/": "bafyreibdoxfay27gf4ye3t5a7aa5h4z2azw7hhhz36qrbf5qleldj76qfy" },
"m"
]
],
"reqid": { "/": { "bytes": "k3Nu8gWHTfq53ypWnVFtwg" } },
"stat": 34
},
{
"ext": {
"Hippity+Hoppity": { "/": { "bytes": "9V/48SUItj7yv+ynVXrpDfYxGl7BYxtKH6hDMQvZw6cQ6qzlob3XKtC/4El3HBHnVjOL2Thl5kXxreybnJnvQH+9T8aFnnkExa19yb0QpcwWlz1bKOwabdQ9n4L58Yw9A0GONQ" } }
},
"reqid": { "/": { "bytes": "mdBpy6/wQyqT1LiaihfQWQ" } },
"stat": 14
}
]
}
} In terms of bytes sent, the difference is negligible when averaged out. We're sending a few more bytes for the map keys but not enough of them and the blocks are still tuples, we're also saving on optional fields by skipping them entirely. Unfortunately there's more pointers in here than I'd like, getting bindnode working with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema is looking good @rvagg
…ng (#332) * feat(net): initial dag-cbor protocol support also added first roundtrip benchmark * feat(requestid): use uuids for requestids Ref: #278 Closes: #279 Closes: #281 * fix(requestmanager): make collect test requests with uuids sortable * fix(requestid): print requestids as string uuids in logs * fix(requestid): use string as base type for RequestId * chore(requestid): wrap requestid string in a struct * feat(libp2p): add v1.0.0 network compatibility * chore(net): resolve most cbor + uuid merge problems * feat(net): to/from ipld bindnode types, more cbor protoc improvements * feat(net): introduce 2.0.0 protocol for dag-cbor * fix(net): more bindnode dag-cbor protocol fixes Not quite working yet, still need some upstream fixes and no extensions work has been attempted yet. * chore(metadata): convert metadata to bindnode * chore(net,extensions): wire up IPLD extensions, expose as Node instead of []byte * Extensions now working with new dag-cbor network protocol * dag-cbor network protocol still not default, most tests are still exercising the existing v1 protocol * Metadata now using bindnode instead of cbor-gen * []byte for deferred extensions decoding is now replaced with datamodel.Node everywhere. Internal extensions now using some form of go-ipld-prime decode to convert them to local types (metadata using bindnode, others using direct inspection). * V1 protocol also using dag-cbor decode of extensions data and exporting the bytes - this may be a breaking change for exising extensions - need to check whether this should be done differently. Maybe a try-decode and if it fails export a wrapped Bytes Node? * fix(src): fix imports * fix(mod): clean up go.mod * fix(net): refactor message version format code to separate packages * feat(net): activate v2 network as default * fix(src): build error * chore: remove GraphSyncMessage#Loggable Ref: #332 (comment) * chore: remove intermediate v1.1 pb protocol message type v1.1.0 was introduced to start the transition to UUID RequestIDs. That change has since been combined with the switch to DAG-CBOR messaging format for a v2.0.0 protocol. Thus, this interim v1.1.0 format is no longer needed and has not been used at all in a released version of go-graphsync. Fixes: filecoin-project/lightning-planning#14 * fix: clarify comments re dag-cbor extension data As per dission in #338, we are going to be erroring on extension data that is not properly dag-cbor encoded from now on * feat: new LinkMetadata iface, integrate metadata into Response type (#342) * feat(metadata): new LinkMetadata iface, integrate metadata into Response type * LinkMetadata wrapper around existing metadata type to allow for easier backward-compat upgrade path * integrate metadata directly into GraphSyncResponse type, moving it from an optional extension * still deal with metadata as an extension for now—further work for v2 protocol will move it into the core message schema Ref: #335 * feat(metadata): move metadata to core protocol, only use extension in v1 proto * fix(metadata): bindnode expects Go enum strings to be at the type level * fix(metadata): minor fixes, tidy up naming * fix(metadata): make gofmt and staticcheck happy * fix(metadata): docs and minor tweaks after review Co-authored-by: Daniel Martí <[email protected]> * fix: avoid double-encode for extension size estimation Closes: filecoin-project/lightning-planning#15 * feat(requesttype): introduce RequestType enum to replace cancel&update bools (#352) Closes: #345 * fix(metadata): extend round-trip tests to byte representation (#350) * feat!(messagev2): tweak dag-cbor message schema (#354) * feat!(messagev2): tweak dag-cbor message schema For: 1. Efficiency: compacting the noisy structures into tuples representations and making top-level components of a message optional. 2. Migrations: providing a secondary mechanism to lean on for versioning if we want a gentler upgrade path than libp2p protocol versioning. Closes: #351 * fix(messagev2): adjust schema per feedback * feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID (#355) * feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID Closes: #349 * fixup! feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID * fixup! feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID when using error type T, use *T with As, rather than **T * fixup! feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID * fixup! feat(graphsync): unify req & resp Pause, Unpause & Cancel by RequestID Co-authored-by: Daniel Martí <[email protected]> * feat: SendUpdates() API to send only extension data to via existing request * fix(responsemanager): send update while completing If request has finished selector traversal but is still sending blocks, I think it should be possible to send updates. As a side effect, this fixes our race. Logically, this makes sense, cause our external indicator that we're done (completed response listener) has not been called. * fix(requestmanager): revert change to pointer type * Refactor async loading for simplicity and correctness (#356) * feat(reconciledloader): first working version of reconciled loader * feat(traversalrecorder): add better recorder for traversals * feat(reconciledloader): pipe reconciled loader through code style(lint): fix static checks * Update requestmanager/reconciledloader/injest.go Co-authored-by: Rod Vagg <[email protected]> * feat(reconciledloader): respond to PR comments Co-authored-by: Rod Vagg <[email protected]> * fix(requestmanager): update test for rebase Co-authored-by: Daniel Martí <[email protected]> Co-authored-by: hannahhoward <[email protected]>
avoid panic when a decoder is not present for a voucher type
For:
Closes: #351
In terms of what this looks like, an example of an original message is in #351 and with this change the same data would look like (as dag-json):
In terms of bytes saved, using the randomish data in TestGraphsyncRoundTrip (i.e. it has some variability in output length and these are not using exactly the same data so consider it approximate):
@mvdan @warpfork care to critique my schema?
Aside: it's interesting that we're even saving bytes written by server from protobuf to dag-cbor even without these changes. The majority of the data sent should be blocks and their CIDs and all these maps with string keys should drown out any saving CBOR gets from its compact int & length representation. Plus we have v1.0 doing int request IDs and v2 doing UUID (as bytes) request IDs so they're longer. Nothing in the protobuf spec is
optional
, although I believerepeated
allows for zero occurrences so in effect the top-level items areoptional
at least. Perhaps it's all to do with moving the metadata into the core message rather than as an extension (so we save writing the string"graphsync/response-metadata"
on each response). An interesting mystery that might be worth investigating at some point.