|
| 1 | +# MSC3911: Linking media to events |
| 2 | + |
| 3 | +(An alternative to [MSC3910](https://github.com/matrix-org/matrix-spec-proposals/pull/3910).) |
| 4 | + |
| 5 | +Currently, access to media in Matrix has the following problems: |
| 6 | + |
| 7 | +* The only protection for media is the obscurity of the URL, and URLs are |
| 8 | + easily leaked (eg accidental sharing, access |
| 9 | + logs). [synapse#2150](https://github.com/matrix-org/synapse/issues/2150) |
| 10 | +* Anybody (including non-matrix users) can cause a homeserver to copy media |
| 11 | + into its local |
| 12 | + store. [synapse#2133](https://github.com/matrix-org/synapse/issues/2133) |
| 13 | +* When a media event is redacted, the media it used remains visible to all. |
| 14 | + [synapse#1263](https://github.com/matrix-org/synapse/issues/1263) |
| 15 | +* There is currently no way to delete |
| 16 | + media. [matrix-spec#226](https://github.com/matrix-org/matrix-spec/issues/226) |
| 17 | +* If a user requests GDPR erasure, their media remains visible to all. |
| 18 | +* When all users leave a room, their media is not deleted from the server. |
| 19 | + |
| 20 | +## Proposal |
| 21 | + |
| 22 | +### Overview |
| 23 | + |
| 24 | +After an item of media is uploaded, it must be linked to an event (via |
| 25 | +parameters to the `/send` api). A given piece of media is only visible |
| 26 | +to a user if the user can see the corresponding event. |
| 27 | + |
| 28 | +### Detailed spec changes |
| 29 | + |
| 30 | +1. A new "media upload" endpoint is defined, `POST |
| 31 | + /_matrix/client/v1/media/upload`. It is based on the existing |
| 32 | + [`/_matrix/media/v3/upload`](https://spec.matrix.org/v1.4/client-server-api/#post_matrixmediav3upload) |
| 33 | + endpoint, but media uploaded this way is not initially viewable (except to |
| 34 | + the user that uploaded it). This is referred to as a "restricted" media item. |
| 35 | + |
| 36 | + The existing endpoint is deprecated. Media uploaded via the existing endpoint |
| 37 | + is "unrestricted". |
| 38 | + |
| 39 | +2. Attaching media |
| 40 | + |
| 41 | + * The methods for sending events |
| 42 | + ([`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}) |
| 43 | + and [`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid) |
| 44 | + are extended to take a query parameter `attach_media`, whose value must be a complete `mxc:` URI. |
| 45 | + |
| 46 | + The `attach_media` parameter may be used several times to attach several |
| 47 | + pieces of media to the same event. The maximium number of pieces of media |
| 48 | + that can be attached to a single event is implementation-defined by servers. |
| 49 | + |
| 50 | + If any of the `attach_media` parameters do not correspond to known |
| 51 | + restricted media items, or they refer to restricted media items that have |
| 52 | + already been attached, the server responds with a 400 error with |
| 53 | + `M_INVALID_PARAM`. |
| 54 | + |
| 55 | + Sending an event in this manner associates the media with the sent |
| 56 | + event. From then on, the media can be seen by any user who can see the event |
| 57 | + itself. |
| 58 | + |
| 59 | + Servers should ensure that sending an event remains an idempotent operation: in |
| 60 | + particular, if a client sends an event with a media attachment, and then |
| 61 | + repeats the operation with identical parameters, a 200 response must be returned |
| 62 | + (with the original event ID) even though the media has already been attached. |
| 63 | + |
| 64 | + * Alternatively, if a restricted media item is referenced in a call to |
| 65 | + [`PUT /_matrix/client/v3/profile/{userId}/avatar_url`](PUT /_matrix/client/v3/profile/{userId}/avatar_url), |
| 66 | + it is instead attached to the user's profile. |
| 67 | + |
| 68 | + Again, if the media is already attached, the server responds with a 400 error with |
| 69 | + `M_INVALID_PARAM`. |
| 70 | + |
| 71 | + If the media is not attached to either an event or a profile within a reasonable period |
| 72 | + (say, ten minutes), then the server is free to assume that the user has changed their |
| 73 | + mind (or the client has gone offline), and may clean up the uploaded media. |
| 74 | + |
| 75 | +4. New download endpoints |
| 76 | + |
| 77 | + The existing download endpoints are to be deprecated, and replaced with new |
| 78 | + endpoints specific to client-server or federation requests: |
| 79 | + |
| 80 | + | Old | Client-Server | Federation | |
| 81 | + | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------- | |
| 82 | + | [`GET /_matrix/media/v3/download/{serverName}/{mediaId}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3downloadservernamemediaid) | `GET /_matrix/client/v1/media/download/{serverName}/{mediaId}` | `GET /_matrix/federation/v1/media/download/{serverName}/{mediaId}` | |
| 83 | + | [`GET /_matrix/media/v3/download/{serverName}/{mediaId}/{fileName}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3downloadservernamemediaid) | `GET /_matrix/client/v1/media/download/{serverName}/{mediaId}/{fileName}` | N/A | |
| 84 | + | [`GET /_matrix/media/v3/thumbnail/{serverName}/{mediaId}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3thumbnailservernamemediaid) | `GET /_matrix/client/v1/media/thumbnail/{serverName}/{mediaId}` | `GET /_matrix/federation/v1/media/thumbnail/{serverName}/{mediaId}` | |
| 85 | + |
| 86 | + [Question: should we move `/config`, and `/preview_url` while |
| 87 | + we're at it, per [MSC1902](https://github.com/matrix-org/matrix-spec-proposals/pull/1902)?] |
| 88 | + |
| 89 | + None of the new endpoints take an `allow_remote` query parameter. (For |
| 90 | + `/_matrix/client`, servers should always attempt to request remote |
| 91 | + media. For `/_matrix/federation`, servers should never attempt to request |
| 92 | + remote media they do not already have cached.) |
| 93 | + |
| 94 | + All of the new endpoints require an `Authorization` header, which must be |
| 95 | + set in the same way as for any other CSAPI or federation request (ie, |
| 96 | + `Bearer {accessToken}` for `/_matrix/client`, or the signature for |
| 97 | + `/_matrix/federation`). |
| 98 | + |
| 99 | + When handling a request to the new endpoints, the server must check if the |
| 100 | + requesting user or server has permission to see the corresponding event. |
| 101 | + If not, the server responds with a 403 error and `M_UNAUTHORIZED`. |
| 102 | + |
| 103 | + * For the new `/_matrix/client` endpoints, the response format is the same as |
| 104 | + the corresponding original endpoints. |
| 105 | + |
| 106 | + * For the new `/_matrix/federation` endpoints, the response is |
| 107 | + [`multipart/mixed`]https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html) |
| 108 | + content with two parts: the first must be a JSON object |
| 109 | + (and should have a `Content-type: application/json` header), and the second |
| 110 | + is the media item as per the original endpoints. The json object may have |
| 111 | + a property `restrictions`. |
| 112 | + |
| 113 | + If there is no `restrictions` property, the media is a legacy "unrestricted" |
| 114 | + media. Otherwise, `restrictions` should be a JSON object with one |
| 115 | + of the following properties: |
| 116 | + |
| 117 | + * `event_id`: the event id of the event to which the media is attached. |
| 118 | + * `profile_user_id`: the user id of the user to whose profile the media is attached. |
| 119 | + |
| 120 | + It is invalid for both `event_id` and `profile_user_id` to be set. |
| 121 | + |
| 122 | + The requesting server must check the restrictions list, and only return |
| 123 | + the requested media to users who have permission to view the relevant |
| 124 | + event or profile. If the requesting server caches the media, it must also |
| 125 | + cache the restrictions list. |
| 126 | + |
| 127 | + If neither `event_id` nor `profile_user_id` are present, the requesting |
| 128 | + user should assume that an unknown restriction is present, and not allow access |
| 129 | + to any user. |
| 130 | + |
| 131 | + An example response: |
| 132 | + |
| 133 | + ``` |
| 134 | + Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p |
| 135 | +
|
| 136 | + --gc0p4Jq0M2Yt08jU534c0p |
| 137 | + Content-Type: application/json |
| 138 | +
|
| 139 | + { "restrictions": { |
| 140 | + "event_id": "$Rqnc-F-dvnEYJTyHq_iKxU2bZ1CI92-kuZq3a5lr5Zg" |
| 141 | + }} |
| 142 | +
|
| 143 | + --gc0p4Jq0M2Yt08jU534c0p |
| 144 | + Content-Type: text/plain |
| 145 | +
|
| 146 | + This media is plain text. Maybe somebody used it as a paste bin. |
| 147 | +
|
| 148 | + --gc0p4Jq0M2Yt08jU534c0p |
| 149 | + ``` |
| 150 | +
|
| 151 | +4. New "media copy" API |
| 152 | +
|
| 153 | + A new endpoint is defined: `POST |
| 154 | + /_matrix/client/v1/media/copy/{serverName}/{mediaId}`. The body of the |
| 155 | + request must be a JSON object, but there are no required parameters. |
| 156 | +
|
| 157 | + Conceptually, the API makes a new copy of a media item. (In practice, the |
| 158 | + server will probably make a new reference to an existing media item, but |
| 159 | + that is an implementation detail). |
| 160 | +
|
| 161 | + The response is a json object with a required `content_uri` property, |
| 162 | + giving a new MXC URI referring to the media. |
| 163 | +
|
| 164 | + The new media item can be attached to a new event, and generally functions |
| 165 | + in every way the same as uploading a brand new media item. |
| 166 | +
|
| 167 | + This "copy" api is to be used by clients when forwarding events with media |
| 168 | + attachments. |
| 169 | +
|
| 170 | +5. Autogenerated `m.room.member` events |
| 171 | +
|
| 172 | + Servers will generate `m.room.member` events with an `avatar_url` whenever |
| 173 | + one of their users joins a room, or changes their profile picture. |
| 174 | +
|
| 175 | + Such events must each use a different copy of the media item, in the same |
| 176 | + way as the "media copy" API described above. |
| 177 | +
|
| 178 | +5. Backwards compatibility mechanisms |
| 179 | +
|
| 180 | + a. Backwards compatibility with older servers: if a client or requesting |
| 181 | + server receives a 404 error with a non-JSON response, or a 400 error with |
| 182 | + `{"errcode": "M_UNRECOGNIZED"}`, in response to a request to one of the new |
| 183 | + endpoints, they may retry the request using the original endpoint. |
| 184 | + |
| 185 | + b. Backwards compatibility with older clients and federating servers: |
| 186 | + servers may for a short time choose to allow unauthenticated access via the |
| 187 | + deprecated endpoints, even for restricted media. |
| 188 | +
|
| 189 | +6. URL preview |
| 190 | +
|
| 191 | + The |
| 192 | + [`/preview_url`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3preview_url) |
| 193 | + endpoint returns an object that references an image for the previewed |
| 194 | + site. |
| 195 | +
|
| 196 | + It is expected that servers will continue to treat such media as unrestricted |
| 197 | + (at least for local users), but it would be legitimate for them to, for example, |
| 198 | + return a different `mxc:` URI for each requesting user, and only allow each user |
| 199 | + access to the corresponding `mxc:` URI. |
| 200 | +
|
| 201 | +### Applications |
| 202 | +
|
| 203 | +The following discusses the impact of this proposal on various parts of |
| 204 | +ecosystem: we consider the changes that will be required for existing |
| 205 | +implementations, and how the proposal will facilitate future extensions. |
| 206 | +
|
| 207 | +#### IRC/XMPP bridges |
| 208 | +
|
| 209 | +Possibly the largest impact will be on IRC and XMPP bridges. Since IRC and |
| 210 | +XMPP have no media repository of their own, these bridges currently transform |
| 211 | +`mxc:` URIs into `https://<server>/_matrix/media/v3/download/` URIs and forward |
| 212 | +those links to the remote platform. This will no longer be a viable option. |
| 213 | +
|
| 214 | +This is largely a problem to be solved by the bridge implementations, but one |
| 215 | +potential solution is for the bridges to provide a proxy. |
| 216 | +
|
| 217 | +In this scenario, the bridge would have a secret HMAC key. When it |
| 218 | +receives a matrix event referencing a piece of media, it should create a new URI |
| 219 | +referencing the media, include an HMAC to prevent tampering. For example: |
| 220 | +
|
| 221 | +``` |
| 222 | +https://<bridge_server>/media/{originServerName}/{mediaId}?mac={hmac} |
| 223 | +``` |
| 224 | +
|
| 225 | +When the bridge later receives a request to that URI, it checks the hmac, |
| 226 | +and proxies the request to the homeserver, using its AS access |
| 227 | +token in the `Authorization` header. |
| 228 | +
|
| 229 | +This mechanism also works for a secondary use of the content repository in |
| 230 | +bridges: as a paste-bin. In this case, the bridge simply generates a link |
| 231 | +referencing its own media. |
| 232 | +
|
| 233 | +The bridge might also choose to embed information such as the room that |
| 234 | +referenced the media, and the time that the link was generated, in the URL. |
| 235 | +This could be used to help control access to the media. |
| 236 | +
|
| 237 | +Such mechanisms would allow the bridge to impose controls such as: |
| 238 | +
|
| 239 | +* Limiting the time a media link is valid for. Doing so would help prevent |
| 240 | + visibility to users who weren't participating in the chat. |
| 241 | +
|
| 242 | +* Rate-limiting the amount of media being shared in a particular room (in other |
| 243 | + words, avoiding the use of Matrix as a Warez distribution system). |
| 244 | +
|
| 245 | +#### Redacting events |
| 246 | +
|
| 247 | +Under this proposal, servers can determine which media is referenced by an |
| 248 | +event that is redacted, and add that media to a list to be cleaned up. |
| 249 | +
|
| 250 | +This would also apply if all users in a room are deactivated (either via a GDPR |
| 251 | +section 17 request or by a self-requested "Deactivate account" request). In |
| 252 | +this case, all events in the room, and all media referenced by them, should be |
| 253 | +removed. Currently, Synapse does not support removing the events (see also |
| 254 | +[synapse#4720](https://github.com/matrix-org/synapse/issues/4720)); but if at |
| 255 | +some point in the future this is added, then this proposal makes it easy to |
| 256 | +extend to deleting the media. |
| 257 | +
|
| 258 | +Fixes [synapse#1263](https://github.com/matrix-org/synapse/issues/1263). |
| 259 | +
|
| 260 | +#### Icons for "social login" flows |
| 261 | +
|
| 262 | +When a server supports multiple login providers, it provides the client with |
| 263 | +icons for the login providers as `mxc:` media URIs. These must be accessible |
| 264 | +without authentication (because the client has no access token at the time the |
| 265 | +icons are displayed). |
| 266 | +
|
| 267 | +This remains a somewhat unsolved problem. Possibly the clients can continue |
| 268 | +to call the legacy `/_matrix/media/v3/download` URI for now: ultimately this |
| 269 | +problem will be solved by the transition to OIDC. Alternatively, we may need |
| 270 | +to provide an alternative `/_matrix/client/v3/login/sso/icon/{idpId}` API |
| 271 | +specifically for access to these icon. |
| 272 | +
|
| 273 | +We also need to ensure that the icons are not deleted from the content |
| 274 | +repository even though they have not been attached to any event or profile. It |
| 275 | +would be wise for servers to provide administrators with a mechanism to upload |
| 276 | +media without attaching it to anything. |
| 277 | +
|
| 278 | +(This was previously discussed in |
| 279 | +[MSC2858](https://github.com/matrix-org/matrix-spec-proposals/pull/2858#discussion_r543513811).) |
| 280 | +
|
| 281 | +## Potential issues |
| 282 | +
|
| 283 | +* Setting the `Authorization` header is going to be annoying for web clients. Service workers |
| 284 | + might be needed. |
| 285 | +
|
| 286 | +* Users will be unable to copy links to media from web clients to share out of |
| 287 | + band. This is considered a feature, not a bug. |
| 288 | +
|
| 289 | +* Since each `m.room.member` references the avatar separately, changing your |
| 290 | + avatar will cause an even bigger traffic storm if you're in a lot of rooms. |
| 291 | +
|
| 292 | +## Alternatives |
| 293 | +
|
| 294 | +* Use some sort of "content token" for each piece of media, and require clients to |
| 295 | + provide it, per [MSC3910](https://github.com/matrix-org/matrix-spec-proposals/pull/3910). |
| 296 | +
|
| 297 | +* Allow clients to upload media which does not require authentication (for |
| 298 | + example via a `public=true` query parameter). This might be particularly |
| 299 | + useful for IRC/XMPP bridges, which could upload any media they encounter to |
| 300 | + the homeserver's repository. |
| 301 | +
|
| 302 | + The danger with this is that is that there's little stopping clients |
| 303 | + continuing to upload media as "public", negating all of the benefits in this |
| 304 | + MSC. It might be ok if media upload it was restricted to certain privileged |
| 305 | + users. |
| 306 | +
|
| 307 | +* Have the "upload" endpoint return a nonce, which can then be used in the |
| 308 | + "send" endpoint in place of the `mxc` uri. It's hard to see what advantage |
| 309 | + this gives, beyond the fact a nonce could be smaller so marginally fewer |
| 310 | + bytes to send. |
| 311 | +
|
| 312 | +## Security considerations |
| 313 | +
|
| 314 | +* Letting servers track the relationship between events and media is a metadata |
| 315 | + leak, especially for e2ee rooms. |
| 316 | +
|
| 317 | +## Unstable prefix |
| 318 | +
|
| 319 | +TODO |
| 320 | +
|
| 321 | +## Dependencies |
| 322 | +
|
| 323 | +None at present. |
| 324 | +
|
| 325 | +## Prior art |
| 326 | +
|
| 327 | +* Credit: this is based on ideas from @jcgruenhage and @anoadragon453 at |
| 328 | + https://cryptpad.fr/code/#/2/code/view/oWjZciD9N1aWTr1IL6GRZ0k1i+dm7wJQ7juLf4tJRoo/ |
| 329 | +
|
| 330 | +* [MSC~~701~~3796](https://github.com/matrix-org/matrix-spec-proposals/issues/3796): |
| 331 | + a predecessor of this proposal |
| 332 | +
|
| 333 | +* [MSC2461](https://github.com/matrix-org/matrix-spec-proposals/pull/2461): |
| 334 | + adds per-user authentication but does not attempt to restrict access to |
| 335 | + individual items of media. |
| 336 | +
|
| 337 | +* [MSC2278](https://github.com/matrix-org/matrix-spec-proposals/pull/2278): |
| 338 | + Deleting attachments for expired and redacted messages |
| 339 | +
|
| 340 | +* [MSC1902](https://github.com/matrix-org/matrix-spec-proposals/pull/1902): |
| 341 | + Split the media repo into s2s and c2s parts |
| 342 | +
|
| 343 | +* [MSC2846](https://github.com/matrix-org/matrix-spec-proposals/pull/2846): |
| 344 | + Decentralizing media through CIDs |
| 345 | + |
| 346 | + |
0 commit comments