Skip to content

Commit b1da78c

Browse files
committed
MSC3911: Linking media to events
1 parent 1676be3 commit b1da78c

File tree

1 file changed

+346
-0
lines changed

1 file changed

+346
-0
lines changed
+346
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,346 @@
1+
# MSC3911: Linking media to events
2+
3+
(An alternative to [MSC3910](https://github.com/matrix-org/matrix-spec-proposals/pull/3910).)
4+
5+
Currently, access to media in Matrix has the following problems:
6+
7+
* The only protection for media is the obscurity of the URL, and URLs are
8+
easily leaked (eg accidental sharing, access
9+
logs). [synapse#2150](https://github.com/matrix-org/synapse/issues/2150)
10+
* Anybody (including non-matrix users) can cause a homeserver to copy media
11+
into its local
12+
store. [synapse#2133](https://github.com/matrix-org/synapse/issues/2133)
13+
* When a media event is redacted, the media it used remains visible to all.
14+
[synapse#1263](https://github.com/matrix-org/synapse/issues/1263)
15+
* There is currently no way to delete
16+
media. [matrix-spec#226](https://github.com/matrix-org/matrix-spec/issues/226)
17+
* If a user requests GDPR erasure, their media remains visible to all.
18+
* When all users leave a room, their media is not deleted from the server.
19+
20+
## Proposal
21+
22+
### Overview
23+
24+
After an item of media is uploaded, it must be linked to an event (via
25+
parameters to the `/send` api). A given piece of media is only visible
26+
to a user if the user can see the corresponding event.
27+
28+
### Detailed spec changes
29+
30+
1. A new "media upload" endpoint is defined, `POST
31+
/_matrix/client/v1/media/upload`. It is based on the existing
32+
[`/_matrix/media/v3/upload`](https://spec.matrix.org/v1.4/client-server-api/#post_matrixmediav3upload)
33+
endpoint, but media uploaded this way is not initially viewable (except to
34+
the user that uploaded it). This is referred to as a "restricted" media item.
35+
36+
The existing endpoint is deprecated. Media uploaded via the existing endpoint
37+
is "unrestricted".
38+
39+
2. Attaching media
40+
41+
* The methods for sending events
42+
([`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey})
43+
and [`PUT /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3roomsroomidsendeventtypetxnid)
44+
are extended to take a query parameter `attach_media`, whose value must be a complete `mxc:` URI.
45+
46+
The `attach_media` parameter may be used several times to attach several
47+
pieces of media to the same event. The maximium number of pieces of media
48+
that can be attached to a single event is implementation-defined by servers.
49+
50+
If any of the `attach_media` parameters do not correspond to known
51+
restricted media items, or they refer to restricted media items that have
52+
already been attached, the server responds with a 400 error with
53+
`M_INVALID_PARAM`.
54+
55+
Sending an event in this manner associates the media with the sent
56+
event. From then on, the media can be seen by any user who can see the event
57+
itself.
58+
59+
Servers should ensure that sending an event remains an idempotent operation: in
60+
particular, if a client sends an event with a media attachment, and then
61+
repeats the operation with identical parameters, a 200 response must be returned
62+
(with the original event ID) even though the media has already been attached.
63+
64+
* Alternatively, if a restricted media item is referenced in a call to
65+
[`PUT /_matrix/client/v3/profile/{userId}/avatar_url`](PUT /_matrix/client/v3/profile/{userId}/avatar_url),
66+
it is instead attached to the user's profile.
67+
68+
Again, if the media is already attached, the server responds with a 400 error with
69+
`M_INVALID_PARAM`.
70+
71+
If the media is not attached to either an event or a profile within a reasonable period
72+
(say, ten minutes), then the server is free to assume that the user has changed their
73+
mind (or the client has gone offline), and may clean up the uploaded media.
74+
75+
4. New download endpoints
76+
77+
The existing download endpoints are to be deprecated, and replaced with new
78+
endpoints specific to client-server or federation requests:
79+
80+
| Old | Client-Server | Federation |
81+
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------- |
82+
| [`GET /_matrix/media/v3/download/{serverName}/{mediaId}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3downloadservernamemediaid) | `GET /_matrix/client/v1/media/download/{serverName}/{mediaId}` | `GET /_matrix/federation/v1/media/download/{serverName}/{mediaId}` |
83+
| [`GET /_matrix/media/v3/download/{serverName}/{mediaId}/{fileName}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3downloadservernamemediaid) | `GET /_matrix/client/v1/media/download/{serverName}/{mediaId}/{fileName}` | N/A |
84+
| [`GET /_matrix/media/v3/thumbnail/{serverName}/{mediaId}`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3thumbnailservernamemediaid) | `GET /_matrix/client/v1/media/thumbnail/{serverName}/{mediaId}` | `GET /_matrix/federation/v1/media/thumbnail/{serverName}/{mediaId}` |
85+
86+
[Question: should we move `/config`, and `/preview_url` while
87+
we're at it, per [MSC1902](https://github.com/matrix-org/matrix-spec-proposals/pull/1902)?]
88+
89+
None of the new endpoints take an `allow_remote` query parameter. (For
90+
`/_matrix/client`, servers should always attempt to request remote
91+
media. For `/_matrix/federation`, servers should never attempt to request
92+
remote media they do not already have cached.)
93+
94+
All of the new endpoints require an `Authorization` header, which must be
95+
set in the same way as for any other CSAPI or federation request (ie,
96+
`Bearer {accessToken}` for `/_matrix/client`, or the signature for
97+
`/_matrix/federation`).
98+
99+
When handling a request to the new endpoints, the server must check if the
100+
requesting user or server has permission to see the corresponding event.
101+
If not, the server responds with a 403 error and `M_UNAUTHORIZED`.
102+
103+
* For the new `/_matrix/client` endpoints, the response format is the same as
104+
the corresponding original endpoints.
105+
106+
* For the new `/_matrix/federation` endpoints, the response is
107+
[`multipart/mixed`]https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html)
108+
content with two parts: the first must be a JSON object
109+
(and should have a `Content-type: application/json` header), and the second
110+
is the media item as per the original endpoints. The json object may have
111+
a property `restrictions`.
112+
113+
If there is no `restrictions` property, the media is a legacy "unrestricted"
114+
media. Otherwise, `restrictions` should be a JSON object with one
115+
of the following properties:
116+
117+
* `event_id`: the event id of the event to which the media is attached.
118+
* `profile_user_id`: the user id of the user to whose profile the media is attached.
119+
120+
It is invalid for both `event_id` and `profile_user_id` to be set.
121+
122+
The requesting server must check the restrictions list, and only return
123+
the requested media to users who have permission to view the relevant
124+
event or profile. If the requesting server caches the media, it must also
125+
cache the restrictions list.
126+
127+
If neither `event_id` nor `profile_user_id` are present, the requesting
128+
user should assume that an unknown restriction is present, and not allow access
129+
to any user.
130+
131+
An example response:
132+
133+
```
134+
Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p
135+
136+
--gc0p4Jq0M2Yt08jU534c0p
137+
Content-Type: application/json
138+
139+
{ "restrictions": {
140+
"event_id": "$Rqnc-F-dvnEYJTyHq_iKxU2bZ1CI92-kuZq3a5lr5Zg"
141+
}}
142+
143+
--gc0p4Jq0M2Yt08jU534c0p
144+
Content-Type: text/plain
145+
146+
This media is plain text. Maybe somebody used it as a paste bin.
147+
148+
--gc0p4Jq0M2Yt08jU534c0p
149+
```
150+
151+
4. New "media copy" API
152+
153+
A new endpoint is defined: `POST
154+
/_matrix/client/v1/media/copy/{serverName}/{mediaId}`. The body of the
155+
request must be a JSON object, but there are no required parameters.
156+
157+
Conceptually, the API makes a new copy of a media item. (In practice, the
158+
server will probably make a new reference to an existing media item, but
159+
that is an implementation detail).
160+
161+
The response is a json object with a required `content_uri` property,
162+
giving a new MXC URI referring to the media.
163+
164+
The new media item can be attached to a new event, and generally functions
165+
in every way the same as uploading a brand new media item.
166+
167+
This "copy" api is to be used by clients when forwarding events with media
168+
attachments.
169+
170+
5. Autogenerated `m.room.member` events
171+
172+
Servers will generate `m.room.member` events with an `avatar_url` whenever
173+
one of their users joins a room, or changes their profile picture.
174+
175+
Such events must each use a different copy of the media item, in the same
176+
way as the "media copy" API described above.
177+
178+
5. Backwards compatibility mechanisms
179+
180+
a. Backwards compatibility with older servers: if a client or requesting
181+
server receives a 404 error with a non-JSON response, or a 400 error with
182+
`{"errcode": "M_UNRECOGNIZED"}`, in response to a request to one of the new
183+
endpoints, they may retry the request using the original endpoint.
184+
185+
b. Backwards compatibility with older clients and federating servers:
186+
servers may for a short time choose to allow unauthenticated access via the
187+
deprecated endpoints, even for restricted media.
188+
189+
6. URL preview
190+
191+
The
192+
[`/preview_url`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixmediav3preview_url)
193+
endpoint returns an object that references an image for the previewed
194+
site.
195+
196+
It is expected that servers will continue to treat such media as unrestricted
197+
(at least for local users), but it would be legitimate for them to, for example,
198+
return a different `mxc:` URI for each requesting user, and only allow each user
199+
access to the corresponding `mxc:` URI.
200+
201+
### Applications
202+
203+
The following discusses the impact of this proposal on various parts of
204+
ecosystem: we consider the changes that will be required for existing
205+
implementations, and how the proposal will facilitate future extensions.
206+
207+
#### IRC/XMPP bridges
208+
209+
Possibly the largest impact will be on IRC and XMPP bridges. Since IRC and
210+
XMPP have no media repository of their own, these bridges currently transform
211+
`mxc:` URIs into `https://<server>/_matrix/media/v3/download/` URIs and forward
212+
those links to the remote platform. This will no longer be a viable option.
213+
214+
This is largely a problem to be solved by the bridge implementations, but one
215+
potential solution is for the bridges to provide a proxy.
216+
217+
In this scenario, the bridge would have a secret HMAC key. When it
218+
receives a matrix event referencing a piece of media, it should create a new URI
219+
referencing the media, include an HMAC to prevent tampering. For example:
220+
221+
```
222+
https://<bridge_server>/media/{originServerName}/{mediaId}?mac={hmac}
223+
```
224+
225+
When the bridge later receives a request to that URI, it checks the hmac,
226+
and proxies the request to the homeserver, using its AS access
227+
token in the `Authorization` header.
228+
229+
This mechanism also works for a secondary use of the content repository in
230+
bridges: as a paste-bin. In this case, the bridge simply generates a link
231+
referencing its own media.
232+
233+
The bridge might also choose to embed information such as the room that
234+
referenced the media, and the time that the link was generated, in the URL.
235+
This could be used to help control access to the media.
236+
237+
Such mechanisms would allow the bridge to impose controls such as:
238+
239+
* Limiting the time a media link is valid for. Doing so would help prevent
240+
visibility to users who weren't participating in the chat.
241+
242+
* Rate-limiting the amount of media being shared in a particular room (in other
243+
words, avoiding the use of Matrix as a Warez distribution system).
244+
245+
#### Redacting events
246+
247+
Under this proposal, servers can determine which media is referenced by an
248+
event that is redacted, and add that media to a list to be cleaned up.
249+
250+
This would also apply if all users in a room are deactivated (either via a GDPR
251+
section 17 request or by a self-requested "Deactivate account" request). In
252+
this case, all events in the room, and all media referenced by them, should be
253+
removed. Currently, Synapse does not support removing the events (see also
254+
[synapse#4720](https://github.com/matrix-org/synapse/issues/4720)); but if at
255+
some point in the future this is added, then this proposal makes it easy to
256+
extend to deleting the media.
257+
258+
Fixes [synapse#1263](https://github.com/matrix-org/synapse/issues/1263).
259+
260+
#### Icons for "social login" flows
261+
262+
When a server supports multiple login providers, it provides the client with
263+
icons for the login providers as `mxc:` media URIs. These must be accessible
264+
without authentication (because the client has no access token at the time the
265+
icons are displayed).
266+
267+
This remains a somewhat unsolved problem. Possibly the clients can continue
268+
to call the legacy `/_matrix/media/v3/download` URI for now: ultimately this
269+
problem will be solved by the transition to OIDC. Alternatively, we may need
270+
to provide an alternative `/_matrix/client/v3/login/sso/icon/{idpId}` API
271+
specifically for access to these icon.
272+
273+
We also need to ensure that the icons are not deleted from the content
274+
repository even though they have not been attached to any event or profile. It
275+
would be wise for servers to provide administrators with a mechanism to upload
276+
media without attaching it to anything.
277+
278+
(This was previously discussed in
279+
[MSC2858](https://github.com/matrix-org/matrix-spec-proposals/pull/2858#discussion_r543513811).)
280+
281+
## Potential issues
282+
283+
* Setting the `Authorization` header is going to be annoying for web clients. Service workers
284+
might be needed.
285+
286+
* Users will be unable to copy links to media from web clients to share out of
287+
band. This is considered a feature, not a bug.
288+
289+
* Since each `m.room.member` references the avatar separately, changing your
290+
avatar will cause an even bigger traffic storm if you're in a lot of rooms.
291+
292+
## Alternatives
293+
294+
* Use some sort of "content token" for each piece of media, and require clients to
295+
provide it, per [MSC3910](https://github.com/matrix-org/matrix-spec-proposals/pull/3910).
296+
297+
* Allow clients to upload media which does not require authentication (for
298+
example via a `public=true` query parameter). This might be particularly
299+
useful for IRC/XMPP bridges, which could upload any media they encounter to
300+
the homeserver's repository.
301+
302+
The danger with this is that is that there's little stopping clients
303+
continuing to upload media as "public", negating all of the benefits in this
304+
MSC. It might be ok if media upload it was restricted to certain privileged
305+
users.
306+
307+
* Have the "upload" endpoint return a nonce, which can then be used in the
308+
"send" endpoint in place of the `mxc` uri. It's hard to see what advantage
309+
this gives, beyond the fact a nonce could be smaller so marginally fewer
310+
bytes to send.
311+
312+
## Security considerations
313+
314+
* Letting servers track the relationship between events and media is a metadata
315+
leak, especially for e2ee rooms.
316+
317+
## Unstable prefix
318+
319+
TODO
320+
321+
## Dependencies
322+
323+
None at present.
324+
325+
## Prior art
326+
327+
* Credit: this is based on ideas from @jcgruenhage and @anoadragon453 at
328+
https://cryptpad.fr/code/#/2/code/view/oWjZciD9N1aWTr1IL6GRZ0k1i+dm7wJQ7juLf4tJRoo/
329+
330+
* [MSC~~701~~3796](https://github.com/matrix-org/matrix-spec-proposals/issues/3796):
331+
a predecessor of this proposal
332+
333+
* [MSC2461](https://github.com/matrix-org/matrix-spec-proposals/pull/2461):
334+
adds per-user authentication but does not attempt to restrict access to
335+
individual items of media.
336+
337+
* [MSC2278](https://github.com/matrix-org/matrix-spec-proposals/pull/2278):
338+
Deleting attachments for expired and redacted messages
339+
340+
* [MSC1902](https://github.com/matrix-org/matrix-spec-proposals/pull/1902):
341+
Split the media repo into s2s and c2s parts
342+
343+
* [MSC2846](https://github.com/matrix-org/matrix-spec-proposals/pull/2846):
344+
Decentralizing media through CIDs
345+
346+

0 commit comments

Comments
 (0)