Skip to content

Commit ef5baec

Browse files
tulirturt2live
authored andcommitted
MSC2530: Body field as media caption (#2530)
* Proposal to use body field as media caption * Add paragraph about relation-based captions being difficult for bridges * Clarify how to treat body when filename is not present * Refactor proposal text * Fix heading size * Add problem statement * Add links to and quotes from current spec * Adjust wording and quote m.audio body spec * Clarify that m.location and m.sticker are out of scope for this proposal * Add examples and summary of changes * Fix JSON syntax in example
1 parent 9eaf81f commit ef5baec

File tree

1 file changed

+144
-0
lines changed

1 file changed

+144
-0
lines changed

proposals/2530-body-as-caption.md

+144
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Body field as media caption
2+
3+
When sending images or other attachments, users often want to include text to
4+
convey additional information. Most chat platforms offer media captions as a
5+
first-class feature, allowing users to choose the attachment and write text,
6+
then send both together in one message.
7+
8+
Matrix currently does not enable this on the protocol level: at best, clients
9+
can emulate the behavior by sending two messages quickly; at worst, the user
10+
has to do that manually. Sending separate messages means it's possible for
11+
the second message to be delayed or lost if something goes wrong.
12+
13+
## Proposal
14+
15+
This proposal allows the `filename` field from [`m.file`], and the `format` and
16+
`formatted_body` fields from [`m.text`] for all media msgtypes (`m.image`,
17+
`m.audio`, `m.video`, `m.file`). This proposal does not affect the `m.location`
18+
msgtype, nor the separate `m.sticker` event type: stickers already use `body`
19+
as a description, and locations don't have file names.
20+
21+
If the `filename` field is present in a media message, clients should treat
22+
`body` as a caption instead of a file name. If the `format`/`formatted_body`
23+
fields are present in addition to `filename` and `body`, then they should take
24+
priority as the caption text. Formatted text in media captions is rendered the
25+
same way as formatted text in `m.text` messages.
26+
27+
The current spec is somewhat ambiguous as to how `body` should be handled and
28+
the definition varies across different message types. The current spec for
29+
[`m.image`] describes `body` as
30+
31+
> A textual representation of the image. This could be the alt text of the
32+
> image, the filename of the image, or some kind of content description for
33+
> accessibility e.g. ‘image attachment’.
34+
35+
while [`m.audio`] describes it as
36+
37+
> A description of the audio e.g. ‘Bee Gees - Stayin’ Alive’, or some kind of
38+
> content description for accessibility e.g. ‘audio attachment’.
39+
40+
In practice, clients (or at least Element) use it as the file name. As a part
41+
of adding captions, the `body` field for all media message types is explicitly
42+
defined to be used as the file name when the `filename` field is not present.
43+
44+
For `m.file` messages, the [current (v1.9) spec][`m.file`] confusingly defines
45+
`filename` as "The original filename of the uploaded file" and simultaneously
46+
recommends that `body` is "the filename of the original upload", effectively
47+
saying both fields should have the file name. In order to avoid (old) messages
48+
with both fields being misinterpreted as having captions, the `body` field
49+
should not be used as a caption when it's equal to `filename`.
50+
51+
[`m.file`]: https://spec.matrix.org/v1.9/client-server-api/#mfile
52+
[`m.text`]: https://spec.matrix.org/v1.9/client-server-api/#mtext
53+
[`m.image`]: https://spec.matrix.org/v1.9/client-server-api/#mimage
54+
[`m.audio`]: https://spec.matrix.org/v1.9/client-server-api/#maudio
55+
56+
### Examples
57+
<details>
58+
<summary>Image with caption</summary>
59+
60+
```json
61+
{
62+
"msgtype": "m.image",
63+
"url": "mxc://maunium.net/HaIrXlnKfEEHvMNKzuExiYlv",
64+
"filename": "cat.jpeg",
65+
"body": "this is a cat picture :3",
66+
"info": {
67+
"w": 479,
68+
"h": 640,
69+
"mimetype": "image/jpeg",
70+
"size": 27253
71+
},
72+
"m.mentions": {}
73+
}
74+
```
75+
76+
</details>
77+
<details>
78+
<summary>File with formatted caption</summary>
79+
80+
```json
81+
{
82+
"msgtype": "m.file",
83+
"url": "mxc://maunium.net/TizWsLhHfDCETKRXdDwHoAGn",
84+
"filename": "hello.txt",
85+
"body": "this caption is longer than the file itself 🤔",
86+
"format": "org.matrix.custom.html",
87+
"formatted_body": "this <strong>caption</strong> is longer than the file itself 🤔",
88+
"info": {
89+
"mimetype": "text/plain",
90+
"size": 14
91+
},
92+
"m.mentions": {}
93+
}
94+
```
95+
96+
</details>
97+
98+
### Summary
99+
* `filename` is defined for all media msgtypes.
100+
* `body` is defined to be a caption when `filename` is present and not equal to `body`.
101+
* `format` and `formatted_body` are allowed as well for formatted captions.
102+
* `body` is defined to be the file name when `filename` is not present.
103+
104+
## Potential issues
105+
106+
In clients that don't show the file name anywhere, the caption would not be
107+
visible at all. However, extensible events would run into the same issue.
108+
Clients having captions implemented beforehand may even help eventually
109+
implementing extensible events.
110+
111+
Old clients may default to using the caption as the file name when the user
112+
wants to download a file, which will be somewhat weird UX.
113+
114+
## Alternatives
115+
116+
### [MSC2529](https://github.com/matrix-org/matrix-spec-proposals/pull/2529)
117+
118+
MSC2529 would allow existing clients to render captions without any changes,
119+
but the use of relations makes implementation more difficult, especially for
120+
bridges. It would require either waiting a predefined amount of time for the
121+
caption to come through, or editing the message on the target platform (if
122+
edits are supported).
123+
124+
The format proposed by MSC2529 would also make it technically possible to use
125+
other message types as captions without changing the format of the events,
126+
which is not possible with this proposal.
127+
128+
### Extensible events
129+
130+
Like MSC2529, this would be obsoleted by [extensible events](https://github.com/matrix-org/matrix-spec-proposals/pull/3552).
131+
However, fully switching to extensible events requires significantly more
132+
implementation work, and it may take years for the necessary time to be
133+
allocated for that.
134+
135+
## Security considerations
136+
137+
This proposal doesn't involve any security-sensitive components.
138+
139+
## Unstable prefix
140+
141+
The fields being added already exist in other msgtypes, so unstable prefixes
142+
don't seem necessary. Additionally, using `body` as a caption could already be
143+
considered spec-compliant due to the ambiguous definition of the field, and
144+
only adding unstable prefixes for the other fields would be silly.

0 commit comments

Comments
 (0)