diff --git a/proposals/4139-bot-buttons.md b/proposals/4139-bot-buttons.md new file mode 100644 index 00000000000..5ee6e758517 --- /dev/null +++ b/proposals/4139-bot-buttons.md @@ -0,0 +1,201 @@ +# MSC4139: Bot buttons & conversations + +Nearly all bots and bridges in the Matrix ecosystem use a text-based interface to support their +operations. These interfaces are typically highly structured commands and require the user to know +the entire incantation for the action they want to invoke, making them feel like "power user" +features. + +Further, interacting with bots today is extremely transactional: the user sends a command and the +bot performs the action as-is or spews errors back at the user due to a typo. If an error was +returned, the entire command needs to be re-run. + +A more user-friendly approach is to have the user provide the bot with information as needed, +without having to guess at the bot's current state. This proposal calls such an approach a +"conversation" with the bot - the user does something to "start" the conversation, and the bot +provides a limited set of prompts to continue the conversation. This repeats until the conversation +ends (usually by the bot saying so explicitly). Users may hold multiple concurrent conversations +with bots. Conversation starters are deliberately left as a bot implementation detail in this +proposal to allow the ecosystem to explore this new interaction technique. Examples may include the +user opening a DM with the bot, sending a `!command` message, or, in future, sending a slash command +like `/start`. + +This conversation approach is heavily inspired by platforms like Telegram. + +## Proposal + +A new `m.prompts` [mixin](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/1767-extensible-events.md#mixins-specifically-allowed) +is specified which describes actions another user in the room can take to further the conversation. + +The `m.prompts` mixin contains some scoping parameters, rendering hints, and the actual prompts +themselves. For example, when applied to an `m.message` event, the `m.prompts` may look like the +following: + +*Note*: The JSON comments are normative, and irrelevant fields are not shown. + +```jsonc +{ + "type": "m.message", + "sender": "@bot:example.org", + "content": { + "m.text": [ + {"body": "Hello! Say !roll [dice] to roll some dice.", "mimetype": "text/html"}, + {"body": "Hello! Say `!roll [dice]` to get started."} + ], + "m.prompts": { + // Clients which recognize `m.prompts` would use `intro` to render the event instead. This + // allows the remainder of the event to be a fallback for unsupported clients. + "intro": { + "type": "m.message", + "content": { + "m.text": [ + {"body": "Hello! What would you like to roll today?"} + ] + } + }, + // These are the users who should see the `prompts`. Other users may see something like "you + // do not have permission to reply to this message" instead of prompts. `scope` is optional: + // when not supplied, all users who can see the message can respond. When an empty array, no + // one can respond. Clients SHOULD NOT show prompts to users who are descoped. + "scope": [ + "@alice:example.org", + "@bob:example.org", + ], + // These are the options a user has. Note the 2 distinct types and 3 label approaches. + "prompts": [ + { + // `type` is the prompt type: "preset" (show a button) or "input" (shown below) + "type": "preset", + // `id` is used by the bot to figure out what prompt the user picked. It is an opaque ID. + "id": "1d6", + // `label` is an extensible event with deliberately no `type`. + "label": { + "m.text": [{"body": "1 six sided die"}] + } + }, { + "type": "preset", + "id": "surprise", + "label": { + // This should render as an image event, hopefully + // Requires https://github.com/matrix-org/matrix-spec-proposals/pull/3552 + "m.text": [{"body": "🎲❓"}], // fallback + "m.file": { + "url": "mxc://example.org/abc123" + }, + "m.image_details": { + // Clients should impose maximums and minimums here. + "width": 16, + "height": 16 + }, + "m.alt_text": { + "m.text": [{"body": "An image of a 6 sided die with a red question mark over it"}] + } + } + }, { + "type": "input", + "id": "custom", + // Regex the client can use to test input locally. Optional - if not provided the client + // should accept *any* input, including an empty string. + "validator": "[0-9]+d[0-9]+", // `2d20`, etc + "label": { + "m.text": [{"body": "Other"}] + } + } + ] + } + } +} +``` + +In this example, clients which don't support the mixin will see the old-style `!roll 2d6` help text, +allowing the user to continue interacting if needed. Over time, bots may wish to drop this fallback +style and instead use a message like `Hello! Your client doesn't support talking to me :(`. + +Clients which do support `m.prompts` will instead render the `intro` object as the event. It's not +required that the `intro.type` matches the top level event `type`, though it is considered good +practice to do so. The `intro` block is primarily intended to allow senders to tailor their message +for supported clients, as the intent for this proposal is to discourage commands like `!roll` where +possible. + +Prompts SHOULD be rendered in order of the array, and appear below the `intro` rendering. Buttons +SHOULD be used for `preset` prompts, using the provided `label`, and text inputs with `label` as a +prefix or placeholder, and validation per `validator`, SHOULD be used for `input` prompts. For +example: + +![](./images/4139-01-dice-bot-welcome.png) + +[Codepen](https://codepen.io/turt2live/pen/gOyVvaY) (note: doesn't do validation) + +The user is then able to click on one of the buttons or submit text through the `input` option. That +reply looks as follows: + +```jsonc +{ + "type": "m.conversation.reply", + "sender": "@alice:example.org", + "content": { + "m.in_reply_to": { // TODO: Change to match Extensible Event replies + "event_id": "$previousMessage", + "rel_type": "m.thread" // yes, we use threads! + }, + // Whichever option the user clicked is described here in a new content block. + "m.used_prompt": { + "id": "surprise" + }, + // We then add all the fallback representations. For `preset` prompts, this is typically just + // the `label` verbatim. `input` prompts may require some creative editing, like "Other: 2d20". + "m.text": [{"body": "🎲❓"}], // fallback for the image + "m.file": { + "url": "mxc://example.org/abc123" + }, + "m.image_details": { + "width": 16, + "height": 16 + }, + "m.alt_text": { + "m.text": [{"body": "An image of a 6 sided die with a red question mark over it"}] + } + } +} +``` + +The bot can then process this and continue the conversation as needed, using more `m.prompts` mixins +to get the information it needs from the user. If the bot considers the conversation/thread to be +complete, it sends an event with no `m.prompts` mixin to the thread. In our example of a dice bot, +this could be the result of the roll. + +Once a user has picked (and sent) a prompt, the client SHOULD disable the user's ability to send +another. This could be done by hiding all options, or using the HTML `disabled` attribute. + +The example dice bot would then start a new conversation by sending a new welcome message, likely +with different text to feel less mechanical. For example: "What are we rolling next? [1d6] [...]". + +It is left as a bot implementation detail to handle multiple responses, responses from descoped +users, and invalid input. Typically this would be handled by the bot using a threaded reply to the +sender saying "sorry, you don't have permission to interact here" or "sorry, I didn't catch that. +[same prompts as original message]". + +## Potential issues + +TODO + +## Alternatives + +[MSC3006](https://github.com/matrix-org/matrix-spec-proposals/pull/3006) is very similar to this +proposal. Instead of starting per-message threads, it defines interactions via a state event. This +makes MSC3006 more akin to a "conversation starter" replacement, to use this MSC's terminology. + +## Security considerations + +TODO + +## Unstable prefix + +While this proposal is not considered stable, clients should use `org.matrix.msc4139.` in place of +`m.` in all identifiers. + +TODO: Language to support usage in room versions without Extensible Events support, similar to +[MSC3381: Polls](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3381-polls.md). + +## Dependencies + +This MSC has no direct dependencies. diff --git a/proposals/images/4139-01-dice-bot-welcome.png b/proposals/images/4139-01-dice-bot-welcome.png new file mode 100644 index 00000000000..54a29679de7 Binary files /dev/null and b/proposals/images/4139-01-dice-bot-welcome.png differ