Skip to content

MSC2545: Image Packs (Emoticons & Stickers) #2545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 55 commits into
base: old_master
Choose a base branch
from

Conversation

Sorunome
Copy link
Contributor

@Sorunome Sorunome commented May 15, 2020

Rendered

Signed-off-by: Sorunome [email protected]

Author: @Sorunome
Shepherd: @anoadragon453

Future MSCs

Current implementations

Emote rendering (rendering of the <img> tag)

  • element-web
  • revolution (even handles data-mx-emoticon)
  • nheko
  • fluffychat (even handles data-mx-emoticon)
  • neochat
  • cinny

Sending, using the mentioned events here

  • revolution (emoticons)
  • fluffychat (emoticons, stickers)
  • neochat (emoticons, at least partial)
  • nheko (emoticons, stickers)
  • cinny (emoticons, stickers)
  • kazv (stickers)

Bridges

  • mx-puppet-discord
  • matrix-appservice-discord (partially, sending data-mx-emoticon only)

Implementation PRs

FluffyChat

Data model: https://gitlab.com/famedly/company/frontend/libraries/matrix_api_lite/-/merge_requests/26
SDK: https://gitlab.com/famedly/company/frontend/famedlysdk/-/merge_requests/726
Emoticons: https://gitlab.com/famedly/fluffychat/-/merge_requests/433
Stickers: https://gitlab.com/famedly/fluffychat/-/merge_requests/452

Nheko

Stickers: Nheko-Reborn/nheko#648
Sticker editor: Nheko-Reborn/nheko#669
Choosing emoticons: Nheko-Reborn/nheko@ea6b19b

Cinny

Emoticons and Stickers: cinnyapp/cinny#686

kazv

Creating and sending stickers: https://lily-is.land/kazv/kazv/-/merge_requests/71


SCT:

FCP tickyboxes

MSC Checklist

@anoadragon453
Copy link
Member

People wanting to try out an implementation of custom emotes on the web can use their existing homeserver account through this riot-web instance.

@turt2live turt2live added kind:feature MSC for not-core and not-maintenance stuff proposal A matrix spec change proposal labels May 15, 2020
@turt2live turt2live self-requested a review May 15, 2020 14:32
@turt2live turt2live marked this pull request as draft May 15, 2020 14:32
@MTRNord
Copy link
Contributor

MTRNord commented May 15, 2020

People wanting to try out an implementation of custom emotes on the web can use their existing homeserver account through this riot-web instance.

Fluffychat Android works too :)

Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally seems to be in a sensible direction, though the duplicated effort for custom emoji is a bit bothersome :(

@MatMaul
Copy link

MatMaul commented May 20, 2020

What about adding a mandatory level of pack indirection ?

im.ponies.room_emotes would only contains something like:

{
  "packs": ["im.ponies.room_emotes.mypack1", "im.ponies.room_emotes.mypack2"]
}

and then use what is proposed here inside im.ponies.room_emotes.mypackX state events.

This allows bundling and updating packs independently, and overcome the 65k limit by quite a margin.

@Sorunome
Copy link
Contributor Author

What about adding a mandatory level of pack indirection ?

im.ponies.room_emotes would only contains something like:

{
  "packs": ["im.ponies.room_emotes.mypack1", "im.ponies.room_emotes.mypack2"]
}

and then use what is proposed here inside im.ponies.room_emotes.mypackX state events.

This allows bundling and updating packs independently, and overcome the 65k limit by quite a margin.

Not quite sure what you mean, here. That, for room emotes, you can specify other state keys that just extend that one pack?

While not explicitly stated in this MSC (yet?), if multiple packs have the same pack.short they are expected to be merged together - that way you can also overcome the 65k limit, however, when adding it to your account data you will have to add multiple entries for just a single pack. Maybe it would be better to explicitly add related state keys?

@MatMaul
Copy link

MatMaul commented May 20, 2020

Ok nevermind, I misread. I thought im.ponies.room_emotes was the state key, not the type.

@theotheroracle
Copy link

how hard would it be to add versioning and uuids to emoji packs
like for example if i'm in a room with a sticker pack that matches the UUID and version of the pack i'm using, then it won't add those emotes to my emote list

@Sorunome
Copy link
Contributor Author

Sorunome commented Jun 7, 2020

how hard would it be to add versioning and uuids to emoji packs
like for example if i'm in a room with a sticker pack that matches the UUID and version of the pack i'm using, then it won't add those emotes to my emote list

You already have versioning with state events - or do you mean something like v1.0? It would be trivial to add more information like that, or a uuid, to the pack object.

That being said, what if multiple packs have the same uuid but different emotes? Merge them? What if you spread one pack over multiple state events to overcome the 65k chars limit and then give them all the same uuid? Should they appear as one pack? How should that work with emote_rooms?

@theotheroracle
Copy link

theotheroracle commented Jun 7, 2020

how hard would it be to add versioning and uuids to emoji packs
like for example if i'm in a room with a sticker pack that matches the UUID and version of the pack i'm using, then it won't add those emotes to my emote list

You already have versioning with state events - or do you mean something like v1.0? It would be trivial to add more information like that, or a uuid, to the pack object.

That being said, what if multiple packs have the same uuid but different emotes? Merge them? What if you spread one pack over multiple state events to overcome the 65k chars limit and then give them all the same uuid? Should they appear as one pack? How should that work with emote_rooms?

if emotes are added to a pack, then the version number should be changed, and yeah i meant like a 1.0 thing, if multiple packs have the same uuid and the same version number, but different emotes, then the conflict should be reported to the user, after thinking about it might make more sense to have a last modified date instead of a version number, and it just uses the newest one, because if you want to remove an emote from a pack, then it shouldn't add it back from an older version of the pack, and i don't see how any of this is affected by it being in a room or a user account

@Sorunome
Copy link
Contributor Author

Sorunome commented Jun 7, 2020

and i don't see how any of this is affected by it being in a room or a user account

emote_rooms is where you say that you want emotes in a room (by room id and by state key) visible everywhere for yourself. How to handle now, if you spread one pack over multiple state events by putting the same uuid? Do you have to enable all the state keys? Should it autodetermine? If the later, wouldn't that be complex for client development?

and yeah i meant like a 1.0 thing, if multiple packs have the same uuid and the same version number, but different emotes, then the conflict should be reported to the user

Would a user care about the version of an emote pack? How to determine which version is newer? Grammer for version numbers would need to be defined, etc.

@Sorunome
Copy link
Contributor Author

Sorunome commented Jun 7, 2020

Like, maybe the uuid and versioning stuff could be added in a separate msc building onto this - the goal of this was to keep things as simple as possible

@theotheroracle
Copy link

emote_rooms is where you say that you want emotes in a room (by room id and by state key) visible everywhere for yourself. How to handle now, if you spread one pack over multiple state events by putting the same uuid? Do you have to enable all the state keys? Should it autodetermine? If the later, wouldn't that be complex for client development?

ok if one pack was spread over multiple events, i would have them share the uuid and also have part numbers, like "part":1 "part":2 etc for A, mainting the order of emotes, and B, to prevent them from overlapping eachother.

Would a user care about the version of an emote pack? How to determine which version is newer? Grammer for version numbers would need to be defined, etc.

state events have timestamps right, so you could just use the newest state event instead of bothering with version numbers

@Sorunome
Copy link
Contributor Author

Sorunome commented Jun 7, 2020

state events have timestamps right, so you could just use the newest state event instead of bothering with version numbers

origin_server_ts can't be trusted and can easily be forged

ok if one pack was spread over multiple events, i would have them share the uuid and also have part numbers, like "part":1 "part":2 etc for A, mainting the order of emotes, and B, to prevent them from overlapping eachother.

What would it need part for? It shouldn't matter if they are ordered correctly or not, emotes are an unordered set

@theotheroracle
Copy link

state events have timestamps right, so you could just use the newest state event instead of bothering with version numbers

oh you could just put a modified tag in then with like unixtime or something

What would it need part for? It shouldn't matter if they are ordered correctly or not, emotes are an unordered set

the most important reason for the part value is to distinguish continuations of the pack from updates and being able to order the emotes could matter to some pack maintainers, i would want to be able to do that for example

@Sorunome
Copy link
Contributor Author

Sorunome commented Jun 7, 2020

oh you could just put a modified tag in then with like unixtime or something

In a federated system there is no one true timesource. That is a mathematically unsolvable problem.

the most important reason for the part value is to distinguish continuations of the pack from updates and being able to order the emotes could matter to some pack maintainers, i would want to be able to do that for example

Updates would replace the previous state event. You don't need any "part" attribute for that. As for ordering, you could say they are ordered alphanumerically by the state key.

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! I have heavily refactored the MSC in 5663906 to make it clearer.

The one functional change buried in that commit was clarifying that the alt attribute of a custom emote should be a description of the image, whereas the title should be the shortcode (what one would expect to appear in a tooltip). This felt more in line with what clients do today, and with the original intention for those attributes in the HTML spec.

Below are some outstanding questions I had after closely reading through the MSC again. Please weigh in to help this MSC move towards FCP 🚂

Comment on lines +203 to +211
#### `m.image_pack.rooms` account data event

The `m.image_pack.rooms` account data event consists of the following:

- `rooms` **Required, Map[String, Map[String, Object]]**. A map of room ID to
another map: from `state_key` to an empty object.

This data structure allows specifying a single room image pack given the pack's
room ID and `state_key`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we combine the m.image_pack and m.image_pack.rooms account data events into a single account data event? Such that m.image_pack would have a rooms field with the contents of the existing m.image_pack.rooms account data event?

My initial assumption for why this was originally done was account data event size limits. But actually, I don't think individual account data events have event size limits? Instead, it would make more sense for a homeserver to limit the total amount of account data a user could store.

So, what purpose is there in having two separate account data event types defined?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have question(s) about this:

  1. Users can define global packs for their account to be used in rooms or users could specify set of emojis only to be used in specific rooms?

  2. Would admin/moderator allow and upload what they think its best suiting for room if its community driven

  3. Can we limit that specific rooms dont allow emojis or just allow emojis defined by room from question 2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, is there precedence for subtyping events like this? I wonder if it is wise to do. That would be an argument in favor of combining.

I would assume this was originally done so m.image_pack is the same as state and account data event; easier to write parsers and spec. This would be an argument against combining.

Given how widely adopted this MSC is, I wonder how fundamentally we should still change it, or at least consider the impact of these changes to the migrations necessary to implement. I this case the migration should be fairly easy (if .rooms is found in account data, move and delete it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Destinyg133

Users can define global packs for their account to be used in rooms or users could specify set of emojis only to be used in specific rooms?

The former.

Would admin/moderator allow and upload what they think its best suiting for room if its community driven

I'm not sure I understand the question.

Can we limit that specific rooms dont allow emojis or just allow emojis defined by room from question 2

If you're asking whether a room admin can prevent custom emojis from packs that aren't defined in that room, then no. Just as a room admin can't prevent only certain images from being sent in a room today.

They could do it via a bot that analyses messages, but not through the auth rules defined by the spec (at least in today's room versions).

@HarHarLinks

I wonder, is there precedence for subtyping events like this? I wonder if it is wise to do. That would be an argument in favor of combining.

Good question. Looks like there is some precedence similar to the naming convention:

  • SSSS: m.secret_storage.key.<key_id>, m.cross_signing.master, m.cross_signing.user_signing
  • MSC3890: org.matrix.msc3890.local_notification_settings.<device_id>

Though I think rather than trying to split up a large event into smaller chunks, this naming is simply a matter of namespacing (i.e. everything really to cross signing should be called m.cross_signing.*.

So perhaps that's what this proposal was originally aiming to do.

I would assume this was originally done so m.image_pack is the same as state and account data event; easier to write parsers and spec. This would be an argument against combining.

Personally I would advise against this, for if m.image_pack the state event or account data event ever diverge in the spec, such combined implementation will have to be separated again.


I actually don't really see a reason to combine these objects. Plus splitting them up does mean that a client can query them separately if desired. Given it would be more churn for existing implementations of this MSC to switch, I'm minded to keep things how they are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is some precedence similar to the naming convention:

* SSSS: `m.secret_storage.key.<key_id>`, `m.cross_signing.master`, `m.cross_signing.user_signing`

* [MSC3890](https://github.com/matrix-org/matrix-doc/pull/3890): `org.matrix.msc3890.local_notification_settings.<device_id>`

These look like different cases? Here in MSC2545 we propose to have both m.image_pack and m.image_pack.rooms. Does a m.cross_signing (not m.cross_signing.*!) event exist that my Ctrl+F is missing?

Besides that, yes I am sure this proposal was aiming to sort its events into a common namespace, which is more than reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like different cases? Here in MSC2545 we propose to have both m.image_pack and m.image_pack.rooms. Does a m.cross_signing (not m.cross_signing.*!) event exist that my Ctrl+F is missing?

I suppose they are slightly different.

We could aim for m.image_pack.user instead of m.image_pack to be more consistent. It wouldn't break backwards-compatibility either as clients today are using im.ponies.user_emotes. I'm not too fussed either way though - curious what others think.

Copy link
Contributor

@HarHarLinks HarHarLinks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Overall, I can't wait for this to be officially approved. After all, it's shadow-spec for a couple years at this point ;) I do however think that improvements that enable better sharing features in this or a timely followup MSC would be important to enable shipping a complete feature.

Comment on lines +203 to +211
#### `m.image_pack.rooms` account data event

The `m.image_pack.rooms` account data event consists of the following:

- `rooms` **Required, Map[String, Map[String, Object]]**. A map of room ID to
another map: from `state_key` to an empty object.

This data structure allows specifying a single room image pack given the pack's
room ID and `state_key`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, is there precedence for subtyping events like this? I wonder if it is wise to do. That would be an argument in favor of combining.

I would assume this was originally done so m.image_pack is the same as state and account data event; easier to write parsers and spec. This would be an argument against combining.

Given how widely adopted this MSC is, I wonder how fundamentally we should still change it, or at least consider the impact of these changes to the migrations necessary to implement. I this case the migration should be fairly easy (if .rooms is found in account data, move and delete it).

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback, all! I've responded to or incorporated your comments below.

Comment on lines +203 to +211
#### `m.image_pack.rooms` account data event

The `m.image_pack.rooms` account data event consists of the following:

- `rooms` **Required, Map[String, Map[String, Object]]**. A map of room ID to
another map: from `state_key` to an empty object.

This data structure allows specifying a single room image pack given the pack's
room ID and `state_key`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Destinyg133

Users can define global packs for their account to be used in rooms or users could specify set of emojis only to be used in specific rooms?

The former.

Would admin/moderator allow and upload what they think its best suiting for room if its community driven

I'm not sure I understand the question.

Can we limit that specific rooms dont allow emojis or just allow emojis defined by room from question 2

If you're asking whether a room admin can prevent custom emojis from packs that aren't defined in that room, then no. Just as a room admin can't prevent only certain images from being sent in a room today.

They could do it via a bot that analyses messages, but not through the auth rules defined by the spec (at least in today's room versions).

@HarHarLinks

I wonder, is there precedence for subtyping events like this? I wonder if it is wise to do. That would be an argument in favor of combining.

Good question. Looks like there is some precedence similar to the naming convention:

  • SSSS: m.secret_storage.key.<key_id>, m.cross_signing.master, m.cross_signing.user_signing
  • MSC3890: org.matrix.msc3890.local_notification_settings.<device_id>

Though I think rather than trying to split up a large event into smaller chunks, this naming is simply a matter of namespacing (i.e. everything really to cross signing should be called m.cross_signing.*.

So perhaps that's what this proposal was originally aiming to do.

I would assume this was originally done so m.image_pack is the same as state and account data event; easier to write parsers and spec. This would be an argument against combining.

Personally I would advise against this, for if m.image_pack the state event or account data event ever diverge in the spec, such combined implementation will have to be separated again.


I actually don't really see a reason to combine these objects. Plus splitting them up does mean that a client can query them separately if desired. Given it would be more churn for existing implementations of this MSC to switch, I'm minded to keep things how they are.

@shootie22

This comment was marked as off-topic.

@anoadragon453
Copy link
Member

All threads that don't require input from a wider audience have been resolved. As such, I propose this MSC be considered by the rest of the SCT:

@mscbot fcp merge

@mscbot
Copy link
Collaborator

mscbot commented May 20, 2025

Team member @mscbot has proposed to merge this. The next step is review by the rest of the tagged people:

Concerns:

  • checklist incomplete

Once at least 75% of reviewers approve (and there are no outstanding concerns), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for information about what commands tagged team members can give me.

@mscbot mscbot added proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. disposition-merge labels May 20, 2025
@anoadragon453
Copy link
Member

MSCs proposed for Final Comment Period (FCP) should meet the requirements outlined in the checklist prior to being accepted into the spec. This checklist is a bit long, but aims to reduce the number of follow-on MSCs after a feature lands.

SCT members: please check off things you check for, and raise a concern against FCP if the checklist is incomplete. If an item doesn't apply, prefer to check it rather than remove it. Unchecking items is encouraged where applicable.

Checklist:

  • Are appropriate implementation(s)
    specified in the MSC’s PR description?
  • Are all MSCs that this MSC depends on already accepted?
  • For each new endpoint that is introduced:
    • Have authentication requirements been specified?
    • Have rate-limiting requirements been specified?
    • Have guest access requirements been specified?
    • Are error responses specified?
      • Does each error case have a specified errcode (e.g. M_FORBIDDEN) and HTTP status code?
        • If a new errcode is introduced, is it clear that it is new?
  • Will the MSC require a new room version, and if so, has that been made clear?
    • Is the reason for a new room version clearly stated? For example,
      modifying the set of redacted fields changes how event IDs are calculated,
      thus requiring a new room version.
  • Are backwards-compatibility concerns appropriately addressed?
  • Are the endpoint conventions honoured?
    • Do HTTP endpoints use_underscores_like_this?
    • Will the endpoint return unbounded data? If so, has pagination been considered?
    • If the endpoint utilises pagination, is it consistent with
      the appendices?
  • An introduction exists and clearly outlines the problem being solved.
    Ideally, the first paragraph should be understandable by a non-technical audience.
  • All outstanding threads are resolved
    • All feedback is incorporated into the proposal text itself, either as a fix or noted as an alternative
  • While the exact sections do not need to be present,
    the details implied by the proposal template are covered. Namely:
    • Introduction
    • Proposal text
    • Potential issues
    • Alternatives
    • Dependencies
  • Stable identifiers are used throughout the proposal, except for the unstable prefix section
    • Unstable prefixes consider the awkward accepted-but-not-merged state
    • Chosen unstable prefixes do not pollute any global namespace (use “org.matrix.mscXXXX”, not “org.matrix”).
  • Changes have applicable Sign Off from all authors/editors/contributors
  • There is a dedicated "Security Considerations" section which detail
    any possible attacks/vulnerabilities this proposal may introduce, even if this is "None.".
    See RFC3552 for things to think about,
    but in particular pay attention to the OWASP Top Ten.

@mscbot concern checklist incomplete

@mscbot mscbot added the unresolved-concerns This proposal has at least one outstanding concern label May 20, 2025
@turt2live turt2live removed the blocked Something needs to be done before action can be taken on this PR/issue. label May 20, 2025
@github-project-automation github-project-automation bot moved this to Needs idea feedback / initial review in Spec Core Team Backlog May 20, 2025
@turt2live turt2live moved this from Needs idea feedback / initial review to Ready for FCP ticks in Spec Core Team Backlog May 20, 2025
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this. I haven't read the MSC in detail, but the parts I did read look more than sufficient for starting FCP in my opinion.

@turt2live turt2live assigned anoadragon453 and unassigned turt2live May 20, 2025
Comment on lines 310 to 317
1. User image pack (defined in the user's account data)
2. Image pack rooms (defined in the `m.image_pack.rooms` user account data
object)
3. Space image packs (defined in the hierarchy of canonical spaces for the
current room)
4. Room image packs (defined in the currently open room's state)
5. Referenced room image packs (defined in the `m.image_pack.rooms` room
state event)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the room to take priority over the space since it is more "local" to the current context.


### Shortcode grammar

A shortcode's length MUST not exceed 100 bytes. This restriction MUST be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be valid Unicode?

Copy link
Member

@anoadragon453 anoadragon453 May 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've done some research to try and figure this out. To clarify, there's a difference between "valid UTF-8 byte sequences" and "valid Unicode scalar values".

Not every byte sequence can be decoded as valid UTF-8. Therefore, if we allowed any sequence of bytes (as the proposal currently does), text decoders may panic upon attempting to blindly decode them. For this reason, we absolutely should restrict the grammar to only "byte sequences that can be decoded as UTF-8".

Separately, the set of "valid Unicode scalar values" is a range that has been stabilised since 2003 (RFC 3629) and is unlikely to change. We could restrict the grammar to this range to avoid causing any Unicode parsers to fail. This seems reasonable to me.

The Unicode consortium assign new codepoints every year in each Unicode update. It would be foolish to restrict to only the currently valid set of assigned codepoints, as that would break forwards-compatibility.

Given the above, the grammar rules could be:

  1. The value must be a well-formed UTF-8 byte sequence;
  2. After decoding, every code point must be a Unicode scalar (i.e. not a surrogate or noncharacter);
  3. Length ≤ 100 bytes;
  4. Must not contain : (U+003A) or (U+0020).

What do people think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Valid UTF-8" is somewhat implied by the fact that it has to be transferred over a JSON REST API as part of an event (though it's certainly the case that the spec could do better on this: it's one of the many things discussed in matrix-org/matrix-spec#365). Anyway, I wouldn't bother calling it out for this instance: clients can and do reasonably expect that they can utf-decode all event content.

Ditto the surrogates, and (fwiw) characters above U+10FFFF.

You mention "noncharacters": are you talking about things like U+FDD0? We don't special-case those elsewhere afaik, so I wouldn't do so here.

More to the point, though, I question why we are specifying this in bytes rather than unicode codepoints. All the clients and servers I know of work in unicode codepoints at this level; to determine if a shortcode is longer than 100 bytes they would have to re-encode those strings back into utf-8 and see what the result is. (Admittedly I haven't actually looked at server impls other than Synapse, so maybe this is bogus logic, but remember that to even get as far as inspecting a shortcode you've had to deserialize the JSON into some sort of higher-level object).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we don't even need a length limit? suppose somebody makes a shortcode that is 62000 bytes long. What have they achieved, other than filling their entire image pack with a single shortcode?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually -- are we limited by what's allowed in an event/message already here? Otherwise you might have a mostly correct message with an invalid short-code?

Comment on lines +438 to +442
If there are multiple emotes with the same shortcode available, the client could
for example slugify the containing pack's display name and prepend it to each
emote's shortcode with a separator character (i.e. `~`). For instance, the user
could enter `:my-pack~emote:` to select an image with shortcode `emote` in the
pack `my pack`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still have conflicts, maybe using : as a separator makes more sense? :my-pack:emote:?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it will conflict, good point! And : is much easier to type.

It's important to note that this is purely a suggestion for clients to add a feature to auto-complete emotes when typing. It does not need to be a unique representation of the emote. For instance, two image packs could have the same exact name as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to image pack names not being globally unique, I think this suggestion of the proposal doesn't work:

if the user enters :emote: in a message, the client could replace that text with the corresponding emote based on the shortcode.

If there are two image packs with the same name, then the parser won't be able to deterministically figure out which one to pick. The only solution to that is to include the room ID and event ID - but users aren't going to use that :)

I'm going to strip that suggestion as a result and replace it with a warning. But the search UI is still a nice idea (bonus points for fuzzy search!).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon giving this some more thought, using : again as the separator does limit client functionality.

If a user has typed That was funny :smile: into a message window, upon hitting the second :, a client's emote UI component would not easily know whether the user:

  1. Wants to start filtering for results from the "smile" image pack, or
  2. Close the search UI (and potentially pick the last highlighted emote).

Whereas typing a distinct character, like ~ would signal 1, and typing : would signal 2.

So, to give clients' more flexibility in designing their UI/UX, I think keeping a distinct character would make sense. However, that does mean that we'll want to exclude that character from the shortcode grammar. Otherwise, as @clokep says, we'll get conflicts if that character is part of the shortcode itself.

Looking at other third-party messaging apps;

  • Discord uses <:name:ID> (< and > are boundaries, : is the separator)
  • Slack uses :pack/shortcode:

I like / as a separator, more than ~. ~ is harder to type for more people, and / already universally indicates "hierarchy" thanks to filesystems and web addresses.

Alternatively, we could use (space) which is already barred by the grammar (to make shortcodes easier to read). But I feel that's a fairly confusing UX for a separator, versus /.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer putting the emoji name before the pack name. This increases UI consistency if user forgets whether that emoji name is unique (still starts out with typing :towel), and if user wants to type emoji name or pack name then click/touch the correct emoji, typing emoji name will likely return fewer choices than pack name.

I think @ would be the best candidate, so you get (for example) :towel@h2g2: (towel emoji @ h2g2 emoji pack).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the user need to ever see the full short code? Does it make sense to use :!roomid:.../shortcode: so that it is easy to look up for clients?

Shortcodes are only for users doing tab-completions and such, machines don't need them for anything. This whole thread is about a client UI suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I realized that after commenting, but didn't want to edit it for a fifth time. I think realistically you'll probably have an image in the UI anyway so the specific short code disambiguation doesn't matter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, this whole discussion feels like a thing that could reasonably left up to clients, rather than worrying about it in the spec.

Copy link
Member

@anoadragon453 anoadragon453 Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richvdh the spec angle concerns whether we should bar a specific character from the shortcode grammar for clients to use in order to implement such a feature.

I agree that we should stop arguing about the order of pack name vs. emote name here though. The only relevant discussion to this forum is which character to bar from the grammar. The current contenders are:

  • /
  • @
  • _
  • ~

The current, organically evolved convention seems (?) to be :neocat_floof:, :neofox_floof:, etc. I.e. effectively : + $packname + _ + $emote_name_snake_case + :.

This only works if $packname doesn't contain any _, which feels counter-intuitive if emote names are snake_case.

I still find ~ difficult to type.

I would be in favour of reserving both / and @ (as anecdotally from my own experience, they are uncommon in emote names) and letting clients make use of them however they like. For this feature, it's okay if client A starts using / and client B starts using @.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that _ is probably not the best choice. Am personally more a fan of @ or ~, but really just anything other than _.
Thought I will say that an @ feels like it makes more sense if the pack name came after the emote name.

This avoids a split-brain in the room. Servers MAY opt to locally redact events
having too many bytes in the shortcode field.

The `:` character MUST NOT be included in the emote shortcode. The `:` character
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also formally exclude U+0000 (NUL) characters from shortcodes, as some parsers can treat that character as "end of string".

Copy link

@Gremious Gremious Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also exclude EOL characters in that vein? I kind of doubt anyone will actually use those since multiline short-codes would be strange, but I don't know if we prefer to be explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just NUL seems to match https://spec.matrix.org/latest/appendices/#room-ids. I fear we might find a lot of characters in unicode that might warrant some kind of exception 🤔

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excites to see progress on this!


`m.image_pack` state events contain the following keys within their `content`:

* `images`: **Required, Map[String, Object]**. A map from a shortcode to an Image Object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to the notes on shortcode grammar here?

Comment on lines +339 to +343
A shortcode's length MUST not exceed 100 bytes. This restriction MUST be
enforced by servers when sending reactions, but servers MUST NOT reject events
coming across federation due to having too many bytes in the shortcode field.
This avoids a split-brain in the room. Servers MAY opt to locally redact events
having too many bytes in the shortcode field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little surprised that we're requiring servers to enforce this grammar rather than clients. Typically, we've steered clear of requiring servers to do much in the way of validation on event bodies.

We're probably going to have to answer the question "what should clients do when they see a shortcode that doesn't match the grammar" anyway (since homeservers may be old, and anyway they "MAY opt to locally redact events" rather than MUST).

Was any thought given to putting the requirement for enforcing grammar on the client rather than the server?

Comment on lines +358 to +359
Specifically, the `U+0020` character (normal space) is disallowed. There are
multiple other byte sequences associated with spaces in Unicode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple other byte sequences associated with spaces in Unicode.

... which are allowed, or not? This is unclear.


##### `alt` attribute

The `alt` attribute MUST be present. Its value SHOULD be the `body` of the emote, or if unavailable, its shortcode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, if any of these MUST requirements on the attribute are not adhered to, a receiving client should refuse to render them?

Comment on lines +438 to +442
If there are multiple emotes with the same shortcode available, the client could
for example slugify the containing pack's display name and prepend it to each
emote's shortcode with a separator character (i.e. `~`). For instance, the user
could enter `:my-pack~emote:` to select an image with shortcode `emote` in the
pack `my pack`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, this whole discussion feels like a thing that could reasonably left up to clients, rather than worrying about it in the spec.

Comment on lines +517 to +519
| `m.image_pack` state event type | `im.ponies.room_emotes` |
| `m.image_pack` account data event type | `im.ponies.user_emotes` |
| `m.image_pack.rooms` account data event type | `im.ponies.emote_rooms` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m.image_pack.rooms state event?

Co-authored-by: Richard van der Hoff <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge kind:feature MSC for not-core and not-maintenance stuff proposal A matrix spec change proposal proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. unresolved-concerns This proposal has at least one outstanding concern
Projects
Status: Ready for FCP ticks
Status: Scheduled - v1.10
Development

Successfully merging this pull request may close these issues.