-
-
Notifications
You must be signed in to change notification settings - Fork 409
irc: unescape hexadecimal sequences in ISUPPORT parameter values #2429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
irc: unescape hexadecimal sequences in ISUPPORT parameter values #2429
Conversation
|
The specification says also that:
which I believe is not currently checked by Sopel. Not sure if we want to worry about that, but if so, it should probably be part of this changeset. |
|
Refreshing to have such a small patch to review after you know what 😅 (and thank you @SnoopJ for helping with that one).
I wouldn't worry about servers violating the MUST; we can "be liberal in what you accept" and decode the escapes anyway, assuming UTF-8 in the absence of a specified CHARSET. You could add a test case where something that looks like an escape but isn't valid (what you called "pathological counter-examples" on IRC) gets handled. More importantly, the following behavior should be validated in tests as well:
If we're gonna do this, it had better not mojibake the encoded values 😉 |
This 'should' confuses me a bit. I think what it's saying is that if the server advertises a To give a concrete example, let's say we receive the bytes Since this part of Sopel has already decoded received bytes to a string, doing this as described is tricky. I'll mull it over and maybe do a little sopelunking to see how to accomodate this part of the specification. |
|
After discussion with @Exirel, it's come to light that the Modern IRC specification and the IETF specification have slightly different opinions about escape sequences, with Modern IRC being more strict, limiting escapes to only I've updated this PR to follow the Modern IRC perspective, which is not only simpler to implement, but avoids tangling with the complication of the antiquated/removed |
|
Note from IRC: we have a different understanding of the spec, and we agree that it is a bit confusing. Following these two things led me to believe that we don't need to un-escape everything and only a subset of characters:
As a result, we discussed it with @SnoopJ and we came to the conclusion that we are going to un-escape only the few bytes that need to. Edit: @SnoopJ beat me to it! |
754471d to
fa45f08
Compare
dgw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everyone else already did the good nitpicks, so this was all that was left for me to find. 🤣
RPL_ISUPPORT may include escape sequences for octets confusible with the message grammar. This changeset adds support for this protocol feature by adding logic to unescape these sequences. Co-authored-by: dgw <[email protected]> Co-authored-by: Florian Strzelecki <[email protected]> Co-authored-by: Rusty Bower <[email protected]>
10854c2 to
77132fc
Compare
|
Squashed and rebased with a summary in the commit message. Another |
dgw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Description
This changeset handles hexadecimal escape sequences in
ISUPPORTparameter values, implementing #2195Checklist
make qa(runsmake qualityandmake test)