Description
The allowReserved
keyword in the Parameter Object is specified as follows (from 3.1.0):
Determines whether the parameter value SHOULD allow reserved characters, as defined by [RFC3986]
:/?#[]@!$&'()*+,;=
to be included without percent-encoding. This property only applies to parameters with anin
value ofquery
. The default value isfalse
.
This is strange in several ways. Most of those characters are not forbidden in query strings. Here is the relevant ABNF:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
query = *( pchar / "/" / "?" )
fragment = *( pchar / "/" / "?" )
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
What this means is that the actual set of characters not allowed in the query string by RFC 3986 is just #
, [
, and ]
. Everything else is allowed by either the pchar
production or the query
production.
It does not make sense to allow #
in a query string value because it will cause everything after it to parse as a fragment. The OAS cannot change that behavior, and implying that we can is problematic at best.
While the failure mode of including [
or ]
is less clear (they're only used for IPv6 or IPvFuture literals in the ABNF), it's still strange to encourage something that correctly implemented URI parsers will reject (admittedly, the percentage of URI parsers that are implemented correctly is depressingly low).
So...
- What is the use case for allowing unencoded
[
or]
in a query parameter value? How are correct URI parsers expected to handle this? Are we requiring tools to parse URIs incorrectly? We might need some guidance here - I would have no idea what to do with this, or whether to validate it in OASComply - I don't think we should include
#
here at all as it's not going to work even in poorly-implemented URI parsers. Since we can't technically remove the feature, we should put a strongly worded SHOULD NOT around it - Is there a reason we need to mention the other characters at all? They don't need to be encoded anyway (although sometimes people do). Are we just using this as a "no, really, please don't encode these, you don't need to anyway"?