Closed
Description
Hi,
not a huge problem but I just stumbled over this while bumping "regex = 0.1" to "0.2:
I was using a lot of this (IRC parsing code):
let caps = re_priv.captures(line).unwrap();
let sender = caps.at(1).unwrap()
let msg = caps.at(2).unwrap_or("");
Now it looks like this (thanks #rust for the help)
let caps = re_priv.captures(line).unwrap();
let sender = caps.get(1).unwrap().as_str();
let msg = caps.get(2).as_ref().map(regex::Match::as_str).unwrap_or("");
// alternative:
let msg = caps.get(2).map_or("", |m| m.as_str());
So my reasoning is now this:
- In this use case I don't care about much error handling
- this is inherently text, so no match/empty string can make sense
- both of the two 0.2 versions are either verbose or not easily guessable (at least to a beginner)
So I don't really want to complain, but I think the syntax (for this use case) got less usable and more verbose, but I don't know if it's worth fixing or if .at() had any other problems.
Activity
BurntSushi commentedon Feb 12, 2017
I'm not sure what to do about this. What do you think the alternative is? Adding more methods doesn't seem like a good idea, since that increases the number of choices a newcomer is faced with, which can be daunting.
Did you look at the documentation while writing code? Could you say something about why that didn't help you? In particular, take a look at the examples for
captures
, which shows yet another alternative:The standard answer here is "this is what unwrap is for." But in this case, you can access captures by index. e.g.,
let caps = re.captures(line).unwrap(); println!("{}", caps[2])
.But not always. The API must provide a way to differentiate between "no match" and "empty string match."
The key change was the addition of the
Match
data type, which encapsulates both the positions and the text of any particular match. While it introduced a new data type (an increase in complexity) it also decreased the number of different accessor methods (a decrease in complexity).BurntSushi commentedon Feb 12, 2017
OK, after reading IRC and reflecting more, I see that you considered the
&caps[2]
syntax, but didn't want it because it panics if there's no match. In that case, you need to handle the "no match" case explicitly, so doing the explicit unwrap seems fine to me. I personally think either of these are pretty palatable:Doing the eta reduction in this case tends to make it a bit more verbose because of the
as_ref
call.winks commentedon Feb 12, 2017
Thanks for the quick answer.
I hadn't thought about "no match" vs "empty string match", correct.
Maybe I'm biased as I hadn't known
map_or
and didn't think of usingmap
. I'm just saying I think theunwrap_or("")
as a proxy of "I don't care much about (exact) error handling" - i.e. I don't want unwrap to panic, but I know what I'm doing and "empty string" is fine - but I'm probably oversimplifying and this is a special use case that just sounds normal to me :)I think adding this
map_or
example to the docs as a "hey, I just need the match or an empty string and don't want to futz around with Option(Match) in this case, just like the old.at(x).unwrap_or("")
would be 100% fine to me.TLDR: the use in 0.1 seemed very convenient while I think in 0.2 this is less nice of a shorthand. Nothing more.
BurntSushi commentedon Feb 12, 2017
Improving the docs is something I can get on board with.
In general, I think I'm hesitant to add new methods that conveniences over a simple combinator or two, and would prefer to solve those issues for newcomers with better docs. However, I'd be willing to bend on that if there was compelling evidence that it was a big stumbling block for folks.
BurntSushi commentedon Feb 18, 2017
I have a doc fix incoming for this.
@winks I think it's worth pointing out that in some number of cases, using indexing, e.g.,
&caps[3]
, might very well be sufficient. In particular, if you have aCaptures
, then you know you have a match, and if you know capture group3
had to have matched if the entire regex matched, then&caps[3]
will never panic. The only reason why you'd need to drop down tocaps.get(3)
is if capture group3
was in an optional path of the regex and therefore may not have necessarily participated in the match.Add Captures::get example.
winks commentedon Feb 18, 2017
Those comments look awesome, I can't wish for more :) 👍
For completeness, this was my main example (sending
!quit
or!quit some_text
to an irc bot):it has two possible matches in the end, but just the non-captured part is fine on its own. Seeing it again I should've probably used '(\s+(.+))?' - but I guess with "optional capture group at the end" it's always the same edge case that
&caps[x]
is not the best choice.Anyway, this can be closed from my point of view - and thanks for your time.
Auto merge of #343 - BurntSushi:fixes, r=BurntSushi