-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Is your feature request related to a problem?
More a misleading flag... When mapping a dictionary to a pandas series and using the na_action='ignore'
i would expect it to ignore unknown values. Currently it replaces them with an NaN. For example:
pd.Series(['calf', 'foal', 'bunny']).map({'calf':'cow','bunny':'rabbit'}, na_action='ignore')
returns:
0 cow
1 NaN
2 rabbit
Describe the solution you'd like
I would expect it to ignore items that aren't in the dictionary instead of replacing them. I would expect this to be returned:
0 cow
1 foal
2 rabbit
The current behavior is problematic for large dataframes with many different elements in a series where perhaps you only want to rename some of them. In this case you would have to either do a replace for each one... or make a mapping for every one and hope you didn't miss one.
Describe alternatives you've considered
Trying to differentiate the current action of ignore
and the action of ignoring-but-not-replacing will be difficult. Perhaps a new value for na_action
of dont_replace
so that it is very apparent what you are asking to happen with NaNs while keeping the current behavior (albeit confusing) of ignore
.
Additional context
Similar-ish to #14210
Activity
rhshadrach commentedon Jun 7, 2022
With
ser
being the pandas Series:will map only values that exist in mapper, leaving other values untouched.
I could see that as a reasonable interpretation when viewing the name alone, but do you have that same expectation from the documentation?
[-]Series Mapping `na_action=ignore` result is misleading. ENH:[/-][+]ENH: Series Mapping `na_action=ignore` result is misleading.[/+]chapmanderek commentedon Jun 9, 2022
@rhshadrach
No i think the documentation spells it out. It specifically says:
Also one of the examples ("rabbit" is missing) it fills in with a NaN value. I just think that the actual flag is misleading. The label isnt the contents of the can.
The ignore flag only really makes sense in terms of the last example where the value is getting passed onto another function or thing. When you are trying to replace items (possibly the more common scenario) it doesn't work as I would expect given the name.
topper-123 commentedon Mar 25, 2023
For what it's worth, I think mappings using the existing value would be more logical to me than replacing it with Nan. But I agree the current behavior is as intended.
topper-123 commentedon Mar 25, 2023
Thinking a bit further, I think it's probably always better to use the
Series.replace
method for this:IMO we should add
replace
under theSee also
doc section underSeries.map
.Thinking even further, should we not just deprecate allowing dict-likes to
Series.map
and direct users to useSeries.replace
for dict-likes instead? That seems very logical to me to do, especially if we seeSeries.map
as a element-level version ofSeries.apply
after #52140.@rhshadrach
rhshadrach commentedon Mar 29, 2023
It seems natural for
Series.map
to take a mapping. While there is an overlap in functionality here, I lean toward not deprecating unless there is an issue withmap
taking a mapping.