ENH: Series Mapping `na_action=ignore` result is misleading. #47262

chapmanderek · 2022-06-07T00:41:03Z

Is your feature request related to a problem?

More a misleading flag... When mapping a dictionary to a pandas series and using the na_action='ignore' i would expect it to ignore unknown values. Currently it replaces them with an NaN. For example:

pd.Series(['calf', 'foal', 'bunny']).map({'calf':'cow','bunny':'rabbit'}, na_action='ignore')

returns:

0       cow
1       NaN
2    rabbit

Describe the solution you'd like

I would expect it to ignore items that aren't in the dictionary instead of replacing them. I would expect this to be returned:

0       cow
1       foal
2    rabbit

The current behavior is problematic for large dataframes with many different elements in a series where perhaps you only want to rename some of them. In this case you would have to either do a replace for each one... or make a mapping for every one and hope you didn't miss one.

Describe alternatives you've considered

Trying to differentiate the current action of ignore and the action of ignoring-but-not-replacing will be difficult. Perhaps a new value for na_action of dont_replace so that it is very apparent what you are asking to happen with NaNs while keeping the current behavior (albeit confusing) of ignore.

Additional context

Similar-ish to #14210

The text was updated successfully, but these errors were encountered:

rhshadrach · 2022-06-07T02:46:53Z

With ser being the pandas Series:

ser = ser.map(mapper).fillna(ser)

will map only values that exist in mapper, leaving other values untouched.

i would expect it to ignore unknown values

I could see that as a reasonable interpretation when viewing the name alone, but do you have that same expectation from the documentation?

chapmanderek · 2022-06-09T23:17:59Z

@rhshadrach
No i think the documentation spells it out. It specifically says:

If ‘ignore’, propagate NaN values

Also one of the examples ("rabbit" is missing) it fills in with a NaN value. I just think that the actual flag is misleading. The label isnt the contents of the can.

The ignore flag only really makes sense in terms of the last example where the value is getting passed onto another function or thing. When you are trying to replace items (possibly the more common scenario) it doesn't work as I would expect given the name.

topper-123 · 2023-03-25T11:18:02Z

For what it's worth, I think mappings using the existing value would be more logical to me than replacing it with Nan. But I agree the current behavior is as intended.

topper-123 · 2023-03-25T12:10:05Z

Thinking a bit further, I think it's probably always better to use the Series.replace method for this:

>>> import pandas as pd
>>> ser = pd.Series(['calf', 'foal', 'bunny'])
>>> ser.map({'calf':'cow','bunny':'rabbit'}, na_action='ignore')
0       cow
1       NaN
2    rabbit
dtype: object
>>> ser.replace({'calf':'cow','bunny':'rabbit'})
0       cow
1      foal
2    rabbit
dtype: object

IMO we should add replace under the See also doc section under Series.map.

Thinking even further, should we not just deprecate allowing dict-likes to Series.map and direct users to use Series.replace for dict-likes instead? That seems very logical to me to do, especially if we see Series.map as a element-level version of Series.apply after #52140.

@rhshadrach

rhshadrach · 2023-03-29T00:37:17Z

It seems natural for Series.map to take a mapping. While there is an overlap in functionality here, I lean toward not deprecating unless there is an issue with map taking a mapping.

chapmanderek added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 7, 2022

rhshadrach added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Series Series data structure labels Jun 7, 2022

simonjayhawkins changed the title ~~Series Mapping na_action=ignore result is misleading. ENH:~~ ENH: Series Mapping na_action=ignore result is misleading. Jun 9, 2022

mroeschke added Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Series Mapping `na_action=ignore` result is misleading. #47262

ENH: Series Mapping `na_action=ignore` result is misleading. #47262

chapmanderek commented Jun 7, 2022 •

edited

Loading

rhshadrach commented Jun 7, 2022

chapmanderek commented Jun 9, 2022

topper-123 commented Mar 25, 2023 •

edited

Loading

topper-123 commented Mar 25, 2023 •

edited

Loading

rhshadrach commented Mar 29, 2023

ENH: Series Mapping na_action=ignore result is misleading. #47262

ENH: Series Mapping na_action=ignore result is misleading. #47262

Comments

chapmanderek commented Jun 7, 2022 • edited Loading

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

rhshadrach commented Jun 7, 2022

chapmanderek commented Jun 9, 2022

topper-123 commented Mar 25, 2023 • edited Loading

topper-123 commented Mar 25, 2023 • edited Loading

rhshadrach commented Mar 29, 2023

ENH: Series Mapping `na_action=ignore` result is misleading. #47262

ENH: Series Mapping `na_action=ignore` result is misleading. #47262

chapmanderek commented Jun 7, 2022 •

edited

Loading

topper-123 commented Mar 25, 2023 •

edited

Loading

topper-123 commented Mar 25, 2023 •

edited

Loading