Skip to content

Plotly Express Key Error With Filtered Categorical Variables in Dataframe #4433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ZakZakZakityZak opened this issue Nov 17, 2023 · 2 comments
Labels
bug something broken sev-3 annoyance with workaround

Comments

@ZakZakZakityZak
Copy link

ZakZakZakityZak commented Nov 17, 2023

I believe I have found an error in plotly when I attempted to use plotly express to build a scatter or line plot using a dataframe with a categorical column that has been filtered to a subset of values. I am using plotly version 5.13.0.

Minimal code with some comments to produce the issue in a jupyter notebook environment:

import plotly.express as px
import pandas as pd 

# Make a dataframe
df = {
    'foo':[1,2,3,4,5],
    'bar':[12,5,17,8,9],
    'baz':[3.0,3.0,4.0,4.0,5.0]
}
df = pd.DataFrame(df)
# Make one of the columns a category type
df['baz'] = df['baz'].astype('category')

# Filter to some specific category values
qdf = df[
    df['baz'].isin((3.0,4.0))
]
# Make and show a figure
fig = px.scatter(
    qdf,
    x = 'foo',
    y = "bar",
    color = "baz"
)

fig.show()

Expected output is a figure with points colored based on the categorical variable.

Actual output is: KeyError: 5.0 I'm including the traceback below. What I think is happening is that plotly is asking the dataframe what values are in the column via some process that returns all the values that COULD be in the column. Which results in a key error when those values are sent to the "get_group" function of the grouper.


KeyError Traceback (most recent call last)
/tmp/ipykernel_12/3786163996.py in <cell line: 17>()
15 ]
16
---> 17 fig = px.scatter(
18 qdf,
19 x = 'foo',

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.10/lib/python3.10/site-packages/plotly/express/_chart_types.py in scatter(data_frame, x, y, color, symbol, size, hover_name, hover_data, custom_data, text, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, error_x, error_x_minus, error_y, error_y_minus, animation_frame, animation_group, category_orders, labels, orientation, color_discrete_sequence, color_discrete_map, color_continuous_scale, range_color, color_continuous_midpoint, symbol_sequence, symbol_map, opacity, size_max, marginal_x, marginal_y, trendline, trendline_options, trendline_color_override, trendline_scope, log_x, log_y, range_x, range_y, render_mode, title, template, width, height)
64 mark in 2D space.
65 """
---> 66 return make_figure(args=locals(), constructor=go.Scatter)
67
68

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.10/lib/python3.10/site-packages/plotly/express/_core.py in make_figure(args, constructor, trace_patch, layout_patch)
2002 )
2003 grouper = [x.grouper or one_group for x in grouped_mappings] or [one_group]
-> 2004 groups, orders = get_groups_and_orders(args, grouper)
2005
2006 col_labels = []

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.10/lib/python3.10/site-packages/plotly/express/_core.py in get_groups_and_orders(args, grouper)
1977 full_sorted_group_names = [tuple(g) for g in full_sorted_group_names]
1978
-> 1979 groups = {
1980 sf: grouped.get_group(s if len(s) > 1 else s[0])
1981 for sf, s in zip(full_sorted_group_names, sorted_group_names)

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.10/lib/python3.10/site-packages/plotly/express/_core.py in (.0)
1978
1979 groups = {
-> 1980 sf: grouped.get_group(s if len(s) > 1 else s[0])
1981 for sf, s in zip(full_sorted_group_names, sorted_group_names)
1982 }

~/.cache/pypoetry/virtualenvs/python-kernel-OtKFaj5M-py3.10/lib/python3.10/site-packages/pandas/core/groupby/groupby.py in get_group(self, name, obj)
815 inds = self._get_index(name)
816 if not len(inds):
--> 817 raise KeyError(name)
818
819 return obj._take_with_is_copy(inds, axis=self.axis)

KeyError: 5.0

@Coding-with-Adam
Copy link
Contributor

hi @ZakZakZakityZak
I was able to reproduce the bug you mentioned, using your code. Thanks for opening this issue.

I get this error message:

File "C:\Users\adams[...]\site-packages\pandas\core\groupby\groupby.py", line 1059, in get_group
raise KeyError(name)
KeyError: 5.0

@Coding-with-Adam Coding-with-Adam added bug something broken sev-3 annoyance with workaround labels Nov 21, 2023
@arcanaxion
Copy link
Contributor

I believe this is a duplicate of #4274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken sev-3 annoyance with workaround
Projects
None yet
Development

No branches or pull requests

3 participants