Skip to content

ENH: column label filtering via regexes to work for numeric names #10384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from

Conversation

cyrusmaher
Copy link
Contributor

Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")

closes #10506

Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")
@jreback
Copy link
Contributor

jreback commented Jun 18, 2015

can you add some tests?

@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label Jun 18, 2015
@jreback jreback changed the title Update generic.py ENH: column label filtering via regexes to work for numeric names Jun 18, 2015
@cyrusmaher
Copy link
Contributor Author

For search(x) -> search(str(x))?

@cyrusmaher
Copy link
Contributor Author

Any advice on what to add or where? I don't see any existing tests for this function...

@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

look in pandas/tests/test_frame for test_filter

@jreback jreback added this to the 0.17.0 milestone Jul 3, 2015
@cyrusmaher
Copy link
Contributor Author

Thanks Jeff! Added the test. Let me know what you think...



# regex with ints in column names
df = DataFrame(0., index=[0, 1, 2], columns=[0, 1, 'A1', 'B'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment (this PR number since no associated issue)

@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

add a not in whatsnew/0.17.0. Put in Other Enhancements section

What would this do in 0.16.2 (if you passed the regex), not fitler anything? or raise?

@cyrusmaher
Copy link
Contributor Author

Done! In 0.16.2 re.search will raise if a column name is numeric...

@@ -26,7 +26,8 @@ New features

Other enhancements
^^^^^^^^^^^^^^^^^^

- `regex` argument to DataFrame.filter now handles numeric column names instead of raising an exception.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double backticks here (and around DateFrame.filter)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number (this PR number) onto the end (see how the other issues are done)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say instead of raising ValueError

@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

when you are all done, pls rebase/squash see contributing docs here

cyrusmaher and others added 4 commits July 3, 2015 15:10
Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")

Add test for regex filter on numeric column names

Add release note

Add second regex test
@cyrusmaher
Copy link
Contributor Author

I'm having trouble with squashing the commits. I don't have a ton of experience with git, so I'm not sure what to do next. Below is the message. Seems to have to do with a merge conflict in test_frame? Any advice?

error: could not apply ac90352... Add test for regex filter on numeric column names

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

@jreback
Copy link
Contributor

jreback commented Jul 3, 2015

contributing docs are here: http://pandas.pydata.org/pandas-docs/stable/contributing.html

you have a conflict and need to fix it

cyrusmaher and others added 4 commits July 3, 2015 16:05
# The first commit's message is:

Fix regex filter for numeric columns

Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")

Add test for regex filter on numeric column names

Add release note

Add second regex test

# This is the 2nd commit message:

Update generic.py

Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")
Simple fix to allow regex filtering to work for numeric column labels, e.g. df.filter(regex="[12][34]")
@cyrusmaher
Copy link
Contributor Author

Hmm, when I rebase it detects conflicts, then I resolve them using git mergetool, and commit. Doesn't seem to change anything. When I run git merge master I get that everything is up-to-date. I'm probably missing something simple?

@jreback
Copy link
Contributor

jreback commented Jul 5, 2015

FYI, you don't normally need to add an issue if you just create a PR (like you did), but no biggie.

@jreback
Copy link
Contributor

jreback commented Jul 5, 2015

I rebase you: https://travis-ci.org/jreback/pandas/builds/69631109

FYI don't use merge master. This is not pandas standard practice. This makes rebasing much more difficult.

@jreback
Copy link
Contributor

jreback commented Jul 6, 2015

merged via bfe5a7f

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

regex option for DataFrame.filter raises error on numeric column names
2 participants