-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: added nunique function to Index #6734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I like all the tests! I would rather see this added in core/base.py (and then remove the Series) one did you cover datetime/period index for tests? |
Sure. I'll try it. Actually tests are derived from |
# NAs in object arrays #714 | ||
i = np.array(['foo'] * 100) | ||
i[::2] = np.nan | ||
idx = Index(i, dtype='O') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't specify dtype
when constructing an Index, let it infer it properly (unless the test actually is trying to test that).
I've moved |
FYI, I find it helpful to pull master and rebase when I push a PR |
also for some reason travis is not picking this up...can you rebase and force push? |
Rebased, and I'll take care. Test has passed. |
can you move the imports to the top of the test script? was a little leary of a can you add a timezone aware series & index in the test_base/Ops/objs.py (all should work, but not sure that case is there). |
This is great. I use such functionality all the time and indeed wrap like +1 for |
@nehalecky thanks for the feedback! see #6382 for methods/properties that I think should be moved/changed to the OpsMin (some more tricky than others). Anything else pls comment. |
@jreback When I include tz series and index, I get |
can you post the comparion? (e.g. you need to have the test use a Timestamp with a tz otherwise you SHOULD get that error). |
Sorry, I'm reffering to existing Is it OK for me to do:
|
These should all work (which is what the structure tests) I think you need to use
|
Thanks. I feel I could understand the problem now, the error isn't caused from
Changing the test logic seems to be a workaround, but |
you can't really compare These SHOULD compare equal though (but because of the tz are not).
for now. just text those cases explicity |
Thanks! I've added error handling, and rebased onto latest master. |
expected = np.array(pd.to_datetime(['2010-01-01 00:00:00', | ||
'2009-01-01 00:00:00', | ||
'2008-09-09 00:00:00'])) | ||
self.assertEqual(result.index.dtype, 'datetime64[ns]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much of these tests did you change? I don't like ANY tests where you have to use equalContets/assertEqual
I get for a numpy arrange out put that is ok (e.g. unique
). but if the output is a Series (e.g. with value_counts), I don't think that should used at all. I know this is waht the original test did, but value_counts here is a series.) Can you go thru tests and change as much as possible to assert_series_equal?
In the cases of a numpy array is returned use self.assert_numpy_array_equal
. The problem with checking contents and equal is it can mask issues with the ordering and/or return types.
Understood. I've modified tests to use And is the current |
You mean like this? yes NaT should be included
|
self.assertEqual(unique.dtype, 'datetime64[ns]') | ||
# numpy_array_equal cannot compare pd.NaT | ||
self.assert_numpy_array_equal(unique[:3], expected) | ||
self.assertTrue(unique[3] is pd.NaT or unique[3].astype('int64') == pd.tslib.iNaT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, though array_equavalent
will match this (as it uses pd.isnull() under the hood which handles NaT properly). even np.array_equal
works here. The actual pd.NaT
values are translated to pd.tslib.iNaT
which is actually an integer; this type of testing is even easier that floats FYI.
ENH: added nunique function to Index
thanks @sinhrks lots of nice PR's from you! keep em coming FYI my comments about the NaT testing were really just for your information...this PR is good! |
Thank you for your review & info. I remember. I owe much to pandas and hope to assist a little! |
no thank you for the contributions! keep em coming |
Added
nunique
function toIndex
same asSeries
.