Skip to content

PERF: Index.take to check for full range indices #56806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 10, 2024

Conversation

lukemanley
Copy link
Member

import pandas as pd
import numpy as np

N = 1_000_000
idx = pd.Index(np.arange(N))
indices = np.arange(N)

%timeit idx.take(indices)

# 11.1 ms ± 271 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)   -> main
# 1.33 ms ± 279 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  -> PR

Motivating use-case:

idx1 = pd.Index(np.tile(np.arange(1000), 1000))
idx2 = pd.Index(np.arange(100))

%timeit idx1.join(idx2, how="left")

# 132 ms ± 1.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  -> main
# 110 ms ± 587 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)   -> PR

@lukemanley lukemanley added Performance Memory or execution speed performance Index Related to the Index class or subclasses labels Jan 10, 2024
@lukemanley lukemanley added this to the 2.3 milestone Jan 10, 2024
@mroeschke mroeschke merged commit 17cdcd9 into pandas-dev:main Jan 10, 2024
@mroeschke
Copy link
Member

Thanks @lukemanley

@lithomas1 lithomas1 modified the milestones: 2.3, 3.0 Jan 11, 2024
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
* Index.take to check is_range_indexer

* whatsnew

* MultiIndex.take to check is_range_indexer

* ensure 1-dim
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants