Skip to content

Alternate way of implementing PeriodIndex #2949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
abielr opened this issue Feb 28, 2013 · 2 comments
Closed

Alternate way of implementing PeriodIndex #2949

abielr opened this issue Feb 28, 2013 · 2 comments
Labels
Enhancement Frequency DateOffsets Period Period data type

Comments

@abielr
Copy link

abielr commented Feb 28, 2013

I've created a slightly modified version of PeriodIndex for some projects I'm working on that you may find useful for pandas. I rewrote the date-math code from scratch in Cython, it has a wide range of frequencies, it's very easy to extend to new frequencies, and the new code is quite compact.

Code is at https://github.com/abielr/pandasreg. There isn't much documentation yet, but all the key date-math code is in pandasreg/src/rfreq.pyx. Most of the rest of the code is a straight copy of the pandas code for Period and PeriodIndex and some other utility functions to work with the custom index.

Basic approach for handling frequencies is as follows:

  • There is a base class RFrequency that defines an interface to convert datetime-like objects to an ordinal period at a given frequency (just like PeriodIndex), and to do the reverse.
  • Periods at one frequency can be converted to periods at another frequency.
  • Classes that inherit from RFrequency define the date-math logic that is particular to a given frequency. However, there is another flexibility for a single child class to define multiple frequencies. For example, the monthly class defines the monthly, bimonthly, quarterly, semiannual, and annual frequency without having to write custom code for each variant.
  • You can force a frequency that is a multiple of a base frequency to pass through a particular period. For example, the semiannual frequency is created from the monthly frequency with a 6-month stride and the requirement that it pass through Jun 1970, which ensures that 1H=Jan-Jun and 2H=Jul-Dec always. (see this stackoverflow post for more background).

Looking at the pandas code, it appears that much of the current PeriodIndex functionality is based on the old scikits.timeseries project. I haven't done a detailed performance comparison, but relative to the scikits.timeseries codebase my code could be of benefit in pandas since there is no pure-C code to maintain, there is much less date-math code, and its easier to extend to new frequencies.

@nehalecky
Copy link
Contributor

Hey @abielr,

I just came across this issue as I am also looking to expand functionality of the PeriodIndex to support multiples of DateOffsets, but it seems you've already tackled this in elegant way.

Before I totally dive in cloning your repo, I wanted to know the status of this project? Do you still use it with current pandas build 0.12? Did this ever gain any traction with the pandas devs? It seems like a natural evolution of functionality in handling regularly-spaced time series information (especially handling those that happen to be multiples of date offsets).

Thank you. :)

@wesm
Copy link
Member

wesm commented Sep 29, 2016

Closing, but would you like to get involved with the pandas 2.0 effort? Come on over https://github.com/pydata/pandas-design

@wesm wesm closed this as completed Sep 29, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: Someday, No action Sep 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Frequency DateOffsets Period Period data type
Projects
None yet
Development

No branches or pull requests

4 participants