Skip to content

ENH: Support for "52–53-week fiscal year" / "4–4–5 calendar" and LastWeekOfMonth DateOffset #5004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 22, 2013

Conversation

cancan101
Copy link
Contributor

Closes #4511 and #4637

@cancan101
Copy link
Contributor Author

Right now I am using the variable startingMonth to mean the month in which the fiscal year ends. This seems silly but IS consistent with BQuarterEnd, for example.

@cpcloud
Copy link
Member

cpcloud commented Sep 27, 2013

@cancan101 Looks like u need a rebase. Can you also squash/reword most of these commits? Maybe leave 1 or 2 with a moderate amount of detail in the commit message. Thanks

@cancan101
Copy link
Contributor Author

@cpcloud Rebased and squashed.

@jreback
Copy link
Contributor

jreback commented Sep 27, 2013

@cancan101 can this be done as a parameter instead of creating new classes?

@cancan101
Copy link
Contributor Author

Do you mean instead of creating five new public classes and two private classes, just create 3 public classes? If so, probably, there is minimal code in two of the classes. That being said, downstream use of parameters seem clunky. See for example #5015 and some of our discussions here: #4511 (comment)

Or do you mean using parameters instead of creating any new classes?

@jreback
Copy link
Contributor

jreback commented Sep 28, 2013

@cancan101 I like your `LastWeekInMonthoffset, but I think that they other 4 offsets are quite specialized. Would it not be better to haveFYear``(fiscal year), and just parameterize, with a``typ`` parameter? that would control the prefix, qtr_week_in and so on ?

@jtratner
Copy link
Contributor

You could also try an enum-like class (or toplevel defitinition) that has members that define each frequency in translation, so you'd do this:

class Offsets(object):
    LastWeekOfMonth = 'LWM'
    Day = 'D'
    Week = 'W'
    BusinessDay = 'B'

and then put the class in the toplevel namespace of pandas. Not sure if it would be better, but it might help resolve this. I too don't like putting so many new classes in.

@cancan101
Copy link
Contributor Author

@jreback @jtratner Reducing the number of classes should not be too hard. That being said I still think it makes sense to add three classes: LastWeekOfMonth, FYear and FQuarter. I feel that there is a significant difference in functionality and meaning between fiscal quarters and fiscal years. Thoughts?

Also to be clear, these new Offsets are not the standard fiscal year/ quarter but rather the "retail fiscal year" aka 52-53 week years.

@cancan101
Copy link
Contributor Author

bump

@cpcloud
Copy link
Member

cpcloud commented Oct 2, 2013

needs a rebase .. some changes in offsets.py and the usual release notes gnat

@cancan101
Copy link
Contributor Author

There should be release notes. Are more needed?
What changes to offsets.py I can reduce the number of classes, but I still think it makes sense to add three classes: LastWeekOfMonth, FYear and FQuarter.

@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

I ok with those classes.

@cpcloud
Copy link
Member

cpcloud commented Oct 2, 2013

@cancan101 I meant that there were recently some changes in offsets.py....(now we have nanosecond date_ranges! yay!)

@cancan101
Copy link
Contributor Author

@jreback I assume by those you mean the three I listed?

GIven that I am going to reduce the number of classes, what should the rule_code look like? It already looks ugly as is. I posted: #5015

@jreback
Copy link
Contributor

jreback commented Oct 2, 2013

can you put a couple of possibilites up?

@jtratner
Copy link
Contributor

jtratner commented Oct 2, 2013

@cancan101 @jreback did you mean reduce the number of classes altogether or just reduce the number of public classes?

If you combine classes, use a classmethod that takes a rulecode and spits out the correct offset.

@cancan101
Copy link
Contributor Author

@jreback

For the yearly offset:
Currently:
For last X day of month: RL-%{month}%{day} for example RL-DECSAT
For nearest X to last day of month: RN-%{month}%{day} for example RN-DECSAT

For the quarterly-offset:
For last X day of month: RLQ-%{qtr}%{month}%{day} for example RLQ-2DECSAT
For nearest X to last day of month: RNQ-%{qtr}%{month}%{day} for example RNQ-1DECSAT
where qtr is the quarter with the leap week.

These are hard to read since after the initial dash, there is no other delimitation. Further because of the regex I referred to in #5015, the number must be the first character after the dash.

Question: Will anything break if one DateOffset class produces rule codes with different prefixes?

@jtratner What do you mean by creating a "classmethod that takes a rulecode and spits out the correct offset."? Are your referring to the idea I suggested in #5028?

@cancan101
Copy link
Contributor Author

Given that I am going to add just FYear and FQuarter, how should the end be specified? One options would be a boolean (for example use_closest). Another would be an enum (which currently only takes on two values).

@jtratner
Copy link
Contributor

jtratner commented Oct 2, 2013

No, that's not what I mean. What I mean is that you might want to let people use a class without having to pass all the arguments to the constructor. For example, here's what we did in TimeOp so that __init__ could take all different arguments and then return different types or None or whatever: https://github.com/pydata/pandas/blob/master/pandas/core/ops.py#L382

@cancan101
Copy link
Contributor Author

@jtratner I don't really follow what would be the point of that method here.

@jtratner
Copy link
Contributor

jtratner commented Oct 2, 2013

Didn't realize you only had two at this point.

@cancan101
Copy link
Contributor Author

@jtratner Theoretically these could be merged into BYear and BQuarter. I am hesitant to do so because there are more parameters for these new classes than apply to BYear and BQuarter.

You could then merge BYear and BQuarter into Year and Quarter and just use an enum like:

{LastDay, LastBDay, ClosestToLastX, LastX} but that seems like overkill.

@cancan101
Copy link
Contributor Author

Otherwise as far as my reading of Wikipedia and subsequent research has gone, there are only two variations of the 52-53 week calendar.

@jtratner
Copy link
Contributor

jtratner commented Oct 2, 2013

Sounds like you've spent a lot of time thinking about this!

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

@cancan101 can you rebase and update?

@cancan101
Copy link
Contributor Author

@jreback Can I get some feedback on the rule code?

@@ -367,9 +369,27 @@ def get_period_alias(offset_str):

for _i, _weekday in enumerate(['MON', 'TUE', 'WED', 'THU', 'FRI']):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just define a no argument function (add add global declarations where necessary) and call it. then you don't need to have all these underscored variables (obviously keep the ones that were there before)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpcloud What do you think about moving this logic to a classmethod on the respective offsets?
Also I am not quite sure what you mean by the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classmethod probably ok....

by function i just meant something like

def initializer():
    # your init code here
initializer()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea with classmethods is to put the logic for generating all instances of a given offset with the offset itself rather than having the code in separate files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initializer for now

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

dt is a datetime, not a Timestamp..., you prov want to convert it at the top of year_has_extra_week

-> week_in_year = (next_year_end - prev_year_end).days/7
(Pdb) l
1242                next_year_end = dt
1243            else:    
1244                next_year_end = dt + self._offset
1245                prev_year_end = dt - self._offset
1246            
1247 ->         week_in_year = (next_year_end - prev_year_end).days/7
1248 
1249            return week_in_year == 53     
1250        
1251        def onOffset(self, dt):
1252            if self._offset.onOffset(dt):
(Pdb) l
1253                return True
1254            
1255            next_year_end = dt - self._offset
1256 
1257            qtr_lens = self.get_weeks(dt)
1258                
1259            current = next_year_end
1260            for qtr_len in qtr_lens[0:4]:
1261                current += relativedelta(weeks=qtr_len)
1262                if dt == current:
1263                    return True
(Pdb) (Timestamp(next_year_end)-Timestamp(prev_year_end)).days/7
52

@cancan101
Copy link
Contributor Author

Any idea what changed?

@cancan101
Copy link
Contributor Author

Also, what is wrong with doing the math on a datetime?

(datetime.datetime(2013,10,24)-datetime.datetime(2013,10,21)).days

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

this is your code, you probably have some relativedelata stuff in there which is not returning the correct type

@cancan101
Copy link
Contributor Author

Okay, I can take a closer look. I assume relativedelata returns a datetime?
Also this worked until I rebased onto master. I had rebased relatively recently too. I wonder what changed?

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

not much has been merged recently....can you pinpoint what commit it last worked?

@cancan101
Copy link
Contributor Author

I'll see what I can come up with

@jtratner
Copy link
Contributor

Use git reflog to grab your pre-rebase branch. Can you try doing arithmetic
with datetimes and Timestamp and see if you can reproduce it? If you can
get that specific error, then there is probably a bug with Timestamp.

That is a very weird error to get - suggests there is (potentially) some
part of that code that flips the order of args to__sub__

@wuan
Copy link
Contributor

wuan commented Oct 21, 2013

The error occurs when subtracting a Timestamp from a datetime type.

git bisect found that the problem was introduced to master with commit 02158e6 on September 30th.

I will just have a close look on that one.

@cancan101
Copy link
Contributor Author

That weird. I'm pretty sure I had rebased more recently to get @jtratner's change for Offset construction.

@wuan
Copy link
Contributor

wuan commented Oct 21, 2013

in tslib.pyx class _Timestamp(datetime), method sub

    datetime.__sub__(self, other)

was replaced by

    super(_Timestamp, self).__sub__(other)

As _Timestamp is a direct subtype of datetime.datetime, the result should be the same. But the latter case leads to the error

TypeError: super(type, obj): obj must be an instance or subtype of type

I have no clue why self in _Timestamp is not an instance or subtype ot _Timestamp.

@jreback
Copy link
Contributor

jreback commented Oct 21, 2013

@cancan101 just revert that one line

datetime.__sub__(other) and should be ok

they don't act the same because _Timestamp is a shadow class (of a c-lass) AFAIK..

@cancan101
Copy link
Contributor Author

@cpcloud Thoughts?

@wuan
Copy link
Contributor

wuan commented Oct 21, 2013

@cancan101 Could you add a test for subtracting a datetime from a timestamp to test_tslib.py as well?

@cancan101
Copy link
Contributor Author

Sure.

@jtratner
Copy link
Contributor

Hurray for git bisect - which is why you want every commit to at least
build, if not pass the test suite. Btw, I think @jreback's replacement
should be datetime.__sub__(self, other)

@cancan101
Copy link
Contributor Author

Issues should be fixed.
On Oct 21, 2013 5:09 PM, "Jeff Tratner" [email protected] wrote:

Hurray for git bisect - which is why you want every commit to at least
build, if not pass the test suite. Btw, I think @jreback's replacement
should be datetime.__sub__(self, other)


Reply to this email directly or view it on GitHubhttps://github.com//pull/5004#issuecomment-26756646
.

@jtratner
Copy link
Contributor

Okay, please rebase on top of latest master if you haven't already
On Oct 21, 2013 9:30 PM, "Alex Rothberg" [email protected] wrote:

Issues should be fixed.
On Oct 21, 2013 5:09 PM, "Jeff Tratner" [email protected] wrote:

Hurray for git bisect - which is why you want every commit to at least
build, if not pass the test suite. Btw, I think @jreback's replacement
should be datetime.__sub__(self, other)


Reply to this email directly or view it on GitHub<
https://github.com/pydata/pandas/pull/5004#issuecomment-26756646>
.


Reply to this email directly or view it on GitHubhttps://github.com//pull/5004#issuecomment-26771149
.

@cancan101
Copy link
Contributor Author

Is it not merge able? If not I'll rebate.
On Oct 21, 2013 9:38 PM, "Jeff Tratner" [email protected] wrote:

Okay, please rebase on top of latest master if you haven't already
On Oct 21, 2013 9:30 PM, "Alex Rothberg" [email protected]
wrote:

Issues should be fixed.
On Oct 21, 2013 5:09 PM, "Jeff Tratner" [email protected]
wrote:

Hurray for git bisect - which is why you want every commit to at least
build, if not pass the test suite. Btw, I think @jreback's replacement
should be datetime.__sub__(self, other)


Reply to this email directly or view it on GitHub<
https://github.com/pydata/pandas/pull/5004#issuecomment-26756646>
.


Reply to this email directly or view it on GitHub<
https://github.com/pydata/pandas/pull/5004#issuecomment-26771149>
.


Reply to this email directly or view it on GitHubhttps://github.com//pull/5004#issuecomment-26771408
.

@jtratner
Copy link
Contributor

You need to rebase because this previously failed after rebasing (wasn't
this PR's fault) and I want to make sure I'm not merging something that's
going to break the builds/tests.

@cancan101
Copy link
Contributor Author

Np. Rebased and pushed. Easy enough. Should be running tests now.

@cancan101
Copy link
Contributor Author

passed

LastWeekOfMonth DateOffset.
- Added ``LastWeekOfMonth`` DateOffset
- Added ``FY5253 and ``FY5253Quarter`` DateOffsets
@cancan101
Copy link
Contributor Author

@jtratner What can I do to get this merged in? I rebased earlier today

jtratner added a commit that referenced this pull request Oct 22, 2013
ENH: Support for "52–53-week fiscal year" / "4–4–5 calendar" and LastWeekOfMonth DateOffset
@jtratner jtratner merged commit b55c790 into pandas-dev:master Oct 22, 2013
@jtratner
Copy link
Contributor

there you go :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Support for "52–53-week fiscal year" / "4–4–5 calendar"
7 participants