Skip to content

Fix time utilities #6692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 20, 2020
Merged

Fix time utilities #6692

merged 7 commits into from
May 20, 2020

Conversation

ofek
Copy link
Contributor

@ofek ofek commented May 19, 2020

What does this PR do?

  1. Fixes delta computation error Add utilities for working with time #6663 (comment)
  2. Simplified get_aware_datetime to get_current_datetime
  3. Removed use of pytz as per https://discuss.python.org/t/get-local-time-zone/4169/7
  4. Added full test coverage

Notes

Additional Notes

@pganssle Anything look amiss to you in general? https://github.com/DataDog/integrations-core/blob/ofek/time/datadog_checks_base/datadog_checks/base/utils/time.py

pganssle
pganssle previously approved these changes May 19, 2020
Copy link

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice. I've left some comments and nitpicks, but I would not be terribly concerned about merging this.

return datetime.now(tz)
else:
return normalize_datetime(dt)
# NOTE: Although the epoch is platform dependent, it appears to be the same

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "epoch is platform dependent"? You mean that some platforms are configured to ignore leap seconds or something?

I would expect time.time() to always return the unix epoch time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going by CPython's definition https://docs.python.org/3/library/time.html#epoch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the docs inaccurate?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting, I don't think I knew that. I would be very shocked if a system having a different epoch didn't break things horribly (though I guess some systems are configured to do the sane thing and ignore leap seconds — maybe that shows up differently in time.time()).

I'm curious to know what datetime.timestamp() would do on such a system. I guess it's also unclear what you would want to do on such a system, since I always think of a "timestamp" as defined relative to the Unix epoch, but possibly it could be defined relative to any epoch.

If you want to always use the system epoch, you could presumably change your definition of EPOCH to be EPOCH = datetime(*time.gmtime(0)[0:6], tzinfo=UTC).

That said, that documentation seems weird. From what I can tell, the implementation of time_gettime on Windows uses a function that computes time from a totally different epoch (1601-01-01), and then adjusts it to 1970-01-01. On non-Windows systems, it uses clock_gettime and gettimeofday, both of which are standardized in POSIX.1 as using the Unix epoch.

I'm wondering if we should change the time.time documentation to explicitly define it by the Unix epoch, rather than by the "system epoch".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. I do think that the time.time documentation should be updated then, especially since it makes extra efforts to achieve POSIX compliance on Windows. I'll remove this comment.

# - Alpine
return epoch_offset()

return (normalize_datetime(dt) - EPOCH).total_seconds()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using .timestamp() is almost certainly better on Python 3, but I suppose I can see why you'd want to unify the implementation if you're still supporting Python 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree the official method is preferred 🙂 We'll definitely switch when we drop Python 2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a comment to identify this as a possible optimization for when we drop Python 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably nitpicking, but alternatively, to benefit from the standard behavior on Python 3, use the old strategy on Python 2, and make it clear that we can drop the Py2 branch in due time without a TODO that might go out of date, we could use something like this in a compat.py module:

# datadog_checks_base/datadog_checks/base/utils/compat.py

if PY2:

    def to_timestamp(dt):
        return (normalize_datetime(dt) - EPOCH).total_seconds()

else:

    def to_timestamp(dt):
        return normalize_datetime(dt).timestamp()

(We might benefit from introducing such a compat.py like this anyway, eg for imports in contextlib vs contextlib2, etc. But I agree that'd be a precedent to make, so not too fussed about the current impl.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it like that #6663 (comment)

@AlexandreYang ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, dt.timestamp() in py3 is just a shortcut for (self - _EPOCH).total_seconds().

https://github.com/python/cpython/blob/3.8/Lib/datetime.py#L1794

I guess both are fine for me, up to you @ofek :)

Copy link
Contributor Author

@ofek ofek May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge. When the time comes there will be a massive py(thon)? ?[23] search anyway 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the time comes

Hopefully not too late ;)

nyc = tz.gettz('America/New_York')
now = datetime.now(nyc)

assert normalize_datetime(now).tzinfo is nyc

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also want to assert normalize_datetime(now) is now? Creating a new datetime can be annoyingly expensive considering these things are immutable 😛.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!



class TestCurrentDatetime:
def test_default_utc(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not crazy about tests that rely on now(). It might be better to mock out datetime.now() and assert that get_current_datetime() called with no arguments calls datetime.now with UTC as the argument?

Alternatively, there's freezegun, which works from a practical perspective, but I've become mildly concerned with keeping it as a dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, not fan either relying on current time. +1 for mocking datetime.now.

The assert get_timestamp(now) - get_timestamp(dt) < 0.01 might be flaky is the system running the test is slowing down for unrelated reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

def test_utc(self):
from datetime import timezone

assert UTC.utcoffset(None) == timezone.utc.utcoffset(None)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this test?

Is it because you're exposing a public UTC object and it's an implementation detail that you're exposing it directly from dateutil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to assert that the constant we expose is a tzinfo instance and represents UTC. How would you do that?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine I think. I was just surprised because UTC is already tested upstream, so it's already part of the dateutil contract that it's UTC. I think it's fair to write your own test for it, though, considering you see this as part of the public interface (and the fact that it's supplied directly from dateutil.tz is an implementation detail.

That said, if in the future you drop Python 2 and change it from UTC = dateutil.tz.UTC to UTC = datetime.timezone.utc, would you still want this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, that's an interesting question. Yes when that happens I will indeed remove this test. Therefore, I will remove it now since the contract is the same (tested upstream). Good call.

mgarabed
mgarabed previously approved these changes May 19, 2020
Copy link
Member

@mgarabed mgarabed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!



class TestCurrentDatetime:
def test_default_utc(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, not fan either relying on current time. +1 for mocking datetime.now.

The assert get_timestamp(now) - get_timestamp(dt) < 0.01 might be flaky is the system running the test is slowing down for unrelated reason.

Copy link
Member

@AlexandreYang AlexandreYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

Just a feedback about the naming of normalize_datetime and possibly adding ensure_utc. See comment below.

@@ -71,3 +40,6 @@ def normalize_datetime(dt, default_tz=UTC):
dt = dt.replace(tzinfo=default_tz)
Copy link
Member

@AlexandreYang AlexandreYang May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that we might also want to provide now or later a way to ensure that the dt returned is UTC.

Hence, normalize_datetime might not be the best name (it's also not clear since "normalize" might mean many things).

WDYT about renaming normalize_datetime to ensure_aware. And possibly adding ensure_utc ? Example:

def ensure_aware(dt, default_tz=UTC):
    """
    Ensures that the returned datetime object is an aware datetime.
    """
    if dt.tzinfo is None:
        dt = dt.replace(tzinfo=default_tz)
    return dt.astimezone(UTC)

def ensure_utc(dt):
    """
    Ensures that the datetime is aware and UTC.
    """
    return normalize_datetime(dt, UTC).astimezone(UTC)

My comment is mainly about making normalize_datetime naming more clear.
ensure_utc might be implemented later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is called default_tzinfo in dateutil: https://dateutil.readthedocs.io/en/stable/utils.html#dateutil.utils.default_tzinfo (though also I guess you could just use that function).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, quite similar to https://github.com/dateutil/dateutil/blob/master/dateutil/utils.py#L58-L61

The only difference is that the second argument defaults to UTC (which is convenient I think).

Copy link
Contributor

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one about using dateutil.tz, learnt something today from Paul's article. 👍

# - Alpine
return epoch_offset()

return (normalize_datetime(dt) - EPOCH).total_seconds()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably nitpicking, but alternatively, to benefit from the standard behavior on Python 3, use the old strategy on Python 2, and make it clear that we can drop the Py2 branch in due time without a TODO that might go out of date, we could use something like this in a compat.py module:

# datadog_checks_base/datadog_checks/base/utils/compat.py

if PY2:

    def to_timestamp(dt):
        return (normalize_datetime(dt) - EPOCH).total_seconds()

else:

    def to_timestamp(dt):
        return normalize_datetime(dt).timestamp()

(We might benefit from introducing such a compat.py like this anyway, eg for imports in contextlib vs contextlib2, etc. But I agree that'd be a precedent to make, so not too fussed about the current impl.)

@ofek
Copy link
Contributor Author

ofek commented May 20, 2020

Thanks again @pganssle!

@ofek ofek merged commit 4c83920 into master May 20, 2020
@ofek ofek deleted the ofek/time branch May 20, 2020 17:26
ChristineTChen pushed a commit that referenced this pull request May 21, 2020
* Fix time utilities

* address feedback - assert returned dt is same object

* address feedback - remove upstream test

* add comment

* remove comment

* address feedback

* address
@yzhan289 yzhan289 mentioned this pull request Feb 24, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants