Last Friday I spent two hours fighting with the datetime class from Python's datetime module. I originally thought that a datetime object represented a single point in time, which is not always the case. I sum up with a few bullets points what I learned about this module:
-
In its simplest form, a datetime object is just a date and a time as you would write it down on a piece of paper. It's a year, a month, a day, and, optionally, an hour, a minute, a second and some microseconds (the 'time' part of 'datetime').
-
So, in this simplest form, a datetime object does not represent a given point in time because it depends on the timezone in which you consider your date and time. It's up to interpretation. Such datetime objects are called naive. You can make a datetime object aware by giving to it the timezone that should be used to interpret this date and time. An aware datetime object does represent a given point in time.
-
The optionnal timezone carried by a datetime object is given by the
tzinfo
attribute, whose value is of typetzinfo
. The tzinfo class is an abstract base class, meaning you can't directly instanciate a tzinfo object. You must instanciate a subclass of tzinfo, such astimezone
. But to construct a timezone object, you must give it atimedelta
object. The timedelta class represents a difference between two datetimes. When used to construct a timezone, it indicates the offset in relation to UTC. For example,timezone(timedelta(hours=2))
is a valid tzinfo object which represents the timezone UTC+2.timezone.utc
is a shortcut fortimezone(timedelta(0))
. -
Most methods which return a datetime object return a naive object. The method
now
returns the local and naive date and time; the methodutcnow
returns the UTC naive date and time. Since Python 3.2, thestrptime
method can produce an aware datetime object from a string. The string must contain the timezone in the+HHMM
or-HHMM
format (for example+0200
for UTC+2), and the placeholder to use is%z
to capture the timezone part.
Practical example
Let's say you've acquired such a string representating a date and a time: Mon Nov 23 20:06:13 CET 2015
. It's actually the default output's format of the date
command in bash. Let's say you want to write a Python script that tells you the exact time difference between the moment the script is executed and the date and time represented by this string.
We're in bad luck here because Python won't be able to tell us to what timezone CET
corresponds to. Remember, it can only parse timezones if they're in the form +HHMM
. So we need to do some basic research. It turns out that CET stands for Central European Time and is UTC+1.
Now that we have such knowledge, a solution is to replace CET
by +0100
in the string then let Python produces an aware datetime object.
from datetime import datetime, timedelta, timezone
DATETIME_STRING = 'Mon Nov 23 20:06:13 CET 2015'
string = DATETIME_STRING.replace('CET', '+0100')
dt = datetime.strptime(string, '%a %b %d %H:%M:%S %z %Y')
Now, we need the current time. There are two functions to get the current time: now
or utcnow
. The problem with now
is that it returns the current local time. So we would need to figure out what is the local timezone, which isn't straightforward (more on that later). utcnow
returns the UTC time, so we know the timezone by definition. Note that even if we know the timezone, utcnow
still returns a naive object, so we'll have to manually set the timezone to UTC:
now = datetime.utcnow()
now = now.replace(tzinfo=timezone.utc) # Getting an aware datetime object
We can now subtract the two datetime objects to obtain a timedelta object, which has the fancy seconds
attribute:
delta = now - dt
print(delta.seconds) # Prints the timespan in seconds between dt and now
Getting the local timezone
In the last part, another strategy would have been to use the now
to obtain the local time and then set the local timezone to the datetime object obtained. But getting the local timezone isn't easy. I haven't found any function in the documentation doing that, and posts on the subject on StackOverflow advice using the tzlocal
module, which isn't present in the standard library.
It's still possible with the vanilla datetime module.
First, a solution that would work sometimes:
diff = datetime.now() - datetime.utcnow()
minutes = round(diff.seconds / 60)
local_timezone = timezone(timedelta(minutes=minutes))
Here we get the timezone as the difference between now
and utcnow
. But since the two functions are not executed at the exact same time, the difference doesn't produce a timedelta object with a whole number of minutes, which is the condition for a timedelta to be used to construct a timezone. We get a whole number of minutes by rounding the number of seconds divided by 60. Then we construct a timezone thanks to a new timedelta.
Now imagine that this code runs on a very slow computer and more than a minute is gone between the execution of the two functions, and you've got a corrupted timezone. So, it doesn't really work.
To make it work we can use a timestamp. Python can parse POSIX timestamp into datetime objects. We can use fromtimestamp
to get the local datetime from a timestamp and utcfromtimestamp
to get the UTC datetime from the same timestamp, then substract the two, which, this time, will represent the exact same instant:
TIMESTAMP = 42
dt_utc = datetime.utcfromtimestamp(TIMESTAMP)
dt_local = datetime.fromtimestamp(TIMESTAMP)
diff = dt_local - dt_utc
local_timezone = timezone(diff)
Well, to be perfectly honnest, I'm still not absolutely convinced that this would work 100% of the time. Depending on the implementation of the fromtimestamps
methods, there may exist edge cases causing trouble some times. I don't know. At the end of the day, the true way to get the timezone is to look at the operating system's specific configuration file containing such information, which is set by the user when it installs the operating system. This is what the tzlocal
module does by the way. Python probably does it too when it does anything UTC.