|
View:
New views
12 Messages
—
Rating Filter:
Alert me
|
|
|
Algorithm to exploit 32 bit time functions to do time zone calculationsHi,
I'm writing to you in reference to your libtai library whose TODO file states that "support time zones" is still todo. I had originally considered using libtai in Perl to avoid the Unix 2038 bug, but Perl requires time zone support. Instead, I am rewriting the time.h library functions to be 2038-clean. The effort is located here. http://code.google.com/p/y2038/ The piece which is of interest to libtai is this: http://code.google.com/p/y2038/wiki/HowItWorks I have figured out a way to make use of 32 bit system functions to do 64 bit time zone and daylight savings calculations. I thought you might be able to apply this to libtai. Thanks, Schwern PS Any potential license issues I'm happy to work out. -- Stabbing you in the face so you don't have to. |
|
|
|
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>You: "Your shit is boring." More like "has been done already". Sorry. >If I understand correctly, that's an entire operating system. Yes, but the gist is, that there is code which uses a certain 64-bit type, called time_t, but you could of course just use int64_t instead, which does the job quite fine. (Not even a binary change in the time zone data format.) >Also you might want to have a look at the tests in y2038. This is actually a good idea. Compiling (I think it was) CVS with the changes was a good test too, as the configure script “checks whether mktime() works”. To get that right (over all of the 64 bit) was hard. You have to think about very many border cases… in the end, I just ensured the round-trip via tai64_t was right, not neccessarily the tai64_t representation itself. bye, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsThorsten Glaser wrote:
> Michael G Schwern dixit: > >> You: "Your shit is boring." > > More like "has been done already". Sorry. Apology accepted. Sorry things got off to a bad start. >> If I understand correctly, that's an entire operating system. > > Yes, but the gist is, that there is code which uses a certain 64-bit type, > called time_t, but you could of course just use int64_t instead, which does > the job quite fine. (Not even a binary change in the time zone data format.) Is this approach portable outside BSD? My main targets are Perl, Python and Ruby which have to be absurdly portable. As I understand it, almost nothing about the time zone database is portable. For example, I could not expect tzload() to work on all operating systems. I would have to ship my own, which I do not want to do, or write time zone code for each operating system, which I also do not want to do. I could be wrong, I'm really a Perl programmer who plays a C programmer on TV. That said, I do recognize that a lot of my work will boil down to just doing a search and replace for "time_t" with "Time64_T" and "int" with "Int64_T". For example, asctime(). It's really the trick to get localtime() working that's important. I intend to loot BSD code for everything I can, right now I'm mostly using it mostly as a reference. In fact, I'm considering changing from MIT to BSD license to make that even easier. Speaking of asctime(), I think this is a y2**31 bug in asctime3: char year[INT_STRLEN_MAXIMUM(int) + 2]; Since 64 bit time can go well above the year 2 billion, years must be stored as 64 bit ints. If int is 32 bits, the code above is only allocating enough room for 2**31 years. I've been testing a lot of 64 bit system's time handling lately and that's a common mistake, 32 bit years. Though not as bad as HP/UX's Y10k bug. Oh, don't forget to make tm.tm_year 64 bit! >> Also you might want to have a look at the tests in y2038. > > This is actually a good idea. Compiling (I think it was) CVS with the > changes was a good test too, as the configure script “checks whether > mktime() works”. To get that right (over all of the 64 bit) was hard. > You have to think about very many border cases… in the end, I just > ensured the round-trip via tai64_t was right, not neccessarily the > tai64_t representation itself. To test timegm() I just round tripped it through gmtime() at various interesting times. time = 60*60*16; gmtime64_r(&time, &date); is_Int64( timegm64(&date), time, "timegm64(60*60*16)" ); An mktime() test should work the same way, round trip through localtime(), no? You can see the test here (and I really should put all that repeated code into its own test function). http://code.google.com/p/y2038/source/browse/trunk/t/timegm.c Anyhow, if you're interested in the Test Anything Protocol, a really simple implementation is here: http://code.google.com/p/y2038/source/browse/trunk/tap.c And the code to run them is here in the "tap_tests" target. http://code.google.com/p/y2038/source/browse/trunk/Makefile More info about TAP can be found here: http://testanything.org/wiki/index.php/Main_Page The MyTAP library used by MySQL might be relevant (and cleaner than mine) http://www.kindahl.net/mytap/doc/ And MySQL's documentation on that. http://dev.mysql.com/doc/mysqltest/en/unit-test.html -- The mind is a terrible thing, and it must be stopped. |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>>> If I understand correctly, that's an entire operating system. >> >> Yes, but the gist is, that there is code which uses a certain 64-bit type, >> called time_t, but you could of course just use int64_t instead, which does >> the job quite fine. (Not even a binary change in the time zone data format.) > >Is this approach portable outside BSD? > >My main targets are Perl, Python and Ruby which have to be absurdly portable. > As I understand it, almost nothing about the time zone database is portable. Oh, okay. This is a step further away from Unix. > For example, I could not expect tzload() to work on all operating systems. I >would have to ship my own, which I do not want to do, or write time zone code >for each operating system, which I also do not want to do. You probably could just take the entire Olson time library in its original, portable state, and change that. (This would have the beneficial side effect to replace vendors’ probably buggier time libraries and time zone databases with something known to work.) But I like your “wrapping” approach. >In fact, I'm considering changing from >MIT to BSD license to make that even easier. There is not a single BSD licence. MIT is just fine… OpenBSD uses ISC for new code, and, being European, MirBSD has to have its own only slightly different one too ;) >Speaking of asctime(), I think this is a y2**31 bug in asctime3: > > char year[INT_STRLEN_MAXIMUM(int) + 2]; Yup, probably. >Oh, don't forget to make tm.tm_year 64 bit! Yeah, that bites us great time. For example, look here: http://cvs.mirbsd.de/ports/lang/python/2.5/patches/patch-Modules_datetimemodule_c When you have only one of the two chunks, it dumps core. Takes a while to find it, since -Wformat did not, obviously, catch this case. It usually spots all occurrences though. >Anyhow, if you're interested in the Test Anything Protocol, a really simple I implemented it in Python for the day-job project I'm currently working on, since my colleague is a Perl fan… but thanks ;) bye, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>Speaking of asctime(), I think this is a y2**31 bug in asctime3: Two of them even, please look at asctime.c in cvsweb, I fixed it ☺ (The allbsd.org mirror will not update before 04:10 UTC though you can use http://www.mirbsd.org/cvs.cgi/) //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>That said, I do recognize that a lot of my work will boil down to just doing a >search and replace for "time_t" with "Time64_T" and "int" with "Int64_T". Ah. The trick with my implementation is, to change “int” and “long” to “time_t” instead *ONLY* where it is neccessary, and leave the remaining narrow integer types alone. In some places, I use int64_t or uint64_t for casts, for either clarity, simplicity or portability, but mostly, I stuck with time_t, as it’s 32-bit on the sparc platform, 64-bit on the i386 platform with MirBSD. I wanted to avoid switching EVERY integer type to 64 bit even where not needed, as that can be much slower and is much bigger. bye, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsThorsten Glaser wrote:
>> My main targets are Perl, Python and Ruby which have to be absurdly portable. >> As I understand it, almost nothing about the time zone database is portable. > > Oh, okay. This is a step further away from Unix. It's about nine. :) If you're morbidly curious, take a look at the supported platform list for Perl. http://perldoc.perl.org/perlport.html#Supported-Platforms >> For example, I could not expect tzload() to work on all operating systems. I >> would have to ship my own, which I do not want to do, or write time zone code >> for each operating system, which I also do not want to do. > > You probably could just take the entire Olson time library in its original, > portable state, and change that. (This would have the beneficial side effect > to replace vendors’ probably buggier time libraries and time zone databases > with something known to work.) I've considered that, yes system libraries are often suspect, but I don't want to now have each application with its own independent time zone library that has to be updated by the user independent of the system's own. Odds are, it won't get updated. > But I like your “wrapping” approach. Thanks! Maybe you can shed some light on this problem, what to do about year 0? Right now the only limit I have on dates is the limit of what Time64_T can store. Does it make sense to stop at 0? I've seen a number of implementations that do. My thinking for gmtime() is that a negative year is just BC, so let it go! localtime()... well, localtime() gets absurd real fast. Gregorian/Julian calendar shifts. The time zone simply not having existed in the past. It's really hard to say what the locals would have thought the datetime was X seconds ago. Any insights? -- You are wicked and wrong to have broken inside and peeked at the implementation and then relied upon it. -- tchrist in <31832.969261130@chthon> |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>Odds are, it >won't get updated. True… >Maybe you can shed some light on this problem, what to do about year 0? I actually haven’t thought about year 0, since localtime/mktime do that, and mjd2tm and tm2mjd from DJB code. I think it’s illegal, isn’t it? >Right now the only limit I have on dates is the limit of what Time64_T >can store. Does it make sense to stop at 0? No, it makes sense to not stop. The thing is, mktime() and gmtime() *must* have full round-trip capabilities (GNU autoconf checks for that before it uses it), which is why, for example, my tai64_t data type does not exactly store what DJB calls a TAI timestamp. There is a small wrap- around at about 0x8000000000000000 (time_t) / 0xC000000000000000 (tai64_t) of 10 (I think, due to the leap seconds) seconds. So, for whatever value you have in whatever representation (time_t, tai64_t, struct tm), all of these must have full round-trip capabilities (with the possible exception that struct tm with a 64-bit year can go beyond the 64-bit time_t value scale, but then, a 32-bit year does the same for a 32-bit time_t, so this is no change). >localtime()... well, localtime() gets absurd real fast. Gregorian/Julian >calendar shifts. Mh. Maybe a struct tm.tm_year is always Gregorian? Have a look, while at it, at the "%J" strftime modifier (and, especially, my implementation of it, using the tm2mjd function). This will get you Julian days, but nothing in Unix has them split off into a calendar time kind of structure. So I think you don’t need to worry about THAT. That’s application layer to do, similar to hebrew, muslim, asian etc. calendars. And at that, it REALLY gets absurd (cf. http://blogs.msdn.com/michkap/default.aspx), but that’s not the (our) OS’ job to worry about. However, for “absurd” years, the OS’ own functions might go crazy. Too bad we can’t access the OS’ own time zone table (I need to get the info for leap seconds out of it, for example). bye, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsIt's very nice to have someone else to bounce this all off of.
Thorsten Glaser wrote: >> Maybe you can shed some light on this problem, what to do about year 0? > > I actually haven’t thought about year 0, since localtime/mktime do that, > and mjd2tm and tm2mjd from DJB code. I think it’s illegal, isn’t it? Illegal according to who? C99 and POSIX 1003.1 define tm.tm_year as a signed int with a range of "years since 1900" which makes a negative year perfectly "legal". >> Right now the only limit I have on dates is the limit of what Time64_T >> can store. Does it make sense to stop at 0? > > No, it makes sense to not stop. The thing is, mktime() and gmtime() > *must* have full round-trip capabilities (GNU autoconf checks for that > before it uses it), which is why, for example, my tai64_t data type does > not exactly store what DJB calls a TAI timestamp. There is a small wrap- > around at about 0x8000000000000000 (time_t) / 0xC000000000000000 (tai64_t) > of 10 (I think, due to the leap seconds) seconds. So, for whatever value > you have in whatever representation (time_t, tai64_t, struct tm), all of > these must have full round-trip capabilities (with the possible exception > that struct tm with a 64-bit year can go beyond the 64-bit time_t value > scale, but then, a 32-bit year does the same for a 32-bit time_t, so this > is no change). I'm a little lost. Could you give an example? >> localtime()... well, localtime() gets absurd real fast. Gregorian/Julian >> calendar shifts. > > Mh. Maybe a struct tm.tm_year is always Gregorian? Have a look, while at > it, at the "%J" strftime modifier (and, especially, my implementation of > it, using the tm2mjd function). This will get you Julian days, but nothing > in Unix has them split off into a calendar time kind of structure. C99 says "Many functions [in time.h] deal with a calendar time that represents the current date (according to the Gregorian calendar) and time." But then goes on to say: "Some functions deal with local time, which is the calendar time expressed for some specific time zone, and with Daylight Saving Time, which is a temporary change in the algorithm for determining local time. The local time zone and Daylight Saving Time are implementation-defined." POSIX 1003.1 does not appear to discuss the matter. It seems clear to me that gmt is always Gregorian but not so clear what happens with localtime(). I wonder how a Chinese locale deals with this. They switched from Julian to Gregorian after 1901, so it should show up in any Chinese localtime() implementation. Also Russia and much of Eastern Europe. > So I think you don’t need to worry about THAT. That’s application layer > to do, similar to hebrew, muslim, asian etc. calendars. And at that, it > REALLY gets absurd (cf. http://blogs.msdn.com/michkap/default.aspx), but > that’s not the (our) OS’ job to worry about. > > However, for “absurd” years, the OS’ own functions might go crazy. Too > bad we can’t access the OS’ own time zone table (I need to get the info > for leap seconds out of it, for example). I'm just glad that ctime() isn't locale sensitive. Oi, what a mess that would be. PS I just found this gem in the mktime() standard: the original values [in the tm struct] of the other components are not restricted to the ranges described in <time.h>. which the BSD time.h man page expands out to: The original values of the tm_wday and tm_yday components of the struc- ture are ignored, and the original values of the other components are not restricted to their normal ranges, and will be normalized if needed. For example, October 40 is changed into November 9, a tm_hour of -1 means 1 hour before midnight, tm_mday of 0 means the day preceding the current month, and tm_mon of -2 means 2 months before January of tm_year. (A positive or zero value for tm_isdst causes mktime() to presume initially that summer time (for example, Daylight Saving Time) is or is not in effect for the specified time, respectively. A negative value for tm_isdst causes the mktime() function to attempt to divine whether summer time is in effect for the specified time. The tm_isdst and tm_gmtoff members are forced to zero by timegm().) On successful completion, the values of the tm_wday and tm_yday compo- nents of the structure are set appropriately, and the other components are set to represent the specified calendar time, but with their values forced to their normal ranges; the final value of tm_mday is not set until tm_mon and tm_year are determined. The mktime() function returns the specified calendar time; if the calendar time cannot be represented, it returns -1; This is something I haven't tested for yet. PPS How does one get the damn qsecretary program to stop making you confirm every email to this list? I'm already subscribed. -- The interface should be as clean as newly fallen snow and its behavior as explicit as Japanese eel porn. |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsMichael G Schwern dixit:
>>> Maybe you can shed some light on this problem, what to do about year 0? >> >> I actually haven’t thought about year 0, since localtime/mktime do that, >> and mjd2tm and tm2mjd from DJB code. I think it’s illegal, isn’t it? > >Illegal according to who? > >C99 and POSIX 1003.1 define tm.tm_year as a signed int with a range of "years >since 1900" which makes a negative year perfectly "legal". 0 is not negative, and there has been no year 0, only 1 ante christo (-1) followed directly by 1 post christo. (Ironically, he was probably not born by then.) >I'm a little lost. Could you give an example? Yup. Convert -2⁶³ from time_t to TAI (while honouring leap seconds), and it will wrap, because the result would be -2⁶³-10 (plus the BIAS), which is actually positive. But that doesn’t matter, as it wraps back on the way back. So don’t introduce any arbitrary limits. >I wonder how a Chinese locale deals with this. They switched from Julian to >Gregorian after 1901, so it should show up in any Chinese localtime() >implementation. Also Russia and much of Eastern Europe. I think a “struct tm” is just always gregorian, since other calendars do not neccessarily have the same day/month/year concept (example: Japanese). >PPS How does one get the damn qsecretary program to stop making you confirm >every email to this list? I'm already subscribed. Same problem here. I think it uses the envelope address, not the header address, to check, which is, IMO, a bug in DJB’s mailing list software ☺ bye, //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty output that accentuates irrelevant detail in the program, which is as sensible as putting all the prepositions in English text in bold font. -- Rob Pike in "Notes on Programming in C" |
|
|
Re: Algorithm to exploit 32 bit time functions to do time zone calculationsThorsten Glaser wrote:
> Michael G Schwern dixit: > >>>> Maybe you can shed some light on this problem, what to do about year 0? >>> I actually haven’t thought about year 0, since localtime/mktime do that, >>> and mjd2tm and tm2mjd from DJB code. I think it’s illegal, isn’t it? >> Illegal according to who? >> >> C99 and POSIX 1003.1 define tm.tm_year as a signed int with a range of "years >> since 1900" which makes a negative year perfectly "legal". > > 0 is not negative, and there has been no year 0, only 1 ante christo (-1) > followed directly by 1 post christo. (Ironically, he was probably not born > by then.) Oh, wasn't thinking about that. I was thinking about negative years. But you're right, year 0 is sticky. Let's look at what ISO 8601 does... Wikipedia claims ISO 8601-2004 treats year 0 as 1 BC, but I can't find an explicit reference. They also claim that ISO 8601-2004 uses "Astronomical year numbering" but again, I can't find that in the standard. It just says a "calendar year" is "in the Gregorian calendar" (2.2.13). However, it seems perfectly sensible and I think I'll go with that. Here's the language in 3.2.1 "The Gregorian calendar" which uses their usual "mutual agreement of the partners in information interchange" cop out The use of this calendar for dates preceding the introduction of the Gregorian calendar [1582] (also called the proleptic Gregorian calendar) should only be by agreement of the partners in information interchange. And in 4.1.2.1... calendar year is, unless specified otherwise, represented by four digits. Calendar years are numbered in ascending order according to the Gregorian calendar by values in the range [0000] to [9999]. Values in the range [0000] through [1582] shall only be used by mutual agreement of the partners in information interchange. They also define an expanded year format which, "by mutual agreement" allows for negative years. (4.4.3.3) >> I wonder how a Chinese locale deals with this. They switched from Julian to >> Gregorian after 1901, so it should show up in any Chinese localtime() >> implementation. Also Russia and much of Eastern Europe. > > I think a “struct tm” is just always gregorian, since other calendars > do not neccessarily have the same day/month/year concept (example: > Japanese). I've got a call out for someone to test what a properly localized Unix dist does, just to get a data point. -- "I went to college, which is a lot like being in the Army, except when stupid people yell at me for stupid things, I can hit them." -- Jonathan Schwarz |
| Free embeddable forum powered by Nabble | Forum Help |