|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
Bug#549227: UDD: please collect and expose the load time for update scriptsPackage: qa.debian.org
Severity: wishlist User: qa.debian.org@... Usertags: udd Hi! It's common in a datawarehouse system (like UDD can be considered) to keep track of the update jobs times: start, end, duration, records elaborated and so on. This will allow to query such information to generate a report ob jobs executions like: durations (mean, stddev, etc), growth, performance, eventual tuning due to interaction with other scripts, and so no. Such information, are usually stored in a different (internal) schema than the main one, but I think we can just add a table in 'udd' (maybe prefixed with 'udd_' to claryfy it's a UDD interal information table) for such information. Thanks, Sandro -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.30-1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash -- To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#549227: UDD: please collect and expose the load time for update scriptsOn 01/10/09 at 20:08 +0200, Sandro Tosi wrote:
> Package: qa.debian.org > Severity: wishlist > User: qa.debian.org@... > Usertags: udd > > Hi! > It's common in a datawarehouse system (like UDD can be considered) to keep track > of the update jobs times: start, end, duration, records elaborated and so on. > > This will allow to query such information to generate a report ob jobs > executions like: durations (mean, stddev, etc), growth, performance, eventual > tuning due to interaction with other scripts, and so no. > > Such information, are usually stored in a different (internal) schema than the > main one, but I think we can just add a table in 'udd' (maybe prefixed with > 'udd_' to claryfy it's a UDD interal information table) for such information. Hi Sandro, Timestamps are now exported to http://udd.debian.org/timing.txt , but this doesn't keep historical information. A patch adding the table you describe would be appreciated (the code would have to be python) -- | Lucas Nussbaum | lucas@... http://www.lucas-nussbaum.net/ | | jabber: lucas@... GPG: 1024D/023B3F4F | -- To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@... with a subject of "unsubscribe". Trouble? Contact listmaster@... |
|
|
Bug#549227: UDD: please collect and expose the load time for update scriptsHi guys,
On Wed, Oct 07, 2009 at 09:47:11PM +0200, Lucas Nussbaum wrote [edited]: > On 01/10/09 at 20:08 +0200, Sandro Tosi wrote: > > It's common in a datawarehouse system (like UDD can be considered) to keep track > > of the update jobs times: start, end, duration, records elaborated and so on. [..] > A patch adding the table you describe would be appreciated (the code > would have to be python) Patch attached. I didn't add a duration column as it's trivially calculated on the fly. I'm open to suggestions about getting record counts before and after updates in a generic way. Cheers, Serafeim ps. hacking UDD would be more fun without mixed indentation ;) -- debtags-organised WNPP bugs: http://members.hellug.gr/serzan/wnpp Index: udd.py =================================================================== --- udd.py (revision 1612) +++ udd.py (working copy) @@ -8,7 +8,7 @@ import string import sys from os import system -from time import asctime +import time import udd.aux import os.path @@ -20,6 +20,23 @@ for cmd in available_commands: print ' %s' % cmd +def insert_timestamps(config, source, command, start_time, end_time): + connection = udd.aux.open_connection(config) + cur = connection.cursor() + values = { 'source' : source, + 'command' : command, + 'start_time' : start_time, + 'end_time' : end_time } + cur.execute("""INSERT INTO udd_timestamps + (source, command, start_time, end_time) + VALUES (%(source)s, %(command)s, %(start_time)s, + %(end_time)s) + """, values) + connection.commit() + +def get_timestamp(): + return time.strftime('%Y-%m-%d %H:%M:%S') + if __name__ == '__main__': if len(sys.argv) < 4: print_help() @@ -46,25 +63,13 @@ # can just use the gatherer's methods if command == 'update': if "update-command" in src_config: - if 'timestamp-dir' in config['general']: - f = open(os.path.join(config['general']['timestamp-dir'], - src+".update-start"), "w") - f.write(asctime()) - f.close() + start_time = get_timestamp() result = system(src_config['update-command']) if result != 0: sys.exit(result) - if 'timestamp-dir' in config['general']: - f = open(os.path.join(config['general']['timestamp-dir'], - src+".update-end"), "w") - f.write(asctime()) - f.close() + end_time = get_timestamp() else: - if 'timestamp-dir' in config['general']: - f = open(os.path.join(config['general']['timestamp-dir'], - src+".insert-start"), "w") - f.write(asctime()) - f.close() + start_time = get_timestamp() (src_command,rest) = types[type].split(None, 1) if src_command == "exec": system(rest + " " + sys.argv[1] + " " + sys.argv[2] + " " + src) @@ -83,11 +88,8 @@ else: exec "gatherer.%s()" % command connection.commit() - if 'timestamp-dir' in config['general']: - f = open(os.path.join(config['general']['timestamp-dir'], - src+".insert-end"), "w") - f.write(asctime()) - f.close() + end_time = get_timestamp() + insert_timestamps(config, src, command, start_time, end_time) except: udd.aux.unlock(config, src) raise Index: sql/setup.sql =================================================================== --- sql/setup.sql (revision 1612) +++ sql/setup.sql (working copy) @@ -535,6 +535,16 @@ ); GRANT SELECT ON wannabuild TO public; +-- timings of data operations +CREATE TABLE udd_timestamps ( + id serial, + source text, + command text, + start_time timestamp, + end_time timestamp, + PRIMARY KEY (id) +); +GRANT SELECT ON udd_timestamps TO public; -- views -- bugs_count |
|
|
Bug#549227: marked as done (UDD: please collect and expose the load time for update scripts)Your message dated Tue, 3 Nov 2009 15:36:44 +0100
with message-id <20091103143644.GC3852@...> and subject line Re: Bug#549227: UDD: please collect and expose the load time for update scripts has caused the Debian Bug report #549227, regarding UDD: please collect and expose the load time for update scripts to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@... immediately.) -- 549227: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=549227 Debian Bug Tracking System Contact owner@... with problems Package: qa.debian.org Severity: wishlist User: qa.debian.org@... Usertags: udd Hi! It's common in a datawarehouse system (like UDD can be considered) to keep track of the update jobs times: start, end, duration, records elaborated and so on. This will allow to query such information to generate a report ob jobs executions like: durations (mean, stddev, etc), growth, performance, eventual tuning due to interaction with other scripts, and so no. Such information, are usually stored in a different (internal) schema than the main one, but I think we can just add a table in 'udd' (maybe prefixed with 'udd_' to claryfy it's a UDD interal information table) for such information. Thanks, Sandro -- System Information: Debian Release: squeeze/sid APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.30-1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash On 02/11/09 at 22:59 +0100, Serafeim Zanikolas wrote: > Hi guys, > > On Wed, Oct 07, 2009 at 09:47:11PM +0200, Lucas Nussbaum wrote [edited]: > > On 01/10/09 at 20:08 +0200, Sandro Tosi wrote: > > > It's common in a datawarehouse system (like UDD can be considered) to keep track > > > of the update jobs times: start, end, duration, records elaborated and so on. > [..] > > A patch adding the table you describe would be appreciated (the code > > would have to be python) > > Patch attached. I didn't add a duration column as it's trivially calculated on > the fly. I'm open to suggestions about getting record counts before and after > updates in a generic way. udd_timestamps) and adapted the check_timestamp script that tell me when data sources have not been updated for a long time. > ps. hacking UDD would be more fun without mixed indentation ;) I thought I had fixed all the files, but I missed udd.py. Fixed now. -- | Lucas Nussbaum | lucas@... http://www.lucas-nussbaum.net/ | | jabber: lucas@... GPG: 1024D/023B3F4F | |
| Free embeddable forum powered by Nabble | Forum Help |