|
View:
New views
12 Messages
—
Rating Filter:
Alert me
|
|
|
Scheduling: Comparing every appointment to every other appointment time efficientlyI'm back with more on my scheduling question. Now I'm sorting out how
to exhaustively compare all appointments in a selection for overlaps. If I have ten appointments, I need to check each one against each other one in the selection. The problem here is that a naive implemention requires events.count^2 comparisons. If you've got 10 records, you've got 100 comparison pairs and if you've got 100 records you got 10,000 comparison pairs. Ouch. Now, given pretty well any exhaustive comparison situation you can normally cut down the comparison pairs by half or slightly better 'for free': * A data point doesn't ever need to be compared to itself. ("Bob's Meeting" doesn't need to be compared to "Bob's Meeting".) * A pair doesn't have to be compared redundantly ("Bob's Meeting" is compared with "Mary's Meeting" so "Mary's Meeting" doesn't then need to be compared to "Bob's Meeting"). This reduces the number of comparisons by a bit better than half: (Events - 1) * (Events / 2) So, if you have 10 events, you need 45 comparison pairs and if you have 100 events, you need 4,950 comparison pairs. That's a good deal better than the naive exponential exploration - but it's still sloping up way too fast. To make the scope of the problem clearer, below are a few numbers for the comparison pairs (searches in this case) for a staring number of events. Events Naive Half Off 10 100 45 100 10,000 4,950 1,000 1,000,000 499,500 So, "half-off" is better but it's still not good enough past a very small starting set. Does anyone have a solution that is much better? I'm really hoping for something that is roughly (Events * 2) comparisons or not so far off of that. At lest something that doesn't have a rocket-tail shaped curve up. Such a sharp curve is frightening as the behavior of the program may change dramatically given only slightly different initial conditions. For example, using the "half off" approach, a starting selection of 20 records takes 190 comparisons while a starting selection of 60 records takes 1,770 comparisons - more than 9 times as many for 3 times the starting selection. That's a risky sort of solution. By the way, for anyone that has such a situation and wants the code for the "half-off" solution, the loops below are just a shell of this. All they do is accumulate some text showing the pairs compared. The clipboard runs out of space if you set the number of items in the test array very far up. In any case, even the starting case of five elements used below is enough to see the code at work. C_LONGINT($events_count) $events_count:=5 ARRAY LONGINT($eventIDs_al;$events_count) C_LONGINT($event_index) For ($event_index;1;$events_count) $eventIDs_al{$event_index}:=$event_index End for C_TEXT($results_text) $results_text:="" C_LONGINT($outer_index) For ($outer_index;1;$events_count) C_LONGINT($baseID_l) $baseID_l:=$eventIDs_al{$outer_index} C_LONGINT($startInnerLoop_l) $startInnerLoop_l:=$outer_index+1 C_LONGINT($inner_index) For ($inner_index;$startInnerLoop_l;$events_count) C_LONGINT($comparisonID_l) $comparisonID_l:=$eventIDs_al{$inner_index} $results_text:=$results_text+String($baseID_l) $results_text:=$results_text+" : " $results_text:=$results_text+String($comparisonID_l) $results_text:=$results_text+Char(Carriage return ) End for End for SET TEXT TO CLIPBOARD($results_text) Below is the output, showing that every element is compared to every other element (but not itself) once and only once: 1 : 2 1 : 3 1 : 4 1 : 5 2 : 3 2 : 4 2 : 5 3 : 4 3 : 5 4 : 5 ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyDavid,
I think the best you'll be able to do is N*(log(N)). In some cases this will be N*2, but not in the general case. General Idea: sort the list by start date/time first For each record: use an intelligent search (log N) to find the first record after your selected record with a start time before your selected record end time. if no record is found, you are done. If a record is found you have your first conflict and the subsequent conflicts will be sequentially after this one until the start time of the next record is after your selected end time. Hope this makes sense. Also, it's been a while since I was in my CS books, so remember what you paid for my advice :-) >I'm back with more on my scheduling question. Now I'm sorting out how >to exhaustively compare all appointments in a selection for overlaps. >If I have ten appointments, I need to check each one against each >other one in the selection. The problem here is that a naive >implemention requires events.count^2 comparisons. If you've got 10 >records, you've got 100 comparison pairs and if you've got 100 records >you got 10,000 comparison pairs. Ouch. > >Now, given pretty well any exhaustive comparison situation you can >normally cut down the comparison pairs by half or slightly better 'for >free': > >* A data point doesn't ever need to be compared to itself. ("Bob's >Meeting" doesn't need to be compared to "Bob's Meeting".) > >* A pair doesn't have to be compared redundantly ("Bob's Meeting" is >compared with "Mary's Meeting" so "Mary's Meeting" doesn't then need >to be compared to "Bob's Meeting"). > >This reduces the number of comparisons by a bit better than half: > > (Events - 1) * (Events / 2) > >So, if you have 10 events, you need 45 comparison pairs and if you >have 100 events, you need 4,950 comparison pairs. That's a good deal >better than the naive exponential exploration - but it's still sloping >up way too fast. To make the scope of the problem clearer, below are a >few numbers for the comparison pairs (searches in this case) for a >staring number of events. > >Events Naive Half Off > 10 100 45 > 100 10,000 4,950 >1,000 1,000,000 499,500 > >So, "half-off" is better but it's still not good enough past a very >small starting set. Does anyone have a solution that is much better? >I'm really hoping for something that is roughly (Events * 2) >comparisons or not so far off of that. At lest something that doesn't >have a rocket-tail shaped curve up. Such a sharp curve is frightening >as the behavior of the program may change dramatically given only >slightly different initial conditions. For example, using the "half >off" approach, a starting selection of 20 records takes 190 >comparisons while a starting selection of 60 records takes 1,770 >comparisons - more than 9 times as many for 3 times the starting >selection. That's a risky sort of solution. -- Doug Cottrill PTM Software, LLC ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
|
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyDavid,
I should have said log 2 of N (or log e of N- not sure which). In computer algorithm terms, N log N (or N*N, or whatever) will be pretty imprecise for calculating exact numbers of operations. Generally, discussion of computer algorithm times are prefaced with "on the order of". So, with N=1000, your computation time is "on the order of" 10,000 operations. Still- much better than N*N which is on the order of 1,000,000 operations. >On Thu, Nov 12, 2009 at 2:55 PM, Doug Cottrill <dkc@...> wrote: >> David, >> >> I think the best you'll be able to do is N*(log(N)). In some cases this >> will be N*2, but not in the general case. > >N * (Log(N)) would be good enough, I'd say. That should make it cost >about 3,000 comparisons for a 1,000 starting selection - I can deal >with that kind of ratio. -- Doug Cottrill PTM Software, LLC ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyDavid Adams:
"Now I'm sorting out how to exhaustively compare all appointments in a selection for overlaps." I am confused by this, because the queries discussed before were proposed to avoid vector comparisons in a loop construct. Thus, when dealing with a subset of records, why not simply use QUERY SELECTION? Sincerely, Sean Broderick ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________ ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyOn Fri, Nov 13, 2009 at 5:23 AM, Broderick, Sean - Direct Brands
<Sean.Broderick@...> wrote: > David Adams: > > "Now I'm sorting out how to exhaustively compare all appointments in a > selection for overlaps." > > I am confused by this, because the queries discussed before were proposed to > avoid vector comparisons in a loop construct. I ended up going with indexed queries on date+time signatures as it's easy. I'm still interested in a more purely mathematical approach - if it can be made to run easily in 4D. I mean, calendars are, in one way, the most commonly used Cartesian space in day-to-day life. > Thus, when dealing with a subset of records, why not simply use QUERY > SELECTION? The earliest query was to answer the question "I want to make an appointment from 4:00-4:30 in Conference Room A, does that conflict with any other meetings in that room?" The nested loops are meant to address the question "I've got 100 appointments with various start/end times - do any of them overlap?" Yes, QUERY SELECTION would work - but you would end up with something like this: CREATE SET([Event];"EventsToCheck") for ($index;1;records in set("EventsToCheck") use set("EventsToCheck") goto selected record([Event];$index) QUERY SELECTION([event];time range check;*) QUERY SELECTION[event];&;skip matching current record) do something with the found conflict records, if any. end for ...you could load IDs into an array instead of using USE SET. If I've understood correctly, it seems like this approach would end up having me run the full query for every event. ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyDavid Adams:
"I've got 100 appointments with various start/end times - do any of them overlap?" Oh, I misunderstood. "If I've understood correctly, it seems like this approach would end up having me run the full query for every event. Yes, but I argue the pseudo-code you wrote would perform well enough and have the benefit of not sending a future colleague to their dusty algorithm books for a bout of nostalgia. On the other hand, here is another idea... ORDER BY([event];[event]start_stamp) SELECTION TO ARRAY([event]start_stamp;$event_start) SELECTION TO ARRAY([event]end_stamp;$event_end) Now, step through the arrays from element 2 to (Size - 1). Comparing current vector boundaries to those of the previous and following vectors should easily reveal conflicts. In your language, wouldn't this be O(n)? Sincerely, Sean Broderick ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________ ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyOn Fri, Nov 13, 2009 at 8:33 AM, Broderick, Sean - Direct Brands
<Sean.Broderick@...> wrote: > David Adams: > > "I've got 100 appointments with various start/end times - do any of them > overlap?" > > Oh, I misunderstood. > > "If I've understood correctly, it seems like this approach would end up > having me run the full query for every event. > > Yes, but I argue the pseudo-code you wrote would perform well enough and > have the benefit of not sending a future colleague to their dusty algorithm > books for a bout of nostalgia. > > On the other hand, here is another idea... > > ORDER BY([event];[event]start_stamp) > SELECTION TO ARRAY([event]start_stamp;$event_start) > SELECTION TO ARRAY([event]end_stamp;$event_end) > > Now, step through the arrays from element 2 to (Size - 1). Comparing current > vector boundaries to those of the previous and following vectors should > easily reveal conflicts. Yes - exactly. I tried out something similar last night and it works very well. What I've actually done is modify the nested loop I already posted so that, again, elements are not compared to themselves and no pair of elements are ever compared twice. Because the items are ordered, you can get a big reduction in overall comparisons: $comparisons_count:=0 ARRAY LONGINT($eventIDs_al;0) ARRAY STRING(20;$startExpression_as;0) ARRAY STRING(20;$endExpression_as;0) SELECTION TO ARRAY([Event]ID;$eventIDs_al;[Event]Start_Expression;$startExpression_as;[Event]End_Expression;$endExpression_as) SORT ARRAY($startExpression_as;$endExpression_as;$eventIDs_al) C_LONGINT($events_count) $events_count:=Size of array($eventIDs_al) C_LONGINT($outer_index) For ($outer_index;1;$events_count) C_LONGINT($baseID_l) C_STRING(20;$baseEndExpression_s) $baseID_l:=$eventIDs_al{$outer_index} $baseEndExpression_s:=$endExpression_as{$outer_index} C_LONGINT($startInnerLoop_l) $startInnerLoop_l:=$outer_index+1 C_LONGINT($inner_index) For ($inner_index;$startInnerLoop_l;$events_count) C_LONGINT($comparisonID_l) C_STRING(20;$comparisonStartExpression_s) $comparisonID_l:=$eventIDs_al{$inner_index} $comparisonStartExpression_s:=$startExpression_as{$inner_index} $comparisons_count:=$comparisons_count+1 If ($baseEndExpression_s>$comparisonStartExpression_s) ` Ends after next session starts APPEND TO ARRAY($conflictIDsOneArray_pointer->;$baseID_l) APPEND TO ARRAY($conflictIDsTwoArray_pointer->;$comparisonID_l) Else ` No overlap - as the array is ordered, no further (later) overlaps are possible for this session. $inner_index:=$events_count+1 ` BREAK End if End for End for $0:=$comparisons_count Feel free to laugh at me for breaking a For loop rather than using another sort of loop. The gain depends on the data. If you have 1,000 appointments with no conflicts, you need 999 comparisons to check everything. If you have 1,000 appointments all at the same time (worst case) it's, well, worst case ;-) In the real world of my data, I think we'll get results in under .1 second. ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyOn 11/12/09 1:08 PM, "David Adams" <dpadams@...> wrote:
> I'm still interested in a more purely mathematical approach Here's my thought on this subject, which may or may not be useful at this point. If I understand correctly, your basic task is to make sure that you don't double book a room for 2 (or more) meetings at the same time. If I've misunderstood, the rest of this post is moot =) Would it just be easier to compare against the room at that time instead of against all other meetings? For example, assuming you have a 10 hour business day and 10 rooms, at most each day you'll only have to run 100 comparisons. This also assumes that you're blocking off rooms in 1 hour increments. If you block off in 15 minute increments, you're looking at 400 comparisons. But this can be improved further. If I know PartyA wants to schedule their meeting at 9:00, then I only need to check the 9 o'clock hour of the 10 rooms -- 10 comparisons max. (If they want to meet for 2 hours, then that's 20 comparisons max). What I'm trying to illustrate is that in a check for overlap, I'm only interested in if there's overlap at all; not specifically which meeting already has that time and space reserved. Kind Regards, Chris Visaya ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
|
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyOn Fri, Nov 13, 2009 at 9:39 AM, Chris Visaya <cvisaya@...> wrote:
> On 11/12/09 1:08 PM, "David Adams" <dpadams@...> wrote: > >> I'm still interested in a more purely mathematical approach > > Here's my thought on this subject, which may or may not be useful at this > point. If I understand correctly, your basic task is to make sure that you > don't double book a room for 2 (or more) meetings at the same time. If I've > misunderstood, the rest of this post is moot =) > > Would it just be easier to compare against the room at that time instead of > against all other meetings? For example, assuming you have a 10 hour > business day and 10 rooms, at most each day you'll only have to run 100 > comparisons. This also assumes that you're blocking off rooms in 1 hour > increments. If you block off in 15 minute increments, you're looking at 400 > comparisons. > > But this can be improved further. If I know PartyA wants to schedule their > meeting at 9:00, then I only need to check the 9 o'clock hour of the 10 > rooms -- 10 comparisons max. (If they want to meet for 2 hours, then that's > 20 comparisons max). > > What I'm trying to illustrate is that in a check for overlap, I'm only > interested in if there's overlap at all; not specifically which meeting > already has that time and space reserved. Thanks for the suggestion - it's definitely a good one. Coincidentally, I had much the same suggestion (not on the list) from Steven Willis yesterday. In my case, I'm working on a couple of different types of conflict. Primarily, it's a matter of checking rooms for conflicts - as you say. After that, I'm left with some more ad-hoc requirements, such as checking for conflicts on the schedule for a specific person. That's what's leading me towards a search for a more generalized approach to conflict searches. With that said, I think the idea of treating the times in the day as discrete units of 5-15-15 minutes is a very solid approach, so I'm glad to have it in mind and also happy to see it here on the list/in the archives. ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
|
|
Re: Scheduling: Comparing every appointment to every other appointment time efficientlyFor anyone following along at home - or consulting the archives in the
future - I thought I'd toss in a few comments about the last method I posted. * It's a scratch method - it may not even work correctly at detection. I have to test it formally. (I know, I'm bad. No UnitTester in 2004 to keep me more honest!) * I'll probably end up with a parameter list like so: $1 Array of event IDs $2 Array of event start time signatures (2009-11-15-12060) $3 Array of event end time signatures $4 Array to store first part of conflict pairs $5 Array to store second half of conflict pairs $6 {Time arrays are already sorted?} The idea being that all of the comparison information is passed in and then two parallel arrays are populated with the IDs of each conflict pair, if any. Using arrays throughout simplifies experimenting when records don't already exist, frees the code from any particular table, lets me select the data however I like, etc. The "Time arrays are already sorted?" business is a bit of an efficiency thingy. If the arrays are already sorted, they needn't be sorted again redundantly. If the arrays aren't sorted, the calling method doesn't have to do it - the code can be put into the main method and handled there. This is pretty much a convenience a lot of the time but, hey, it's basically a free enhancement. The inner loops terminate when the first non-conflict event is found - or there's nothing left to compare. Some might consider doing a binary search to find the first not matching event as this may take fewer iterations than a linear search (what I'm doing.) In this case, I need all of the conflicting event IDs so I'll need to traverse those elements regardless. So, a binary search would (often) be a more efficient way to locate the first non-conflicting appointment but is a less efficient (very nearly always) way of visiting each of the conflicting appointments. I mention this as sometimes what people need is to find the first non-conflict (open space) - where using a binary search in the inner search might actually be a sound idea. ********************************************************************** Get up to $600 to spend on Amazon.com this holiday season - http://www.4d.com/serverpromo.html 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4D.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4D_Tech-Unsubscribe@... ********************************************************************** |
| Free embeddable forum powered by Nabble | Forum Help |