Aquila 10-07-2007, 11:16 PM I've got a question that's been killing me the last couple of days
I have a list of datetimes for arrived and departed that I need to get the total time that doesn't intersect for...
I..E
First Guy Arrived 8 am, departed 10AM
Second Guy Arrived 9am, Departed 10AM
Third Guy Arrived 2PM departed 5PM
Fourth Guy Arrived 9AM left at 1PM
IF I wrote this out the second guy's hours wouldn't be included in the total because he was there the same time as the first, the third guys hours would all count because his time on site didn't intersect anyone and only 3 hours of the fourth guys time would count because some of his time intersected the first two guys..
I need to know how long someone was at the site for, not individually, not counting time when two people where on site or the time no one was on site.
I have absolutley no clue how to write a sql statement to get that information whatsoever using datearrived and datedeparted fields.
this is a pain... can anyone help me with anything? even a slight kick in the right direction?
(SQL Server 2005)
That's an interesting problem, Aquila. I'm not coming up with any reasonable way to solve this with a single query, although it could be done with a stored procedure. Is that an option?
The method I'm thinking of off the top of my head (which isn't even a little bit elegant, but perhaps it will spark someone else to refine it) is to loop through a cursor of your raw data (you could probably group by arrival and get only max departure, to weed out some of the overlap from the beginning), then figure out what the longest first interval is (up to a time when there's no one there), then when the next arrival time is after that, then what's the longest next interval, etc.
Aquila 10-11-2007, 10:27 AM I've done something similiar that isn't working. for every minute I place an entry in a temp table if it doesn't already exists then sum them up. It's ugly as heck and has issues, but... it might work ok when I'm done.
Here's some pseudocode for what I had in mind:
Create cursor of arrival/departure pairs
Create temp table of intervals (start/end times)
For each row in cursor
{
if arrival is between an existing start/end pair in temp table
{
if departure is before end time
{
nevermind; skip this row
}
else
{
increment end time in table to departure from current row
}
}
else if departure is between an existing start/end pair in temp table
{
decrement start time in table to arrival from current row
}
else
{
insert arrival/departure as new start/end pair in temp table
}
}
At the end, your temp table should contain the list of non-overlapping intervals you're looking for, so you can just sum the times.
Aquila 10-22-2007, 01:17 PM Thanks for your help! we got this done friday and your outline helped get our heads on straight! Really appreciate it.
A-Dam 10-26-2007, 09:01 PM You can do it without using a cursor. I know that the guys over at the sister-site DbForums are pretty obsessive about not using cursors (for performance reasons).
I used a table named "minutesinrange" (one field named "minutes") that holds one value for each minute between the earliest arrival time and the last departure time. I named the table with the arrival and departure times "station".
Here is the proc I used to fill the minutesinrange table. In practice, you would probably create it as a temp table:
DECLARE @currtime smalldatetime, @mintime smalldatetime, @maxtime smalldatetime
SET @mintime = (SELECT MIN(arrivetime) FROM station)
SET @maxtime = (SELECT MAX(departtime) FROM station)
SET @currtime = @mintime
WHILE (@currtime <= @maxtime)
BEGIN
INSERT INTO minutesinrange(minutes) VALUES (@currtime)
SET @currtime = dateadd(minute, 1, @currtime)
END
Then here is the query to get the total minutes when the station was occupied:
SELECT COUNT(*) AS total
FROM (SELECT minutes FROM minutesinrange
WHERE (minutes IN
(SELECT minutesinrange.minutes
FROM minutesinrange
INNER JOIN station ON minutesinrange.minutes
BETWEEN station.arrivetime AND station.departtime))) t
This might give results different than you would expect. If someone arrives at 8:00 and departs at 8:02, then you would assume that he was present for 2 minutes, but his time range would include 8:00, 8:01, and 8:02. You can easily replace the above BETWEEN expression with the following:
BETWEEN station.arrivetime AND dateadd(minute, - 1, station.departtime)
There is also a way to do this with an integers table instead of the minutesinrange table. That's a table of consecutive integers from 0 to whatever. An intergers table is very handy to keep in a database, but for this example it would require a more complicated sql query. I can post that method if you desire.
A-Dam 10-27-2007, 09:52 PM In the last post I used a table containing every minute between the very first arrival and last departure. If taken over the span of a week's time, this would require 60*24*7 (10080) rows. And that table's field was a smalldatetime datatype.
You can instead use a table of consecutive integers with only as many rows equaling the longest time span in the other table. In the poster's example, the fourth guy arrived at 9am and departed at 1pm. That would be 4 hours, or 240 minutes (241 when you include 1:00). You can find the longest time span with a query like this:
SELECT MAX(DATEDIFF(minute, arrivetime, departtime)) AS diff FROM Station
Then generate a table (named "ints" with one field named "i") with integers from 0 to 240. An integers table (pre-made) with more values will also work, and is a handy thing to have in any database.
Then this query will return the total number of minutes when people were present at the station:
SELECT COUNT(*) AS minutespresent
FROM (SELECT DISTINCT DATEADD(minute, ints.i, station.arrivetime) AS times
FROM ints INNER JOIN station ON ints.i
BETWEEN 0 AND DATEDIFF(minute, station.arrivetime, station.departtime)) t
|