How to find the highest (latest) and lowest (earliest) date [R]

Start with some dummy data:

start <- as.Date("2010/01/01")
end <- as.Date("2010/12/31")
set.seed(1)
datewant <- seq(start, end, by = "days")[sample(15)]
tmpTimes <- data.frame(EntryTime = datewant, 
                       ExitTime = datewant + sample(100, 15))
## reorder on EntryTime so in random order
tmpTimes <- tmpTimes[sample(NROW(tmpTimes)), ]
head(tmpTimes)

so we have something like this:

> head(tmpTimes)
    EntryTime   ExitTime
8  2010-01-14 2010-03-16
9  2010-01-05 2010-01-17
7  2010-01-10 2010-01-30
3  2010-01-08 2010-04-16
10 2010-01-01 2010-01-26
13 2010-01-12 2010-02-15

Using the above, look at Goal 1, compute difference between earliest and latest date. You can treat dates as if they were numbers (that is how they are stored internally anyway), so functions like min() and max() will work. You can use the difftime() function:

> with(tmpTimes, difftime(max(EntryTime), min(EntryTime)))
Time difference of 14 days

or use standard subtraction

> with(tmpTimes, max(EntryTime) - min(EntryTime))
Time difference of 14 days

to get the difference in days. head() and tail() will only work if you sort the dates as these take the first and the last value in a vector, not the highest and lowest actual value.

Goal 2: You seem to be trying to convert a data frame to a Date. You can't do this. What you can do is reformat the data in the components of the data frame. Here I add columns to tmpTimes by reformatting the EntryTime column into several different summaries of the date.

tmpTimes2 <- within(tmpTimes, weekOfYear <- format(EntryTime, format = "%W-%Y"))
tmpTimes2 <- within(tmpTimes2, monthYear <- format(EntryTime, format = "%B-%Y"))
tmpTimes2 <- within(tmpTimes2, Year <- format(EntryTime, format = "%Y"))

Giving:

> head(tmpTimes2)
    EntryTime   ExitTime weekOfYear    monthYear Year
8  2010-01-14 2010-03-16    02-2010 January-2010 2010
9  2010-01-05 2010-01-17    01-2010 January-2010 2010
7  2010-01-10 2010-01-30    01-2010 January-2010 2010
3  2010-01-08 2010-04-16    01-2010 January-2010 2010
10 2010-01-01 2010-01-26    00-2010 January-2010 2010
13 2010-01-12 2010-02-15    02-2010 January-2010 2010

If you are American or want to use the US convention for the start of the week (%W starts the week on a Monday, in US convention is to start on a Sunday), change the %W to %U. ?strftime has more details of what %W and %U represent.


A final point on data format: In the above I have worked with dates in standard R format. You have your data stored in a data frame in a non-standard markup, presumably as characters or factors. So you have something like:

tmpTimes3 <- within(tmpTimes, 
                    EntryTime <- format(EntryTime, format = "%d-%m-%y"))
tmpTimes3 <- within(tmpTimes3, 
                    ExitTime <- format(ExitTime, format = "%d-%m-%y"))

> head(tmpTimes3)
   EntryTime ExitTime
8   14-01-10 16-03-10
9   05-01-10 17-01-10
7   10-01-10 30-01-10
3   08-01-10 16-04-10
10  01-01-10 26-01-10
13  12-01-10 15-02-10

You need to convert those characters or factors to something R understands as a date. My preference would be the "Date" class. Before you try the above answers with your data, convert your data to the correct format:

tmpTimes3 <- 
    within(tmpTimes3, {
           EntryTime <- as.Date(as.character(EntryTime), format = "%d-%m-%y")
           ExitTime <- as.Date(as.character(ExitTime), format = "%d-%m-%y")
           })

so that your data looks like this:

> head(tmpTimes3)
    EntryTime   ExitTime
8  2010-01-14 2010-03-16
9  2010-01-05 2010-01-17
7  2010-01-10 2010-01-30
3  2010-01-08 2010-04-16
10 2010-01-01 2010-01-26
13 2010-01-12 2010-02-15
> str(tmpTimes3)
'data.frame':   15 obs. of  2 variables:
 $ EntryTime:Class 'Date'  num [1:15] 14623 14614 14619 14617 14610 ...
 $ ExitTime :Class 'Date'  num [1:15] 14684 14626 14639 14715 14635 ...

Short answer:

  • Convert to date if not already done.
  • Then use min and max on the list of dates.

    date_list = structure(c(15401, 15405, 15405), class = "Date")
    date_list
    #[1] "2012-03-02" "2012-03-06" "2012-03-06"
    
    min(date_list)
    #[1] "2012-03-02"
    max(date_list)
    #[1] "2012-03-06"
    

Tags:

Datetime

Time

R