Split date data (m/d/y) into 3 separate columns

Given a text variable x, like this:

> x
[1] "10/3/2001"

then:

> as.Date(x,"%m/%d/%Y")
[1] "2001-10-03"

converts it to a date object. Then, if you need it:

> julian(as.Date(x,"%m/%d/%Y"))
[1] 11598
attr(,"origin")
[1] "1970-01-01"

gives you a Julian date (relative to 1970-01-01).

Don't try the substring thing...

See help(as.Date) for more.


I use the format() method for Date objects to pull apart dates in R. Using Dirk's datetext, here is how I would go about breaking up a date into its constituent parts:

datetxt <- c("2010-01-02", "2010-02-03", "2010-09-10")
datetxt <- as.Date(datetxt)
df <- data.frame(date = datetxt,
                 year = as.numeric(format(datetxt, format = "%Y")),
                 month = as.numeric(format(datetxt, format = "%m")),
                 day = as.numeric(format(datetxt, format = "%d")))

Which gives:

> df
        date year month day
1 2010-01-02 2010     1   2
2 2010-02-03 2010     2   3
3 2010-09-10 2010     9  10

Note what several others have said; you can get the Julian dates without splitting out the various date components. I added this answer to show how you could do the breaking apart if you needed it for something else.


Quick ones:

  1. Julian date converters already exist in base R, see eg help(julian).

  2. One approach may be to parse the date as a POSIXlt and to then read off the components. Other date / time classes and packages will work too but there is something to be said for base R.

  3. Parsing dates as string is almost always a bad approach.

Here is an example:

datetxt <- c("2010-01-02", "2010-02-03", "2010-09-10")
dates <- as.Date(datetxt) ## you could examine these as well
plt <- as.POSIXlt(dates)  ## now as POSIXlt types
plt[["year"]] + 1900      ## years are with offset 1900
#[1] 2010 2010 2010
plt[["mon"]] + 1          ## and months are on the 0 .. 11 intervasl
#[1] 1 2 9
plt[["mday"]] 
#[1]  2  3 10
df <- data.frame(year=plt[["year"]] + 1900, 
                  month=plt[["mon"]] + 1, day=plt[["mday"]])
df
#  year month day
#1 2010     1   2
#2 2010     2   3
#3 2010     9  10

And of course

julian(dates)
#[1] 14611 14643 14862
#attr(,"origin")
#[1] "1970-01-01"

Tags:

Date

R