CSV file to Histogram in R

I'm also an R newbie, and I ran into the same thing. I made two separate mistakes, actually, so I'll describe them both here.

Mistake 1: Passing a frequency table to hist(). Originally I was trying to pass a frequency table to hist() instead of passing in the raw data. One way to fix this is to use the rep() ("replicate") function to explode your frequency table back into a raw dataset, as described here:

  • Creating a histogram using aggregated data
  • Simple R (histogram) from counted csv file

Instead of that, though, I just decided to read in my original dataset instead of the frequency table.

Mistake 2: Wrong data type. My raw data CSV file contains two columns: hostname and bookings (idea is to count the number of bookings each host generated during some given time period). I read it into a table.

> tbl <- read.csv('bookingsdata.csv')

Then when I tried to generate a histogram off the second column, I did this:

> hist(tbl[2])

This gave me the "'x' must be numeric" error you mention in a comment. (It was trying to read the "bookings" column header in as a data value.)

This fixed it:

> hist(tbl$bookings)

You should really start to read some basic R manual... CRAN offers a lot of them (look into the Manuals and Contributed sections)

In any case:

setwd("path/to/csv/file")
myvalues <- read.csv("filename.csv")
hist(myvalues, 100) # Example: 100 breaks, but you can specify them at will

See the manual pages for those functions for more help (accessible through ?read.table, ?read.csv and ?hist).


To plot the histogram, the values must be of numeric class i.e the data must be of numeric value. Here the value of x seems to be of some other class.

Run the following command and see:

sapply(myvalues[1,],class)

Tags:

Csv

R

Histogram