What's better: DataSet or DataReader?

That is essentially: "which is better: a bucket or a hose?"

A DataSet is the bucket here; it allows you to carry around a disconnected set of data and work with it - but you will incur the cost of carrying the bucket (so best to keep it to a size you are comfortable with).

A data-reader is the hose: it provides one-way/once-only access to data as it flies past you; you don't have to carry all of the available water at once, but it needs to be connected to the tap/database.

And in the same way that you can fill a bucket with a hose, you can fill the DataSet with the data-reader.

The point I'm trying to make is that they do different things...

I don't personally use DataSet very often - but some people love them. I do, however, make use of data-readers for BLOB access etc.


It depends on your needs. One of the most important differences is that a DataReader will retain an open connection to your database until you're done with it while a DataSet will be an in-memory object. If you bind a control to a DataReader then it's still open. In addition, a DataReader is a forward only approach to reading data that can't be manipulated. With a DataSet you can move back and forth and manipulate the data as you see fit.

Some additional features: DataSets can be serialized and represented in XML and, therefore, easily passed around to other tiers. DataReaders can't be serialized.

On the other hand if you have a large amount of rows to read from the database that you hand off to some process for a business rule a DataReader may make more sense rather than loading a DataSet with all the rows, taking up memory and possibly affecting scalability.

Here's a link that's a little dated but still useful: Contrasting the ADO.NET DataReader and DataSet.


Further to Marc's point: you can use a DataSet with no database at all.

You can fill it from an XML file, or just from a program. Fill it with rows from one database, then turn around and write it out to a different database.

A DataSet is a totally in-memory representation of a relational schema. Whether or not you ever use it with an actual relational database is up to you.