What is the difference between Caching and Pooling?

Caching is saving a value/object for reuse - normally to save resources.

Wikipedia says:

a cache is a component that transparently stores data so that future requests for that data can be served faster.

Polling is similar, where you have a number of such objects (a pool) - once an object has been taken out of the pool and used, it is returned to the pool for later reuse.

Wikipedia says:

A pool in computer science is a set of initialised resources that are kept ready to use, rather than allocated and destroyed on demand.


Caching usually refers to holding onto a static copy of a data for quick retrieval (with the assumption that retrieval or calculation of the value is expensive).

Pooling usually refers to keeping a number of resources around for quick usage (with the assumption that the creation and or disposal of these resources is expensive).


Both aim for object reuse. The distinction is usually drawn along statefulness; a pool is a collection of stateless objects, a cache is one of stateful objects. See this explanation.


Cache - store frequently used values, typically because the lookup and/or creation is non-trivial. e.g. if a lookup table from a database is frequently used, or values are read from a file on disk, it's more efficient to keep it in memory and refresh it periodically.

A cache only manages object lifetime in the cache, but does not impose semantics on what is held in the cache. A cache also doesn't create the items, but just stores objects.

Pool - term to describe a group of resources that are managed by the pool itself. e.g. (Database) Connection Pool - When a connection is needed it is obtained from the pool, and when finished with is returned to the pool.

The pool itself handles creation and destruction of the pooled objects, and manages how many objects can be created at any one time.

Pools are typically used to reduce overhead and throttle access to resources. You wouldn't want every servlet request opening a new connection to the database. Because then you have a 1:1 relationship between active requests and open connections. The overhead of creating an destroying these connections is wasteful, plus you could easily overwhelm your database. by using a pool, these open connections can be shared. For example 500 active requests might be sharing as little as 5 database connections, depending on how long a typical request needs the connection.

Cache Pool - mostly seems to describe the number of (independent?) cache's that exist. E.g. an asp.net application has 1 cache per Application Domain (cache isn't shared between asp.net applications). Literally a pool of caches, although this term seems to be used rarely.

Tags:

Java