Image palette reduction

  • http://en.wikipedia.org/wiki/Color_quantization
  • Octree
  • Median-cut
  • K-means
  • Gamut subdivision
  • http://www.cs.berkeley.edu/~dcoetzee/downloads/scolorq/

The reference links people have provided are good, and there are several solutions to this problem, but since I've been working on this problem recently (with complete ignorance as to how others have solved it), I offer my approach in plain English:

Firstly, realize that (human perceived) color is 3-dimensional. This is fundamentally because the human eye has 3 distinct receptors: red, green, and blue. Likewise your monitor has red, green, and blue pixel elements. Other representations, like, hue, saturation, luminance (HSL) can be used, but basically all representations are 3-dimensional.

This means RGB color space is a cube, with red, green, and blue axes. From a 24-bit source image, this cube has 256 discrete levels on each axis. A naive approach to reducing the image to 8-bit is to simply reduce the levels per axis. For instance, an 8x8x4 cube palette with 8 levels for red and green, 4 levels for blue is easily created by taking the high 3 bits of the red and green values, and the high 2 bits of the blue value. This is easy to implement, but has several disadvantages. In the resulting 256 color palette, many entries will not be used at all. If the image has detail using very subtle color shifts, these shifts will disappear as the colors snap into the same palette entry.

An adaptive palette approach needs to account for not just averaged/common colors in the image, but which areas of color space have the greatest variance. That is, an image that has thousands of subtle shades of light green requires a different palette than an image that has thousands of pixels of exactly the same shade of light green, since the latter would ideally use a single palette entry for that color.

To this end, I took an approach that results in 256 buckets containing exactly the same number of distinct values each. So if the original image contained 256000 distinct 24-bit colors, this algorithm results in 256 buckets each containing 1000 of the original values. This is accomplished by binary spatial partitioning of color space using the median of distinct values present (not the mean).

In English, this means we first divide the whole color cube into the half of pixels with less than the median red value and the half with more than the median red value. Then, divide each resulting half by green value, then by blue, and so on. Each split requires a single bit to indicate the lower or higher half of pixels. After 8 splits, variance has effectively been split into 256 equally important clusters in color space.

In psuedo-code:

// count distinct 24-bit colors from the source image
// to minimize resources, an array of arrays is used 
paletteRoot = {colors: [ [color0,count],[color1,count], ...]} // root node has all values 
for (i=0; i<8; i++) {
  colorPlane = i%3 // red,green,blue,red,green,blue,red,green
  nodes = leafNodes(paletteRoot) // on first pass, this is just the root itself
  for (node in nodes) {
    node.colors.sort(colorPlane) // sort by red, green, or blue
    node.lo = { colors: node.colors[0..node.colors.length/2] }
    node.hi = { colors: node.colors[node.colors.length/2..node.colors.length] } 
    delete node.colors // free up space! otherwise will explode memory
    node.splitColor = node.hi.colors[0] // remember the median color used to partition
    node.colorPlane = colorPlane // remember which color this node split on
  }
}

You now have 256 leaf nodes, each containing the same number of distinct colors from the original image, clustered spatially in the color cube. To assign each node a single color, find the weighted average using the color counts. The weighting is an optimization that improves perceptual color matching, but is not that important. Make sure to average each color channel independently. The results are excellent. Note that it is intentional that blue is divided once less than red and green, since the blue receptors in the eye are less sensitive to subtle changes than red and green.

There are other optimizations possible. By using HSL you could instead put the higher quantizing in the luminance dimension instead of blue. Also the above algorithm will slightly reduce overall dynamic range (since it ultimately averages color values), so dynamically expanding the resulting palette is another possibility.