At 99designs we love great design, and a big part of good design is use of color. We were interested to see how designers make use of color in their designs, so we built an automatic color extractor to enable us to analyse color usage at a massive scale.
Think of a design you love. Part of the story it tells is in the colors it uses, the contrast of light and shade, and the subtle emotions those colors convey.
It's pretty easy for a human to tell what colors are important in a design. Take this logo below for example. Most people identifying the important colors would come up with something like this:
Images are made up of pixels, and if you just count the colors of every pixel, you don't get anything like the list of colors above. This post is about our journey working out how to automatically work out an image's color palette that is close to what a real person would pick.
Problem 1: Detecting background color
Quick quiz: What's the primary color in this image?
The problem here is that simply counting the number of pixels with a given color the background color nearly always dominates. We need to work out the background color so we can exclude it.
We found a simple approach that works well in most cases. We found that if the pixels in the corners of the image are the same color, that color was the background.
Let's look again. In this case, the corner pixels are:
#ffffff -- all white. We can thus safely exclude white as
a background color, leaving red as the most frequent color. Nice!
Problem 2: Too many colors
Quiz time again, how many colors are in this image?
4 colors right? Not quite. Look closer at the G for example:
There's actually 255 different colors (an occult number for computer scientists). If we take the top four, we get:
Aww, not that great. It has two very similar blues, and misses the yellow entirely.
The problem is that colors that are not exactly the same are treated differently to a computer. All the different shades and variants of colors mess up the counts. Humans easily group sets of colors together though, and ideally our program would do the same.
So how can we judge if two colors look the same to the human eye or not? Fortunately, a bit of color theory comes in handy here.
Aside: Color theory
On a computer, colors are usually represented in the RGB color space. This means that a color is made up of three components: Red, Green and Blue. To work out the distance between two colors you can use the Euclidian distance of the components.
Comparing colors in the RGB color space works ok, but it's not perfect. Differences in RGB don't accurately match how the human eye perceives color. For example, yellow often appears brighter to humans than a blue of the same brightness. Also, humans can perceive smaller differences in green hues than in pink.
A better way to compare colors is to convert to the Lab color space. Lab is designed to allow comparison in a way that matches how the human eye perceives color. A color in the Lab color space has three components: "L" represents lightness, the "a" component ranges from green to magenta, the "b" component ranges from blue to yellow. The distance between colors in the Lab color space is often called the delta-E. A delta-E of less than 1.0 means that the human eye cannot tell the difference between two colors.
We can take advantage of this to group similar colors together in a way that matches how the eye would do it naturally.
Merging colors together
By combining visually similar colors, we're able to address most of the earlier problems. The color theory above tells us which colors to merge: pairs with a low delta-E are visually similar, and can be safely combined. On our noisy image, this leads to a clearer list of colors:
As it turns out, there's a whole lot of situations in which extra colors get added: antialiasing on edges of shapes, image compression artifacts, textures and gradients all add to the number of different colors that occur, even if they don't change the overall palette. This technique helps to deal with these issues. It can leave behind very low counts of some noisy colors -- an additional threshold filter helps to clean up colors that don't occur very often.
Problem 3: "interesting" colors
Now we can work out what colors are used in a design, but it turns out that a lot of the colors we find aren't that visually interesting.
Which colors are the most interesting here?
When we ask people, they usually pick the brighter and more distinctive colors. But the palette this image uses is much more like this:
In fact, it seems like grays and subdued shades are often used used as fillers to give highlights more impact. How could we isolate these distinctive colors?
Excellent question! Lab coats on!
After flicking through designs until our retinas got tired, we turned to color theory again. It turns out color theory has a name for the concept of color interestingness: "saturation".
Cutting out colors with a low saturation gives you much more interesting results.
We can now automatically work out the palette for an image that closely matches what a human would pick.
At 99designs we love open source -- so we're releasing Colorific, our automatic color palette detector. Check it out at github.
Python users can install Colorific via the pip packaging system:
pip install colorific
Colorific has been tested on Python 2.7, but if you have any problems please submit an issue on github.
Colorific is designed to run in a streaming manner. You feed in image filenames as input and colorific spits out the filename and the color palette as output.
$ echo myimage.png | colorific myimage.png #3e453f,#2ea3b7,#bee6ea,#51544c,#373d38 #ffffff
Tune in next time and we'll tell you all about how we apply Colorific at a massive scale to analyse the huge number of incoming designs at 99designs.