Just a small update on my progress. Since my last post I went snowboarding for a week in the French Alps. It was lovely, until I fell over and my arm went purple. No bones snapped but it did mean I had to spend a day in a chalet not snowboarding. Fortunately I had a laptop and an internet connection and my arm still worked well enough to type. I learned a lot that day, it was almost as much fun as hurtling down a snowy hill.
Before I get to what I learned, in the previous post I said I was unsure that my colour space conversion (from RGB to YCrCb) was losslessly invertible. I tested and it wasn’t, which means that rounding errors in the colour conversion will cause a small amount of image quality loss. A quick measurement showed that almost every pixel has either R, G or B being one off the original value. This isn’t a large error that you’d ever really notice, but it does mean that my compression scheme is demoted to being “nearly lossless”. It turns out that JPEG 2000 can use an alternative colour space transform that is losslessly invertible. It doesn’t provide as much compression as YCrCb would, but it is worth considering if losslessness is vital.
So, back to my day in the Chalet. My friend Harith had read my previous post and suggested that the YCrCb colour space transform looks a bit like the result of principal component analysis. Actually he muttered something about eigenvectors and wandered off, but it was enough to make me think. What exactly does the YCrCb transform do? And is it as good as it can be? Does it work on all images?
The problem with RGB is that the data in the each of the colour channels is correlated. Neighbouring pixels in a photograph tend to vary in brightness more than colour. So, if you know the colour of one pixel, and you know the next pixel’s green value, you can make a good guess at the red and blue values. Try and convince yourself of this by looking at the cute doggy below.
This correlation is bad because effectively we are duplicating the same information in each of the colour channels. Good general-purpose compression algorithms, like PPM or LZMA are probably exploiting some of this duplication to achieve their compression. But we can do better by making assumptions that those algorithms cannot. The YCrCb colour space has the desirable effect of decorrelating the colour channels – greatly reducing the amount of duplicated information.
The YCrCb colour space works on well on average images. However, it is a compromise and my theory is that it is not as good as having a colour space transform that has been tailored for the particular image in question. The first thing I decided to do was to plot a graph of the colour data for a bunch of images in RGB and YCrCb formats, so I could see how well YCrCb was decorrelating the data. Click on the images below to see animated versions:
The correlation of different channels appears as diagonals in these plots. You can see that the main diagonal in the YCrCb plot has gone. However, if we look at the plot down the luminance axis you can see that there is still a diagonal present in the main block of grey-ish pixels. You can also see various other bulges, but lets ignore those for the moment because they don’t contain many pixels and therefore don’t contribute much to the image size.
The remaining diagonal is there because the Cr and Cb channels are still correlated. So, there ought to be a colour space transform that can identify and remove that correlation. This is exactly what principal component analysis is for. PCA is an algorithm used by statisticians to automate finding patterns in large multi-dimensional data sets. In our case we can use it to form the optimal colour space transform for each image. Or even more temptingly, we could use different transforms on different parts of the image – one for each bulge in the colour space graph maybe.
So that’s what I’m going to try next.