jpeg的色度二次采样算法

本文介绍了jpeg的色度二次采样算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试编写jpeg编码器，并在创建收集适当的Y，Cb和Cr颜色分量以便传递给执行变换的方法的算法时遇到了麻烦。

据我了解，以下是四种最常见的二次采样变体的设置（我可能在这里有点偏离）：

4：4：4-一个8x8像素的MCU块，每个像素代表Y，Cb和Cr。

4：2：2-一个16x8像素的MCU块每个像素Y且Cb，Cr每两个像素

4：2：0-一个16x16像素的MCU块，其中Y每两个像素C和Cb，Cr每四个像素

我到目前为止所发现的关于此事的最明确的描述是

这是一个简单的循环，对我们的1/2水平情况进行颜色二次采样：

  unsigned char ucCb [8] [8]，ucCr [8] [8]； 
 int x，y; 
 
（y = 0; y  {
 for（x = 0; x  {
 ucCb [y] [x] =（srcCb [y] [x * 2] + srcCb [y] [（x * 2）+1] + 1）/ 2; //平均每个水平对
 ucCr [y] [x] =（srcCr [y] [x * 2] + srcCr [y] [（x * 2）+1] + 1）/ 2; 
} //对于x 
} //对于y

您可以瞧，没什么了。将源图像中的每对Cb和Cr像素水平平均，以形成新的Cb / Cr像素。然后对它们进行DCT转换，锯齿形处理并以与往常相同的形式进行编码。

最后对于2x2子样本情况，MCU现在为16x16像素，写入的DCT块将是Y0，Y1，Y2，Y3，Cb，Cr。其中Y0代表左上方的8x8亮度像素，Y1代表右上方，Y2代表左下方，Y3代表右下方。在这种情况下，Cb和Cr值表示已一起平均的4个源像素（2x2）。以防万一，您将YCbCr颜色空间中的颜色值平均在一起。如果您将RGB色彩空间中的像素平均在一起，将无法正常工作。

FYI-Adobe在RGB色彩空间中支持JPEG图像（而不是YCbCr）。这些图像不能使用颜色二次采样，因为R，G和B具有同等的重要性，而在此颜色空间中对其二次采样将导致更严重的视觉伪像。

I'm attempting to write a jpeg encoder and am stumbling at creating the algorithms that gather the appropriate Y, Cb, and Cr color components in order to pass to the method performing the transform.

As I understand it for the four most common subsampling variants are setup as follows (I could be way off here):

4:4:4 - An MCU block of 8x8 pixels with Y, Cb, and Cr represented in each pixel.
4:2:2 - An MCU block of 16x8 pixels with Y in each pixel and Cb, Cr every two pixels
4:2:0 - An MCU block of 16x16 pixels with Y every two pixels and Cb, Cr every four

There most explicit description of the laout I have found so far is described here

What I don't understand is how to gather those components in the correct order to pass as an 8x8 block for transforming and quantizing.

Would someone be able to write an example, (pseudocode would be fine I'm sure, C# even better), of how to group the bytes for transform?

I'll include the current, incorrect, code I am running.

/// <summary>
/// Writes the Scan header structure
/// </summary>
/// <param name="image">The image to encode from.</param>
/// <param name="writer">The writer to write to the stream.</param>
private void WriteStartOfScan(ImageBase image, EndianBinaryWriter writer)
{
    // Marker
    writer.Write(new[] { JpegConstants.Markers.XFF, JpegConstants.Markers.SOS });

    // Length (high byte, low byte), must be 6 + 2 * (number of components in scan)
    writer.Write((short)0xc); // 12

    byte[] sos = {
        3, // Number of components in a scan, usually 1 or 3
        1, // Component Id Y
        0, // DC/AC Huffman table 
        2, // Component Id Cb
        0x11, // DC/AC Huffman table 
        3, // Component Id Cr
        0x11, // DC/AC Huffman table 
        0, // Ss - Start of spectral selection.
        0x3f, // Se - End of spectral selection.
        0 // Ah + Ah (Successive approximation bit position high + low)
    };

    writer.Write(sos);

    // Compress and write the pixels
    // Buffers for each Y'Cb Cr component
    float[] yU = new float[64];
    float[] cbU = new float[64];
    float[] crU = new float[64];

    // The descrete cosine values for each componant.
    int[] dcValues = new int[3];

    // TODO: Why null?
    this.huffmanTable = new HuffmanTable(null);

    // TODO: Color output is incorrect after this point. 
    // I think I've got my looping all wrong.
    // For each row
    for (int y = 0; y < image.Height; y += 8)
    {
        // For each column
        for (int x = 0; x < image.Width; x += 8)
        {
            // Convert the 8x8 array to YCbCr
            this.RgbToYcbCr(image, yU, cbU, crU, x, y);

            // For each component
            this.CompressPixels(yU, 0, writer, dcValues);
            this.CompressPixels(cbU, 1, writer, dcValues);
            this.CompressPixels(crU, 2, writer, dcValues);
        }
    }

    this.huffmanTable.FlushBuffer(writer);
}

/// <summary>
/// Converts the pixel block from the RGBA colorspace to YCbCr.
/// </summary>
/// <param name="image"></param>
/// <param name="yComponant">The container to house the Y' luma componant within the block.</param>
/// <param name="cbComponant">The container to house the Cb chroma componant within the block.</param>
/// <param name="crComponant">The container to house the Cr chroma componant within the block.</param>
/// <param name="x">The x-position within the image.</param>
/// <param name="y">The y-position within the image.</param>
private void RgbToYcbCr(ImageBase image, float[] yComponant, float[] cbComponant, float[] crComponant, int x, int y)
{
    int height = image.Height;
    int width = image.Width;

    for (int a = 0; a < 8; a++)
    {
        // Complete with the remaining right and bottom edge pixels.
        int py = y + a;
        if (py >= height)
        {
            py = height - 1;
        }

        for (int b = 0; b < 8; b++)
        {
            int px = x + b;
            if (px >= width)
            {
                px = width - 1;
            }

            YCbCr color = image[px, py];
            int index = a * 8 + b;
            yComponant[index] = color.Y;
            cbComponant[index] = color.Cb;
            crComponant[index] = color.Cr;
        }
    }
}

/// <summary>
/// Compress and encodes the pixels. 
/// </summary>
/// <param name="componantValues">The current color component values within the image block.</param>
/// <param name="componantIndex">The componant index.</param>
/// <param name="writer">The writer.</param>
/// <param name="dcValues">The descrete cosine values for each componant</param>
private void CompressPixels(float[] componantValues, int componantIndex, EndianBinaryWriter writer, int[] dcValues)
{
    // TODO: This should be an option.
    byte[] horizontalFactors = JpegConstants.ChromaFourTwoZeroHorizontal;
    byte[] verticalFactors = JpegConstants.ChromaFourTwoZeroVertical;
    byte[] quantizationTableNumber = { 0, 1, 1 };
    int[] dcTableNumber = { 0, 1, 1 };
    int[] acTableNumber = { 0, 1, 1 };

    for (int y = 0; y < verticalFactors[componantIndex]; y++)
    {
        for (int x = 0; x < horizontalFactors[componantIndex]; x++)
        {
            // TODO: This can probably be combined reducing the array allocation.
            float[] dct = this.fdct.FastFDCT(componantValues);
            int[] quantizedDct = this.fdct.QuantizeBlock(dct, quantizationTableNumber[componantIndex]);
            this.huffmanTable.HuffmanBlockEncoder(writer, quantizedDct, dcValues[componantIndex], dcTableNumber[componantIndex], acTableNumber[componantIndex]);
            dcValues[componantIndex] = quantizedDct[0];
        }
    }
}

This code is part of an open source library I am writing on Github

解决方案

JPEG color subsampling can be implemented in a simple, yet functional manner without much code. The basic idea is that your eyes are less sensitive to changes in color versus changes in luminance, so the JPEG file can be much smaller by throwing away some color information. There are many ways to subsample the color information, but JPEG images tend to use 4 variants: none, 1/2 horizontal, 1/2 vertical and 1/2 horizontal+vertical. There are additional TIFF/EXIF options such as the "center point" of the subsampled color, but for simplicity we'll use an average of the sum technique.

In the simplest case (no subsampling), each MCU (minimum coded unit) is an 8x8 block of pixels made up of 3 components - Y, Cb, Cr. The image is processed in 8x8 pixel blocks where the 3 color components are separated, passed through a DCT transform and written to the file in the order (Y, Cb, Cr). In all cases of subsampling, the DCT blocks are always composed of 8x8 coefficients or 64 values, but the meaning of those values varies due to the color subsampling.

The next simplest case is subsampled in one dimension (horizontal or vertical). Let's use 1/2 horizontal subsampling for this example. The MCU is now 16-pixels wide by 8 pixels tall. The compressed output of each MCU will now be 4 8x8 DCT blocks (Y0, Y1, Cb, Cr). Y0 represents the luma values of the left 8x8 pixel block and Y1 represents the luma values of the right 8x8 pixel block. The Cb and Cr values are each 8x8 blocks based on the average value of horizontal pairs of pixels. I couldn't find any good images to insert here, but some pseudo-code can come in handy.

(update: image that might represent subsampling:)

Here's a simple loop which does the color subsampling of our 1/2 horizontal case:

unsigned char ucCb[8][8], ucCr[8][8];
int x, y;

for (y=0; y<8; y++)
{
   for (x=0; x<8; x++)
   {
      ucCb[y][x] = (srcCb[y][x*2] + srcCb[y][(x*2)+1] + 1)/2; // average each horiz pair
      ucCr[y][x] = (srcCr[y][x*2] + srcCr[y][(x*2)+1] + 1)/2;
   } // for x
} // for y

As you can see, there's not much to it. Each pair of Cb and Cr pixels from the source image is averaged horizontally to form a new Cb/Cr pixel. These are then DCT transformed, zigzagged and encoded in the same form as always.

Finally for the 2x2 subsample case, the MCU is now 16x16 pixels and the DCT blocks written will be Y0, Y1, Y2, Y3, Cb, Cr. Where Y0 represents the upper left 8x8 luma pixels, Y1 the upper right, Y2 the lower left and Y3 the lower right. The Cb and Cr values in this case represent 4 source pixels (2x2) that have been averaged together. Just in case you were wondering, the color values are averaged together in the YCbCr colorspace. If you average the pixels together in RGB colorspace, it won't work correctly.

FYI - Adobe supports JPEG images in the RGB colorspace (instead of YCbCr). These images can't use color subsampling because R, G and B are of equal importance and subsampling them in this colorspace would lead to much worse visual artifacts.

这篇关于jpeg的色度二次采样算法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！