音视频入门-11-PNG文件格式详解

PNG 文件格式解析

PNG 图像格式文件由一个 8 字节的 PNG 文件署名域和 3 个以上的后续数据块（IHDR、IDAT、IEND）组成。

PNG 文件包括 8 字节文件署名（89 50 4E 47 0D 0A 1A 0A，十六进制），用来识别 PNG 格式。

用十六进制查看器打开任意一个 PNG 文件，都是可以看到这样的头部:

PNG 定义了两种类型的数据块：一种是 PNG 文件必须包含、读写软件也都必须要支持的关键块（critical chunk）；另一种叫做辅助块（ancillary chunks），PNG 允许软件忽略它不认识的附加块。这种基于数据块的设计，允许 PNG 格式在扩展时仍能保持与旧版本兼容。

数据块总览

下表就是 PNG 中数据块的类别，关键数据块部分突出显示以区分：

数据块符号	数据块名称	多数据块	是否可选	位置限制
`IHDR`	`文件头数据块`	`否`	`否`	`第一块`
cHRM	基色和白色点数据块	否	是	在PLTE和IDAT之前
gAMA	图像γ数据块	否	是	在PLTE和IDAT之前
sBIT	样本有效位数据块	否	是	在PLTE和IDAT之前
`PLTE`	`调色板数据块`	`否`	`是`	`在IDAT之前`
bKGD	背景颜色数据块	否	是	在PLTE之后IDAT之前
hIST	图像直方图数据块	否	是	在PLTE之后IDAT之前
tRNS	图像透明数据块	否	是	在PLTE之后IDAT之前
oFFs	(专用公共数据块)	否	是	在IDAT之前
pHYs	物理像素尺寸数据块	否	是	在IDAT之前
sCAL	(专用公共数据块)	否	是	在IDAT之前
`IDAT`	`图像数据块`	`是`	`否`	`与其他IDAT连续`
tIME	图像最后修改时间数据块	否	是	无限制
tEXt	文本信息数据块	是	是	无限制
zTXt	压缩文本数据块	是	是	无限制
fRAc	(专用公共数据块)	是	是	无限制
gIFg	(专用公共数据块)	是	是	无限制
gIFt	(专用公共数据块)	是	是	无限制
gIFx	(专用公共数据块)	是	是	无限制
`IEND`	`图像结束数据`	`否`	`否`	`最后一个数据块`

我们目前只需关注关键数据块即可。

数据块中有 4 个关键数据块：

文件头数据块 IHDR（header chunk）：包含有图像基本信息，作为第一个数据块出现并只出现一次。
调色板数据块 PLTE（palette chunk）：必须放在图像数据块之前。
图像数据块 IDAT（image data chunk）：存储实际图像数据。PNG 数据允许包含多个连续的图像数据块。
图像结束数据 IEND（image trailer chunk）：放在文件尾部，表示 PNG 数据流结束。

数据块连起来，大概这个样子：

PNG 标识符	PNG 数据块(IHDR)	PNG 数据块(其他类型数据块)	…	PNG 结尾数据块(IEND)

数据块结构

PNG 文件中，每个数据块（比如IHDR，IDAT等）由4个部分组成：

名称	字节数	说明
Length (长度)	4 字节	指定数据块中数据域的长度，其长度不超过(2^31－1)字节
Chunk Type Code (数据块类型码)	4 字节	数据块类型码由 ASCII 字母(A-Z和a-z)组成
Chunk Data (数据块数据)	可变长度	存储按照 Chunk Type Code 指定的数据
CRC (循环冗余检测)	4 字节	存储用来检测是否有错误的循环冗余码

CRC(cyclic redundancy check) 域中的值是对 Chunk Type Code 域和 Chunk Data 域中的数据进行计算得到的。
注意：Length 值的是除：length 本身，Chunk Type Code，CRC 外的长度，也就是 Chunk Data 的长度。

数据块-文件头数据块 IHDR

它包含 PNG 文件中存储的图像数据的基本信息，并要作为第一个数据块出现在 PNG 数据流中，而且一个 PNG 数据流中只能有一个文件头数据块。

文件头数据块由 13 字节组成：

域的名称	字节数	说明
Width	4 bytes	图像宽度，以像素为单位
Height	4 bytes	图像高度，以像素为单位
Bit depth	1 byte	图像深度： `索引彩色图像：1，2，4或8` `灰度图像：1，2，4，8或16` `真彩色图像：8或16`
ColorType	1 byte	颜色类型：`0：灰度图像, 1，2，4，8或16` `2：真彩色图像，8或16` `3：索引彩色图像，1，2，4或8` `4：带α通道数据的灰度图像，8或16` `6：带α通道数据的真彩色图像，8或16`
Compression method	1 byte	PNG Spec 规定此处总为 0，表示使用压缩方法(LZ77派生算法)
Filter method	1 byte	PNG Spec 规定此处总为 0，滤波器方法
Interlace method	1 byte	隔行扫描方法：`0：非隔行扫描` `1： Adam7(由Adam M. Costello开发的7遍隔行扫描方法)`

用十六进制查看器打开一个 PNG 文件:

十六进制	说明
00 00 00 0D	数据块长度 13 字节
49 48 44 52	数据块类型码 “IHDR” 的 ASCII 字母
00 00 04 1D	图像宽度 1053
00 00 02 B3	图像高度 691
08	图像深度 8
06	带α通道数据的真彩色图
00	压缩方法
00	滤波器方法
00	隔行扫描方法：00非隔行扫描
52 C3 75 3A	CRC (循环冗余检测)

数据块-调色板数据块 PLTE

包含有与索引彩色图像(indexed-color image)相关的彩色变换数据，它仅与索引彩色图像有关，而且要放在图像数据块(image data chunk)之前。

PLTE 数据块是定义图像的调色板信息，PLTE 可以包含 1~256 个调色板信息，每一个调色板信息由 3 个字节组成：

颜色	字节	意义
Red	1 byte	0 = 黑色, 255 = 红
Green	1 byte	0 = 黑色, 255 = 绿色
Blue	1 byte	0 = 黑色, 255 = 蓝色

调色板的长度应该是 3 的倍数，否则，这将是一个非法的调色板。
对于索引图像，调色板信息是必须的，调色板的颜色索引从 0 开始编号，然后是 1、2……，调色板的颜色数不能超过色深中规定的颜色数（如图像色深为 4 的时候，调色板中的颜色数不可以超过 2^4=16），否则，这将导致 PNG 图像不合法。
真彩色图像和带 α 通道数据的真彩色图像也可以有调色板数据块，目的是便于非真彩色显示程序用它来量化图像数据，从而显示该图像。

用十六进制查看器打开一个索引图像 PNG 文件:

十六进制	说明
00 00 00 27	数据块长度 39 字节
50 4C 54 45	数据块类型码 “PLTE” 的 ASCII 字母
`B7 00 34` `FF 99 00` `60 00 73` `FF 0F 00` `FF ED 00` `09 00 B2` `FF 66 00` `FF 3B 00` `E2 00 15` `8B 00 54` `FF C1 00` `33 00 99` `FF FF 00`	调色板颜色 13 个
48 29 75 2C	CRC (循环冗余检测)

预览调色板中的颜色：

数据块-图像数据块 IDAT

它存储实际的数据，在数据流中可包含多个连续顺序的图像数据块。

IDAT 存放着图像真正的数据信息，因此，如果能够了解 IDAT 的结构，我们就可以很方便的生成 PNG 图像。

用十六进制查看器打开一个索引图像 PNG 文件:

十六进制	说明
00 00 00 D3	数据块长度 211 字节
49 44 41 54	数据块类型码 “IDAT” 的 ASCII 字母
78 9C ……	压缩的数据 211 字节，LZ77 派生压缩方法
52 98 5D 9D	CRC (循环冗余检测)

图像数据块 IDAT 细节在本文下半部分有详细分析

数据块-图像结束数据 IEND

它用来标记 PNG 文件或者数据流已经结束，并且必须要放在文件的尾部。

如果我们仔细观察 PNG 文件，我们会发现，文件的结尾 12 个字符看起来总应该是这样的：

00 00 00 00 49 45 4E 44 AE 42 60 82

用十六进制查看器打开一个 PNG 文件:

由于数据块结构的定义，IEND 数据块的长度总是 0（00 00 00 00，除非人为加入信息），数据标识总是 IEND（49 45 4E 44），因此，CRC 码也总是 AE 42 60 82。

图像数据块 IDAT 细节

IDAT 压缩数据细节

在 PNG Spec 压缩算法部分:

PNG compression method 0 is deflate/inflate compression with a sliding window (which is an upper bound on the distances appearing in the deflate stream) of at most 32768 bytes. Deflate compression is an LZ77 derivative [ZL].

Deflate-compressed datastreams within PNG are stored in the "zlib" format, which has the structure:

- zlib compression method/flags code	1 byte
- Additional flags/check bits	1 byte
- Compressed data blocks	n bytes
- Check value	4 bytes

Further details on this format are given in the zlib specification [RFC-1950].

PNG 使用 DEFLATE 压缩算法。
DEFLATE 是同时使用了 LZ77 算法与哈夫曼编码（Huffman Coding）的一个无损数据压缩算法。
DEFLATE 压缩的数据以 zlib 格式存储。

    zlib(RFC1950):一种格式，是对 deflate 进行了简单的封装，他也是一个实现库(delphi中有zlib,zlibex)
    gzip(RFC1952):一种格式，也是对 deflate 进行的封装。

    gzip = gzip 头 + deflate 编码的实际内容 + gzip 尾
    zlib = zlib 头 + deflate 编码的实际内容 + zlib 尾

提取&解压 IDAT 中压缩数据

Windows 上可以用 [hexeditor](https://www.hhdsoftware.com/free-hex-editor)
Mac 上可以用 [hexfiend](http://ridiculousfish.com/hexfiend/)、[Hopper Disassembler](https://www.hopperapp.com/)

使用 zlib 解压 78 9C ...... 压缩的数据字节：

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include "zlib.h"

int main() {
    FILE *inFile = fopen("/Users/staff/Desktop/indexed-color-image.data", "rb");
    FILE *outFile = fopen("/Users/staff/Desktop/indexed-color-image-uncompress.data", "wb");
    
    fseek(inFile, 0L, SEEK_END);
    long size = ftell(inFile);
    fseek(inFile, 0L, SEEK_SET);

    uint8_t dataBuf[size];
    fread(dataBuf, size, 1, inFile);
    printf("压缩文件大小：%ld\n", size);

    uint8_t destBuf[1500000]={0};
    uint32_t destLen = 0;

    uncompress(destBuf, &destLen, dataBuf, size);
    printf("解压后大小：%d\n", destLen);

    fwrite(destBuf, destLen, 1, outFile);

    fflush(outFile);
    fclose(inFile);
    fclose(outFile);

    return 0;
}

分析解压后的数据

在 PNG Spec 7.1 Integers and byte order

All integers that require more than one byte shall be in network byte order (as illustrated in figure 7.1): the most significant byte comes first, then the less significant bytes in descending order of significance (MSB LSB for two-byte integers, MSB B2 B1 LSB for four-byte integers). The highest bit (value 128) of a byte is numbered bit 7; the lowest bit (value 1) is numbered bit 0. Values are unsigned unless otherwise noted. Values explicitly noted as signed are represented in two\'s complement notation.

PNG 使用网络字节序 大端字节序（Big Endian）。

在 PNG Spec 7.2 Scanlines

In PNG images of colour type 0 (greyscale) each pixel is a single sample, which may have precision less than a byte (1, 2, or 4 bits). These samples are packed into bytes with the leftmost sample in the high-order bits of a byte followed by the other samples for the scanline.

In PNG images of colour type 3 (indexed-colour) each pixel is a single palette index. These indices are packed into bytes in the same way as the samples for colour type 0.

PNG 图像深度小于 1 字节，将会被 packed into bytes。

在 PNG Spec 7.3 Filtering

PNG allows the scanline data to be filtered before it is compressed. Filtering can improve the compressibility of the data. The filter step itself results in a sequence of bytes of the same size as the incoming sequence, but in a different representation, preceded by a filter type byte. Filtering does not reduce the size of the actual scanline data. All PNG filters are strictly lossless.

Different filter types can be used for different scanlines, and the filter algorithm is specified for each scanline by a filter type byte. The filter type byte is not considered part of the image data, but it is included in the datastream sent to the compression step. An intelligent encoder can switch filters from one scanline to the next. The method for choosing which filter to employ is left to the encoder.

每一扫描行前有一字节用于指定过滤器类型。

在 PNG Spec Filters:

Filtering transforms the PNG image with the goal of improving compression. PNG allows for a number of filter methods. All the reduced images in an interlaced image shall use a single filter method. Only filter method 0 is defined by this International Standard. Other filter methods are reserved for future standardization (see 4.9 Extension and registration). Filter method 0 provides a set of five filter types, and individual scanlines in each reduced image may use different filter types.
。。。。。。

文件头数据块 IHDR 中 Filter method 过滤方法只能是 0。
Filter method=0 定义了 5 种 Filter Type 过滤器类型: 0:None、1:Sub、2:Up、3:Average、4:Paeth。

当 PNG 图片是索引图像时(下图数据：图像深度: 4 尺寸 256X256 过滤器类型: 0:None 隔行扫描方法：0：非隔行扫描)：

indexed-color-image.png

每个高亮区域前面一个字节 00 代表 过滤器类型 : 0:None 【PNG Spec 7.3 Filtering】【 PNG Spec Filters】。
如果高亮区域前面一个字节不是 00，高亮区将不是扫描行索引数据，需要参考【PNG Spec 9.2 Filter types for filter method 0】
每种颜色高亮显示的部分 128 字节 是一个扫描行颜色索引数据，因为 图像深度：4，所以每个字节代表两个颜色索引【PNG Spec 7.2 Scanlines】。
如字节 55 是十六进制，二进制为 01010101，前四 bit 位代表一个颜色索引 0101 十进制为 5，后四 bit 位代表一个颜色索引 0101 十进制为 5 。

当 PNG 图片是真彩图像时(下图数据：图像深度: 8 尺寸 70X70 过滤器类型: 0:None 隔行扫描方法：0：非隔行扫描)：

true-color-image.png

每个高亮区域前面一个字节 00 代表 过滤器类型 : 0:None 【PNG Spec 7.3 Filtering】【 PNG Spec Filters】。
如果高亮区域前面一个字节不是 00，高亮区将不是扫描行颜色数据，需要参考【PNG Spec 9.2 Filter types for filter method 0】
每种颜色高亮显示的部分 210 字节 是一个扫描行颜色数据，因为 真彩图片 图像深度：8，所以每三个字节代表一个像素颜色。
如字节 FF 00 00，代表一个像素 RGB 颜色。