AVFrame中视音频数据存储格式

2018-10-31
 

AVFrame关于视音频数据存储

在AVFrame中有2个很重要的数据成员,一个是data,一个是linesize。data中存储的是未编码的源始数据(不论视音频),linesize中存储的是每行data中数据大小。

data的定义如下:
  1. /**

  2.     * pointer to the picture/channel planes.

  3.     * This might be different from the first allocated byte

  4.     *

  5.     * Some decoders access areas outside 0,0 - width,height, please

  6.     * see avcodec_align_dimensions2(). Some filters and swscale can read

  7.     * up to 16 bytes beyond the planes, if these filters are to be used,

  8.     * then 16 extra bytes must be allocated.

  9.     */  

  10.    uint8_t *data[AV_NUM_DATA_POINTERS];  

linesize定义如下:
  1. /**

  2.   * For video, size in bytes of each picture line.

  3.   * For audio, size in bytes of each plane.

  4.   *

  5.   * For audio, only linesize[0] may be set. For planar audio, each channel

  6.   * plane must be the same size.

  7.   *

  8.   * For video the linesizes should be multiplies of the CPUs alignment

  9.   * preference, this is 16 or 32 for modern desktop CPUs.

  10.   * Some code requires such alignment other code can be slower without

  11.   * correct alignment, for yet other it makes no difference.

  12.   *

  13.   * @note The linesize may be larger than the size of usable data -- there

  14.   * may be extra padding present for performance reasons.

  15.   */  

  16.  int linesize[AV_NUM_DATA_POINTERS];  

注意:当为音频的时候linesize,只有linesize[0]才是有效值,因为左右一样大。

3、存储方式

1、视频

视频相对简单的多,以yuv420为例,图像数据在AVFrame中存储是这样的:
data[0]存储Y
data[1]存储U
data[2]存储V
而他们相对应的大小为:
linesize[0]为Y的大小
linesize[1]为U的大小
linesize[2]为V的大小

2、音频

音频数据则要复杂一些,在音频中分为平面和非平面数据类型,下面是音频数据类型的定义:
  1. /**

  2. * Audio Sample Formats

  3. *

  4. * @par

  5. * The data described by the sample format is always in native-endian order.

  6. * Sample values can be expressed by native C types, hence the lack of a signed

  7. * 24-bit sample format even though it is a common raw audio data format.

  8. *

  9. * @par

  10. * The floating-point formats are based on full volume being in the range

  11. * [-1.0, 1.0]. Any values outside this range are beyond full volume level.

  12. *

  13. * @par

  14. * The data layout as used in av_samples_fill_arrays() and elsewhere in FFmpeg

  15. * (such as AVFrame in libavcodec) is as follows:

  16. *

  17. * For planar sample formats, each audio channel is in a separate data plane,

  18. * and linesize is the buffer size, in bytes, for a single plane. All data

  19. * planes must be the same size. For packed sample formats, only the first data

  20. * plane is used, and samples for each channel are interleaved. In this case,

  21. * linesize is the buffer size, in bytes, for the 1 plane.

  22. */  

  23. enum AVSampleFormat {  

  24.    AV_SAMPLE_FMT_NONE = -1,  

  25.    AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits  

  26.    AV_SAMPLE_FMT_S16,         ///< signed 16 bits  

  27.    AV_SAMPLE_FMT_S32,         ///< signed 32 bits  

  28.    AV_SAMPLE_FMT_FLT,         ///< float  

  29.    AV_SAMPLE_FMT_DBL,         ///< double  

  30.  

  31.    AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar  

  32.    AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar  

  33.    AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar  

  34.    AV_SAMPLE_FMT_FLTP,        ///< float, planar  

  35.    AV_SAMPLE_FMT_DBLP,        ///< double, planar  

  36.  

  37.    AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically  

  38. };  

定义中最后带p的为平面数据类型,可以用av_sample_fmt_is_planar来判断此数据类型是否是平面数据类型。
先说非平面数据:
以一个双声道(左右)音频来说,存储格式可能就为LRLRLR.........(左声道在前还是右声道在前没有认真研究过),数据都装在data[0]中,而大小则为linesize[0]。
平面数据:
就有点像视频部分的YUV数据,同样对双声道音频PCM数据,以S16P为例,存储就可能是
plane 0: LLLLLLLLLLLLLLLLLLLLLLLLLL...
plane 1: RRRRRRRRRRRRRRRRRRRR
....
对应的存储则为:
data[0]存储plane 0
data[1]存储plane 1
对应的大小则都为linesize[0],可以用av_get_bytes_per_sample(out_stream->codec->sample_fmt) * out_frame->nb_samples来算出plane的大小。



文章转载出处:http://blog.csdn.net/dancing_night/article/details/45642493