EPlayer 易播

搜索

AVFrame中视音频数据存储格式

2018-10-31

在AVFrame中有2个很重要的数据成员，一个是data，一个是linesize。data中存储的是未编码的源始数据（不论视音频），linesize中存储的是每行data中数据大小。

data的定义如下：

[cpp] view plaincopy
/** 
     * pointer to the picture/channel planes. 
     * This might be different from the first allocated byte 
     * 
     * Some decoders access areas outside 0,0 - width,height, please 
     * see avcodec_align_dimensions2(). Some filters and swscale can read 
     * up to 16 bytes beyond the planes, if these filters are to be used, 
     * then 16 extra bytes must be allocated. 
     */  
    uint8_t *data[AV_NUM_DATA_POINTERS];  

linesize定义如下：

[cpp] view plaincopy
/** 
   * For video, size in bytes of each picture line. 
   * For audio, size in bytes of each plane. 
   * 
   * For audio, only linesize[0] may be set. For planar audio, each channel 
   * plane must be the same size. 
   * 
   * For video the linesizes should be multiplies of the CPUs alignment 
   * preference, this is 16 or 32 for modern desktop CPUs. 
   * Some code requires such alignment other code can be slower without 
   * correct alignment, for yet other it makes no difference. 
   * 
   * @note The linesize may be larger than the size of usable data -- there 
   * may be extra padding present for performance reasons. 
   */  
  int linesize[AV_NUM_DATA_POINTERS];  

注意：当为音频的时候linesize，只有linesize[0]才是有效值，因为左右一样大。

3、存储方式

1、视频

视频相对简单的多，以yuv420为例，图像数据在AVFrame中存储是这样的：

data[0]存储Y

data[1]存储U

data[2]存储V

而他们相对应的大小为：

linesize[0]为Y的大小

linesize[1]为U的大小

linesize[2]为V的大小

2、音频

音频数据则要复杂一些，在音频中分为平面和非平面数据类型，下面是音频数据类型的定义：

[cpp] view plaincopy
/** 
 * Audio Sample Formats 
 * 
 * @par 
 * The data described by the sample format is always in native-endian order. 
 * Sample values can be expressed by native C types, hence the lack of a signed 
 * 24-bit sample format even though it is a common raw audio data format. 
 * 
 * @par 
 * The floating-point formats are based on full volume being in the range 
 * [-1.0, 1.0]. Any values outside this range are beyond full volume level. 
 * 
 * @par 
 * The data layout as used in av_samples_fill_arrays() and elsewhere in FFmpeg 
 * (such as AVFrame in libavcodec) is as follows: 
 * 
 * For planar sample formats, each audio channel is in a separate data plane, 
 * and linesize is the buffer size, in bytes, for a single plane. All data 
 * planes must be the same size. For packed sample formats, only the first data 
 * plane is used, and samples for each channel are interleaved. In this case, 
 * linesize is the buffer size, in bytes, for the 1 plane. 
 */  
enum AVSampleFormat {  
    AV_SAMPLE_FMT_NONE = -1,  
    AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits  
    AV_SAMPLE_FMT_S16,         ///< signed 16 bits  
    AV_SAMPLE_FMT_S32,         ///< signed 32 bits  
    AV_SAMPLE_FMT_FLT,         ///< float  
    AV_SAMPLE_FMT_DBL,         ///< double  
  
    AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar  
    AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar  
    AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar  
    AV_SAMPLE_FMT_FLTP,        ///< float, planar  
    AV_SAMPLE_FMT_DBLP,        ///< double, planar  
  
    AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically  
};  

定义中最后带p的为平面数据类型，可以用av_sample_fmt_is_planar来判断此数据类型是否是平面数据类型。

先说非平面数据：

以一个双声道（左右）音频来说，存储格式可能就为LRLRLR.........（左声道在前还是右声道在前没有认真研究过），数据都装在data[0]中，而大小则为linesize[0]。

平面数据：

就有点像视频部分的YUV数据，同样对双声道音频PCM数据，以S16P为例，存储就可能是
plane 0: LLLLLLLLLLLLLLLLLLLLLLLLLL...
plane 1: RRRRRRRRRRRRRRRRRRRR....

对应的存储则为：

data[0]存储plane 0

data[1]存储plane 1

对应的大小则都为linesize[0]，可以用av_get_bytes_per_sample(out_stream->codec->sample_fmt) * out_frame->nb_samples来算出plane的大小。

文章转载出处：http://blog.csdn.net/dancing_night/article/details/45642493

新闻资讯

下一篇：这是最后一篇

上一篇：这是第一篇

QQ客服

在线留言

电话咨询