All the motion vector data is predicted from previously
encoded data from nearest neighbours. In predicting the data
a number of conventions are observed.
The first convention is that all the so-called block data (prediction
modes and the motion vectors themselves, and/or any DC values) is
actually associated with the top-left block of the prediction unit to
which they refer.
This allows for a consistent prediction and coding structure to
be adopted.
Example. If splitting level=1 and then the
prediction units in a MB are sub-MBs. Nevertheless, the
prediction mode and any motion vectors are associated with the
top-left block of each sub-MB and values need not be coded for
other blocks in the sub-MB.
The second convention is that all MB data is scanned in
raster order for encoding purposes. All block data is scanned
first by MB in raster order, and then in raster order within
each MB. That is, taking each MB in raster order, each block
value which needs to be coded within that MB is coded in raster
order:
Figure: block data is scanned in raster order by MB
and then in raster order within each MB
The third convention concerns the availability of values for
prediction purposes when they may not be coded for every block.
Since prediction will be based on neighbouring values, it is
necessary to propagate values for the purposes of prediction
when the MV data has conspired to ensure that values are not
required for every block.
Example. In the next diagram, we can see the effect of
this. Suppose we are coding REF1_x. In the first MB, splitting level=0
and so at most only the top-left block needs a value, which can
be predicted from values in previously coded MBs. As it
happens, the prediction mode REF1_ONLY and so a value is coded. The
value v is then deemed to be applied to every block in the MB.
In the next MB, splitting level=1, so the unit of
prediction is the sub-MB. In the top-left sub-MB the prediction mode is,
say, REF1AND2 and so a value x is coded for the top-left
block of that sub-MB. It can be predicted from any available
values in neighbouring blocks, and in particular the value v is
available from the adjacent block.
Figure: For the purposes of prediction, values are
deemed to be propagated within MBs or sub-MBs.
Prediction methods
The prediction used depends on the MV data being
coded, but in all cases the
aperture for the predictor is shown in the figure below. This aperture
is interpreted as blocks where block data is concerned and MBs where MB data is
concerned.
Figure: Aperture for MV prediction.
The splitting level is predicted as the mean of the levels
of the three MBs in the aperture.
Of the block data, the MV prediction mode is predicted for reference 1
and reference 2 separately. The MV prediction mode is interpreted as a two bit
number encoding the four possibilities: INTRA=0, REF1_ONLY=1, REF2_ONLY=2 and
REF1AND2=3. In this way the first bit (bit 0) is a flag indicating that reference 1 is
used and the second bit (bit 1) that reference 2 is used. Bit 0 and Bit 1 are predicted
separately: if most of the 3 predicting blocks use
reference 1 (ie their mode is REF1_ONLY or REF1AND2 and bit 0 is set) then
it is predicted that bit 0 is set, and likewise for bit 1.
The DC values are predicted by the average of the three values in the aperture.
The motion vectors themselves are predicted by predicting the horizontal and
vertical components separately. The predictor for motion vector components is the median of the three corresponding
values in the prediction aperture for the block.
In many cases MV or DC values are not available from all blocks in
the aperture, for example if the prediction mode is different.
In this case the blocks are merely excluded from
consideration. Where only two values are available, the median motion
vector predictor reverts to a mean. Where only one value is available, this
is the prediction. Where no value is available, no prediction is made, except for
the DC values, where 0 is used by default.
In the case of the MB_split data, the number of possible values is
only 3 in the case of MB_split and 2. The prediction therefore can use modulo arithmetic and produces
an unsigned prediction residue of 0,1 or 2 in the first case
and 0 or 1 in the second. All other predictions produce signed
prediction residues.