]> Dirac: RDO motion estimation metric
Dirac Home
Navigation item arrowHome
Navigation item arrowDocumentation
Navigation item arrowDirac Algorithm
Navigation item arrowContents
Navigation item arrowIntroduction
Navigation item arrowArchitecture
Navigation item arrowRDO
Navigation item arrowTransform coding
Navigation item arrowMotion estimation
Navigation item arrowMacroblocks
Navigation item arrowMotion vector coding

SourceForge.net Logo
Valid XHTML 1.1!
RDO motion estimation metric

Previous: Motion estimation
Next: Macroblock structures - contents

The performance of motion-estimation and motion-vector coding is absolutely critical to the performance of a video coding scheme.With motion vectors at 1/4 or 1/8th pixel accuracy, a simple-minded strategy of finding the best match between frames can greatly inflate the resulting bitrate for little or no gain in quality because the additional accuracy is very sensitive to noise. What is required is the ability to trade off the vector bitrate with prediction accuracy and hence the bit rate required to code the residual frame and the eventual quality of that frame, whilst at the same time making the estimator more robust.

The simplest way to do this is to incorporate a smoothing factor into the metric used for matching blocks. So the metric consists of a basic block matching metric, plus some constant times a measure of the local motion vector smoothness. The basic block matching metric used by Dirac is Sum of Absolute Differences (SAD). Given two blocks X,Y of samples, this is given by:

SAD(X,Y)= i,j | X i,j Y i,j |

The smoothness measure used is based on the difference between the candidate motion vector and the median of the neighbouring previously computed motion vectors. Since the blocks are estimated in raster-scan order then vectors for blocks to the left and above are available for calculating the median:

Downloading a GIF rendering as your browser doesn't support SVG. Please ignore the "install additional plugins" message if you see it. More details Sorry, your browser can't connect to the server to download a GIF substitute.
        Either install an SVG-enabled browser or connect to the internet to download the diagram.

Figure: neighbouring vectors available in raster-scan order for local variance calculation

The vectors chosen for computing the local median predictor are V2, V3 and V4; this has the merit of being the same predictor as is used in coding the motion vectors.

The total metric is a combination of these two metrics. Given a vector V which maps the current frame block X to a block Y=V(X) in the reference frame, the metric is given by:

SAD(X,Y)+ λ max ( | V x pred x |+| V y pred y |,48 )

The value λ is a coding parameter used to control the trade-off between the smoothness of the motion vector field and the accuracy of the match. When λ is very large, the local variance dominates the calculation and the motion vector which gives the smallest metric is simply that which is closest to its neighbours. When λ is very small, the metric is dominated by the SAD term, and so the best vector will simply be that which gives the best match for that block. For values in between, varying degrees of smoothness can be achieved. The parameter λ is calculated as a multiple of the RDO parameters for the L1 and L2 frames, so that if the inter frames are compressed more heavily then smoother motion vector fields will also result.

The limit on the size of the smoothing factor is to 48 1/8ths of a pixel. This prevents the motion field being smoothed excessively, since where there is a genuine motion transition the motion vector will legitimately differ from its neighbours and shouldn't be excessively penalised.

Previous: Motion estimation
Next: Macroblock structures - contents