The new dirt and grain reducers from Digital Vision rely heavily on motion estimation technology. So what is the importance of that, is it just another buzz-word?
Before moving on we need to define some terminology. The term motion estimation is used for the process of finding the motion of the objects in a picture relative it’s surrounding pictures. Motion compensation denotes the process of using that information to compensate for the motion between the pictures used in the temporal processing so that only pixels from the same object are combined.
Motion estimation can drastically improve the quality of any type of temporal processing, i.e. processing where information from several pictures are used to produce an improved version of a specific picture. Let’s look at two examples, starting with dirt removal.
Dirt removal is a two-stage process. First the dirt must be detected and then it must be replaced with more appropriate data. Dirt is detected by comparing the one picture with the surrounding pictures trying to identify anomalies present only in the centre picture. This detection can easily be fooled by moving objects, creating a signal that is hard to differentiate from single frame dirt. Motion estimation greatly improves the detection process by excluding false detection caused by motion. False detections will otherwise result in processing in portions of the picture that require no processing with the risk of introducing artefacts.
The second stage of the dirt removal process is the actual replacement of dirt pixels with corrected data. This is done by extracting information about a specific pixel from surrounding pictures at the corresponding location, i.e. from the same object. To extract this information from the correct location in the surrounding pictures again requires motion estimation to compensate for the motion between the pictures.
The second example is noise and grain reduction. Most noise reduction techniques rely on averaging of pixels to cancel out the noise. There are two basic techniques: spatial filtering, where only information from a single picture is used, and temporal processing, where information from several pictures is used. Spatial averaging will always introduce a loss of resolution since pixels from different locations in the picture are combined. The only way to reduce noise without losing resolution is to combine pixels corresponding to the same coordinates from a number of consecutive pictures. However, if there is motion present between the pictures used in the temporal process artefacting called "smearing" will occur. By compensating for the motion and aligning the corresponding pixels in consecutive frames, temporal filtering, even at higher levels, can be applied in areas of motion without "smearing". This should not be mistaken for motion adaptive processing, a much simpler technique where only the presence of motion is detected to allow fallback to spatial filtering in those areas, again introducing loss of resolution. There are however situations where spatial processing is still needed, such as areas where the motion cannot be compensated for. An advanced noise reducer should therefore have both types of processing so that the most appropriate balance between the two can be used for every individual pixel.
PHAME algorithms
Digital Vision has a long history of motion estimation technology dating back to the foundation of the company in 1988. The basis for this technology is the PHAME family of algorithms, patented by Digital Vision in the early 90’s and awarded an Emmy in 1992 for its use in standards conversion. The PHAME algorithms make it possible to predict motion in the picture at a very detailed level. Even though continuously developed and improved, the latest generation motion estimator, now for the first time used in the ASC3 and AGR4, is based on the same PHAME principles.

by Göran Appelquist
VP Digital Vision Product Unit
|