New Approaches to Video Matting: Eye-tracking and Parallelization
By gelautz - Posted on October 22nd, 2007
Video matting is the problem of estimating foreground and background colors as well as an opacity factor for every pixel in an image sequence. Video object segmentation and matting are fundamentally important operations in many image/video editing and compositing tasks, with a number of applications for instance in the entertainment industry. Once a video object has been extracted successfully from its background by appropriate matting techniques, it may be inserted into another video scene and/or combined with artificial objects to create mixed reality applications that merge the aspect of real with computer-generated worlds.The matting problem is severely ill-posed and generally cannot be solved without any prior knowledge. Early methods constrain the problem by filming the object in front of a constant colored background, which is known as blue screen matting. Although this technique has been effectively used in thefilm industry, it cannot be applied to natural images or video sequences. Subsequently a number of approaches have been proposed that extract the foreground object directly from natural scenes. In general these approaches rely on a so-called trimap, which is a segmentation of the image into definitely foreground, definitely background and an unknown region. Fractional opacity values are then computed only for pixels in the unknown region. The manual creation of a good trimap requires a considerable amount of user interaction, in particular for foreground objects with lots of holes and for images with a large amount of semi-transparent foreground. Recently, a matting system was proposed that iteratively computes an opacity value for every pixel of an image based on a small sample of foreground and background pixels marked by the user. This method creates impressive results for a varietyof images without the use of a trimap. However, the iterative matting system leads to a significant increase in computation time and may also get stuck in a local minimum. Although the aforementioned advancements in matting have lead to increasingly good results there are still a lot of problems totackle. Especially the extraction of foreground objects from videos, as opposed to still images, represents a very challenging topic. State-of-the-art video matting systems require a much higher amount of user interaction than for still images and exceed the power of today¿s general purpose computer hardware. In this PhD project, we will explicitly tackle those problems in video matting by developing new and efficient algorithms to quickly extract high-quality video objects from natural image sequences. We seek to develop a new level of advanced user interaction techniques based on eye-tracking technology and parallelization. This will allow us to design interactive matting approaches with iterative improvement of the matte by switching between user interaction and computing intermediate results at high processing rates. The user input will be supported by novel eye-tracking technologyin a multimodal interface design.