- Free viewpoint television
Free viewpoint television (FTV) is a system for viewing natural video, allowing the user to interactively control the viewpoint and generate new views of a dynamic scene from any 3D position.The equivalent system for synthetic video is known as
virtual reality . With FTV, the focus of attention can be controlled by the viewers rather than a director, meaning that each viewer may be observing a unique viewpoint. It remains to be seen how FTV will affect television watching as a group activity.History
Systems for rendering arbitrary views of natural scenes have been well known in the
computer vision community for a long time but only in recent years has the speed and quality reached levels that are suitable for serious consideration as an end user system.Quicktime VR might be considered a predecessor to FTV.Professor Masayuki Tanimoto from Nagoya University (Japan) has done much to promote the use of the term "free viewpoint television" and has published many papers on the
ray space representation , although other techniques can be, and are used for FTV.Capture and display
In order to acquire the views necessary to allow a high quality rendering of the scene from any angle, several cameras are placed around the scene; either in a studio environment or an outdoor venue, such as a sporting arena for example. The output Multiview Video (MVV) must then be packaged suitably so that the data may be compressed and also so that the users' viewing device may easily access the relevant views to interpolate new views.
It is not enough to simply place cameras around the scene to be captured. The geometry of the camera set up must be measured by a process known in computer vision as "camera calibration." Manual alignment would be too cumbersome so typically a "best effort" alignment is performed prior to capturing a test pattern which is used to generate calibration parameters.
Multiview video capture varies from partial (usually about 30 degrees) to complete (360 degrees) coverage of the scene. Therefore it is possible to output stereoscopic views suitable for viewing with a 3D display or other 3D methods. Systems with more physical cameras can capture images with more coverage of the viewable scene, however, it is likely that certain regions will always be occluded from any viewpoint. A larger number of cameras should make it possible to obtain high quality output because less interpolation is needed.
More cameras mean that efficient coding of the Multview Video is required. This may not be such a big disadvantage as there are representations that can remove the redundancy in MVV; such as inter view coding using
MPEG-4 , the ray space representation, geometry videos,huh etc.In terms of hardware, the user requires a viewing device that can decode MVV and synthesize new viewpoints, and a 2D or 3D display.
Standardization
The
Moving Picture Experts Group (MPEG) is currently investigating Multiview Video Coding under a group called '3DAV' (3D Audio and Visual) headed by Aljoscha Smolic [http://iphome.hhi.de/smolic/personal.html] at theHeinrich-Hertz Institute . This activity falls under ISO/IEC JTC1/SC29/WG11 [http://www.chiariglione.org/mpeg/technologies/mp-mv/index.htm] and is expected to be adopted as part ofMPEG-4 when finished. The key technology to be standardized is the specification of the view synthesis engine.External links
* [http://www.bbc.co.uk/rd/projects/iview iview] is a British DTI project between
BBC ,Snell & Wilcox andUniversity of Surrey to develop an FTV system.
* [http://www.ri.cmu.edu/events/sb35/tksuperbowl.html Eye Vision] is a system developed by ProfessorTakeo Kanade at CMU for CBS's coverage ofSuper Bowl XXXV . The user is not able to change viewpoint but the camera operator is able to choose any virtual viewpoint by synthesizing images from anactive vision system.References
Wikimedia Foundation. 2010.