New MPEG standard with Fraunhofer HHI technology: MPEG glTF 2.0 extension enables photorealistic and animatable Avatars
After two years of standardization work, the Moving Picture Experts Group (MPEG) has finalized its scene description activity with an extension to the "Graphics Language Transmission Format" (glTF). The new extension for the first time allows the integration of volumetric video and audio into an immersive scene. The Fraunhofer Heinrich Hertz Institute (HHI) played a major role in the research and development of the new standard. In addition to Fraunhofer HHI, companies such as Qualcomm, Nokia, Interdigital, Intel, TNO, Sony, Philips and Xiaomi were actively involved in the standardization process with individual contributions.
The glTF standard is widely used. It was developed by the Khronos Group for the description of three-dimensional models and scenes. The Khronos Group is a consortium of over 150 industry-leading companies creating advanced interoperability standards for 3D graphics, augmented and virtual reality, parallel programming, vision acceleration and machine learning. The MPEG extension builds on glTF version 2.0 and now allows, among other things, video textures, animated meshes, and spatial audio to be included in glTF. This means that volumetric videos can now be distributed to end devices with the glTF format.
Part of the MPEG extension is the mesh linking technology developed by Fraunhofer HHI. This technology makes it possible to combine the high realism of volumetric videos with the animatability of computer graphics models. Photo-realistic virtual persons can now be integrated into mixed reality scenes in order to interact with the users. For example, a photorealistic avatar can now actively maintain eye contact with the user. Previously, this was only possible with computer graphics models, but not with the much more realistic-looking, yet more complex volumetric video content.
"We brought our many years of experience and expertise in the field of computer graphics to the standardization process," said Prof. Peter Eisert, head of the "Vision and Imaging Technology" department at Fraunhofer HHI. "The core idea is to transfer the motion of computer graphics models to the volumetric scan." To do this, the geometry of the computer model is first adapted to the volumetric scan. After that "fitting" process correspondence between the two formats is calculated so that animations of the computer graphics model can be transferred to the volumetric scan.
"With the new MPEG extension, the glTF 2.0 format is able to deliver photorealistic volumetric video with the flexibility of a computer graphic model," said Yago Sanchez, researcher in the “Multimedia Communications” group and one of the editors of the standard. “With this information, a compatible glTF player can now freely animate the volumetric video.”
"MPEG is always looking for technologies that deliver the highest quality and extend previous capabilities. Therefore, we are proud to have contributed our technologies to the new standard, which thus makes it possible to describe animatable volumetric videos in glTF 2.0," commented Dr. Cornelius Hellge, Group Leader “Multimedia Communications” at Fraunhofer HHI.
Following the MPEG meeting in January 2022, the "Final Draft Industrial Standard" (FDIS) of the Scene Description Standard will be issued. That means that the standard is technically finalized and only needs to be formally ratified by ISO, the global federation of national standardization organizations, and IEC, the International Electrotechnical Commission, as the leading standardization organizations.
In 2022, Fraunhofer HHI will focus on further developing the MPEG extension towards interactivity for immersive scenes. The researchers have already created a live demonstration that shows the potential of the newly standardized technology.
Wissenschaftlicher Ansprechpartner:
Dr. Cornelius Hellge
cornelius.hellge@hhi.fraunhofer.de
Tel. +49 30 31002-239