Virtual Reality holds great promise, but streaming VR360 video over today’s networks is a considerable challenge. It just requires too much bandwidth to deliver an acceptable quality video stream that is responsive to head movements. The result is either a low quality image, or an experience that can make you feel sick – or both. ClearVR tiled streaming, a technology that we originally developed at TNO, solves these problems. ClearVR enables streaming of high quality VR360 video over existing networks, with a very snappy response to head motion. Tiledmedia’s Founders have worked on tiled streaming research since 2011, and the technology provides a perfect solution for the challenges posed by VR streaming.
Only some 12% of the image is in the user’s viewport
Only 12% of the image …
Imagine viewing 360 Video content through a Head-Mounted Device (HMD). You can look around, and at any moment in time, you see only a part of the full panorama. In fact, you only see about one-eighth (12%). Streaming the entire panorama is hugely inefficient. Doing so in high quality is downright impossible unless you have an extremely fast internet connection. And note that while today’s HMDs have a resolution that is too low for a truly immersive VR experience, that their resolution will increase significantly in the coming years –making the need for an efficient solution only more urgent.
These are the two major solutions to get the bandwidth down to realistic levels:
- Creating many different versions of the panorama, and streaming the one that best fits the viewpoint;
- Dividing the image into tiles, and only sending the tiles that are in view.
We use the second method, because it is much more scalable than the first one, and requires much less encoding and server resources.
Tiled Streaming enables distribution of VR content:
- At extremely high quality
- With virtually zero motion-to-photon latency
- On any display device (dedicated head-mounted devices, phones, tablets)
- Using standard encoding / decoding systems
- For on-demand and live content
- In a way that is massively scalable to millions of users simultaneously over any CDN, using standard http streaming technology – no per-user edge processing required
- At bitrates comparable to normal video.
Cutting it up in tiles
Tiled Streaming works with all relevant devices including the popular Oculus Rift and mobile devices like Samsung’s Gear VR. The Tiled Streaming software in these so-called “clients” retrieves only the tiles that are actually visible in the HMD. The panoramic video needs to be encoded in a special way, but this can be done with industry-standard encoders. Typically, there will be over a hundred of such tiles. These tiles are independently coded and stored on a Content Distribution Network (CDN), where the client can find them. The client has the logic to request the tiles it needs, decode them, and then rearrange them for rendering on the device.
Sending only the tiles that are in the viewport saves a huge amount of bandwidth
There is also a lower resolution version of the panorama that is always transmitted. This layer ensures that there are no black holes when you turn your head. When you move your attention to a different part of the panorama, the device needs to fetch new content from the network. While this happens extremely fast (within 20-40 msec.), it still takes a bit of time. The fall-back layer ensures there are no black holes while new tiles are fetched, and also takes care of an incredibly short ‘motion-to-photon delay’ (the delay that will make you sick if it’s too long.) That delay is as low as it can possibly be, because it only depends on the local processing.
A low-resolution image is always present, ensuring that motion-to-photon latency is minimal
By choosing the tile size in a clever way, the amount of data can be reduced by a factor of approximately five . Put in another way, we can send 5 times as many pixels at the same bitrate, which equates to a much higher resolution and quality.
Cubemap rather than ERP
To keep the explanation simple, the examples above use an “Equirectangular Projection” (ERP). In practice, we use a cubemap, which is much better suited to tiling approaches. A cubemap looks like the picture below.
Adaptive bitrate streaming
The original Tiled Streaming technology was designed to support adaptive streaming, with multiple layers to allow zooming and panning in ultra-high resolution imagery. We now apply these principles to VR streaming, where the client has the logic and the flexibility to retrieve the layer that best suits the viewport and network conditions.