Jan Ozer Data Rate Ott Upload 10 Mbps 20 Mbps

Conference Enquiry Tests Adaptive Video and Quality Benchmarks

Article Featured Image

The Lodge for Imaging Science and Applied science hosts the annual International Symposium on Electronic Imaging, held this year in San Francisco, California, from February fourteen to 18. The Symposium has eight tracks across a range of disciplines, where researchers from industry and academia present papers and findings.

I attended primarily to learn the latest in two arenas: adaptive streaming and video quality benchmarks. In this article, I'll nowadays an overview of the sessions and papers I establish nearly interesting and relevant. The first two relate to work on adaptive streaming performed by Google. The second ii talk over how to measure the quality of adaptive streaming experiences.

A Subjective Study for the Design of Multi-Resolution ABR Video Streams With the VP9 Codec

Ane common problem facing encoding professionals is identifying when to switch between streams in an adaptive grouping. This paper, authored by Chao Chen, Sasi Inguva, and Anil Kokaram from YouTube/Google, presented a hybrid objective/subjective technique for identifying the appropriate data rate for switching stream resolutions. Though the experiment focused on the 4K/2K determination signal using the VP9 codec, the technique can exist used for any decision signal and codec.

Adaptive streaming involves a group of encoding configurations at various resolution and data rate pairs. At each data charge per unit in the ladder, the player has to choose the advisable resolution. Intuitively, it's the resolution that delivers the highest quality at that data rate, and Google broke no new ground in sharing this ascertainment.

Every bit mentioned, Google's focus was on the appropriate data rate to switch betwixt 2K and 4K videos, and the short answer is that information technology'southward between 4 Mbps and 5 Mbps when encoding with the VP9 codec. How Google got there is the interesting part.

Google selected 7966 4K videos uploaded to YouTube, created 2K versions, encoded both the 4K and 2K versions with VP9 at various data rates, and computed their Structural Similarity Index (SSIM) scores. Based upon these scores, the boilerplate switching rate betwixt 2K and 4K was 4 Mbps. That is, below 4 Mbps, most 2K clips had a higher SSIM rating, while higher up 4 Mbps, the 4K videos had a higher SSIM rating.

To exam this premise, Google ran subjective tests, but even Google doesn't take the time, patience, or funds to test 7966 videos. In fact, The Google researchers wanted to subjectively test but 10. Then the question was how to choose ten videos that represent the entire universe of 4K clips that are and e'er will exist uploaded to YouTube. Not surprisingly, the answer is a bit wonky, though comprehensible if you simply trust Google's math.

The researchers reasoned that the encoding complexity of videos involve two main factors: the corporeality of motion in the clip and the amount of detail. To assess the corporeality of particular, the authors used I-frame size, since at a constant quantization parameter, more than detail requires a larger file size to preserve. To measure the corporeality of motion in a prune, the researchers used boilerplate P-frame size divided past I-frame size, to decouple the upshot that large I-frames tin have on P-frames (this is the trust the math part).

These metrics decided upon, the researchers encoded 3226 of the highest-quality clips in the 4K library to H.264 using FFMPEG with a abiding quantization parameter of 28. And so, they measured I frame and I/P frame size, which in essence creates a 9-slot taxonomy of 4K clips based upon the corporeality of motion and detail. In each region, they selected 20 clips closes to the eye of each region, and selected the highest quality prune.

From these clips, they produced 2K variants, and encoded these variants and the original 4K clips to two Mbps, three Mbps, 6 Mbps, and 11 Mbps. So they scaled the 2K versions back to 4K for side-by-side subjective testing. The result was an average switching rate of about 5 Mbps. From this, the authors concluded: "In this sense, SSIM is probably a good quality index for the purpose of estimating the average resolution switching bitrate for large amount of videos. Although SSIM may overestimate or underestimate the quality for a detail video, its estimation error volition be averaged out when estimating boilerplate quality for a big collections of videos."

What's meaning about this study? Multiple items. First, it validates using SSIM equally the ground for determining how to configure streams in adaptive groups. Second, the 4/5 Mbps switch betoken between 2K/4K video is interesting, though this will vary from codec to codec. Finally, if you notice yourself having to select a express gear up of clips that accurately reflect the characteristics of a larger grouping, the I and P-frame/I-frame technique described might simply do the trick.

Optimizing Transcode Quality Targets Using a Neural Network With an Embedded Bitrate Model

One of the more meaning recent events in the encoding world was Netflix'due south per-title encode web log mail service where the authors discussed their schema for creating a custom encoding ladder for each video distributed by the service. Netflix'due south approach involves multiple trial encodes, which works well when y'all distribute a big, only limited set of content. The compressionists at YouTube have a completely different problem to manage; In essence, how to pull off per-championship encoding when yous have 300 hours of video uploaded every minute of every day. This talk, and the above titled paper, discussed their approach.

The paper, authored by Google's Michele Covell, Martin Arjovsky, Yao-chung Lin, and Anil Kokaram, starts by describing the weather condition that YouTube must work nether. First, YouTube encodes files in parallel, splitting each source into chunks then sending them off to dissimilar encoding instances. Since communications between these instances would complicate arrangement design and operation, the solution couldn't involve communications betwixt these instances.

2nd, any approach must be codec agnostic, because YouTube deploys multiple codecs. To make this piece of work, the solution had to depend upon a unmarried rate command parameter for each codec, though it tin vary from codec to codec. For x.264, which was the focus of the newspaper, YouTube used the Constant Rate Factor (CRF) value every bit the single rate control parameter.

CRF is a rate command technique that adjusts the quantization level to optimize quality over the duration of the file (or file segment). The problem with CRF is that information technology has no charge per unit control mechanism; you set the CRF value, and x264 produces a file at whatever data rate is necessary to meet the selected quality level. YouTube's files take to encounter a target data rate, and so the object of the exercise was how to choose the CRF level that would evangelize the required data charge per unit.

One obvious solution would be to run a get-go encoding pass on all incoming files, and distribute this data to all encoding instances. However, implementing two-pass encoding would dramatically increase the encoding horsepower necessary to process the incoming load. For this reason, YouTube had to implement the solution, if at all possible, in a single pass.

As the newspaper describes, while YouTube tin't afford a first pass on all incoming files, it does gather some information from a loftier-bitrate mezzanine file produced from all incoming files. Essentially, because users upload files in a variety of formats, sizes, scrap rates, and frame rates, this mezz file is necessary to normalize these files earlier encoding. When creating this mezz file, YouTube gleans many details about the file, though non up to the level of information gained from a true first-encoding pass.

Schooling the Neural Network

The issue was how to predict the right CRF value from this express information, and for this, YouTube deployed a neural network. At a loftier level, a neural network is a multiple-CPU system with the power to learn via grooming. To train the network, YouTube performed over 137,000 encodes on xiv,000 clips, and fed the data into the network. The researchers and so encoded 1,000 test clips based upon input from the network and found that the system choose the right CRF value to encounter the target data rate 65 percent of the time, with a tolerable bitrate error of under xx percent. This would mean that 35 percent of the clips would take to be re-encoded to meet the target bitrate.

The researchers adjacent evaluated the learning benefit of incorporating the results of a fast, low-quality CRF encode into the system. Specifically, the arrangement encoded a 240 pixel peak video file at a CRF value of xl, and incorporated data from this encode into the neural network training. This boosted accurateness to 80 per centum, which means that only xx percent of the files needed re-encoding.

Information technology'due south tough to say how the average compressionist might use this research, though it does provide a fascinating wait into the scale of YouTube's operations, and an interesting example at what neural networks are and the type of work that they tin can perform. Any technique y'all use to optimize encodes, however, if yous're non thinking well-nigh per-title or per-category encoding optimization, you're behind the curve.

Subjective Analysis and Objective Characterization of Adaptive Bitrate Videos

Assessing the quality of a single video file via subjective and objective testing is well travelled footing. However, the Quality of Experience (QoE) of adaptive streaming is much more complicated, using multiple streams with unlike quality levels and different algorithms to determine when and how often to switch streams. This paper, authored by Jacob Søgaard (Technical Academy of Denmark), Samira Tavakoli (Universidad Politécnica de Madrid), Kjell Brunnström (Acreo Swedish ICT AB and Mid Sweden Academy), and Narciso García (Universidad Politécnica de Madrid), provides a great explanation about the types of testing performed to appraise the QoE of adaptive streaming. Unfortunately, information technology shows that highly accessible and like shooting fish in a barrel to apply objective tests are poor predictors of actual subjective ratings.

Rating the QoE of Adaptive Streaming

Near the offset of their paper, the authors reference a highly useful paper entitled Quality of Experience and HTTP Adaptive Streaming: A Review of Subjective Studies, which I Googled and was able to download. I suggest that y'all do the same. As the title suggests, this paper reviewed previous studies and summarized their conclusions, which are relevant to all streaming producers.

Streaming Covers

Related Articles

Live Video Encoding and Transcoding Techniques

In this session, Jan Ozer presents a live video comparison that includes toll, stream redundancy, packaging flexibility, bandwidth requirements, DRM and captioning support, and scalability.

How Netflix Pioneered Per-Championship Video Encoding Optimization

One-size-fits-all encoding doesn't produce the best possible results, and then Netflix recently moved to per-championship optimization. Learn why this improves video quality and saves on bandwidth, but isn't the right model for every company.

Netflix Re-Encoding Entire Catalog to Reduce File Sizes By 20%

By recognizing that some titles are more than visually demanding than others, Netflix has revolutionized the mode it encodes video and will dramatically cut down bandwidth requirements.

spiesfied1981.blogspot.com

Source: https://www.streamingmedia.com/Articles/Editorial/Featured-Articles/Conference-Research-Tests-Adaptive-Video-and-Quality-Benchmarks-109907.aspx?CategoryID=422

0 Response to "Jan Ozer Data Rate Ott Upload 10 Mbps 20 Mbps"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel