Open main menu

Wikipedia β

AOMedia Video 1 (AV1) is an open, royalty-free video coding format designed for video transmissions over the Internet. It is being developed by the Alliance for Open Media (AOMedia), a consortium of leading firms from the semiconductor industry, video on demand providers, and web browser developers, founded in 2015. It is the primary contender for standardization by the video standard working group NetVC of the Internet Engineering Task Force (IETF).[1] The group has put together a list of criteria to be met by the new video standard.[2] It is meant to succeed its predecessor VP9 and compete with HEVC/H.265 from the Moving Picture Experts Group.[3]

AOMedia Video 1
AV1.png
Developed by Alliance for Open Media
Type of format Compressed video
Contained by WebM
Extended from VP9
Open format? Yes

AV1 can be used together with the audio format Opus in a future version of the WebM format for HTML5 web video and WebRTC.[4]

Contents

HistoryEdit

The first official announcement of the project came with the press release on the formation of the Alliance. The growing usage of its predecessor VP9 is attributed to confidence in the Alliance and (the development of) AV1 as well as the pricey and complicated licensing situation of HEVC (High Efficiency Video Coding).[5][6]

The roots of the project precede the Alliance, however. Individual contributors started experimental technology platforms years before: Xiph's/Mozilla's Daala already published code in 2010, VP10 was announced on 12 September 2014, and Cisco's Thor was published on 11 August 2015. The first version 0.1.0 of the AV1 reference codec was published on 7 April 2016.

The bitstream format is projected to be frozen in Q4[7][8] of 2017. According to Mukund Srinivasan, chief business officer of AOM member Ittiam, early hardware support will be dominated by software running on non-CPU hardware (such as GPGPU, DSP or shader programs, as is the case with some VP9 hardware implementations), as fixed-function hardware will take 12–18 months after bitstream freeze until chips are available, plus 6 months for products based on those chips to hit the market.[8]

PurposeEdit

The purpose of AV1 is to be as good as possible under royalty-free patent licensing. Crucial to this objective is therefore to ensure, during development, that it does not infringe on patents of competing companies.[9] This contrasts to its main competitor HEVC, for which IPR review was not part of the standardization process.[5] The latter practice is stipulated in ITU-T's definition of an open standard. The case of HEVC's independent patent pools has been characterized by critical observers as a failure of price management.[10][11]

Under patent rules adopted from the World Wide Web Consortium (W3C), technology contributors license their AV1-connected patents to anyone, anywhere, anytime based on reciprocity, i.e. as long as the user does not engage in patent litigation.[12] As a defensive condition, anyone engaging in patent litigation loses the right to the patents of all patent holders.[5]

It aims for state of the art performance with a noticeable compression efficiency advantage at only slightly increased coding complexity. The efficiency goal is 25% improvement over HEVC.[2] AV1 is primarily intended for lossy encoding, although lossless compression is supported as well.[13]

It is specifically designed for real-time applications (especially WebRTC) and higher resolutions (wider color gamuts, higher frame rates, UHD) than typical usage scenarios of the current generation (H.264) of video formats where it is expected to achieve its biggest efficiency gains. It is therefore planned to support the color space from ITU-R Recommendation BT.2020 and 10 and 12 bits of precision per color component.[14]

Cisco is a manufacturer of videoconferencing equipment, and their Thor contributions aim at "reasonable compression at only moderate complexity".[11]

TechnologyEdit

 
AV1 introduces "T-shaped" partitioning schemes for coding units, a feature from VP10

AV1 is a traditional block-based frequency transform format featuring new techniques taken from several experimental formats that have been testing technology for a next-generation format after HEVC and VP9.[15] Based on Google's experimental VP9 evolution project VP10,[16] AV1 incorporates additional techniques developed in Xiph's/Mozilla's Daala and Cisco's Thor.

AV1 performs internal processing in higher precision (10 or 12 bits per sample), which leads to compression improvement due to smaller rounding errors in reference imagery. For intra prediction, there are more (than 8) angles for directional prediction and weighted filters for per-pixel extrapolation. Temporal prediction can use more references. Prediction can happen for bigger units (≤128×128), and they can be subpartitioned in more ways. Predictions can be combined in more advanced ways (than a uniform average) in a block, including smooth and sharp gradients in different directions. This allows either inter–inter or inter–intra predictions to be combined in the same block.[17][18]

Two different non-binary arithmetic coding entropy coders were considered for replacing VP9's binary entropy coder: Daala's entropy coder (Daala EC) and Asymmetric Numeral Systems. The use of non-binary coding helps evade patents, but also adds bit-level parallelism to an otherwise serial process, reducing clock rate demands on hardware implementations.[6] Of the two contenders, ANS is the fastest to decode in software, but Daala EC is more hardware friendly.[5] As of late 2017, Daala EC has replaced VP9's entropy coder, with ANS still retained.

The integration of Daala's Perceptual Vector Quantization proved too complex within the framework of AV1, encoding-wise.[6] The Rate Distortion heuristic framework aims to speed up the encoder by a sizable factor, PVQ or not,[6] but PVQ was ultimately dropped.

For the in-loop filtering step, the integration of Thor's constrained low-pass filter and Daala's directional deringing filter has been fruitful: The combined Constrained Directional Enhancement Filter (CDEF) exceeds the results of using the original filters separately or together.[19][20]

 
Parallelism within a frame is possible in tiles (vertical) and tile rows (horizontal).

More encoder parallelism is possible thanks to configurable prediction dependency between tile rows.[21]

The Alliance publishes a reference implementation written in C and assembly language (aomenc, aomdec) as free software under the terms of the BSD 2-Clause License.[22]

Quality and efficiencyEdit

A first comparison from the beginning of June 2016[23] found AV1 roughly on par with HEVC, as did one using code from late January 2017.[24]

As of April 2017, using the 8 currently enabled experimental features (of 77 total), Bitmovin was able to demonstrate favorable objective metrics, as well as visual results, compared to HEVC on the Sintel and Tears of Steel animated films.[25] A follow-up comparison by Jan Ozer of Streaming Media Magazine confirmed this, and concluded that "AV1 is at least as good as HEVC now".[26]

Ozer noted that his and Bitmovin's results contradicted a comparison by Fraunhofer Institute for Telecommunications from late 2016[27] that had found AV1 38.4% less efficient than HEVC, underperforming even H.264/AVC, and justified this discrepancy by having used encoding parameters endorsed by each encoder vendor, as well as having more features in the newer AV1 encoder.

Tests from Netflix showed that – based on measurements with PSNR and VMAF at 720p –, AV1 could be about 25% more efficient than HEVC, at the expense of a 4–10 fold increase in encoding time.[28]

AdoptionEdit

It is expected that Alliance members have interest in adopting the format, in respective ways, once the bitstream is frozen.[14][25] The member companies represent several industries, including browser vendors (Google, Mozilla, Microsoft), content providers (Google, Netflix, Amazon, Hulu) and hardware manufacturers (Intel, AMD, ARM, Nvidia).[5][6]

Video streaming service YouTube declared intent to transition to the new format as fast as possible, starting with highest resolutions within six months after the finalization of the bitstream format.[14]

Netflix "expects to be an early adopter of AV1".[9]

Like its predecessor VP9, AV1 will be used together with the WebM and Opus formats. These are well supported among web browsers, with the exception of Safari (desktop and mobile versions) and the discontinued Internet Explorer (prior to Edge) (see VP9 in HTML5 video § browser support).

On 28 November 2017, Mozilla and Bitmovin announced a demo of AV1 playback viewable in Firefox Nightly.[29]

Coding toolsEdit

As of late November 2017, 34 of 90 experimental coding tools are enabled by default in the developmental software codebase.[30] In addition to current experiments, some have also been fully integrated by having their build-time flags removed.

The development process is such that coding tools are added as experiments in the codebase, controlled by build-time flags, for review by hardware and legal teams. Once reviews are passed, the experiment can be enabled by default.[8]

Experiment names are lowercased in the configure script and uppercased in conditional compilation flags.[31][32]

Former experiments that have been fully integratedEdit

This list may or may not be complete.

Warped motion, as seen from the front of a train: From a pioneering slow-TV program featuring 7 hours of warped motion. The warped_motion and global_motion tools in AV1 aim to reduce redundant information in motion vectors by recognizing patterns arising from camera motion.
Historic build-time flag Explanation
alt_intra[33] A new prediction mode suitable for smooth regions[34]
cb4x4[35]
cdef[36] Constrained Directional Enhancement Filter: The merge of Daala's directional deringing filter + Thor's constrained low pass filter[19][37]
chroma_sub8x8[38]
compound_segment[39]
delta_q[40]
daala_ec[41] The Daala entropy coder (a non-binary arithmetic coder)
ec_adapt[42] Adapts symbol probabilities on the fly.[34] As opposed to per frame, as in VP9.[5]
ec_smallmul[43] A hardware optimization of daala_ec[37]
ext_inter[44] Extended inter[21]
ext_refs[45] Adds more reference frames, as described in Adaptive multi-reference prediction using a symmetric framework[46]
ext_tx[47]
filter_7bit[48] 7-bit interpolation filters[49]
global_motion[50] Global Motion[21][34]
interintra[51] Inter-intra prediction, part of wedge partitioned prediction[18]
motion_var[52] Renamed from obmc.[53] Overlapped Block Motion Compensation: Reduce discontinuities at block edges using different motion vectors[34]
new_multisymbol[54]
one_sided_compound[55]
palette[56] Palette prediction: Intra codig tool for screen content.[57]
rect_intra_pred[58]
rect_tx[59] Rectangular transforms[60]
ref_mv[61] Better methods for coding the motion vector predictors through implicit list of spatial and temporal neighbor MVs[34]
smooth_hv[62]
tile_groups[63]
var_tx[64]
warped_motion[65] Warped Motion[34]
wedge[39] Wedge partitioned prediction[18]

Current experimentsEdit

Enabled by default Build-time flag[30] Explanation
No adapt_scan
No add_4bytes_obusize
No amvr
Yes aom_qm Quantization Matrices[66]
No bgsprite
Yes cdef_singlepass An optimization of cdef[20]
Yes cfl Chroma from Luma[34]
No colorspace_headers
No compound_round
Yes convolve_round
No daala_tx4 Daala Transforms[67][68]
No daala_tx8
No daala_tx16
No daala_tx32
No daala_tx64
No daala_tx Shorthand for daala_tx{4,8,16,32,64}[31]
No dct_only
Yes deblock_13tap
No dependent_horztiles
Yes dist_8x8 A merge of former experiments cdef_dist and daala_dist.[32] Daala_dist is Daala's distortion function.[6]
Yes dual_filter
No eighth_pel_mv_only
No entropy_stats
No eob_first
Yes ext_comp_refs
Yes ext_delta_q
Yes ext_intra Extended intra[21]
Yes ext_intra_mod
Yes ext_partition
Yes ext_partition_types
No ext_partition_types_ab
No ext_qm
No ext_skip
No ext_tile
No ext_warped_motion
No filter_intra Interpolate the reference samples before prediction to reduce the impact of quantization noise[34]
No fp_mb_stats
Yes frame_marker
No frame_refs
No frame_sign_bias
Yes frame_size
No frame_superres
No hash_me
No horzonly_frame_superres
No inter_stats_only
No intrabc
Yes intra_edge
No jnt_comp
Yes kf_ctx
Yes loopfiltering_across_tiles
Yes loopfilter_level
Yes loop_restoration
No lpf_sb
No lv_map
No lv_map_multi
No masked_tx
No max_tile
No mfmv
No mono_video
Yes mv_compress
No new_quant
No no_frame_context_signaling
No obu
No opt_ref_mv
Yes palette_delta_encoding
Yes palette_throughput
Yes parallel_deblocking
Yes q_adapt_probs
No q_segmentation
No rd_debug
No rect_tx_ext
Yes reference_buffer
No ref_adapt
Yes segment_globalmv
No segment_pred_last
Yes short_filter
Yes simple_bwd_adapt
No simplify_tx_mode
Yes striped_loop_restoration
Yes tempmv_signaling
Yes tmv
Yes tx64x64
No txk_sel
Yes txmg
No xiphrc Xiph Rate Controller[69]

ReferencesEdit

  1. ^ Rick Merritt (EE Times), 30 June 2016: Video Compression Feels a Pinch
  2. ^ a b Sebastian Grüner (19 July 2016). "Der nächste Videocodec soll 25 Prozent besser sein als H.265" (in German). golem.de. Retrieved 1 March 2017. 
  3. ^ Zimmerman, Steven (15 May 2017). "Google's Royalty-Free Answer to HEVC: A Look at AV1 and the Future of Video Codecs". XDA Developers. Archived from the original on 14 June 2017. Retrieved 10 June 2017. 
  4. ^ Tsahi Levent-Levi (3 September 2015). "WebRTC Codec Wars: Rebooted". BlogGeek.me. Retrieved 1 March 2017. The beginning of the end of HEVC/H.265 video codec 
  5. ^ a b c d e f Timothy B. Terriberry (18 January 2017). "Progress in the Alliance for Open Media" (video). linux.conf.au. Retrieved 1 March 2017. 
  6. ^ a b c d e f Timothy B. Terriberry (18 January 2017). "Progress in the Alliance for Open Media (slides)" (PDF). Retrieved 22 June 2017. 
  7. ^ https://fosdem.org/2017/schedule/event/om_av1/
  8. ^ a b c Ozer, Jan (30 August 2017). "AV1: A status update". Retrieved 14 September 2017. 
  9. ^ a b Frost, Matt (31 July 2017). "VP9-AV1 Video Compression Update". Retrieved 21 November 2017. Obviously, if we have an open source codec, we need to take very strong steps, and be very diligent in making sure that we are in fact producing something that's royalty free. So we have an extensive IP diligence process which involves diligence on both the contributor level – so when Google proposes a tool, we are doing our in-house IP diligence, using our in-house patent assets and outside advisors – that is then forwarded to the group, and is then again reviewed by an outside counsel that is engaged by the alliance. So that's a step that actually slows down innovation, but is obviously necessary to produce something that is open source and royalty free. 
  10. ^ "Standards are Failing the Streaming Industry". 4 May 2017. Retrieved 20 May 2017. 
  11. ^ a b Steinar Midtskogen, Arild Fuldseth, Gisle Bjøntegaard, Thomas Davies (13 September 2017). "Integrating Thor tools into the emerging AV1 codec" (PDF). Retrieved 2 October 2017. Royalty-free video codecs: The deployment of recent compression technologies such as HEVC/H.265 may have been delayed or restricted due to their licensing terms. (…) What can Thor add to VP9/AV1? Since Thor aims for reasonable compression at only moderate complexity, we considered features of Thor that could increase the compression efficiency of VP9 and/or reduce the computational complexity. 
  12. ^ Neil McAllister, 1 September 2015: Web giants gang up to take on MPEG LA, HEVC Advance with royalty-free streaming codec – Joining forces for cheap, fast 4K video
  13. ^ "examples/lossless_encoder.c - aom - Git at Google". aomedia.googlesource.com. Retrieved 2017-10-29. 
  14. ^ a b c Ozer, Jan (3 June 2016). "What is AV1?". Streaming Media. Information Today, Inc. Archived from the original on 26 November 2016. Retrieved 26 November 2016. ... Once available, YouTube expects to transition to AV1 as quickly as possible, particularly for video configurations such as UHD, HDR, and high frame rate videos ... Based upon its experience with implementing VP9, YouTube estimates that they could start shipping AV1 streams within six months after the bitstream is finalized. ... 
  15. ^ Romain Bouqueau (12 June 2016). "A view on VP9 and AV1 part 1: specifications". GPAC Project on Advanced Content. Retrieved 1 March 2017. 
  16. ^ Jan Ozer, 26 May 2016: What Is VP9?
  17. ^ Debargha Mukherjee, Hui Su, Jim Bankoski, Alex Converse, Jingning Han, Zoe Liu, Yaowu Xu (Google Inc.), International Society for Optics and Photonics, ed., "An overview of new video coding tools under consideration for VP10 – the successor to VP9", SPIE Optical Engineering+ Applications 9599, doi:10.1117/12.2191104 
  18. ^ a b c Converse, Alex (16 November 2015). "New video coding techniques under consideration for VP10 – the successor to VP9". YouTube. Retrieved 3 December 2016. 
  19. ^ a b "Constrained Directional Enhancement Filter". 28 March 2017. Retrieved 15 September 2017. 
  20. ^ a b "Thor update". July 2017. Retrieved 2 October 2017. 
  21. ^ a b c d "Decoding the Buzz over AV1 Codec". 9 June 2017. Retrieved 22 June 2017. 
  22. ^ https://aomedia.googlesource.com/aom/+/master/LICENSE
  23. ^ Sebastian Grüner (9 June 2016). "Freie Videocodecs teilweise besser als H.265" (in German). golem.de. Retrieved 1 March 2017. 
  24. ^ "Results of Elecard's latest benchmarks of AV1 compared to HEVC". 24 April 2017. Retrieved 14 June 2017. The most intriguing result obtained after analysis of the data lies in the fact that the developed codec AV1 is currently equal in its performance with HEVC. The given streams are encoded with AV1 update of 2017.01.31 
  25. ^ a b "Bitmovin Supports AV1 Encoding for VoD and Live and Joins the Alliance for Open Media". 18 April 2017. Retrieved 20 May 2017. 
  26. ^ Ozer, Jan. "HEVC: Rating the contenders" (PDF). Streaming Learning Center. Retrieved 22 May 2017. 
  27. ^ D. Grois, T, Nguyen, and D. Marpe, "Coding efficiency comparison of AV1/VP9, H.265/MPEG-HEVC, and H.264/MPEG-AVC encoders", IEEE Picture Coding Symposium (PCS) 2016 [1]
  28. ^ "Netflix on AV1 - Streaming Learning Center". Streaming Learning Center. 2017-11-30. Retrieved 2017-12-08. 
  29. ^ Giles, Ralph; Smole, Martin (28 November 2017). "DASH playback of AV1 video in Firefox". Mozilla Hacks – the Web developer blog. Retrieved 29 November 2017. 
  30. ^ a b "AV1 experiment flags". 29 September 2017. Retrieved 2 October 2017. 
  31. ^ a b Egge, Nathan (13 September 2017). "Add the DAALA_TX experiment". Retrieved 2 October 2017. 
  32. ^ a b Cho, Yushin (30 August 2017). "Delete daala_dist and cdef-dist experiments in configure". Retrieved 2 October 2017. Since those two experiments have been merged into the dist-8x8 experiment 
  33. ^ Joshi, Urvang (1 June 2017). "Remove ALT_INTRA flag". Retrieved 19 September 2017. 
  34. ^ a b c d e f g h "Analysis of the emerging AOMedia AV1 video coding format for OTT use-cases" (PDF). Retrieved 19 September 2017. 
  35. ^ Mukherjee, Debargha (21 October 2017). "Remove CONFIG_CB4X4 config options". Retrieved 29 October 2017. 
  36. ^ Barbier, Frederic (10 November 2017). "Remove experimental flag of CDEF". Retrieved 23 October 2017. 
  37. ^ a b "NETVC Hackathon Results IETF 98 (Chicago)". Retrieved 15 September 2017. 
  38. ^ Su, Hui (23 October 2017). "Remove experimental flag of chroma_sub8x8". Retrieved 29 October 2017. 
  39. ^ a b Mukherjee, Debargha (29 October 2017). "Remove compound_segment/wedge config flags". Retrieved 23 November 2017. 
  40. ^ Davies, Thomas (19 September 2017). "Remove delta_q experimental flag". Retrieved 2 October 2017. 
  41. ^ Egge, Nathan (25 May 2017). "This patch forces DAALA_EC on by default and removes the dkbool coder". Retrieved 14 September 2017. 
  42. ^ Egge, Nathan (18 June 2017). "Remove the EC_ADAPT experimental flags". Retrieved 23 September 2017. 
  43. ^ Terriberry, Timothy (25 August 2017). "Remove the EC_SMALLMUL experimental flag". Retrieved 15 September 2017. 
  44. ^ Alaiwan, Sebastien (2 October 2017). "Remove compile guards for CONFIG_EXT_INTER". Retrieved 29 October 2017. This experiment has been adopted 
  45. ^ Alaiwan, Sebastien (16 October 2017). "Remove compile guards for CONFIG_EXT_REFS". Retrieved 29 October 2017. This experiment has been adopted 
  46. ^ Zoe Liu; Debargha Mukherjee; Wei-Ting Lin; Paul Wilkins; Jingning Han; Yaowu Xu (4 July 2017). "Adaptive Multi-Reference Prediction Using A Symmetric Framework". Retrieved 29 October 2017. 
  47. ^ Alaiwan, Sebastien (2 November 2017). "Remove experimental flag of EXT_TX". Retrieved 23 November 2017. 
  48. ^ Davies, Thomas (19 September 2017). "Remove filter_7bit experimental flag". Retrieved 29 October 2017. 
  49. ^ Fuldseth, Arild (26 August 2017). "7-bit interpolation filters". Retrieved 29 October 2017. Purpose: Reduce dynamic range of interpolation filter coefficients from 8 bits to 7 bits. Inner product for 8-bit input data can be stored in a 16-bit signed integer. 
  50. ^ Alaiwan, Sebastien (30 October 2017). "Remove experimental flag of GLOBAL_MOTION". Retrieved 23 November 2017. 
  51. ^ Chen, Yue (30 October 2017). "Remove CONFIG_INTERINTRA". Retrieved 23 November 2017. 
  52. ^ Alaiwan, Sebastien (31 October 2017). "Remove experimental flag of MOTION_VAR". Retrieved 23 November 2017. 
  53. ^ Chen, Yue (13 October 2017). "Renamings for OBMC experiment". Retrieved 19 September 2017. 
  54. ^ Barbier, Frederic (15 November 2017). "Remove experimental flag of NEW_MULTISYMBOL". Retrieved 23 October 2017. 
  55. ^ Liu, Zoe (7 November 2017). "Remove ONE_SIDED_COMPOUND experimental flag". Retrieved 23 November 2017. 
  56. ^ Joshi, Urvang (1 June 2017). "Remove PALETTE flag". Retrieved 19 September 2017. 
  57. ^ "Overview of the Decoding Process (Informative)". Retrieved 12 November 2017. For certain types of image, such as PC screen content, it is likely that the majority of colors come from a very small subset of the color space. This subset is referred to as a palette. AV1 supports palette prediction, whereby non-inter frames are predicted from a palette containing the most likely colors. 
  58. ^ Yoshi, Urvang (26 September 2017). "Remove rect_intra_pred experimental flag". Retrieved 2 October 2017. 
  59. ^ Mukherjee, Debargha (29 October 2017). "Remove experimental flag for rect-tx". Retrieved 23 November 2017. 
  60. ^ Mukherjee, Debargha (1 July 2016). "Rectangular transforms 4x8 & 8x4". Retrieved 14 September 2017. 
  61. ^ Alaiwan, Sebastien (27 April 2017). "Merge ref-mv into codebase". Retrieved 23 September 2017. 
  62. ^ Joshi, Urvang (9 November 2017). "Remove smooth_hv experiment flag". Retrieved 23 November 2017. 
  63. ^ Davies, Thomas (18 July 2017). "Remove the CONFIG_TILE_GROUPS experimental flag". Retrieved 19 September 2017. 
  64. ^ Alaiwan, Sebastien (24 October 2017). "Remove compile guards for VAR_TX experiment". Retrieved 29 October 2017. This experiment has been adopted 
  65. ^ Alaiwan, Sebastien (31 October 2017). "Remove experimental flag of WARPED_MOTION". Retrieved 23 November 2017. 
  66. ^ Davies, Thomas (9 August 2017). "AOM_QM: enable by default". Retrieved 19 September 2017. 
  67. ^ "Daala-TX" (PDF). 22 August 2017. Retrieved 26 September 2017. Replaces the existing AV1 TX with the lifting implementation from Daala. Daala TX is better in every way: ● Fewer multiplies ● Same shifts, quantizers for all transform sizes and depths ● Smaller intermediaries ● Low-bitdepth transforms wide enough for high-bitdepth ● Less hardware area ● Inherently lossless 
  68. ^ Egge, Nathan (27 October 2017). "Daala Transforms in AV1". 
  69. ^ Pehlivanov, Rostislav (15 February 2017). "Implement a new rate control system". Retrieved 19 September 2017. This commit implements a new rate control system which was ported from Daala's rate control system (which was based off of Theora's rate control system) (…) Bitrate targeting works much better than the current rate control system's targeting and will actually closely match the rate specified by the user without the current rate control system's bursty behaviour. 

External linksEdit