Head-of-line blocking

Head-of-line blocking (HOL blocking) in computer networking is a performance-limiting phenomenon that occurs when a queue of packets is held up by the first packet in the queue. This occurs, for example, in input-buffered network switches, out-of-order delivery and multiple requests in HTTP pipelining.

Network switches

Head-of-line blocking example: The 1st and 3rd input flows are competing to send packets to the same output interface. In this case if the switching fabric decides to transfer the packet from the 3rd input flow, the 1st input flow cannot be processed in the same time slot. Note that the 1st input flow is blocking a packet for output interface 3, which is available for processing.

A switch may be composed of buffered input ports, a switch fabric, and buffered output ports. If first-in first-out (FIFO) input buffers are used, only the oldest packet is available for forwarding. If the oldest packet cannot be transmitted due to its target output being busy, then more recent arrivals cannot be forwarded. The output may be busy if there is output contention.

Without HOL blocking, the new arrivals could potentially be forwarded around the stuck oldest packet to their respective destinations. HOL blocking can produce performance-degrading effects in input-buffered systems.

This phenomenon limits the throughput of switches. For FIFO input buffers, a simple model of fixed-sized cells to uniformly distributed destinations, causes the throughput to be limited to 58.6% of the total as the number of links becomes large.^[1]

One way to overcome this limitation is by using virtual output queues.^[2]

Only switches with input buffering can suffer HOL blocking. With sufficient internal bandwidth, input buffering is unnecessary; all buffering is handled at outputs and HOL blocking is avoided. This no-input-buffering architecture is common in small to medium-sized ethernet switches.

Out-of-order delivery

Out-of-order delivery occurs when sequenced packets arrive out of order. This may happen due to different paths taken by the packets or from packets being dropped and resent. HOL blocking can significantly increase packet reordering.^[3]^[4]

Reliably broadcasting messages across a lossy network among a large number of peers is a difficult problem. While atomic broadcast algorithms solve the single point of failure problem of centralized servers, those algorithms introduce a head-of-line blocking problem.^[5] The Bimodal Multicast algorithm, a randomized algorithm that uses a gossip protocol, avoids head-of-line blocking by allowing some messages to be received out-of-order.^[6]

In HTTP

One form of HOL blocking in HTTP/1.1 is when the number of allowed parallel requests in the browser is used up, and subsequent requests need to wait for the former ones to complete. HTTP/2 addresses this issue through request multiplexing, which eliminates HOL blocking at the application layer, but HOL still exists at the transport (TCP) layer.^[7]^[8]

In reliable byte streams

Head-of-line blocking can occur in reliable byte streams: if packets are reordered or lost and need to be retransmitted (and thus arrive out-of-order), data from sequentially later parts of the stream may be received before sequentially earlier parts of the stream; however, the later data cannot typically be used until the earlier data has been received, incurring network latency. If multiple independent higher-level messages are encapsulated and multiplexed onto a single reliable byte stream, then head-of-line blocking can cause processing of a fully-received message that was sent later to wait for delivery of a message that was sent earlier.^[9] This affects, for example, HTTP/2, which frames multiple request–response pairs onto a single stream; HTTP/3, which has an application-layer framing design and uses datagram rather than stream transport, avoids this problem.^[10]^[11] The latency degradation from head-of-line blocking depends on the underlying packet loss rate and round-trip time, with higher losses producing worse latency.^[12]^[13] Without changing the stream abstraction, reducing packet loss can reduce the harm from head-of-line blocking; an alternative is to implement the reliable byte stream using forward error correction to send redundant data so that a certain amount of loss can be tolerated without incurring retransmissions.^[9]

References

^ M. Karo; M. Hluchyj; S. Morgan (December 1987). "Input Versus Output Queuing on a Space-Division Packet Switch". IEEE Transactions on Communications. 35 (12): 1347–1356. doi:10.1109/TCOM.1987.1096719.
^ Nick McKeown; Adisak Mekkittikul; Venkat Anantharam; Jean Walrand (August 1999). "Achieving 100% Throughput in an Input-Queued Switch" (PDF). IEEE Transactions on Communications. 47 (8): 1260–1267. CiteSeerX 10.1.1.18.7529. doi:10.1109/26.780463.
^ Jon C. R. Bennett; Craig Partridge; Nicholas Shectman (December 1999). "Packet reordering is not pathological network behavior". IEEE/ACM Transactions on Networking. 7 (6): 789–798. CiteSeerX 10.1.1.461.7629. doi:10.1109/90.811445. S2CID 26573611.
^ Bennett, J. C. R.; Partridge, C.; Shectman, N. (April 2000). Sarisky, Dan (ed.). "Packet Reordering is Not Pathological Network Behavior [Slides]" (PDF). SC N Research. Archived from the original (PDF) on 2017-08-20. Retrieved 2017-08-19.
^ Defago, X.; Schiper; A., Urban, P. (2004). "Total order broadcast and multicast algorithms: taxonomy and survey" (PDF). ACM Computing Surveys. 36 (4): 372-421. doi:10.1145/1041680.1041682. S2CID 207155989.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Tyler McMullen (2015). "It Probably Works". ACM Queue.
^ Grigorik, Ilya (October 2013). "Making the Web Faster with HTTP 2.0". ACM Queue. 11 (10): 40. doi:10.1145/2542661.2555617. S2CID 34623442. Retrieved 10 June 2019.
^ Javier Garza (October 2017). "How does HTTP/2 solve the Head of Line blocking (HOL) issue".
^ ^a ^b Briscoe et al. 2016, pp. 29–30.
^ Langley et al. 2017, pp. 184, 186.
^ Marx et al. 2018, pp. 22–23.
^ Nowlan, Wolinsky & Ford 2013, p. 6.
^ Heijligers 2021, p. 65.

Bibliography

Briscoe, Bob; Brunstrom, Anna; Petlund, Andreas; Hayes, David; Ros, David; Tsang, Ing-Jyh; Gjessing, Stein; Fairhurst, Gorry; Griwodz, Carsten; Welzl, Michael (2016). "Reducing Internet Latency: A Survey of Techniques and Their Merits". IEEE Communications Surveys & Tutorials. 18 (3): 2149–2196. doi:10.1109/COMST.2014.2375213. hdl:2164/8018. S2CID 206576469.
Heijligers, Jaap (2021). Tor over QUIC (Thesis).
Langley, Adam; Riddoch, Alistair; Wilk, Alyssa; Vicente, Antonio; Krasic, Charles; Zhang, Dan; Yang, Fan; Kouranov, Fedor; Swett, Ian; Iyengar, Janardhan; Bailey, Jeff; Dorfman, Jeremy; Roskind, Jim; Kulik, Joanna; Westin, Patrik; Tenneti, Raman; Shade, Robbie; Hamilton, Ryan; Vasiliev, Victor; Chang, Wan-Teh; Shi, Zhongyi (2017). "The QUIC Transport Protocol". Proceedings of the Conference of the ACM Special Interest Group on Data Communication. pp. 183–196. doi:10.1145/3098822.3098842. ISBN 9781450346535. S2CID 2768765.
Marx, Robin; Wijnants, Maarten; Quax, Peter; Faes, Axel; Lamotte, Wim (2018). "Web Performance Characteristics of HTTP/2 and Comparison to HTTP/1.1". Web Information Systems and Technologies. Lecture Notes in Business Information Processing. Vol. 322. pp. 87–114. doi:10.1007/978-3-319-93527-0_5. hdl:1942/26146. ISBN 978-3-319-93526-3. S2CID 52009597.
Nowlan, Michael F.; Wolinsky, David; Ford, Bryan (2013). Reducing Latency in Tor Circuits with Unordered Delivery. 3rd USENIX Workshop on Free and Open Communications on the Internet.

[Karol-1] M. Karo; M. Hluchyj; S. Morgan (December 1987). "Input Versus Output Queuing on a Space-Division Packet Switch". IEEE Transactions on Communications. 35 (12): 1347–1356. doi:10.1109/TCOM.1987.1096719.

[2] Nick McKeown; Adisak Mekkittikul; Venkat Anantharam; Jean Walrand (August 1999). "Achieving 100% Throughput in an Input-Queued Switch" (PDF). IEEE Transactions on Communications. 47 (8): 1260–1267. CiteSeerX 10.1.1.18.7529. doi:10.1109/26.780463.

[3] Jon C. R. Bennett; Craig Partridge; Nicholas Shectman (December 1999). "Packet reordering is not pathological network behavior". IEEE/ACM Transactions on Networking. 7 (6): 789–798. CiteSeerX 10.1.1.461.7629. doi:10.1109/90.811445. S2CID 26573611.

[4] Bennett, J. C. R.; Partridge, C.; Shectman, N. (April 2000). Sarisky, Dan (ed.). "Packet Reordering is Not Pathological Network Behavior [Slides]" (PDF). SC N Research. Archived from the original (PDF) on 2017-08-20. Retrieved 2017-08-19.

[5] Defago, X.; Schiper; A., Urban, P. (2004). "Total order broadcast and multicast algorithms: taxonomy and survey" (PDF). ACM Computing Surveys. 36 (4): 372-421. doi:10.1145/1041680.1041682. S2CID 207155989.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[6] Tyler McMullen (2015). "It Probably Works". ACM Queue.

[7] Grigorik, Ilya (October 2013). "Making the Web Faster with HTTP 2.0". ACM Queue. 11 (10): 40. doi:10.1145/2542661.2555617. S2CID 34623442. Retrieved 10 June 2019.

[8] Javier Garza (October 2017). "How does HTTP/2 solve the Head of Line blocking (HOL) issue".

[FOOTNOTEBriscoeBrunstromPetlundHayes201629–30-9] Briscoe et al. 2016, pp. 29–30.

[FOOTNOTELangleyRiddochWilkVicente2017184,_186-10] Langley et al. 2017, pp. 184, 186.

[FOOTNOTEMarxWijnantsQuaxFaes201822–23-11] Marx et al. 2018, pp. 22–23.

[FOOTNOTENowlanWolinskyFord20136-12] Nowlan, Wolinsky & Ford 2013, p. 6.

[FOOTNOTEHeijligers202165-13] Heijligers 2021, p. 65.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]