Open main menu

Wikipedia β

Consistent Overhead Byte Stuffing

Consistent Overhead Byte Stuffing (COBS) is an algorithm for encoding data bytes that results in efficient, reliable, unambiguous packet framing regardless of packet content, thus making it easy for receiving applications to recover from malformed packets. It employs a particular byte value, typically zero, to serve as a packet delimiter (a special value that indicates the boundary between packets). When zero is used as a delimiter, the algorithm replaces each zero data byte with a non-zero value so that no zero data bytes will appear in the packet and thus be misinterpreted as packet boundaries. The value substituted for each zero data byte is equal to one plus the number of non-zero data bytes that follow.

Byte stuffing is a process that transforms a sequence of data bytes that may contain 'illegal' or 'reserved' values (such as packet delimiter) into a potentially longer sequence that contains no occurrences of those values. The extra length of the transformed sequence is typically referred to as the overhead of the algorithm. The COBS algorithm tightly bounds the worst-case overhead, limiting it to no more than one byte in 254, which results in a constant overhead that depends only on the number of data bytes. Consequently, the encoded byte sequence has a predictable, constant length, which makes COBS useful for real-time applications in which jitter may be problematic. The algorithm is computationally inexpensive and its average overhead is low compared to other unambiguous framing algorithms.[1][2]

Contents

Packet framing and stuffingEdit

When packetized data is sent over any serial medium, a protocol is needed by which to demarcate packet boundaries. This is done by using a framing marker, which is a special bit-sequence or character value that indicates where the boundaries between packets fall. Data stuffing is the process that transforms the packet data before transmission to eliminate all occurrences of the framing marker, so that when the receiver detects a marker, it can be certain that the marker indicates a boundary between packets.

COBS transforms a data set of up to 254 bytes in the range [0,255] into bytes in the range [1,255]. Having eliminated all zero bytes from the data, a zero byte can now be used to unambiguously mark the end of the transformed data. This is done by appending a zero byte to the transformed data, and thus forming a packet consisting of the COBS-encoded data (the payload) and the zero byte, end-of-packet marker.

The overhead of COBS encoding is constant regardless of data content. Every data set is encoded with an overhead of exactly one byte, so that N bytes are always transformed into exactly N+1 encoded bytes. The overhead byte, which equals one plus the number of non-zero bytes that follow, appears at the beginning of the encoded data. Note that the overhead byte is not a transformed data byte; it is an additional byte that precedes the transformed data bytes.

In the case of byte streams, or fixed-size data sets larger than 254 bytes, COBS requires data to be encoded a section at a time, such that no section exceeds 254 bytes in size. The unambiguous zero byte packet delimiter allows a receiver to synchronize reliably with the beginning of the next packet, even after an error. It also allows new listeners, which might join a broadcast stream at any time, to reliably detect the beginning of the first complete packet in the received byte stream.

Encoding examplesEdit

These examples show how various data sequences would be encoded by the COBS algorithm. In the examples, all bytes are expressed as hexadecimal values, and encoded data is shown with text formatting to illustrate various features:

  • An overhead byte appears at the beginning of every encoded packet. This byte does not correspond to a data byte; it is an additional byte that is prepended to the encoded output. Its value equals either (1) one plus the number of non-zero data bytes that follow, or (2) one plus the total number of data bytes (applies when no zero bytes occur in the data). It is effectively a pointer to the next packet byte that requires interpretation: if the addressed byte is non-zero then it is an encoded zero data byte that points to the next byte requiring interpretation; if the addressed byte is zero then it is the end of packet.
  • Bold indicates a data byte that has not been altered by encoding. All non-zero data bytes remain unaltered.
  • Green indicates a zero data byte that was altered by encoding. All zero data bytes are replaced during encoding by one plus the number of non-zero bytes that follow.
  • A zero byte appears at the end of every packet to indicate end-of-packet to the data receiver. This packet delimiter byte does not correspond to a data byte; it is an additional byte that is appended to the encoded output.
Example Unencoded data (hex) Encoded with COBS (hex)
1 00 01 01 00
2 00 00 01 01 01 00
3 11 22 00 33 03 11 22 02 33 00
4 11 22 33 44 05 11 22 33 44 00
5 11 00 00 00 02 11 01 01 01 00
6 01 02 ... FE FF 01 02 ... FE 00

Below is a diagram using example 3 from above table, to illustrate how each modified data byte is located, and how it is identified as a data byte or an end of frame byte.

   [OHB]                                : Overhead byte (Start of frame)
     3+ -------------->|                : Points to relative location of first zero symbol
                       2+-------->|     : Is a zero data byte, pointing to next zero symbol
                                  [EOP] : Location of end-of-packet zero symbol.
     0     1     2     3     4    5     : Byte Position
     03    11    22    02    33   00    : COBS Data Frame
           11    22    00    33         : Extracted Data
     
OHB = Overhead Byte (Points to next zero symbol)
EOP = End Of Packet

ImplementationEdit

The following code implements a COBS encoder and decoder in the C programming language:

/*
 * StuffData byte stuffs "length" bytes of
 * data at the location pointed to by "ptr",
 * writing the output to the location pointed
 * to by "dst".
 */
#include <stdint.h>
#include <stddef.h>

#define FinishBlock(X) (*code_ptr = (X), code_ptr = dst++, code = 0x01)

void StuffData(const uint8_t *ptr, size_t length, uint8_t *dst)
{
  const uint8_t *end = ptr + length;
  uint8_t *code_ptr = dst++;
  uint8_t code = 0x01;

  while (ptr < end)
  {
    if (*ptr == 0)
      FinishBlock(code);
    else
    {
      *dst++ = *ptr;
      if (++code == 0xFF)
        FinishBlock(code);
    }
    ptr++;
  }

  FinishBlock(code);
}

/*
 * UnStuffData decodes "length" bytes of
 * data at the location pointed to by "ptr",
 * writing the output to the location pointed
 * to by "dst".
 */

void UnStuffData(const uint8_t *ptr, size_t length, uint8_t *dst)
{
  const uint8_t *end = ptr + length;
  while (ptr < end)
  {
    int code = *ptr++;
    for (int i = 1; i < code; i++)
      *dst++ = *ptr++;
    if (code < 0xFF)
      *dst++ = 0;
  }
}

/*
 * Defensive UnStuffData, which prevents poorly
 * conditioned data at *ptr from over-running
 * the available buffer at *dst.
 */

void UnStuffData(const uint8_t *ptr, size_t length, uint8_t *dst)
{
  const uint8_t *end = ptr + length;
  while (ptr < end)
  {
    int code = *ptr++;
    for (int i = 1; ptr < end && i < code; i++)
      *dst++ = *ptr++;
    if (code < 0xFF)
      *dst++ = 0;
  }
}

ReferencesEdit

  1. ^ Cheshire, Stuart; Baker, Mary (1999). "Consistent Overhead Byte Stuffing" (PDF). IEEE/ACM Transactions on Networking. doi:10.1109/90.769765. Retrieved November 30, 2015. 
  2. ^ Cheshire, Stuart; Baker, Mary. "Consistent Overhead Byte Stuffing" (PDF). ACM. Retrieved November 23, 2010. 

External linksEdit