Talk:Adaptive differential pulse-code modulation

Latest comment: 5 months ago by 92.12.87.15 in topic How it works?

All wrong? edit

Subband coding is not the same as ADPCM. Is there any source that supports the content of this article? Or should I just flush it and put the scheme that James L. Flanagan invented? Dicklyon (talk) 18:15, 20 December 2008 (UTC)Reply

I think it's closer to a real article now. I havne't found good support for my recollection that Flanagan invented it, but I think it's out there... Dicklyon (talk) 02:07, 21 December 2008 (UTC)Reply

What was the solution to this in the end, seeing as the subband thing is still included? I thought ADPCM + sub-band splitting was basically ATRAC? (a link to *that* might be worth adding?). Kinda feels like it might be time to tie it up after, you know... fifteen years. 92.12.87.15 (talk) 18:16, 31 October 2023 (UTC)Reply

This article is absolutely useless. edit

Where's the details?! there's nothing that actually matters in this article at all. — Preceding unsigned comment added by Bumblebritches57 (talkcontribs) 23:57, 15 September 2014 (UTC)Reply

At least one editor looks forward to your Bumblebritches57 improvementsSovalValtos (talk) 15:57, 25 September 2014 (UTC)Reply
SovalValtos, there are established authorities who could contribute to Wikipedia, but they won't. You know why? It's because Wikipedia has no peer review and allow anyone to make changes willy-nilly. Oh, sure, someone who's willing to bird-dog the subject can review the changes and revert, in part or in toto, but that's exactly what the established authorities don't want to be obliged to do. --MarkFilipak (talk) 01:41, 19 April 2015 (UTC)Reply

This article is a joke edit

Why is this so damn short? It actually contains following information: - ADPCM is DPCM with a variable scale factor (yes, this is right) - ADPCM was developed in 1970... (might be correct... I don't know) - It is used in telephony instead of mLaw/ALaw to double the capacity of a phone line - There is a Subband ADPCM that uses two subbands

I'm so lucky that I already know, what ADPCM is. I switched to the english version because the german one doesn't look much better so far. Did the mankind already forget, what ADPCM is/was/will be?

- Where is the basic explaination how it works? A non-technician will not understand somewhat. That explaination don't need to be in-dept at developer/code level. - Where are the famous and frequently used IMA-ADPCM and MS-ADPCM? Never mentioned somewhere inside this article - Where is it used? (Telephony isn't the only purpose. It is used in many old computer games and many well-known gaming consoles, voice recorders, old cell phones, some cameras, anti-shock-buffers in portable CD players and some ancient digital storage media including CD-I) - There are some another ADPCM-based G.XXX-Standards not mentioned here. Franky666 — Preceding unsigned comment added by 2003:58:EC37:EF22:FD3B:9CB3:E5EE:297A (talk) 00:29, 30 November 2016 (UTC)Reply

How it works? edit

How ADPCM works? How are 8–bit u–law samples mapped to 4–bit? What does each of the 16 4–bit values mean? 24691358r (talk) 18:32, 11 August 2017 (UTC)

This is so frustrating that it is unknown how ADPCM works. My wildguess is that it takes the difference between the last and the beforelast samples, and reapplies the difference between the last and the current sample, and stores -8÷32768, ... 0, 1÷32768, 2÷32768, 3÷32768, 4÷32768, 5÷32768, ... 7÷32768 in a signed nibble. Please tell me it's ADPCM! 27.125.43.78 (talk) 19:10, 13 May 2018 (UTC)

More to the point, why are each of you guys' words separated with a huge and variable number of spaces in the Wiki editor? Is this some kind of espionage stenography thing?
Anyway, the second contributor was pretty much right to my knowledge (but for the guess at the magnitude of each value; it's closer to being + or - an entire bit of magnitude for each value of the nibble, with a bit of stretch as it would otherwise compromise the dynamic range and frequency, and there's no "zero", just a positive or negative minimum value; I expect the values are carefully chosen so any given value can be represented with a short string of differentials, and asymmetrically so you don't get stuck in a buzzing loop with a nonzero DC signal. Simply dividing down to 32768/8 would be no good, that'd give you an effective resolution no better than 4-bit PCM for maximum frequency tones, and a need to go to quite a low freq (about 500Hz, on a phone line) to get even 8-bit equivalent (and 9-bit would put you below the start of the 300Hz filter rolloff), but with a lot of additional artefact noise *added*)
That is, it works somewhat like error diffusion dithering for images, which gives lower apparent "noise" and a smoother overall signal in that domain much like ADPCM does for audio, vs simply quantising the original with fewer bits. In images, the differential part is missing (though that would certainly be an interesting thing to try, it would, like ADPCM, be only really useful as a final step and have a lot of generational propagation error if you tried to edit and re-encode a decoded copy), but you get some of the adaptive part if using a custom palette tuned to the individual image. The algorithm picks the closest colour to the first pixel on a line, records the error between the two, then adds that to (averages it with?) the next original pixel and repeats the process until it reaches the end of the line. Possibly also doing it vertically for the first pixel of the next line at least (actual 2D diffusion across the whole image is something I haven't seen, and really you'd want it to be bidirectional which causes all kinds of extra processing headache).
From which you get a dithered appearance rather naturally, kind-of regular patterns across solid colours (if it wasn't possible to assign a palette value to that exact colour and the "reduce bleed" button isn't clicked to make the output snap completely to a palette colour if within a certain range), and, at least theoretically, large blocks of plain colour remaining as such (...ditto).
Audio is more like a purely greyscale version of this, but starting out with a wider dynamic range, and a very wide image that's many thousands or millions of pixels wide (with changes usually being relatively gradual and smeared over many pixels) but only one pixel high, and there isn't really a palette in this case. Actually mu-law / A-law are kind of like mapping a high dynamic range picture to a standard 8-bit greyscale palette using a gamma curve, but ADPCM ditches that idea as it'd be hideously noisy. Instead each "pixel" in the stream is compared to its predecessor to create a dynamic encoding, and that's what gets compressed according to a set palette and with the error diffusion to keep it as close as possible to the original signal (an uncorrected quantised-differential version would sound increasingly different from the original and quickly become unlistenable). Sometimes the error will be enough to shift the differential value up or down by one or two steps, sometimes not.
This of course introduces additional noise, but not as much as you'd expect, and it scales somewhat with the input signal volume so you don't need to be super bothered about amplifying or normalising it to be as loud as possible, outside of the speaker talking louder if the person on the other end asks them to. The noise is somewhat spread across time instead of being concentrated entirely in the instantaneous amplitude (much as it's spread horizontally across an image with error diffusion dithering), and the human auditory system tends to mask little bits of noise in the quiet areas surrounding a louder sound... which also means an additional part of ADPCM encoding can come into play - it being broken up into NICAM-style blocks which have a volume scalefactor attached (I think that's maybe exclusive to DVI, or to whatever the other version was...?). Which does technically mean it's a little more than 4 bits on average, and you need that little extra bandwidth in your system or to subtly reduce the actual sample rate to compensate, but it can provide further dynamic range benefits, especially if the signal happens to be consistently low.
In fact, it may have been meant just for telephone compression, but in the early days of "real" audio coming out of home computers, before MPG was a practical option, I used it to compress audio ripped from CDs and tapes to fit individual tracks onto floppy disks and to take up less hard drive space. The results weren't as good as the same-sized MP3 of course, but a 14kHz mono 4-bit ADPCM file gave surprisingly acceptable results when squishing a typical 3'20" single into 1.38MB... (the mp3 equivalent would have been 56kbit for 22kHz stereo however). Experimented a bit with 3-bit but it was rather crunchier and nowhere near as compatible. Would still have been fine for speech I expect, but I don't know what conditions would have demanded it. Maybe that's how the 48 and 56kbit rates were achieved for the wideband version, use 3-bit for one or both of the sub-bands?
As for converting 8-bit A/mu-law to ADPCM ... really, starting from 16-bit is preferable, but the companded versions are at least a bit less wasteful than converting from 8-bit PCM (where some of the differential levels of ADPCM simply don't get used, unless you're using some version specially tuned for 8-bit sources). In any case, it's simple enough. You take the compressed 8-bit stream, decompress it to 13 or 14 bits (...or in a modern system, more likely 16), and then encode that into the 4-bit ADPCM. The process at each step is just like it would be in any other situation, just the decompressed version doesn't get output to a speaker or analogue line (...well, perhaps a very short analogue line if the two parts aren't integrated in the same unit and the separate modules aren't digitally connected, but that seems unlikely). 92.12.87.15 (talk) 18:50, 31 October 2023 (UTC)Reply