Talk:BCD (character encoding)

Latest comment: 3 years ago by Peter Flass in topic power of two

Merge? How does this differ from Six-bit character code? edit

Six-bit character code Peter Flass (talk) 13:54, 7 June 2012 (UTC)Reply
BCD (6-bit) Peter Flass (talk) 13:55, 7 June 2012 (UTC)Reply

There are many reasons for keeping BCD (6-bit) separate from Six-bit character code (I've used it for 35 years as System engineer at Bull, and it was a milestone)
  1. For the same reason we do not merge EBCDIC and ASCII - Seven-bit character codes (8-bit later on)
  2. Not to allow it to be confused with BCD..
  3. Six-bit character code is generic (trying to comprehend all others..among them Fieldata)
  4. Braille sixbit code it should go inside Six-bit character code, but may not be included in BCD (6-bit)
  5. It was the first Six-bit character code.. giving birth to its offsprings EBCDIC and ASCII
  6. Do you want to merge all 8bit code sets in one article?
  7. All code pages have separate articles.. Do you want to merge all code pages in one article?

(to be continued)--Mcapdevila (talk) 16:57, 18 June 2012 (UTC)Reply

Fair 'nuff. Peter Flass (talk) 20:54, 18 June 2012 (UTC)Reply
According to Wikipedia (which of course must be correct) Fieldata is a 7-bit code. As far as I can see there are no articles Seven-bit character code or Eight-bit character code, just six-bit. Peter Flass (talk) 12:34, 19 June 2012 (UTC)Reply
On the other hand, as far as I can see there are no articles Seven-bit character code or Eight-bit character code, just six-bit. The other major code pages each have their own article. I recently added an article on IBM Transcode, which is a different 6-bit code and there are probably others. Do we need a generic article for six-bit codes?Peter Flass (talk) 12:42, 19 June 2012 (UTC)Reply
See talk for Six-bit character code. Is the definition of BCD a character set where the digits 0-9 have binary codes 00 thru 09? This article should say that. The article Binary-coded decimal covers the arithmetic details of BCD.Peter Flass (talk) 12:48, 19 June 2012 (UTC)Reply
Fieldata (as per 1956) was a sixbit code: "Fieldata is the original character set used internally in UNIVAC computers of the 1100 series, represented by the sixth of the 36-bit word of that computer."--Mcapdevila (talk) 18:07, 19 June 2012 (UTC)Reply
I think it will be no need for the Six-bit character code article, in the moment all the included code systems they have its own article, as the other families. In the mean time they have to appear somewhere.--Mcapdevila (talk) 18:55, 20 June 2012 (UTC)\Reply
Makes sense.Peter Flass (talk) 20:11, 20 June 2012 (UTC)Reply

Per Wikipedia:Merging, proposed merge discussions should take place on a single page. As I was confused by the separate discussion at talk:Six-bit character code, I took the liberty to merge it. Thanks, Wbm1058 (talk) 17:36, 19 October 2012 (UTC)Reply

I guess there's no reason to merge and some reason not to, so I removed the merge proposal. I see the distinction between various BCD codes and other codes. Confusingly most 6-bit codes are called BCD even where this is incorrect. I did decide to "be bold" and made changes to this article, the most significant being discussion of the development of BCD based on information from Punched card and Binary-coded decimal. Peter Flass (talk) 14:06, 31 December 2013 (UTC)Reply

Yes, I always wondered, for example reading the IBM 704 Fortran manual, why they call it BCD. I suppose it makes more sense for the BCD arithmetic processors, and continued over to the binary processors. Gah4 (talk) 19:02, 22 February 2017 (UTC)Reply

Definition of BCD edit

To distinguish BCD from other six-bit codes I added the following: "BCD encodes the characters '0' through '9' as the corresponding binary values". Peter Flass (talk) 23:50, 31 December 2013 (UTC)Reply

I backed this off because some BCD character sets seem to have zero in another position. Does anyone have access to the Mackenzie reference? What the heck is his definition of BCD? Peter Flass (talk) 23:46, 3 January 2014 (UTC)Reply
I guess this is not true. I've done some research and there are a number of BCD encodings. Peter Flass (talk) 00:09, 16 February 2014 (UTC)Reply
You can't write the all zero bits character on even parity NRZI tape. (That is why they went to odd parity for 9 track.) So the '0' character moves around. Note that the table for the 704 doesn't include '+' or '=', for the machine that Fortran was written for, and presumably what it was sold for. At that point, the punch codes were not unique. The 026 had the ability to punch a specific number of characters, which were reassigned for scientific machines. I believe some of the tables have '&' where '+' should be. (Fortran would be sad without '+'.) Gah4 (talk) 22:25, 21 July 2020 (UTC)Reply

Two commas in IBM 704 BCD edit

Does it really contain two commas and no period? Well, I've seen stranger things, but if so, the table should annotate this explicitly as a bizarre anomaly. — MaxEnt 10:36, 25 June 2014 (UTC)Reply

Good catch, thanks. Peter Flass (talk) 11:03, 25 June 2014 (UTC)Reply

No, but the 704 code does contain two minus signs. From the IBM 704 Fortran manual: "Note 1: There are two - signs. Only the 11 punch minus may be used in source program cards. Either minus may be used in input data to the object program; object program output has the 8-4 minus." In the BCDIC days, there were two different maps of characters to punches, one used for business, the other for science. I just changed the 704 table to match Fortran commonly used on the 704. I wonder if there should be tables indicating the punched card codes in addition to the in-memory codes. Gah4 (talk) 19:44, 6 May 2015 (UTC)Reply

I just did a revert on removing one of the minus signs. Though in those days the characters didn't belong to the hardware, but the software. The card reader on the 704 reads a card row into two 36 bit words. Software has to convert that to columnwise characters, using whichever code it desires. But since Fortran made the 704 famous, I believe that we should use its code. Gah4 (talk) 04:54, 23 April 2017 (UTC)Reply

External links modified edit

Hello fellow Wikipedians,

I have just modified one external link on BCD (character encoding). Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 08:43, 23 October 2016 (UTC)Reply

hex values for octal codes? edit

At 0x0032 is the record mark, which was not proposed separately due to its similarity to the double dagger. At 0x0077 is the group mark, which is not in Unicode yet.

Note that 0x0032 is a 16 bit hexadecimal value meant to represent a six bit octal value, but it doesn't.

Also, should the row headers be of the form 0x0 where x is the appropriate octal digit? Otherwise, there is no reason for the row and column headers to be three digit octal characters. Gah4 (talk) 08:33, 22 February 2017 (UTC)Reply

I notice a recent edit changing to hex constants. I believe that is usual for wikipedia pages, though traditionally when these systems were in use, and in the original documentation, octal was more usual. I am fine with hex, but others might disagree. Discuss here instead of an edit war. Gah4 (talk) 00:22, 18 April 2017 (UTC)Reply
I vote for octal, not only is this the tradirional representation, but the tables make a lot more sense viewed in octal. Peter Flass (talk) 03:20, 18 April 2017 (UTC)Reply

Sorry for the edit confusion edit

@Gah4: I just undid your undo with additional citations and explanations (I was actually in the middle of it when I hit your undo as an edit conflict). I think I have it straight now. Is that any clearer? 71.41.210.146 (talk) 05:24, 23 April 2017 (UTC)Reply

@Gah4: I checked your recent edit adding the 716 printer info, and I'm not sure it's right. The table in the source (and, indeed, the entire printer) seems to go from virtual punched cards to print; there is no 6-bit binary representation at all.

And indeed as explained on p. 52, it's up to software to provide this. The printer is fed with transposed cards, 72 columns (2 36-bit words) at a time. Repeat for 12 virtual card rows per line printed.

I still don't fully understand it; there's no obvious way to print a blank space! An unpunched column corresponds to "*". (Is that actually a symbol or some sort of footnote/special handling notation?) It's a pretty baroque system by modern standards, so I probably just don't understand it, but are you sure you do? 71.41.210.146 (talk) 08:49, 23 April 2017 (UTC)Reply

As I noted earlier, this is all in software. The card reader reads cards row by row, each row into two 36 bit words. Software has to convert that to internal code. The printer came from the Type 407 accounting machine, where it does print from card codes. So, printing software converts to card codes, then sends those codes to the printer. Both the card reader and printer have a plug board that allows much rewiring. For the card reader, you can select which 72 columns are actually read, but it is normally 1 to 72. I am less sure what you can do with the printer plug board, but I presume also that you can move columns around. Also, you can't write a character with all bits zero to even parity NRZI tape, as it will have no transitions. That partly explains the tape code. In addition, the card codes are not unique. Scientific and business machines use the same card codes for different characters. Note that neither table III nor V has an equal sign. These are the tables for the business character set, even though it is supposed to be a scientific machine. I suspect that they change the printer wheels for actual scientific programming. Yes, I am not sure why there is a '*' in position zero. I suppose the type 407 manual will explain that one. Otherwise, my choice was to use the Fortran character codes, as that is what users were more likely to see. But table V does explain the source of the two minus signs in Fortran. Gah4 (talk) 09:35, 23 April 2017 (UTC)Reply
Some is explained in[1] including that the second '*' is meant for blank filling when printing checks. It seems that there are complications for the row without a digit punch, and 0 is a zone punch. So, the problem isn't how to print a blank, but how to print a zero. As for the two minus signs, from the 704 Fortran manual: "Only the 11 punch minus may be used in source program cards. Either minus may be used in input data to the object program; object program output has the 8-4 minus." An extra challenge for programmers! Gah4 (talk) 09:58, 23 April 2017 (UTC)Reply
@Gah4: Yes, I follow all that, but the fact that it's all software means that the printer doesn't have a 6-bit binary code. There is a well-defined convention for the digit punches 1–9, 8+3 and 8+4, but there are two conventions for the zone punches. At the least, some text explaining that this is the character set but the actual code assignments are a software convention.
If you look at the picture of the print wheel in fig. 3 on p. 8 of your citation, you'll see that the 407 numbers zones in none/12/11/10 order, so the table you have is reasonable, but it's still an interpretation.
As for the blank filling, I mentioned that at the end of the 48-character BCDIC section: the first specials added to the basic 10+26+blank were - (for credit balances and hyphenated names), & (for names & addresses), and * (for blank filling). 71.41.210.146 (talk) 18:23, 23 April 2017 (UTC)Reply
Above table III, it describes that the code differs from the 702[2] which has the numerical values of the characters not in alphabetical order. The 702 manual gives the card and tape codes. The 704 keeps the same card and tape codes, but changes the in-memory form. An all cases 0 has to not be X'00' on tape, as tape uses even parity for BCD tape. For binary tape, they have to use odd parity, as you can't avoid writing zeros. My choice is to go back to the Fortran 704 code, as that is well defined, and is what Fortran programmers would see. Gah4 (talk) 08:28, 24 April 2017 (UTC)Reply
@Gah4: Yes, it's documenting that the 704 uses a different binary code than the 702. But this is irrelevant to the 716 printer accessory, which is what we're discussing. The 716 works the same way as a 407, and a 407 doesn't have a 6-bit binary code. It uses a 12-bit code (with at most one zone punch and one digit punch, or the 3+8 and 4+8 digit combinations) based on a virtual IBM card column. Translation from binary is the job of 704 software, and not a fixed property of the machine.
My complaint is that the current mainspace page attributes a 6-bit binary code to the 716 printer, and the 716 printer doesn't deal with binary codes. Not as input, not internally, nothing.
(For those following along, the "Table III" mentioned is the one on p. 24 of the 704 manual.)
I just figured out how to make the page accurate. Have a look in a few minutes and tell me what you think... 71.41.210.146 (talk) 09:15, 24 April 2017 (UTC)Reply
Well, the real problem is that codes are software defined more than we like to believe. Table III is the tape code for the 704 because it is the tape code for the 702. Tape codes are pretty much always in software, but by naming them, we can pretend that they are hardware. Especially since the usual way to run the 704 was to copy cards to tape, carry it to the 704, write the output to tape, and carry the tape somewhere else to print. Tape drives were much faster than card readers and printers. Even more, note that you can't run Fortran with the printer code that you show. By the time of the 7094[3], there were seven different character codes for that printer, shown on page 113, along with the Fortran code. (You at least need an equal sign for Fortran.) The tables that you show don't have an equal sign, and so are not useful for Fortran, while it is Fortran that made the 704 famous. I am not so sure why this happened. With hardware floating point, the 704 was obviously a scientific machine, yet they show the business character set in the manual. Printers and keypunches could only generate 48 printable characters, so they gave some punch codes, and then the corresponding tape codes and printer codes, more than one meaning. Fixing this was the driver for EBCDIC. For the 716, you had to change the print wheels to change the code. For the 1403 chain/train printers, it is an easily replaceable chain or train (depends on the model), along with software to tell it which character is in which position. Characters sets are more set by convention than by hardware. The Fortran character set was built into the compiler and run-time library. Programs could rely on that. Gah4 (talk) 18:56, 24 April 2017 (UTC)Reply

Don't confuse printer codes (or keypunch codes) with character sets. You can run FORTRAN, or any other compilers, on machines with printers that don't have some of the characters. I remember running FORTRAN on machines wher the printed/punched characters didn't match the values the compiler used.Peter Flass (talk) 19:19, 24 April 2017 (UTC)Reply

@Gah4: Absolutely agreed about software; in most applications, the codes are arbitrary. But the codes are "anchored" by a significant ecosystem of hardware or software that interprets them a certain way. If Wikipedia's database and a browser client's fonts were updated to reverse A–Z to Z–A, 99% of the text would be displayed unchanged with only some minor collating order issues because the layers in between don't care. But <html>, <head>, <body>, etc. would cause problems.
The 407/716 printers come from the "unit record" (punched card) world. Although electronic, they aren't binary. They aren't big-endian or little-endian, they're face down, 9-edge first. The only "anchor" the 716 has to a 6-bit code is via a 704 CPU. It makes no sense to attribute a six-bit code to the 716 in isolation from a 704. Now, you can sensibly talk about the six-bit code implemented by the standard "716 printer driver" on the 704, but that requires a different source than what's in the article now.
I fully agree that the 716 code in the article now can't be used to run Fortran. So what? You added it to the article; why did you think it was a useful addition? My objection was (and is) to the inclusion of binary equivalents to the printed symbols. I addressed that by leaving the character table in, but labelling the axes with card punch codes rather than hexadecimal numbers.
In an earlier edit, I changed the 704 Fortran character set to the 704 manual of operation character set, because I though that if we were going to list one, that was "more authoritative". Not including the Fortran character set as well was just laziness on my part; it wasn't a deliberate deletion. Feel free to add it back!
@Gah4 and Peter Flass: Talking about the Fortran character set makes sense because the Fortran compiler (and associated software ecosystem) is such a heavyweight "anchor". Even if the printer mangles the characters, the compiler interprets certain bit patterns as =, (, ), <, >, +. -, *, /, etc. This interpretation was consistent across all 704 users, and it's therefore reasonable to talk about the "704 Fortran character encoding". 71.41.210.146 (talk) 21:58, 24 April 2017 (UTC)Reply
Yes, the reason the Fortran codes aren't in the manual is that they hadn't finished it yet. The Fortran manual, at least, isn't until October 1956. As Peter Flass notes, I am sure that often enough someone would use the printer with the wrong print wheels. Just one of the costs of business in those days. I do remember punching ALGOL programs on 026 keypunches, which don't have many of the characters needed. There were big signs on the wall telling the multipunch needed for those. Seems to me that the problem with the printer codes is that they aren't BCD anymore, the subject of this page. Otherwise, yes, both the table from figure III and the Fortran code from the Fortran manual can go in as separate tables. It may be that the print wheels for Fortran hadn't been made yet in 1954. Gah4 (talk) 22:31, 24 April 2017 (UTC)Reply


References

  1. ^ "407 accounting machine" (PDF). bitsavers.trailing-edge.com. IBM. Retrieved 23 April 2017.
  2. ^ "Electronic data processing machines type 702" (PDF). bitsavers.informatik.uni-stuttgart.de. IBM. Retrieved 24 April 2017.
  3. ^ "IBM 7094 Principles of Operation" (PDF). IBM. p. 113. Retrieved 24 April 2017.

External links modified edit

Hello fellow Wikipedians,

I have just modified 2 external links on BCD (character encoding). Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 03:54, 13 July 2017 (UTC)Reply

power of two edit

As I understand it, there is a metal plate in the 026 and 029 keypunches that contains the dot patterns. There are x and y positions, that are not powers of two, though I don't remember the numbers. There is even one character on the 029 that doesn't have a dot pattern, 0-8-2, but does have a key. There was no room to add it. For wheel or train printers, there isn't much reason for a power of two, as long as the printer keeps track of which character is where. Gah4 (talk) 19:35, 26 July 2020 (UTC)Reply

I don't remember 0-8-2 having a key. I always used to multi-punch it. Peter Flass (talk) 20:17, 26 July 2020 (UTC)Reply
See the T Gah4 (talk) 22:32, 26 July 2020 (UTC)Reply
Never used it, didn't even know it was there. God. that keyboard brings back a lot of memories! Peter Flass (talk) 22:34, 26 July 2020 (UTC)Reply
I don't understand the question. To what numbers does "powers of two" refer? How does the concept even apply to combinations of holes in a column, for anything other than column binary, which none of the keypunches support?