Talk:Bush hid the facts

Learn more about this page

This is the talk page for discussing improvements to the Bush hid the facts article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Computing Low‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.

Microsoft Windows: Computing Low‑importance

	This article is within the scope of WikiProject Microsoft Windows, a collaborative effort to improve the coverage of Microsoft Windows on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Microsoft WindowsWikipedia:WikiProject Microsoft WindowsTemplate:WikiProject Microsoft WindowsMicrosoft Windows articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Computing.

Software: Computing Low‑importance

	This article is within the scope of WikiProject Software, a collaborative effort to improve the coverage of software on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.SoftwareWikipedia:WikiProject SoftwareTemplate:WikiProject Softwaresoftware articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Computing.

More lines edit

Latest comment: 15 years ago1 comment1 person in discussion

"i'll get the linux" also works, according to an old revision of the Easter eggs in Microsoft products page. —Woddfellow2|✎ 22:56, 24 June 2008 (UTC)Reply

Bug edit

Latest comment: 1 year ago7 comments7 people in discussion

It's not really a bug. It's just a best effort guess that's not good enough. --193.43.89.206 18:30, 28 June 2007 (UTC)Reply

It's still a bug. Software bug: "A software bug (or just "bug") is an error, flaw, mistake, failure, or fault in a computer program that prevents it from behaving as intended (e.g., producing an incorrect result)." --Goblin ^›talk 19:09, 12 January 2008 (UTC)Reply

You are incorrect, sir. Entering the Chinese characters and saving would be storing the exact same information, yet when the text file was opened up the result of showing Chinese characters would be correct. The bug is with the formats, not with the application that adheres to them. —Preceding unsigned comment added by 66.119.171.18 (talk) 00:42, 9 April 2009 (UTC)Reply

Yup, seconded. There is no way of determining the correct encoding without applying a natural language model. —Preceding unsigned comment added by 150.101.232.226 (talk) 04:06, 1 June 2009 (UTC)Reply

It's not a bug - a bug means an undesired behavior by programmers, something they didn't want the program to behave like. This is by design; the function IsTextUnicode is not buggy, it works as specified. —Preceding unsigned comment added by 188.112.72.45 (talk) 17:02, 14 March 2010 (UTC)Reply

It IS a bug. The correct implementation would have been to require the BOM on the new, incompatible encoding, ie UCS-2/UTF-16. There should NOT be a BOM on text in old encodings. Whether there should be one on UTF-8 is questionable, I don't think so because an IsTextUnicode to distinguish UTF-8 and CP1252 is almost impossible to get wrong, but others disagree. The fact that Microsoft thought there should be no distinguishing mark on the new encoding that is easily confused with the old one is a design error and thus a bug.Spitzak (talk) 22:23, 21 April 2010 (UTC)Reply

However, note that even though Microsoft Notepad saves BOM in UTF-16 files, it still has to be able to process text files storing raw text for compatibility purposes. The 8-bit text file for "bush hid the facts" is short enough to cause ambiguity in encoding detection. In newer versions of Notepad, Microsoft slightly altered the detection heuristic. 2A01:119F:220:9400:58C3:7988:DAF7:149F (talk) 14:37, 12 February 2023 (UTC)Reply

Cleanup edit

Latest comment: 16 years ago1 comment1 person in discussion

Why is there a 'cleanup' sticker on this, it seems fine. I am removing it.Seth J. Frantzman 18:06, 17 August 2007 (UTC)Reply

What do the Chinese characters say? edit

Latest comment: 15 years ago2 comments2 people in discussion

just a nonsensical string —Preceding unsigned comment added by 204.174.12.18 (talk) 22:42, 5 May 2008 (UTC)Reply

The characters are: 畂桳栠摩琠敨映捡獴. You can see one analysis at Wikipedia:Reference_desk/Archives/Language/2008_September_25#Pseudo-Chinese_question.... AnonMoos (talk) 06:12, 8 October 2008 (UTC)Reply

Sounds like a lie edit

Latest comment: 16 years ago2 comments2 people in discussion

I don't buy it. If this sentence were terminated because of the amount of letters in the four words, it's all a lie. I tried typing "reagan hid the facts" and the same happened. The only thing I can think is that Microsoft doesn't like people typing negative things about their favourite politicians when editing an html-page, or what looks like it. —Preceding unsigned comment added by 80.163.25.176 (talk) 13:28, 21 September 2007 (UTC)Reply

Well it does work with Reagan too, so the bug is more extensive that originally thought. Odd. violet/riga (t) 14:05, 21 September 2007 (UTC)Reply

Removal of "Explanation" section. edit

Latest comment: 16 years ago2 comments2 people in discussion

I removed the "Explanation" section from this page because technically, it makes no sense, or is incomplete. The section read as follows (wrapped so it doesn't cause long lines - an unedited version is in this page's source):

==Explanation==
Text files containing [[Unicode]] [[UTF-16]]-encoded Unicode start with a
"[[Byte-Order Mark]]" (BOM), which is a 2-byte flag that tells a reader how the
following UTF-16 data is encoded.  When you save a file in Notepad, by default you
are saving to 8-bit Extended [[ASCII]].  When the file is opened again, the bit
pattern tells notepad that you are reading from 16-bit [[Unicode]].  This causes
the eighteen 8-bit [[ASCII]] characters to be displayed as nine 16-bit [[Unicode]]
characters.
'''
Verified by JT (KNUSTComputerScience2004)'''

It certainly seems as though there's a misinterpretation of the file as UTF-16, but there's a vital part missing from this; the BOM itself isn't present in the text entered, or in the saved file. (The BOM being U+FEFF). Therefore, the explanation that the BOM is causing this makes no sense.

Maybe something in the text is causing Notepad to mistakenly think that it's UTF-16 and that it found a BOM, but the explanation doesn't mention that and it's technically incomplete. I don't know what the bug is in Notepad, but I have confirmed it myself. --Ciaran H 15:57, 1 October 2007 (UTC)Reply

Notepad calls IsTextUnicode, which runs a heuristic on the text. (Heuristic is programmerese ≈ educated guess.) Sometimes it guesses wrong, and the likelyhood of guessing wrong is bigger for short texts. Shinobu (talk) 19:55, 29 November 2007 (UTC)Reply

Removal of the comment at the bottom edit

Latest comment: 15 years ago2 comments2 people in discussion

I removed that "This arcticle is bullshit" comment at the bottom of the article. —Preceding unsigned comment added by 87.180.56.34 (talk) 15:36, 19 December 2007 (UTC)Reply

Good call. It's original research, and would need to be presented in the form "according to NOTABLE SOURCE, QUOTE this article is bullshit UNQUOTE REFERENCE". -Ashley Pomeroy (talk) 20:30, 28 September 2008 (UTC)Reply

Not a notepad-only bug? edit

Latest comment: 15 years ago2 comments2 people in discussion

I saved "bush hid the facts" with Metapad and it still bugged when I closed and reopened the text file. The exact error message I received: "Detected non-ANSI characters in this Unicode file. Data will be lost if this file is saved!" When I hit OK the text became nine question marks. Does this occur with other text editors, perhaps across operating systems? 71.115.6.93 (talk) 04:07, 27 April 2008 (UTC)Reply

Metapad is yet another Win32 editor so the problem is probably caused by IsTextUnicode(), too. There is some technical insight if you care. saimhe (talk) 20:37, 21 June 2008 (UTC)Reply

Conspiracy theory edit

Latest comment: 15 years ago3 comments3 people in discussion

I think the following points should be made (provided they're true and you can prove it...):

Bush is a reference to George W. Bush
The trigger text "Bush hid the facts" was chosen for a lark after the original bug had been discovered, and helped knowledge of the bug spread as it made a fairly mundane bug seem funnier/weirder
Some people probably thought the "Bush hid the facts" was a specific Easter Egg phrase planted by hackers or anti-Bush programmers inside MS.
Did MS go to the trouble of denying any such conspiracy?

jnestorius^(talk) 20:55, 25 September 2008 (UTC)Reply

There are so many combinations that make this happen, it's actually nothing to do with Bush at all, e.g. "Pete ate the pasta", "Pete ate the pasta", "Bush ate the files" all trigger it in Notepad in WinXP SP2. 79.78.200.240 (talk) 16:05, 7 January 2009 (UTC)Reply

That's what they guy above you just said. Note his second point in particular. It should be made more clear in the article that it's not anything to do with Bush himself, but that the phrase was chosen on purpose because it worked. —Preceding unsigned comment added by 66.119.171.18 (talk) 00:46, 9 April 2009 (UTC)Reply

Newline edit

Latest comment: 6 years ago2 comments2 people in discussion

If you save the file with a trailing newline (0D 0A, since we're talking about the DOS/Windows world), this bug won't be triggered. This may go some way explaining why the bug is less frequent than it could be (as misparsing ASCII text as UTF-16 is not statistically rare). —Preceding unsigned comment added by 89.0.2.3 (talk) 00:22, 4 February 2009 (UTC)Reply

All even-length ascii text when decoded as UTF-16 will result in code points in the first half of the BMP. Some of those code points are unallocated but afaict all of them are potentially allocatable. CRLF interpreted as UTF-16LE would map to U+0A0D which is an unallocated character in the "Gurmukhi" block. Presumablly the MS implementation either knows that character is unallocated or considers a mixture of Chinese and Gurmukhi unlikely. Plugwash (talk) 16:59, 7 February 2018 (UTC)Reply

"This app can break" doesn't work! :) edit

Latest comment: 15 years ago4 comments4 people in discussion

It's funny, but I tried this bug and it behaves in the described way for all strings I tried except for "This app can break". If I changed any letter in this (e.g. "app" to "apa") and saved in the new Notepad window, it still changes to Chinese, but this one phrase fails. :)

Anybody can confirm that? I restarted Notepad for each phrase and saved only new files, so I shouldn't have made any mistakes. m_gol (talk) 12:25, 1 March 2009 (UTC)Reply

I suspect that the sequence " app" isn't mapping to two valid Unicode Chinese characters. But I'm unable to verify that. --Alvestrand (talk) 17:28, 4 March 2009 (UTC)Reply

It might be the capital T. Since Chinese doesn't have capitals, in some cases, it may cause the (istextunicode) thingy to recognize it as English. I don't know the exact cause, only speculating. Annihilatron (talk) 16:46, 25 March 2009 (UTC)Reply

It is the capital T, but it has nothing to do with the fact that Chinese may not have capitals. A capital T has a 8-bit value than a small T, so therefore "Th" has a different 16-bit-value than "th". Obviously that different 16-bit value doesn't represent a Chinese character (or perhaps a "typical" Chinese character) so that is why it doesn't work. —Preceding unsigned comment added by 66.119.171.18 (talk) 00:50, 9 April 2009 (UTC)Reply

Exceptions edit

Latest comment: 14 years ago2 comments2 people in discussion

There are exceptions to this bug. For example, if you type 1234 123 123 12345, it stays the same!--Jupiter.solarsyst.comm.arm.milk.universe (talk) 22:14, 13 May 2009 (UTC)Reply

�F�v��ֲ� —Preceding unsigned comment added by 149.169.212.191 (talk) 00:19, 8 January 2010 (UTC)Reply

Inaccuracy edit

Latest comment: 13 years ago1 comment1 person in discussion

"Bush hid the facts" becomes 畂桳栠摩琠敨映捡獴. The characters shown in the article (only the first one is different) are for "bush hid the facts" instead (first letter uncapitalised). I would edit the article myself but these characters probably also have a different transliteration than the one provided, so I fear I'd only make it more inaccurate. 91.107.57.148 (talk) 15:35, 5 January 2011 (UTC)Reply

Fixed in SP1 of Windows 7? edit

Latest comment: 10 years ago3 comments3 people in discussion

If it was already there, I think it has been fixed in Service Pack 1 of Windows 7. Updated windows doesn't have this behavior. When I type anything like Bush hid the facts etc. and follow the procedure, it remains the same. Others, please confirm this and edit it here if you find the same! — Preceding unsigned comment added by 223.181.0.206 (talk) 03:23, 30 January 2012 (UTC)Reply

Doesn't work for me either. 2Awwsome (talk) 17:27, 23 May 2013 (UTC)Reply

According to this article it was fixed well before that, in Vista.Spitzak (talk) 03:25, 29 May 2013 (UTC)Reply

Bug does still exist in Windows 10 edit

Latest comment: 6 years ago1 comment1 person in discussion

I am using Windows 10. This bug just occured to me while I was writing to a txt file through the "ofstream" class in C++. To take the question away, no, the sequence written to the file was not "Bush hid the facts" ;) I randomly typed things on my keyboard. — Preceding unsigned comment added by Copperazide (talk • contribs) 02:13, 30 January 2018 (UTC)Reply

Dubious phrase edit

Latest comment: 2 years ago3 comments3 people in discussion

In the Workarounds section, I find the dubious phrase "Notepad prepends a UTF-8 byte order mark". WTF is a "UTF-8 byte-order mark"? UTF-8 is a byte-stream encoding; unlike UTF-16 it has no endianness. Accordingly, BOMs on UTF-8-encoded files are the exception, and getting rarer. Wegesrand (talk) 15:33, 26 January 2022 (UTC)Reply

It's sometimes useful to begin a file with the three-byte UTF8 encoding of FEFF to indicate unambiguously that a text-file is in UTF8 to a program which can understand it. (Of course in other cases the three bytes can cause problems.) There's discussion of this at Byte order mark... AnonMoos (talk) 18:20, 27 January 2022 (UTC)Reply

Most Windows programs (at least supplied by Microsoft) insist on writing the code for U+FEFF at the start of any file that is "unicode", including UTF-8. This of course destroys the compatibility of UTF-8 with ASCII, which is the entire point behind the design of UTF-8, and likely delayed the adoption of Unicode in files and the internet by decades. However this pattern does not trigger the Bush hid the facts bug so it "fixes" this.Spitzak (talk) 18:49, 27 January 2022 (UTC)Reply

Self-published source edit

Latest comment: 2 months ago6 comments3 people in discussion

Following a YouTube video about the subject (arguably demonstrating that the Wikipedia article is wrong), that video has now been added as a source. Can we consider the author, FlyTech Videos, a subject matter expert? I am reminded by the recent discussion here where Karl Jobst is rejected as a subject matter expert on something he made a video about. --Renerpho (talk) 18:04, 4 July 2023 (UTC)Reply

I have asked the relevant Wikipedia projects to comment on this.[1][2][3] --Renerpho (talk) 19:39, 5 July 2023 (UTC)Reply

A tool to create sequences that trigger the bug, or check if a given sequence will do so, has been put on Github.[4] It also explains how the bug works. Of course it is also self-published (same person who made the YouTube video). --Renerpho (talk) 19:45, 5 July 2023 (UTC)Reply

Why should we care about self publishing? This video not only explain with more details why the bug happen, it also provides a (close enough) oracle to prove that what he says is true (up to new lines.) and that the IsTextUnicode() function is faulty. What do you want more as a proof? I mean, if the article was talking about history, or politics, well, ok. But it's science here, any repeatable source, self published or not, should be accepted. (Link to the video for info https://www.youtube.com/watch?v=sPShnuBSvBg) 2001:861:4286:8CA0:79D0:F61A:A524:1AF8 (talk) 17:02, 7 July 2023 (UTC)Reply

I don't care about proof. This is an encyclopedia, not a truth seeking machine. If reputable sources mention it, it belongs into the article. If the user has to confirm it themselves, by doing their own tests or by trusting the video, it does not. The fact that this is science, not politics, doesn't change that. If anything, the standards regarding WP:OR and WP:NOTTRUTH should be more strict, not less. Renerpho (talk) 02:00, 8 July 2023 (UTC)Reply

I wonder what kind of publication this sort of thing can go in. In the old days it would probably end up on a computer magazine and become citable for us. There would at least be an editor.

(The author noticeably hand-waves over how the Oracle is produced: probably decompilation of the IsTextUnicode function, not a wise thing to admit from a legal perspective indeed. What did the old magazines do with this sort of knowledge?) Artoria 2e5 🌉 09:13, 13 February 2024 (UTC)Reply

Add topic