Wikipedia:Reference desk/Archives/Computing/2011 January 29

Computing desk
< January 28 << Dec | January | Feb >> January 30 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


January 29 edit

Legacy pan-European variable-width character encodings edit

Why aren't there any legacy pan-European variable-width character encodings? --84.62.200.57 (talk) 21:01, 29 January 2011 (UTC)[reply]

You can easily understand this by looking at the article ISO/IEC 8859 and its subarticles for the various encodings. The encodings had to be compatible with ASCII, i.e. they could only use the upper 128 code points. Moreover, some of these code points were reserved for special purposes. Therefore separate standards for the different regions were created. Hans Adler 00:30, 30 January 2011 (UTC)[reply]
The question was about a variable-width encoding. I don't know the answer, but it's probably because a variable-width encoding would have been more of a hassle than several single-byte encodings. Among other things, terminal windows traditionally worked on a one-byte-per-character-cell principle, meaning that double-byte characters are twice as wide as single-byte characters. That's fine for Japanese kanji but not for accented Latin letters. -- BenRG (talk) 04:29, 30 January 2011 (UTC)[reply]
OK, I missed the "variable-width" bit, but I think historically it just doesn't make much sense in Europe. There wasn't much of an internet to speak of, and each European language itself could be represented with an ISO 8859 code that had much better compatibility with ASCII than any variable-width code could have had. As soon as the problem became more pressing, Unicode was under development anyway. Once you start moving everything to a variable width (or fixed multi-byte) encoding in order to get all European languages on board at the same time, it was logical to include the rest of the world or at least be sufficiently extensible so that it could be included. Hans Adler 17:31, 30 January 2011 (UTC)[reply]

Are there any legacy pan-European character encodings? --84.62.200.57 (talk) 14:43, 30 January 2011 (UTC)[reply]

If you include single-byte encodings, the Cork encoding might qualify, and there are probably others. There are no variable-width encodings to my knowledge. One of the CJK encodings may qualify, but JIS X 0208 doesn't even contain Latin-1. -- BenRG (talk) 23:28, 1 February 2011 (UTC)[reply]

Google edit

Does google keep copies of its web cache, donate them to internet archivers or something? Or does google just overwrite / delete it when crawling the web. It would be a shame to think that possibly the most complete copy of the internet is not being preserved 82.44.55.25 (talk) 22:37, 29 January 2011 (UTC)[reply]

Would Google have enough space to keep multiple copies of the internet? Fly by Night (talk) 03:16, 30 January 2011 (UTC)[reply]
I'm pretty sure that they would, yes. I don't know, but my guess is that they never delete old pages and that they do cooperate with organizations like the Internet Archive, because both of those strike me as typical Google behavior. -- BenRG (talk) 04:36, 30 January 2011 (UTC)[reply]
Is there any official word from google about what it does with its cache? 82.44.55.25 (talk) 14:39, 30 January 2011 (UTC)[reply]
The Internet Archive does store old versions of the things it archives. I don't think there'd be any benefit to Google giving them its cached copies, though; they can just retrieve that pages themselves, at their own pace. Paul (Stansifer) 22:16, 30 January 2011 (UTC)[reply]
I am aware of the Internet Archive, but it is very much less complete than google cache in terms of coverage, at least in my experience. I assume the benefit to the Internet Archive would be the enormous amount saved bandwidth in the terabytes range, and the benefit to google would be the same as their usenet archive; being able to say "we helped build this awesome archive". 82.44.55.25 (talk) 23:09, 30 January 2011 (UTC)[reply]
The data still has to make it to the Internet Archive somehow. Mailing hard drives back and forth might be cheaper than buying bandwidth (I know that it can be faster), but I don't think that it's worth all that much, given that (1) Google and the Internet Archive have different goals and thus different storage formats and update frequencies, (2) Google doesn't save images, but the Internet Archive does, and (3) the Internet Archive needs an immense amount of bandwidth anyways, since they host large video and audio files. Paul (Stansifer) 23:23, 30 January 2011 (UTC)[reply]

This speculation is interesting but is there any official word from google about what they do with their cache data? I tried searching their faq pages but couldn't come up with anything relevant myself. Maybe someone with better search skills can find something 82.44.55.25 (talk) 23:57, 30 January 2011 (UTC)[reply]

They won't say anything about it if they don't keep it. It's called a "cache" after all, and caches are by their nature temporary. Paul (Stansifer) 01:23, 31 January 2011 (UTC)[reply]
Why wouldn't they say anything about it? They document even the most mundane and obvious features of their search page with in-depth guides. 82.44.55.25 (talk) 12:29, 31 January 2011 (UTC)[reply]
It's not a feature to throw away data, and this data has little value, since other people could have acquired it, if they were so inclined. There are lots of things that Google doesn't do; listing them all would take forever. Paul (Stansifer) 14:34, 31 January 2011 (UTC)[reply]
Google Cache is a feature of Google. I cannot be the first person to wonder if google keeps versions of cached pages or just overwrites them with each new crawl. Maybe it seems obvious to you but it doesn't to me. In the age of wiki-like revision histories and virtually unlimited hard drive space, it would so easy for them to do it. And I strongly disagree with you that "the data has little value"; if people had that attitude over books centuries ago years and years of written history would have been lost forever instead of being preserved in libraries and archives. I thank you for your ideas and input to my question, but I really would rather have an official page from google than speculation. 82.44.55.25 (talk) 14:56, 31 January 2011 (UTC)[reply]
Here are the services that Google makes available to the general public: Google Products. If you are a Google employee, a developer with a special agreement with Google, or if you work at a research or development organization affiliated with Google, numerous additional products and services are available for you. It is almost assured that Google has some extra caching, beyond that which it reveals through its public Search interface; but those extra services are probably both non-public and require technical skills to operate. Nimur (talk) 21:14, 31 January 2011 (UTC)[reply]

Internet Explorer active even though I have not used it edit

Ccleaner cleans my computer when I switch it on at least once a day. Sometimes, when I tell Ccleaner to clean my computer after using it for a while, it says it has cleaned Internet Explorer internet files and cookies. But I use Firefox. Internet Explorer is still on my computer, but I never use it. Why is Internet Eplorer active even though I havn't used it? Thanks 92.28.244.55 (talk) 22:39, 29 January 2011 (UTC)[reply]

Many other programs use the Internet Explorer engine to render webpages, so using them will cause internet files and cookies to show up as if you had used Internet Explorer. Examples of this that I know of are FeedReader and Maxthon, there are many more. This is one possible reason why Ccleaner reports Internet Explorer usage even though you haven't directly used it 82.44.55.25 (talk) 23:32, 29 January 2011 (UTC)[reply]
Windows' Help system also uses IE. Hit <F1> in Windows or any other microsoft application and IE goes to work, though in disguise. Roger (talk) 13:52, 30 January 2011 (UTC)[reply]

What companies (aside form Google) keep the exact number of their servers a secret? edit

So I was watching Modern Marvels on the History Channel on their episode about 90's tech. On the part about Google, they mentioned that they keep the number of their servers a secret. While I had already known about that for some time, seeing it made me think "What other companies do it?" Is it a common strategy or does only Google keep the number of their servers a secret? Narutolovehinata5 tccsdnew 23:22, 29 January 2011 (UTC)[reply]

I know plenty of companies that don't even know the number of servers they are running. Does that count? --Stephan Schulz (talk) 23:36, 29 January 2011 (UTC)[reply]
I was thinking along similar lines myself. Large companies are constantly (read "daily") updating their servers, replacing them, noticing that they have died, combining small ones into bigger ones, geographically spreading them, reassigning the tasks each performs.... A question I would ask is, what do you mean by a server? HiLo48 (talk) 23:41, 29 January 2011 (UTC)[reply]
  • Stephan Schulz, that does not count. I was thinking of companies which actually explicitly say that the exact number of their servers is a secret, much like how China (and possibly Vietnam, North Korea, Cuba and other communist countries) explicitly mention that they keep the number of their executions a secret. As for HiLo48, our article defines a server as "a computer program running as a service, to serve the needs or requests of other programs (referred to in this context as "clients") which may or may not be running on the same computer, a physical computer dedicated to running one or more such services, to serve the needs of programs running on other computers on the same network, a software/hardware system (i.e. a software service running on a dedicated computer) such as a database server, file server, mail server, or print server.". Does that answer your question? Narutolovehinata5 tccsdnew 23:55, 29 January 2011 (UTC)[reply]
Er... you kind of lost me with the whole Communist-country thing. Regardless, I think it's normal for corporations to keep quiet about things they're not legally required to disclose. I don't know how many have explicitly said their number of servers is a secret, but I presume it's roughly the same as the number that have been asked. -- BenRG (talk) 03:43, 30 January 2011 (UTC)[reply]
Not HiLo48 but I doubt it. Are you including servers used only for internal usage? If someone shares files on their desktop computer is that a server? Are you including specialised NAS, routers and firewalls? If not are you including computers dedicated to running m0n0wall, pfSense and stuff? If someone is running multiple VMs each functioning as a server however you define it does that count as one server or multiple? If you are say Microsoft developing server software and OSes and you have some computers used for testing which of course are functioning as servers and accessed over the network for testing purposes do those count as servers? If a company has fairly lax policies about what you can do with their computers and someone runs a game server temporarily for their office LAN party is that a server? Is a computer running a P2P client a server? If not what if the company uses P2P to aide in distribution of their content and run specialised computers for it are these servers and if yes when does the switch over happen? These are just a small number of the questions you haven't answered if you want to define a server sufficiently to work out numbers. Nil Einne (talk) 12:02, 30 January 2011 (UTC)[reply]
I agree with BenRG. At any company, the default position is to keep everything secret which is internal to the company, unless there's a specific strategy underway to gain some sort of benefit by blabbing about the internal mechanisms of the company. Comet Tuttle (talk) 18:32, 31 January 2011 (UTC)[reply]