Computing desk
< December 26	<< Nov \| December \| Jan >>	December 28 >

Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

December 27

Encoding problem in text document recovery

I kept a large amount of Hebrew text documents on a hard disk With a certain encoding (do not remember with UTF8 or unicode or other), And I accidentally made a format that erased all the contents on the disk.

I used a recovery software that was able to recover photos, movies and more, However, in the restoration of the text documents in Hebrew, All documents were restored with ANSI encoding that turned everything into gibberish.

Is it possible to return them to the original state in Hebrew to be readable? 37.142.49.192 (talk) 09:59, 27 December 2020 (UTC)[reply]

Is the new encoding ISO/IEC 8859-1? There is not enough information for us to determine if the conversion was round-trip. The software may have used a round-trip method or one that threw away bytes it did not grok. UTF8 is a way (and the most common way) to encode Unicode in a file. Which software did you use to read the Hebrew text documents? Were the documents in Modern Hebrew or Classical Hebrew? If you could paste in a paragraph or two, code breakers among us might try to make sense of the gibberish. It would obviously help tremendously if you can associate some fragment with its equivalent in plaintext. --Lambiam 00:39, 28 December 2020 (UTC)[reply]

Lambiam, Thanks for the help.

1. In most cases, the new encoding looks like ISO / IEC 8859-1. But there are also documents with other weird coding types.

2. The text documents in Hebrew were originally created in a standard text document on a PC. A few months ago, when I transferred them to my current computer (Mac) and opened a text editor, everything became gibberish. So I went back to the pc I changed the encoding of the documents (UTF8 or unicode ), and then when I transferred to Mac it was fine.

3. The documents were in modern Hebrew (in some cases, parts of the text were in English or Arabic).

4. I bring photos for comparison:

The three lines in Hebrew here:

,

were at the beginning of the document that underwent restoration and became gibberish here:

.

מנחם.אל (talk) 17:16, 29 December 2020 (UTC)[reply]

In case it helps prospective code breakers, here is a list of the Unicode characters comprising the words displayed in the Hebrew alphabet, rendered in ASCII as HTML entities:

סכסוך = סכסוך
צבאי = צבאי
מערכה = מערכה
מלחמת = מלחמת
הסאהל = הסאהל
מלחמה = מלחמה

I see no feasible way to produce a similar digital version of the stream of characters from the second image; it stumps any OCR apps I know of. Can you copy the text into your copy buffer and paste it here? One would expect vanilla ASCII to escape conversion unscathed; it is not promising that we do not see [[...]] anywhere. Or does this occur in the parts that are cut off in the image? --Lambiam 18:05, 30 December 2020 (UTC)[reply]

Lambiam, In the meantime I have seen that I have the same problem with php files as well. I will attach text from php file and text from php file in gibberish here:

Original code in php: (I did not copy the whole file):

<?php

require __DIR__ . '/../../vendor/autoload.php';

use Elasticsearch\ClientBuilder;

$url=$_SERVER['REQUEST_URI'];

$parts = parse_url($url);

parse_str($parts['query'], $params);

$from = $params['from'];

$to = $params['to'];

$step = $params['step'];

Code in gibberish display:

3È uØáàT´|�´U|¥≠Ë2��&N'Eï‡Ñq�—2¿7ãFV<�çåpäh˛yxx.�H≠�Yªﬂﬁ ⁄MV+6«ó›Y:√�5�÷‚HºK]±v«„p∏ÔΩ)¢;?∆7™_ç�ΩπkÒ|5Õ¥.˝�>°T™§—#�ˇ’ÍTíıÿ��dR∏}ÿÁ–rÀF<-`�È�«ì�&∏O÷“Áô|U+FªÒ�¿

á=Ö‚ùXP®zö|ºc¸˜/ ¶LDUføû≤ƒî¥(�_o9�öΩ√∂

ÇRué∆•9�Å�j��˘"∂}ﬂ�hAgŒ�∞¿��÷‚�B>ﬁ∆ÒÑ�2≠áè?@6üâJWOe’�Ìüd$_ΩÌ∆#Bo…0∞C�[ÓÈß

ûÈ?˜wZøpŒiV6�¡Ñ�≤ ?€Ì≤ã-8OΩÇ iˇ≥b„�I\U<�òO�Üxø[π õqs˙ƒ÷YBu[a˜ú5m≥�«ÿËπ:∆ôK©˙óR� úÚ€‚Dúﬂzæ�õõ¥ö«÷™∑Æ¢°)—Ú§˝w�mõ˝$É†-FAúUŸÕÈW,'t?]ì#ñ.lVÏñâ=y÷ŒÔå∑YøP�@A©Å<Ôÿ/∂oê�fŒ�kˇ��„à¢¬eí$IìºÍO-“Ê

C≈£]î��∏

Ê�ﬂy]P%{á√¡FËäX�Vä‰˙?é¨Ü8�1DÚ¶ãÓ≤ˇ÷^\Iπ¿��f©ï™›ﬁ<u ﬁ◊MÉSA™WÏ⁄Ìº2`s˘oXZfl£Ãª��l?™�€�&ı~q�´r·;,˛©S�˝g£C»z=∑ôÙŒ-h≤˚.µK†xÇBŸ¥�‹£áQ∞/Ûƒ\¡ÄCé?¶?� âyk p�ÑzM�∑�Ôw/C Ωm|Ô¡�ò≈�C_Ì|`»°

You can see more details on this question (and a view of the gibberish) at stackoverflow מנחם.אל (talk) 11:39, 1 January 2021 (UTC)[reply]

I hope that this has inspired you to assemble a rigorous program of online+offline backup and recovery, so that in the future, you will not need to rely on forensics to rescue your valuable files. Elizium23 (talk) 14:32, 1 January 2021 (UTC)[reply]

Unfortunately, I can offer virtually no hope. What we see here cannot be the result of a simple encoding mismatch: all conventional text encodings leave the original ASCII characters untouched. In what follows, byte values are rendered as 2-digit hexadecimal numbers, from 00 to FF. Anyone wishing to analyze this should look at the wikitext, because it contains tab characters (hex 08) that are displayed as spaces. I tallied the occurrences of the byte values in the gibberish. The distribution is extremely skewed; clearly, the gibberish is not a product of data compression. Of the 256 possible byte values, 92 do not occur at all, including 00—08. In a random sequence of the same length, the expected value for the number of byte values with zero occurrences is only 2.68. In the range D0—FF only two byte values occur: 70 × E2 and 77 × EF. The absolute winner is 133 × C3, with the second place for, ex aequo, 77 × BF and EF (already mentioned), with bronze for E2 (also already mentioned). The average value of the 256 tallies is 4.57, so these tallies are truly exceptional outliers. The only glimmer of hope is that these bytes are apparently not just random, but I have no ideas for a process that could explain all this. --Lambiam 16:50, 1 January 2021 (UTC)[reply]

Save location of call logs (Android phone)

Where is the list of the last calls stored in an Android phone - or where can it possibly be stored? Motivation of the question: I have found a phone with SIM card removed and with the last call on the list displayed, say, in September. Assuming that there has not been an active deletion of entries on the list - can I assume that the phone has not been used for calls after September? Or may the list of later calls be gone with the SIM that had been removed? --Anonymous question (talk) 11:31, 27 December 2020 (UTC)[reply]

Do I understand the purpose of your question correctly? You found a phone, and instead of returning it, you are snooping around in it? ◅ Sebastian 13:53, 29 December 2020 (UTC)[reply]

They might not know whose it is. Trying to figure out when it was lost might help.--47.152.93.24 (talk) 02:36, 31 December 2020 (UTC)[reply]

SIMs only store contact information, not phone logs. Call logs are on internal storage along with all the other data on the phone. If you can't figure out whose it is, you could try contacting the mobile carrier if it's registered to one. Otherwise turn it over to law enforcement I guess. --47.152.93.24 (talk) 02:36, 31 December 2020 (UTC)[reply]

What is the SRAM cell size of an Intel 4004? (micron^2)

I am only finding mentions of later cell sizes with "Intel 4004" coincidentally on the page. Sagittarian Milky Way (talk) 16:36, 27 December 2020 (UTC)[reply]

It was manufactured using 10 μm process but information about the latter is patchy. Ruslik_Zero 20:41, 27 December 2020 (UTC)[reply]

Wikipedia:Reference desk/Archives/Computing/2020 December 27

Contents

December 27

Encoding problem in text document recovery

Save location of call logs (Android phone)

What is the SRAM cell size of an Intel 4004? (micron^2)