Wikipedia:Reference desk/Archives/Computing/2020 March 16

Computing desk
< March 15 << Feb | March | Apr >> March 17 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


March 16

edit

Binary spliting encoding/decoding recursive idea for data compression. Will it work?

edit

lets say we got two set of 0,1,2,3 possible digits n we want to encode them for an efficient way from a data compression point of view. to be able to decode the higher rank digit of this set of two encoded digits, then we might need that the encoding counting base be only 3.5, for example, and i addmit that for the lower one digit we might run out of some precision. Anyway, what i am trying to suggest is that the encoding/decoding order for the actions, may offer us the possibility of compressing data. for example, given some string of 2^n digits that are already encoded by some efficient way, then we might be able to decode it by first spliting it into 2 strings of 2^(n-1) digits , using some formulas that can be just a lil bit more complicated than the classic ones that can usualy be applied for such issues. Alrite, no need to thank me, no need to send me to h--k, just try to tell me how far i might be from a decent way of judging things, cz i already am about one week late for my Clopixol shot, n i feel like doing something about that too. Thank You, Florin747 (talk) 07:59, 16 March 2020 (UTC)[reply]

The question is in need of some clarifications. Do you mean a set as in mathematics (no duplicate elements) or a list of items? What are the items? Is each one a choice from the four-element set {0, 1, 2, 3}? Why two sets? Is this different than compressing each set individually? Otherwise this might as well be about seven sets, or 42 sets. Data compression only works if the data are not uniformly random, like for English text in which ation or icall are common letter sequences, but ngvba and vpnyy are not. (Also, could you take the effort of spelling out the words properly?)  --Lambiam 15:54, 16 March 2020 (UTC)[reply]

Alrite, I took My Clopixol yesterday, I feel more relaxed now about accomplishing that task of mine. I think I was talking about some counting system , with four digits, 0,1,2,3 and also some counting base,b, of some 3<b<4 which could allow us some data compression, by the condition of considerring some special order for decoding digits, the one that I put above about splitting a string/encoded number of 2^n digits into 2 strings of 2^(n-1) digits, assuming that is actually possible. I mean , this last part seems to be the hardest part of this paradigm. If it works , it may do data compression over and over, no matter how many time repeating, but this seems highly unlikely for some simple idea like this one. Thank You, for staying in touch, and please forgive me about my english, never was my strong point. Florin747 (talk) 06:59, 17 March 2020 (UTC)[reply]

If n > 0, then 2n = 2n−1 + 2n−1, so you can split a string of 2n digits into two strings of 2n−1 digits by splitting it in the middle: 23310012 → (2331, 0012), or in fact in any of a number of ways, such as "unzipping" by taking alternate digits: 23310012 → (2301, 3102). Again, as I said, you need some property of the input source that is not quite random for data compression to be possible. Nothing in what you write contains a suggestion in that direction.  --Lambiam 16:12, 17 March 2020 (UTC)[reply]

Hi, I guess I was trying to bring the problem at its most general way, I mean about the random numbers too. I am not so sure that it will be working , for the random data case. In any case, I guess that may serve for some study for the students, trying to find out of what could go any wrong about decoding some large number (by splitting it) in the given condition of some k digits and counting base b, k-1<b<k that are just about enough to asure that the maximum possible value of the lower half code, at least at the first look, shouldnt interfere with the value of the higher half code. Anyway, it is just about some study, something like of searching for the impossible, maybe just good enough to make students wonder about some possibilities. Thank You, Florin747 (talk) 07:40, 18 March 2020 (UTC)[reply]

... I guess that a good intermediary problem to be solved is that to see first if the above problem possible solution, in case of k=4 digits (0,1,2,3) and counting base of b=4.1 which should be normaly able to provide decent decoding by serialisation, might be also able to work for the variant of the splitting a larger code into two of its halves. If that work, THEN we might be able to try for the more hard problem that we got first, k-1<b<k. Alrite, thank You , for helping me making these more clear, into my mind. I guess that You are welcome with any more help. Thank You, Florin747 (talk) 08:13, 18 March 2020 (UTC)[reply]

My oh my, am I some dizzy mind from that Clopixol action, or what. Anyway, I tried my best to replace "k<b<k+1" with the one that You can see a few lines above. I am truly sorry, about the mistake(s). Thank You ! Florin747 (talk) 12:55, 18 March 2020 (UTC)[reply]

Pi symbol

edit

It runs in my mind that some Greek letters (at least pi and mu) have two identical-looking versions that are different Unicode characters: one is the actual letter, and the other is the mathematical symbol. If this is true, is π the letter or the symbol? And how would I learn which one it is? Note that the π character appears in the Pi (letter) article, but it links to Pi, which represents the mathematical constant. Nyttend backup (talk) 15:27, 16 March 2020 (UTC)[reply]

The articles Pi (letter) and Mu (letter) each have a section on character encodings. If you use your browser's search function on these pages after copying the character into its search box, it will direct you to the version in question. An online service for identifying Unicode characters in general is offered by BabelStone; probably there are others as well.  --Lambiam 16:21, 16 March 2020 (UTC)[reply]
Pi actually hasn't got separate Unicode characters, as far as I know. The symbol for the mathematical constant is supposed to be encoded by the normal Greek textual letter (i.e. U+03C0 π GREEK SMALL LETTER PI. There are a few alternates, such as U+03D6 ϖ GREEK PI SYMBOL, U+1D70B 𝜋 MATHEMATICAL ITALIC SMALL PI etc., but those are for other specialized purposes. In the case of Mu, there's U+00B5 µ MICRO SIGN, but as far as I understand its presence in Unicode is mostly due to legacy compatibility considerations (our article on Micro-#Symbol encoding in character sets has some of the gritty details). Fut.Perf. 21:57, 17 March 2020 (UTC)[reply]

Bioinformatics OSS needed for COVID-19 research

edit

What open-source bioinformatics software projects are important for COVID-19 research and in need of new contributors? NeonMerlin 16:49, 16 March 2020 (UTC)[reply]

This could be a good start. Wylie39 (talk) 17:53, 16 March 2020 (UTC)[reply]
@NeonMerlin: Does Folding@Home fit what you are looking for? See this: [1]. RudolfRed (talk) 19:28, 16 March 2020 (UTC)[reply]
Thanks. Both of those seem like good fits. NeonMerlin 16:17, 17 March 2020 (UTC)[reply]
Here is another project/dataset: White House seeks AI help in answering coronavirus questions, COVID-19 Open Research Dataset (CORD-19), COVID-19 Open Research Dataset Challenge (CORD-19) StrayBolt (talk) 16:59, 17 March 2020 (UTC)[reply]