Wikipedia:Reference desk/Archives/Mathematics/2021 December 5

Mathematics desk
< December 4 << Nov | December | Jan >> December 6 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


December 5

edit

Improving secrecy

edit

This question is inspired by quantum cryptography, but you don't need to be familiar with that.

Alice and Bob have a method for generating a sequence of random bits   that they both know, such that Eve only knows 50% of them. More formally, each   is independently uniformly distributed, and Eve has probability 1/2 of knowing the value of   (again, independent of anything else). For each bit, Eve knows whether or not she knows it.

If Alice and Bob want Eve to know fewer, they can split the sequence into two and perform bitwise addition mod 2. So  , for  . Then Eve has probability 1/4 of knowing any given  , but the sequence is only half as long.

In general, Alice and Bob can create a sequence of length   where Eve has probability   of knowing any given bit. Note that if   is the length of the sequence and   is Eve's probability of knowing a bit, this maintains  .

Now the question: Can Alice and Bob do better? Is there a process such that  ? I suspect the answer is no, but I don't know how to show this.--72.80.81.137 (talk) 21:02, 5 December 2021 (UTC)[reply]

I should add: it's important that Eve's probability of knowing a given bit remains independent of other bits. Otherwise you could do much better.72.80.81.137 (talk) 21:46, 5 December 2021 (UTC)[reply]
I assume that Eve knows the recipe. In the most general formulation, each output bit is produced by a different function, each of which is a function of some subsequence (not necessarily a contiguous segment) from the original sequence, containing the bits that are used in the computation. So taking the output to be   we could have   in which each input bit is "live", meaning it may influence the outcome. Assume that for some pair     and   share some bit   so   and   If we know that Eve knows   this increases the likelihood she knows   as well. This will not do for a solid proof, but the requirement of knowledge independence seems to imply that input bits cannot be "reused", but that each can contribute to one output bit only. This would means that you cannot do better than    --Lambiam 22:38, 5 December 2021 (UTC)[reply]
Thinking about how to formalize your argument, I came up with the following: Consider the probability space  , where an element indicates which bits of the original sequence are known to Eve. The measure of a singleton is  . The probability of Eve knowing every bit of the new sequence is  , so there must be at least one point from the space giving this outcome, meaning  , which gives  , as desired. However, this is assuming that which bits Eve knows of the new sequence depends only on which she knew of the old sequence, and not on the values of the bits in the old sequence. I can't find a way to incorporate the values of the original bits without breaking independence, but I'm not seeing an argument that it's impossible.72.80.81.137 (talk) 05:35, 6 December 2021 (UTC)[reply]
Based on the view given by   Eve knows   if and only if its value is the same for all assignments to   that agree with what Eve already knew of the original sequence – which I understood to mean that she knows the actual values of   for   where set   is some set of indices, specific to Eve and known to her, the size of which is about   This means that the knowability of   does not depend on any values of bits of the original sequence that are not among those she is already privy to. If   and   and   are in   but   is not, then Eve knows   iff   So knowability of an output bit can depend on the actual value of a (known) input bit.  --Lambiam 10:52, 6 December 2021 (UTC)[reply]

Density of logical valid statements

edit

Let say I can encode each statement in Propositional calculus, for example by letters and ignore syntax invalid statements (for example a->->b). I ask what is the density of the valid statements (in strings with size not larger than n)? --Exx8 (talk) 23:04, 5 December 2021 (UTC)[reply]

I don't think you're going to be able to come up with a satisfying expression in terms of n for a fixed n; it depends too much on the details of the coding.
You might have more luck with the asymptotic behavior — I would be strongly tempted to conjecture that logically valid statements have asymptotic density zero given any reasonable coding. That's basically because a valid statement has to be true in every structure of a given signature, even without imposing any non-logical axioms whatsoever on the structure. That seems like an unusual condition that's going to get more and more unusual as statements get more complex.
However I don't have either a proof or a source; maybe someone else does. --Trovatore (talk) 23:51, 5 December 2021 (UTC)[reply]

Oh, I just noticed that you specified the propositional calculus, whereas I was thinking of first-order predicate calculus. My conjecture is still the same, though. A valid statement has to be true no matter what values you assign to the propositional variables. Again that seems like something that's not going to happen very often. --Trovatore (talk) 23:53, 5 December 2021 (UTC)[reply]
Among the formulae of length 3, in which case it does not matter whether the notation is infix or Polish, when there are   propositional variables, the density of tautologies is   over    --Lambiam 00:34, 6 December 2021 (UTC)[reply]
At least, this is the case using only logical constants   (true),   (false),  ,   and  . The density   then also applies for formulae of length 1 and 2, making it plausible that this may remain constant for longer formulae. Adding   may make a difference, but if the   conjecture holds for formulae with the original set of five constants, this too will have a non-vanishing asymptotic density.  --Lambiam 04:02, 6 December 2021 (UTC)[reply]
With v propositional variables there are only 2v propositional functions, so sure, if you cap the number of variables, it makes sense that you would have nonzero asymptotic density (maybe equal to 1/2v?). But that doesn't seem like the natural reading of the question. As the size grows without bound, the number of variables would also be expected to grow without bound. --Trovatore (talk) 06:27, 6 December 2021 (UTC)[reply]
Then we need to be precise as to the formal language. In most texts, including our own article, the set of propositional variables is infinite. So then, of the infinitely many formulae of length 1, only one is a (trivial) tautology. For formulae of length greater than 2, we additionally need to specify a distribution over the formulae; otherwise one can enumerate them so as to asymptotically achieve any desired density between 0 and 1.  --Lambiam 07:40, 6 December 2021 (UTC)[reply]
Can you clarify whether you’re asking about the density of syntactically valid statements (i.e. well-formed formulae) or logically valid statements (i.e. tautologies)? Regarding Lambiam’s point about variable names, one natural approach would be to consider two strings equivalent if they differ only in variable names. Quick calculation assuming this equivalence: Suppose we only have the conjunction and negation, as well as parentheses and variables (although I suspect the end result will be similar regardless of the details of the encoding). Let   be the number of syntactically valid formulas of length n. Then   is approximately   (the first term coming from making conjunctions, and the second from negations). This should grow around  . The number of total strings will grow at least like  , since we have four non-variable symbols. So the density of syntactically valid formulas will go to 0 exponentially fast. JoelleJay (talk) 22:29, 6 December 2021 (UTC)[reply]
(And logically valid formulas go to 0 even faster.) JoelleJay (talk) 22:46, 6 December 2021 (UTC)[reply]
@JoelleJay: As I read the question, it wants to know the frequency of logically valid formulas as a fraction of the number of syntactically valid formulas -- in other words, a quantification of how much faster "even faster" is. --JBL (talk) 16:35, 7 December 2021 (UTC)[reply]
Looking at the original question again I agree with you, somehow I skipped over the "ignore" before "syntax invalid statements". I still think it should go to 0 very quickly. JoelleJay (talk) 17:22, 7 December 2021 (UTC)[reply]
I ran a little experiment. For simplicity, I used an encoding with a single operator, the Sheffer stroke (NAND), which I'll denote by the symbol §. As suggested, formulae that are interconvertible by variable substitution are identified with each other, so a§(b§a) ≡ c§(a§c). The formulae a§(b§b) and (b§ba are not identified, even though interconvertible if appealing to the commutativity of §. Experimental results:
 n : x                  = T / S
  1: 0.00000000 = 0 / 1
  2: 0.00000000 = 0 / 2
  3: 0.20000000 = 2 / 10
  4: 0.00000000 = 0 / 75
  5: 0.05494505 = 40 / 728
  6: 0.00821018 = 70 / 8526
  7: 0.02052452 = 2376 / 115764
  8: 0.00662590 = 11768 / 1776060
  9: 0.00871389 = 263510 / 30240210
10: 0.00405652 = 2287352 / 563870450
11: 0.00406548 = 46335296 / 11397261720
(Legend: n denotes the number of slots for variables, and x is the result of computing the fraction T/S, where T is the number of tautologies and S the number of syntactically valid formulae.)
The sequence Sn is OEIS sequence A289679. The sequence Tn is seen to oscillate, but I conjecture that it will eventually descend monotonically. The results appear to confirm the hypothesis that the density goes to zero, but do not suggest an order of decrease, such as ~ n −λ or ~ exp(−λn).  --Lambiam 16:53, 9 December 2021 (UTC)[reply]