Mathematics desk
< December 4	<< Nov \| December \| Jan >>	December 6 >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

December 5

Improving secrecy

This question is inspired by quantum cryptography, but you don't need to be familiar with that.

Alice and Bob have a method for generating a sequence of random bits $b_{0},\dots ,b_{n-1}$ that they both know, such that Eve only knows 50% of them. More formally, each $b_{i}$ is independently uniformly distributed, and Eve has probability 1/2 of knowing the value of $b_{i}$ (again, independent of anything else). For each bit, Eve knows whether or not she knows it.

If Alice and Bob want Eve to know fewer, they can split the sequence into two and perform bitwise addition mod 2. So $a_{i}=b_{2i}+b_{2i+1}\mod 2$ , for $i<n/2$ . Then Eve has probability 1/4 of knowing any given $a_{i}$ , but the sequence is only half as long.

In general, Alice and Bob can create a sequence of length $n/k$ where Eve has probability $2^{-k}$ of knowing any given bit. Note that if $l$ is the length of the sequence and $p$ is Eve's probability of knowing a bit, this maintains $l(-\log _{2}p)=n$ .

Now the question: Can Alice and Bob do better? Is there a process such that $l(-\log _{2}p)>n$ ? I suspect the answer is no, but I don't know how to show this.--72.80.81.137 (talk) 21:02, 5 December 2021 (UTC)[reply]

I should add: it's important that Eve's probability of knowing a given bit remains independent of other bits. Otherwise you could do much better.72.80.81.137 (talk) 21:46, 5 December 2021 (UTC)[reply]

I assume that Eve knows the recipe. In the most general formulation, each output bit is produced by a different function, each of which is a function of some subsequence (not necessarily a contiguous segment) from the original sequence, containing the bits that are used in the computation. So taking the output to be

c_{0},...,c_{m-1},

we could have

c_{4}=f_{4}(S_{4})=f_{4}(b_{3},b_{7},b_{8}),

in which each input bit is "live", meaning it may influence the outcome. Assume that for some pair

i,j,i\neq j,

S_{i}

and

S_{j}

share some bit

b_{k},

so

c_{i}=f_{i}(...,b_{k},...)

and

c_{j}=f_{j}(...,b_{k},...).

If we know that Eve knows

c_{i},

this increases the likelihood she knows

c_{j}

as well. This will not do for a solid proof, but the requirement of knowledge independence seems to imply that input bits cannot be "reused", but that each can contribute to one output bit only. This would means that you cannot do better than

\textstyle {\sum _{i}-\log _{2}p_{i}=n}.

--Lambiam 22:38, 5 December 2021 (UTC)[reply]

Thinking about how to formalize your argument, I came up with the following: Consider the probability space

\{0,1\}^{n}

, where an element indicates which bits of the original sequence are known to Eve. The measure of a singleton is

2^{-n}

. The probability of Eve knowing every bit of the new sequence is

p^{l}>0

, so there must be at least one point from the space giving this outcome, meaning

p^{l}\geq 2^{-n}

, which gives

p(-\log l)\leq n

, as desired. However, this is assuming that which bits Eve knows of the new sequence depends only on which she knew of the old sequence, and not on the values of the bits in the old sequence. I can't find a way to incorporate the values of the original bits without breaking independence, but I'm not seeing an argument that it's impossible.72.80.81.137 (talk) 05:35, 6 December 2021 (UTC)[reply]

Based on the view given by

c_{i}=f_{i}(S_{i}),

Eve knows

c_{i}

if and only if its value is the same for all assignments to

b_{0},\dots ,b_{n-1}

that agree with what Eve already knew of the original sequence – which I understood to mean that she knows the actual values of

b_{e}

for

e\in E,

where set

E\subseteq \{0\,...,n{-}1\}

is some set of indices, specific to Eve and known to her, the size of which is about

{\tfrac {1}{2}}n.

This means that the knowability of

c_{i}

does not depend on any values of bits of the original sequence that are not among those she is already privy to. If

c_{4}=~{\rm {if}}~b_{3}~{\rm {then}}~b_{7}~{\rm {else}}~b_{8},

and

3

and

7

are in

E

but

8

is not, then Eve knows

c_{4}

iff

b_{3}=1.

So knowability of an output bit can depend on the actual value of a (known) input bit. --Lambiam 10:52, 6 December 2021 (UTC)[reply]

Density of logical valid statements

Let say I can encode each statement in Propositional calculus, for example by letters and ignore syntax invalid statements (for example a->->b). I ask what is the density of the valid statements (in strings with size not larger than n)? --Exx8 (talk) 23:04, 5 December 2021 (UTC)[reply]

I don't think you're going to be able to come up with a satisfying expression in terms of n for a fixed n; it depends too much on the details of the coding.

You might have more luck with the asymptotic behavior — I would be strongly tempted to conjecture that logically valid statements have asymptotic density zero given any reasonable coding. That's basically because a valid statement has to be true in every structure of a given signature, even without imposing any non-logical axioms whatsoever on the structure. That seems like an unusual condition that's going to get more and more unusual as statements get more complex.

However I don't have either a proof or a source; maybe someone else does. --Trovatore (talk) 23:51, 5 December 2021 (UTC)[reply]

Oh, I just noticed that you specified the propositional calculus, whereas I was thinking of first-order predicate calculus. My conjecture is still the same, though. A valid statement has to be true no matter what values you assign to the propositional variables. Again that seems like something that's not going to happen very often. --Trovatore (talk) 23:53, 5 December 2021 (UTC)[reply]

Among the formulae of length 3, in which case it does not matter whether the notation is infix or Polish, when there are

v

propositional variables, the density of tautologies is

1

over

v{+}2.

--Lambiam 00:34, 6 December 2021 (UTC)[reply]

At least, this is the case using only logical constants

\top

(true),

\bot

(false),

\neg

,

\wedge

and

\vee

. The density

{\tfrac {1}{v{+}2}}

then also applies for formulae of length 1 and 2, making it plausible that this may remain constant for longer formulae. Adding

\rightarrow

may make a difference, but if the

{\tfrac {1}{v{+}2}}

conjecture holds for formulae with the original set of five constants, this too will have a non-vanishing asymptotic density. --Lambiam 04:02, 6 December 2021 (UTC)[reply]

With v propositional variables there are only 2^v propositional functions, so sure, if you cap the number of variables, it makes sense that you would have nonzero asymptotic density (maybe equal to 1/2^v?). But that doesn't seem like the natural reading of the question. As the size grows without bound, the number of variables would also be expected to grow without bound. --Trovatore (talk) 06:27, 6 December 2021 (UTC)[reply]

Then we need to be precise as to the formal language. In most texts, including our own article, the set of propositional variables is infinite. So then, of the infinitely many formulae of length 1, only one is a (trivial) tautology. For formulae of length greater than 2, we additionally need to specify a distribution over the formulae; otherwise one can enumerate them so as to asymptotically achieve any desired density between 0 and 1. --Lambiam 07:40, 6 December 2021 (UTC)[reply]

Can you clarify whether you’re asking about the density of syntactically valid statements (i.e. well-formed formulae) or logically valid statements (i.e. tautologies)? Regarding Lambiam’s point about variable names, one natural approach would be to consider two strings equivalent if they differ only in variable names. Quick calculation assuming this equivalence: Suppose we only have the conjunction and negation, as well as parentheses and variables (although I suspect the end result will be similar regardless of the details of the encoding). Let

f(n)

be the number of syntactically valid formulas of length n. Then

f(n)

is approximately

f(n/2)^{2}+f(n-1)

(the first term coming from making conjunctions, and the second from negations). This should grow around

2^{n}

. The number of total strings will grow at least like

4^{n}

, since we have four non-variable symbols. So the density of syntactically valid formulas will go to 0 exponentially fast. JoelleJay (talk) 22:29, 6 December 2021 (UTC)[reply]

(And logically valid formulas go to 0 even faster.) JoelleJay (talk) 22:46, 6 December 2021 (UTC)[reply]

@JoelleJay: As I read the question, it wants to know the frequency of logically valid formulas as a fraction of the number of syntactically valid formulas -- in other words, a quantification of how much faster "even faster" is. --JBL (talk) 16:35, 7 December 2021 (UTC)[reply]

Looking at the original question again I agree with you, somehow I skipped over the "ignore" before "syntax invalid statements". I still think it should go to 0 very quickly. JoelleJay (talk) 17:22, 7 December 2021 (UTC)[reply]

I ran a little experiment. For simplicity, I used an encoding with a single operator, the Sheffer stroke (NAND), which I'll denote by the symbol §. As suggested, formulae that are interconvertible by variable substitution are identified with each other, so a§(b§a) ≡ c§(a§c). The formulae a§(b§b) and (b§b)§a are not identified, even though interconvertible if appealing to the commutativity of §. Experimental results:

n : x = T / S

1: 0.00000000 = 0 / 1

2: 0.00000000 = 0 / 2

3: 0.20000000 = 2 / 10

4: 0.00000000 = 0 / 75

5: 0.05494505 = 40 / 728

6: 0.00821018 = 70 / 8526

7: 0.02052452 = 2376 / 115764

8: 0.00662590 = 11768 / 1776060

9: 0.00871389 = 263510 / 30240210

10: 0.00405652 = 2287352 / 563870450

11: 0.00406548 = 46335296 / 11397261720

(Legend: n denotes the number of slots for variables, and x is the result of computing the fraction T/S, where T is the number of tautologies and S the number of syntactically valid formulae.)

The sequence S_n is OEIS sequence A289679. The sequence T_n is seen to oscillate, but I conjecture that it will eventually descend monotonically. The results appear to confirm the hypothesis that the density goes to zero, but do not suggest an order of decrease, such as ~ n^−λ or ~ exp(−λn). --Lambiam 16:53, 9 December 2021 (UTC)[reply]

Wikipedia:Reference desk/Archives/Mathematics/2021 December 5

Contents

December 5

Improving secrecy

Density of logical valid statements