Wikipedia:Reference desk/Archives/Mathematics/2012 September 25

Mathematics desk
< September 24 << Aug | September | Oct >> September 26 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


September 25

edit

a thousand coins a thousand times?

edit

This is not a homework question, and it's probably a little frivolous, so if you don't like it don't answer. But I'm curious, how would you go about solving this problem: Toss a thosand coins a thousand times, on average, what is the longest streak of either heads or tails? If it's not clear, you toss one coin a thousand times, in that sequence you get a streak of 13 heads. Repeat that a thousand times and take the average of the longest streaks. I could maybe get a decent answer if I wrote a program, but it would take a few hours for me to work it out. Is there a mathy way to do this with a few equations? Vespine (talk)

A sequence of identical outcomes is known as a run. This page from Wolfram Mathworld reviews the methods that are needed to answer your question. The simplest way to get an answer is to use formula 15 from that page, imagining that all of your trials are concatenated together, so that the whole experiment is reduced to a sequence of a million coin tosses. Plugging into the formula gives an answer of about 13.1. That will be a little bit of an overestimate, because it includes runs that cross trial boundaries, but for runs of this length those will only be about 1% of the total. Looie496 (talk) 05:19, 25 September 2012 (UTC)[reply]
I ran a simulation with a million trials of 1000 flips each, and got 10.3 for the average length of the longest run of either heads or tails in each trial. If you only allow heads runs or only allow tails runs, then it drops to 9.3. This information is only intended as a check against the formula answer given above. StuRat (talk) 06:05, 25 September 2012 (UTC)[reply]
A pseudo-random generator would not be suited for this kind of simulations I think. They won't generate all possible sequences.
Looie496: You seem to have used the natural logarithm; it should be the logaritm base 1/p = 2 giving 18.9 Ssscienccce (talk) 06:21, 25 September 2012 (UTC)[reply]
A way to show that 13.1 would likely be too low is: assume trials of 15 throws: on average 1 out of 32768 would give all 15 heads. Thats for a total of 491520 throws; so a million throws would on average give two runs of 15 heads, and that's without counting the more frequent runs crossing boundaries. Ssscienccce (talk) 06:58, 25 September 2012 (UTC)[reply]
How do you get 491520 throws ? And how did you get 2 out of a million from that ? As for the average longest run being 18.9, that seems way too high. The chances of getting exactly 19 heads in a row, are, by my calculations: (1001-19)/(0.5)19 = 982/524288 = 0.00187, or 0.00374 for either a run of heads or tails. That's just too unlikely to be the average run length. StuRat (talk) 08:19, 25 September 2012 (UTC)[reply]
Yep, mistake, I was responding to what I assume Looie was talking about, based on the formula he mentioned and the result he got. As far as I can tell, the only way to get 13.1 would be ln(1000000*(1-0.5)) Ssscienccce (talk) 09:02, 25 September 2012 (UTC)[reply]
I see. What do you get when you plug in 1000 flips instead of a million ? (I don't know how to do log2(500) on my calculator.) StuRat (talk) 09:13, 25 September 2012 (UTC)[reply]
That gives 8.96 for a specific outcome (head or tails), 9.96 if it doesn't matter. You can calculate the logb(x) with log(x)/log(b) or ln(x)/ln(b) Ssscienccce (talk) 10:19, 25 September 2012 (UTC)[reply]
I agree with StuRat, it's 10.4 in my simulation. For the 1000 trials, the longest runs are 7 in sequence (17 times), 8 (104), 9 (237), 10 (245), 11 (137), 12 (95), 13 (62), 14 (35), 15 (16), 16 (9), 17 (2), 18 (3), 19 (1), 20 (1). Dragons flight (talk) 08:01, 25 September 2012 (UTC)[reply]
When I try to do it mathematically, from scratch, I get approximately 13, as well:
Let's just look at the case of getting 13 heads in a row. To do that out of 13 throws would be a 1/213 probability, or 1/8192. Now, the number of positions where you could possibly get a run of 13 is 1001-13 or 988. So, the probability of 13 heads appearing anywhere within the 1000 flips is approximately 988/8192, or 0.1206. If we consider the possibility of either 13 heads or tails in a row, then we double the probability to 0.2412. So, you then need to repeat these calculations for every possible run length, from 1 to 1000, then combine this info together to figure out the average run. Using the above method as a reasonable approximation, I wrote a program to do the math for me (no simulation, just crunching formulas). So, I took the probability of having a 1000 coin run, added that to the probability of a 999 coin run, etc., until I got to a 50% chance. I get a result of about 13.
So, we have an odd discrepancy between 13.0-13.1 in theory and 10.3-10.4 in simulations (and Ssscienccce's answer of 18.9 seems way high). I suspect the difference is due to the handling of multiple runs of the same length (the math is counting these multiple times, when it should only count them once). I believe the experimental result to be the correct one. StuRat (talk) 05:39, 25 September 2012 (UTC)[reply]
I think Ssscienccce is calculating the expected value of the longest run in a sequence of 1,000,000 trials. However, what Vespine is asking for is the expected value of the average of the longest run in a set of 1,000 sequences of 1,000 trials each. By the law of large numbers, this will be close to the expected value of the longest run in a single sequence of 1,000 trials. Gandalf61 (talk) 08:28, 25 September 2012 (UTC)[reply]
Yep, that's what I thought Looie was talking about. (Seems that random number generators are better than I assumed btw.) Ssscienccce (talk) 08:32, 25 September 2012 (UTC)[reply]
StuRat's simulation seems to deal with tossing 1 coin 1000 times. The OP asked about tossing 1000 coins 1000 times and taking the max, which is higher.
I ran 20 trials with 1000 coins and got an average of 20.25 and stdev 1.5, meaning the true mean is likely to be between 19.5 and 21. -- Meni Rosenfeld (talk) 09:41, 25 September 2012 (UTC)[reply]
His clarification was "you toss one coin a thousand times, in that sequence you get a streak of 13 heads. Repeat that a thousand times and take the average of the longest streaks." That seems to support my interpretation. StuRat (talk) 09:47, 25 September 2012 (UTC)[reply]
Sorry, I see that now and you beat me to deleting my comment. -- Meni Rosenfeld (talk) 09:50, 25 September 2012 (UTC)[reply]
Let me explain in more detail what I think is at the root of the discrepancy between my mathematical answer (which I think is wrong) and my experimental simulation answer (which I think is right). Consider a simplified problem where we toss a coin 4 times and want to know the probability of getting either 3 heads in a row or 3 tails in a row. Let's list every possible combo and mark those with 3 in a row with a star:
HHHH *
HHHT *
HHTH
HHTT
HTHH
HTHT
HTTH
HTTT *
THHH *
THHT
THTH
THTT
TTHH
TTHT
TTTH *
TTTT *
So, we have 6/16 cases which match our goal, giving us a 3/8 probability. Now let me use the same math I used previously. The chance of flipping 3 in a row heads out of 3 is 1/8. The chances of flipping 3 in a row either heads or tails out of 3 is 1/4. There are two ways to get 3 in row, either the first 3 of 4 flips, or the last 3 of 4 flips. So, multiply that 1/4 probability by 2 and we get a 1/2 probability. Therefore, the math predicts a higher probability than it should. I believe the reason for this is that the first and last cases listed above each have 2 series of 3 in a row, overlapping. So, instead of 6/16 cases, we get 8/16 cases.
Now let's look at the longest run in each case above:
HHHH 4
HHHT 3
HHTH 2
HHTT 2
HTHH 2
HTHT 1
HTTH 2
HTTT 3
THHH 3
THHT 2
THTH 1
THTT 2
TTHH 2
TTHT 2
TTTH 3
TTTT 4
The average length of a run is therefore 2.375. I don't see how you get that from the formula R = log(1/p)[n(1-p)] listed in the Wolfram Mathworld link. So, either I don't understand how to use the formula, it's incorrect, or should not be applied to this problem. StuRat (talk) 10:25, 25 September 2012 (UTC)[reply]
The formula gives the longest run most likely to get, but to calculate the average of the longest runs, I don't see an obvious way... Ssscienccce (talk) 11:48, 25 September 2012 (UTC)[reply]
What do you mean by "longest run most likely to get"? The expected longest run (~ average of longest runs over many trials) will be a rational number, whereas that Mathworld formula in general yields an irrational number, so it either means something else (don't know what) or it is incorrect. 86.176.214.152 (talk) 12:52, 25 September 2012 (UTC)[reply]
We can calculate the expected length of the longest run directly as follows:
  1. Consider a sequences of 1,000 heads and tails in which all the runs are of length k or less. We can map this to a restricted composition of 1,000 in which all the components are less than or equal to k.
  2. Suppose there are f(k) such compositions. Then there are 2f(k) sequences in which the length of the longest run is less than or equal to k (the factor of 2 occurs because the first toss can be either H or T, but after that the sequence is entirely determined by the composition).
  3. We can calculate f(k) using k-step Fibonacci numbers.
  4. So the number of sequences in which the length of the longest run is exactly k is g(k) = 2(f(k) - f(k-1)).
  5. There are a total of 21000 sequences, so to find the expected value of the longest run we sum k*g(k) and divide by 21000. In practice, we don't need to sum all 1000 terms, as only the first 100 terms make any significant contribution to the sum (because the probability of seeing a run of 100 or more is negligible).
With a little help from Python, the expected length of the longest run in a sequence of 1,000 throws turns out to be 10.298615 (to 6 d.p.), which is close to the experimental results. Gandalf61 (talk) 13:10, 25 September 2012 (UTC)[reply]
... and the expected distribution of the length of the longest run in 1,000 sequences of 1,000 trials each is 0, 0, 0, 0, 0, 0, 18, 121, 237, 239, 170, 101, 55, 29, 15, 7, 4, 2, 1, 0 (these sum to 999 due to rounding) for lengths of 1 to 20, which is close to the distribution observed by Dragons flight above. Gandalf61 (talk) 13:52, 25 September 2012 (UTC)[reply]


It's actually easy to approximate this because the longest streak in the sequence of N = 1000 coin throws will obviously have a low frequency, so one can ingore the proabability of overlap. This means that you can use the Poisson distribution. The probability of some given coin being the starting point of a streak of length n is 2^[-(n+1)]. The total number of configurations is 2^1000 as there are two ways to choose the state of each coin. If you condition on there being a streak of length n starting at some coin, the state of that coin is fixed to be different from the previous coin, all the n coins are fixed to be in that state, but also the first coin not in the treak is fixed to be different from the coins in the streak. So, the state of n+1 coins are fixed, and you are thus left with N-(n+1) free to chose states of coins, The probability is thus

 

The expected number of streaks is thus

 

If we impose periodic boundary conditions so that every coin can be the starting point (for large N this doesn't matter much). Then the actual number of streaks in sequence of N coins is approximately a Poisson process, the probability of there being k streaks of length n is thus:

 

Then the probability that the largest streak that appears has length n is the probability that you have zero streaks of lengths larger than n, while having more than zero streaks of length n. The probability of that is

 

Using

 

we can write q(n) as:

 

Count Iblis (talk) 16:53, 25 September 2012 (UTC)[reply]

According to this (approximate) formula for q(n), the average is 10.2985, and the standard deviation is 1.8727. Count Iblis (talk) 18:10, 25 September 2012 (UTC)[reply]

OK, we now have two different mathematical results and two different experimental simulation results, all of which give the answers as around 10.3, so we seem to have a solution. I will mark this Q resolved. StuRat (talk) 19:24, 25 September 2012 (UTC)[reply]
  Resolved
Excellent replies! Thanks heaps, I only just got back here since yesterday. I also realized that my question and title were not quite the same thing, sorry about that, but wouldn't they yield pretty much the same answer? Vespine (talk) 00:54, 26 September 2012 (UTC)[reply]
The title is too vague as a question to be able to answer this. You'd have to clarify. — Quondum 04:04, 26 September 2012 (UTC)[reply]
Your initial statement of the problem was vague: "Toss a thousand coins a thousand times, on average, what is the longest streak of either heads or tails?". This could mean the average of the longest streaks with each individual coin (which is what I think you meant), or it could mean you concatenate all the coin tosses together, to effectively toss one coin a million times, and find the longest streak for all that. This is how Meni interpreted it, and it produces a longer streak, somewhere around 20. Your clarification afterwards removed the ambiguity. StuRat (talk) 04:48, 26 September 2012 (UTC)[reply]

Error in published paper?

edit

According to page 314 of

Garza, G.; Young, J. (2004), "Wieferich Primes and Period Lengths for the Expansions of Fractions", Math. Mag., 77 (4): 314–319, doi:10.2307/3219294,

the period p of a number x in base b is the smallest number p such that bp ≡ 1 (mod x). If x is a prime number satisfying bx−1 ≡ 1 (mod x2), then x is called a Wieferich prime in base b. They claim at the beginning of the paper that the period of x = 1093 (the first base 2 Wieferich prime) is 1092 and that this is also the period of x2. However, I do not understand how this fits with the result of W. Meissner, who in

Meissner, W. (1913), "Über die Teilbarkeit von 2p − 2 durch das Quadrat der Primzahl p=1093", Sitzungsber. D. Königl. Preuss. Akad. D. Wiss. (in German), Zweiter Halbband. Juli bis Dezember, Berlin: 663–667

shows that 2364 ≡ 1 (mod 10932). So why is the period of 1093 1092 and not 364? -- Toshio Yamaguchi (tlkctb) 10:28, 25 September 2012 (UTC)[reply]

To clarify, the term period here refers to the period of the expansion of  . -- Toshio Yamaguchi (tlkctb) 10:45, 25 September 2012 (UTC)[reply]

My guess: They are a bit casual with the word period, using it to mean a number p (not necessarily the smallest one) such that bp ≡ 1 (mod x). Whether a smaller number exists that satisfies the condition is likely not relevant in the context (I haven't seen the paper). It's hard to believe they would write a paper on Wieferich primes and not know about 364, since that's how 1093 was discovered. Ssscienccce (talk) 03:54, 26 September 2012 (UTC)[reply]

vectors and matrices

edit

given a vector vi i∈[1,N] is there an algebraic operation that yields the matrix diag[v1, v2 ... vN ]? — Preceding unsigned comment added by 92.11.64.50 (talk) 12:35, 25 September 2012 (UTC)[reply]

Does Σi ei eiT v eiT work? — Preceding unsigned comment added by 92.11.64.50 (talk) 12:47, 25 September 2012 (UTC)[reply]
I think it does! Thanks mr. ref desk — Preceding unsigned comment added by 92.11.64.50 (talk) 13:02, 25 September 2012 (UTC)[reply]
You're welcome. :) — Preceding unsigned comment added by 92.11.64.50 (talk) 13:02, 25 September 2012 (UTC)[reply]
Guess I'm too late .. vT(1,1,1,1..)In (Hadamard product) Ssscienccce (talk) 15:50, 25 September 2012 (UTC)[reply]
Let me provide a link to the above Hadamard product and mark this Q resolved. StuRat (talk) 19:20, 25 September 2012 (UTC)[reply]
  Resolved

Maximing product x*y under assumption x y<=A(B-x)(B-y)

edit

Hello,

I am trying to find an elegant solution to the following problem.

Suppose A and B are positive real numbers. Assuming x and y are nonnegative real numbers, what is the supremum/maximum for values of  , assuming  

The clue is that x and y have to be equal, but how to prove this elegantly? I considered the AM-GM inequality, but that didn't work. Is there are more general approach to problems like this?

Many thanks, Evilbu (talk) 21:08, 25 September 2012 (UTC)[reply]

Let r = A / B. Using algebra, your condition becomes x + y < r, or y < r - x.[read < as less than or equal] Obviously, for a fixed x, xy is larger for larger y, thus given x, the maximal product is x(r - x). Hence, we need only find the maximum for x(r - x); we can do this by finding where the derivative is 0: r - 2x = 0 iff x = r / 2. Finally, y = r - x = r / 2, hence x = y. When you say "more general approach for problems like this", what kind of problems do you mean exactly?Phoenixia1177 (talk) 21:28, 25 September 2012 (UTC)[reply]
Wow! I can't believe I forgot about the A in front, sorry about that, I feel foolish now.Phoenixia1177 (talk) 03:06, 26 September 2012 (UTC)[reply]
The previous analysis is only valid if A=1; then the constraint is x + y ≤ B. If A > 1, then a convexity argument can be used; if (x, y) = (a, b) is a solution, then (by symmetry) so is (b, a), and then, by convexity, so is   which, by the AM-GM inequality, has a larger value of the objective function. It could still be accurate, but that argument won't work. — Arthur Rubin (talk) 21:50, 25 September 2012 (UTC)[reply]
As per the above AM-GM inequality-based demonstration, we must have x=y. Then the constraint is x^2 < or = A(B-x)^2 hence x < or = sqrt(A)|B-x|. There are 3 cases: If A>1, the constraint is, when x>B, x < or = -B.sqrt(A)+x.sqrt(A), hence x > or = Bsqrt(A)/[sqrt(A) - 1]. So let x go to infinity--the problem is unbounded. If A=1, the constraint becomes x^2 < or = B^2-2Bx+x^2 hence x < or = B/2. So set x=B/2. And if A<1, if x>B the constraint is x < or = (x-B).sqrt(A) hence x.(1 - sqrt(A)) < or = -B.sqrt(A) hence a non-negative is less than or equal to a negative, which has no solution. If you set x=B,the constraint is violated. So let x<B. Then the constraint is x < or = sqrt(A).B - sqrt(A).x hence x(1+sqrt(A)) < or = B.sqrt(A). So set x = B.sqrt(A)/[1+sqrt(A)].
Not very elegant, but I doubt there's a more elegant way to approach it than that. Duoduoduo (talk) 23:06, 25 September 2012 (UTC)[reply]
As I said, the AM-GM inequality doesn't directly work unless A ≥ 1. However, something can probably be done.... — Arthur Rubin (talk) 00:55, 26 September 2012 (UTC)[reply]
For any A: If an optimum (x*,y*) exists, there exists (x*+y*). Now could x1, y1 with x1 < y1 be the optimum? No, because if we infinitesimally increase x and decrease y by the same amount,0 < dx = -dy and we have the same (x+y) = (x*+y*), but d(xy) = y dx +x dy = (y-x)dx > 0, so increasing x and decreasing y toward each other increases the objective function value. Thus x<y, and by symmetry y<x, cannot characterize the optimum. Duoduoduo (talk) 02:27, 26 September 2012 (UTC)[reply]
Nice try. If A < 1, your infinitesimal change might move it outside the constraint. — Arthur Rubin (talk) 03:48, 26 September 2012 (UTC)[reply]
(ec)The above argument must be supplemented when A<1. The constraint is xy <= A(B-x)(B-y) hence (1-A)xy <= AB^2 - AB(x+y). If A = 1 or A > 1, holding (x+y) fixed while varying x and y to increase xy does not cause the constraint to be violated. Also if A < 1 and the constraint is not binding, the change does not violate the constraint. But suppose we have x1 and y 1 such that x1 < y 1 (not necessarily summing to (x*+y*)), but say the constraint is binding at x1, y 1. Differentiate the constraint (as an equality) to get dx = dy{-AB-(1-A)x} / [(1-A)y+AB]. Use this in d(xy) = x.dy + y.dx to get d(xy) = dy(x-y)AB / [(1-A)y+AB]. With A<1 and x<y and dx>0 and dy<0 so as to stay on the binding constraint, we get d(xy) > 0, so increasing x and decreasing y toward each other and staying on the constraint does increase the objective function xy; hence x<y could not characterize the optimum in this case either. Duoduoduo (talk) 04:08, 26 September 2012 (UTC)[reply]
Wouldn't it be more convenient to: Let   and  , then: z= x/r =yr or y=z/r and x=rz
A(B-x)(B-y)=A(B-rz)(B-z/r)= A(B2+z2) -ABz(r+1/r); for a given value of xy, only r is variable, and r+1/r is minimal when r equals 1. Ssscienccce (talk) 06:23, 26 September 2012 (UTC)[reply]
I have a question to Arthur Rubin: how exactly did you use the convexity argument (I am only familiar with convexity of functions in one variable) Evilbu (talk) 06:07, 26 September 2012 (UTC)[reply]
There's a simpler approach. If, instead of setting dx = -dy, we set dx/x = -dy/y, then xy is constant, but x+y decreases (as x and y move toward each other), so the constraint is still met (LHS = constant, RHS increases). Hence any optimum might as well have x = y.
On convexity; the claim is, if A ≥ 1, then the region meeting the constraint is a convex set. Actually, that's not right. Let me give the argument I had in mind for A ≥ 1. The constraint can be rewritten as  . If (a, b) meets the condition, then  .
  by the AM-GM inequality. Hence
 
so   meets the constraint, but has a larger x y.
Rewriting the infintesimal version above in a macroscopic manner, if   meets the constraints, then so does   (by the AM-GM inequality), and has the same value of the objective function  . Hence we can take  . — Arthur Rubin (talk) 10:24, 26 September 2012 (UTC)[reply]

Show that a Fourier series converges at particular points

edit

I can show that the Fourier series   converges at all points in the interval   for all   by the Dirichlet test, except the points   and  . How do you show the series converges for those particular values? Widener (talk) 23:10, 25 September 2012 (UTC)[reply]

Actually, it's except x = a and x = b. — Arthur Rubin (talk) 03:47, 26 September 2012 (UTC)[reply]
In detail, it follows by splitting the sum into n > 0 and n < 0. The sum for n > 0 can be verified by:
 
 
and
 
Arthur Rubin (talk) 10:09, 26 September 2012 (UTC)[reply]
Yes, that's the Dirichlet test.
I didn't phrase the question right. Let me rephrase it:
"I am already able to show that the Fourier series   converges at all points in the interval   for all   using the Dirichlet test, except the test doesn't work for the points   and  . How do you show the series converges for the values   and  ?"
In other words, show that the series   (equals   I think) converges. Widener (talk) 20:36, 26 September 2012 (UTC)[reply]
By the way, the mistake I made earlier (where I thought the test didn't work for -a and -b) was that I forgot to make the n in e^{inx} negative when I wrote the part for n<0. Widener (talk) 20:39, 26 September 2012 (UTC)[reply]
Well, I don't think it converges in the usual sense:
  converges, but
 
Which one of these expressions is
 
is a matter of interpretation; I'd usually assume the latter, but some sources assume the former. — Arthur Rubin (talk) 10:12, 27 September 2012 (UTC)[reply]
It is the former. Widener (talk) 10:27, 27 September 2012 (UTC)[reply]