Wikipedia:Reference desk/Archives/Mathematics/2013 June 24

Mathematics desk
< June 23 << May | June | Jul >> June 25 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


June 24 edit

Derivation of a ternary plot from Cartesian coordinates edit

Is there a simpler proof than using a vector equation of the distance from a point to a line (below) to show that a' , b' and c' is linearly proportional to a, b and c, respectively? Thanks, cmɢʟee୯ ͡° ̮د ͡° ੭ 11:42, 24 June 2013 (UTC)[reply]

Try similar triangles. Sławomir Biały (talk) 12:35, 24 June 2013 (UTC)[reply]
Thanks, Sławomir Biały. Could you point out which triangles are similar? cmɢʟee୯ ͡° ̮د ͡° ੭ 11:56, 25 June 2013 (UTC)[reply]
Draw a plane through the point P parallel to the plane of interest. Then (a,b,c) are just the lengths of the three altitudes of P in that triangle. There are now three pairs of similar triangles (sharing legs OP and OP'). Sławomir Biały (talk) 12:19, 25 June 2013 (UTC)[reply]
I am confused. Doesn't the "plane of interest" already contain P, since it is the plane x + y + z = a + b + c ? So what is P' ? And since the lines with lengths a, b and c are mutually orthogonal, they can't all lie in the same plane can they ? Gandalf61 (talk) 14:03, 25 June 2013 (UTC)[reply]
My understanding is that we're projecting P onto that plane (x+y+z=K). Sławomir Biały (talk) 14:45, 25 June 2013 (UTC)[reply]
But K = a + b + c so P = (a,b,c) is already on the plane x + y + z = K. No need to project it anywhere. Gandalf61 (talk) 16:27, 25 June 2013 (UTC)[reply]

Yes, right. I misunderstood. But similar triangles still works. The triangles OPx and OPy are similar with x the projection of P onto the coordinate plane and y the projection of P onto the side of the triangle contained in that coordinate plane. Sławomir Biały (talk) 17:45, 25 June 2013 (UTC)[reply]

 
Derivation of a ternary plot from Cartesian coordinates

Figure (1) shows an oblique projection of point P(a,b,c) in a 3-dimensional Cartesian space with axes a, b and c, respectively.

If a + b + c = K (a positive constant), P is restricted to a plane containing A(K,0,0), B(0,K,0) and C(0,0,K). If a, b and c each cannot be negative, P is restricted to the triangle bounded by A, B and C, as in (2).

In (3), the axes are rotated to give an isometric view. The triangle, viewed face-on, appears equilateral.

In (4), the distances of P from lines BC, AC and AB are denoted by a' , b' and c' , respectively.

For any line l = s + t in vector form ( is a unit vector) and a point p, the distance from a point to a line from p to l is   .

In this case, point P is at   .

Line BC has   and   .

Using the perpendicular distance formula,

 

Substituting K = a + b + c,

  .

Similar calculation on lines AC and AB gives

  and   .

This shows that the distance of the point from the respective lines is linearly proportional to the original values a, b and c.
Ref: Ternary plots

Weighted average edit

 
The top line lists the mass of the individual rectangles and the middle lists the cumulative mass (not to scale). X = somewhere between 30 and 40 (around 35) but what is its exact value?

I want to describe a small set of numbers using some form of weighted average but I'm not sure which one it is (or even if it is a normal way of dealing with numbers). The average should describe the point at which 50% of the total is reached (like μ of a normal distribution) giving larger numbers more weight than smaller ones.

An example of the data is: 2, 3, 8, 15, 33, 52, 55, 61, 61. The total of them is 290, so I want to know the point at which 145 is reached which is somewhere between 52 and 55.

Does this have a name and is there some quick and easy way of calculating it, preferably in excel? Is there some better way of describing it? Cheers SmartSE (talk) 15:13, 24 June 2013 (UTC)[reply]

Can you explain what you mean by "the point at which 145 is reached"? Usually, the various types of means are applied to an unordered sample set of numbers, and the mean stays the same, even if the numbers are rearranged. If you are thinking "first came 2, then a 3, then an 8...", then that would be some sort of time series, where order is important. Finally, you seem to be implying that you are adding together these numbers as you go along, leading to some sequence of partial sums. So here's how I see your question/problem: Given an ordered set {s_N}, you create a list of partial sums S_n=s_1+ ... _+s_n, and would like to find the smallest n, such that S_n>S_N/2. For your example, this gives n=7, s_7=55. -- Is that right? I'm really not sure I'm understanding. If you can clarify, we can probably give better help. SemanticMantis (talk) 20:29, 24 June 2013 (UTC)[reply]
I've added a diagram which hopefully makes it clearer. The individual numbers are masses and I want to know the individual mass at which 50% of the total mass is found either side the value. I can work out x myself in the following way: "=SUM(30+(((77.5-55)/40)*10))" giving x=35.625 - hopefully you can see where I got the numbers from and apologies if you prefer algebra. I really need to find a way automate it though as I'll have hundreds of sequences that I'd like to describe. SmartSE (talk) 22:34, 24 June 2013 (UTC)[reply]
I've managed to rewrite that as "=SUM(A4+((((SUM(A1:A6))/2)-B4)/A5)*(A5-A4))" where A1:A6 are individual weights and B1:B6 are cumulative weights, but that still requires me to manually identify A4,A5 and B4. SmartSE (talk) 22:45, 24 June 2013 (UTC)[reply]
What Smartse is referring to is the weighted median I believe. As for how to find it in Excel, I'm sure there's a better way of doing it, but I'd suggest first making a "total" at the bottom of the column of values, then to the right of the values make a running cumulative by using the formula "=A2+B1" then "=B2+C1" etc where column 1 is the values and 2 is the running averages (cell B1 will just be "=A1"), then in the third column you will need an IF statement which starting in A3 saying "=IF(A2=(Z1/2),"Weighted median",""). This will deliver the phrase "Weighted median" if that number is the said median, and a blank cell if it is not. This assumes the total is in cell Z1 - change that to wherever the total actually is. Then B3 would be the same only with B2 in the formula, and so on. That would only work if at one point the cumulative total hit the exact midpoint of the total. In the highly likely chance that the cumulative total never hits the exact point but it shoots past, you will need to start in B3 as you will need to reference the line above, and B3 would need a formula saying something like "=IF(AND(A2 < Z1/2,B2 > Z1/2),"Weighted median",""). That basically tests if the cumulative total after the previous number was added is below the half-way point, but the cumulative of the current line is above it. Please note that I have not tested this in Excel so it might need some tweaking, that's just off the top of my head. Falastur2 Talk 20:46, 24 June 2013 (UTC)[reply]
Thanks for your reply - I should have probably mentioned that I can calculate it manually, but I need a formula. I came across weighted median and I agree that it does sound like it could be that, but when I search to see how to do it in excel the examples don't look similar to mine. Microsoft have some code here that I have tried using but "=SUMPRODUCT(C2:C7,C2:C7)/SUM(C2:C7)" for where C2:C7 are the numbers in the diagram gives x = 41.18. SmartSE (talk) 22:34, 24 June 2013 (UTC)[reply]
I'm not sure what you mean by "formula". Does COUNTIF(range,"<"&AVERAGE(range)) count? Sławomir Biały (talk) 23:13, 24 June 2013 (UTC)[reply]
I mean something that can be entered in one cell (or two next to each other) to get the result I'm looking for. I tried that code and got an answer of 3... SmartSE (talk) 14:19, 25 June 2013 (UTC)[reply]
Well yes, that failed because you entered C2:C7 twice whereas the MS example used A2:A4 and B2:B4. What you've done by using SUMPRODUCT(C2:C7,C2:C7) is found the sum of the squares of that range of numbers, which is not what you were intending. To make it work you'd need to insert a new column listing the frequency with which each number in your data set appears. Besides, what that formula is doing is giving you the weighted mean, not the weighted median, so this is kind of all beside the point. Falastur2 Talk 00:22, 25 June 2013 (UTC)[reply]
Hmm I did realise that was a bit of a bodge, but even if I make another column with the frequency of each number (all 1) then the result is different (25.833). SmartSE (talk) 14:19, 25 June 2013 (UTC)[reply]
If you plan on using this in Excel, I think the easiest method would be to create a user defined function using VBA; you could have it take two ranges of cells; it wouldn't be too difficult to code it to find the correct cells to use. Moreover, it would make applying the function simpler than having to use than some horrid looking formula with subtotals and other junk. If you've never used VBA, just looking up "Excel VBA" should lead to a ton of useful results to get the gust of how it works; it's fairly easy to use, especially if you've ever programmed before.Phoenixia1177 (talk) 10:39, 25 June 2013 (UTC)[reply]
I haven't ever used VBA, but looking at this it looks as if it should be within my capabilities to use it. SmartSE (talk) 14:19, 25 June 2013 (UTC)[reply]

Resolved edit

Thanks for everyone's help. I've managed to come up with a way that will allow me to automate it relatively easily. In case you're interested you can see it in sheet 2 of this spreadsheet. SmartSE (talk) 22:52, 25 June 2013 (UTC)[reply]