Talk:Theil–Sen estimator

Latest comment: 3 months ago by David Eppstein in topic Broken Link

tau

edit

Quote: "As Sen observed, this estimator is the value that makes the Kendall tau rank correlation coefficient comparing the sample data values yi with their estimated values mxi + b become approximately zero."

Really? Then the method gives an estimation (mxi + b) completely uncorrelated with the estimated variable (yi)? Olaf (talk) 00:49, 27 April 2014 (UTC)Reply

No, it means that roughly half the yi are greater than the corresponding mxi+b, and roughly half are less. Deltahedron (talk) 19:49, 27 April 2014 (UTC)Reply
No, it's not median error supposed to be equal to zero as it would be in your interpretation, it's Kendall's tau rank correlation. Counterexample: if yi = xi, then the estimator mxi + b = 1xi + 0 = xi = yi and thus the tau correlation between the estimator mxi + b and the original value yi is equal to one, instead of zero. Olaf (talk) 20:07, 27 April 2014 (UTC)Reply
That's not a particularly good counterexample, since the number of concordant and the number of discordant pairs are both zero, and hence tau=0. Deltahedron (talk) 20:11, 27 April 2014 (UTC)Reply
Let's check: y1=1, y2=2, y3=3.
Estimations: Y1=1, Y2=2, Y3=3
Concordant pairs:
1<2 and y1 < Y2
1<3 and y1 < Y3
2<3 and y2 < Y3
Tied pairs: none
Discordant pairs: none.
Tau = 1
In absence of tied ranks the tau correlation has the same property as Pearson's correlation: tau(A,A) = 1, and we have no tied ranks, if ai <> aj when i<>j
Olaf (talk) 20:23, 27 April 2014 (UTC)Reply
No, it's the residuals that are all equal and hence uncorrelated. Deltahedron (talk) 20:37, 27 April 2014 (UTC)Reply
Yes, and the article supposed, it's the estimated values, not their residuals. Now it's fixed ([1]). Thank you for the references. Olaf (talk) 20:43, 27 April 2014 (UTC)Reply
However, what's important is what independent reliable sources say. Searching "Theil Sen" "Kendall tau" in Google Books gave me: [2], [3], [4] which support the assertion of the text (unlike the reference to Rousseeuw & Leroy (2003), pp. 67, 164 which did not). Deltahedron (talk) 20:19, 27 April 2014 (UTC)Reply
Ok, so it's tau correlation between estimation error and X value equal to zero, not between estimator and estimated value! (the second reference). Olaf (talk) 20:26, 27 April 2014 (UTC)Reply
Thanks for clearing this up. —David Eppstein (talk) 22:36, 27 April 2014 (UTC)Reply

Bias

edit

The statement on unbiasedness,

The Theil–Sen estimator is an unbiased estimator of the true slope in simple linear regression

is unfounded. The corresponding source explicitly states that Sen's claim to that effect is incorrect. It should be removed. Muhali (talk) 08:38, 14 February 2017 (UTC)Reply

Just dug a little deeper. Their counterexample is built on asymmetric noise, which is somewhat rare, so maybe we just keep it the way it is stated now. Muhali (talk) 09:04, 14 February 2017 (UTC)Reply

Accuracy of the estimated slope

edit

The description seems to be of a kind of percentile bootstrap, but as far as I can see, this is incorrect. The procedure described here would yield a 95% interval for the sampled slopes, not (as it should) of their median. A reference for the described procedure is missing. Maybe someone has a good reference to a good way of doing this? (I don't have one handy now.) --Han691 (talk) 17:23, 19 August 2019 (UTC)Reply

edit

The link for reference 24 is broken. 172.56.200.238 (talk) 23:18, 15 July 2024 (UTC)Reply

Updated. —David Eppstein (talk) 10:10, 16 July 2024 (UTC)Reply