Wikipedia:Reference desk/Archives/Computing/2016 May 13

Computing desk
< May 12 << Apr | May | Jun >> May 14 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


May 13 edit

Smartphone turned off by itself one night edit

Cannot be turned on again. Recharging the phone makes it hot very quickly. It's the HTC One S. Is it worth taking it to a repair shop? Thanks. Imagine Reason (talk) 01:22, 13 May 2016 (UTC)[reply]

How old is it? I'm guessing it's out of warranty. If you are very lucky, it's just the battery, a repair shop will be able to fix it but it won't be free. If you are unlucky, they might want money to find out it's not the battery. Doesn't look like the easiest phone to open up yourself. Vespine (talk) 04:59, 13 May 2016 (UTC)[reply]
Funny, the instructions are mainly about melting the adhesive that holds it all together, then it says to do the reverse to put it back together. What's the reverse of melting adhesive ? StuRat (talk) 03:13, 14 May 2016 (UTC)[reply]
As for whether it's worth trying a repair shop, it only costs about $65, right  ? I'm skeptical that you can get it repaired for much less than that, and there's also the risk they will charge you to diagnose the problem, but not fix it. So, unless you can find a place that does the diagnosis for free, and only charges for the actual repairs, I wouldn't bother. (There's also the security risk that they might misuse data on your phone.) Spend your money on a new phone, instead. StuRat (talk) 03:18, 14 May 2016 (UTC)[reply]
BTW, it sounds like a fire hazard, so don't attempt to charge it again, unless you monitor it the whole time, with a fire extinguisher handy. StuRat (talk) 03:44, 14 May 2016 (UTC)[reply]
I would probably ask around if I knew someone with that same phone, and if so I would ask that person if I can borrow their battery for a couple of minutes. That way you can test if the battery is the problem, which seems likely, and if the phone works with that battery then you can consider if you want to buy a new battery. Do not buy a fake Chinese copy of the battery, those things can be very dangerous. The Quixotic Potato (talk) 20:58, 15 May 2016 (UTC)[reply]
But the instructions above mean getting a battery in and out of the phone is quite tricky and requires special tools. I doubt if his friend would be willing to put his phone through the procedure. I would not buy a phone where I can't pop the battery in and out easily, for this reason, and because that's the only way to really ever turn a phone off when you don't want to be tracked. StuRat (talk) 21:27, 15 May 2016 (UTC)[reply]
Good point, I actually like disassembling hardware and trying to put it back together again, but not everybody does. And I have to admit that I have accidentally destroyed a couple of things because of my curiosity. The Quixotic Potato (talk) 22:14, 15 May 2016 (UTC)[reply]
Yep, I managed to fix the faulty plug on my IBM ThinkPad, but tore the absurdly fragile paper cable to the TrackPoint in the process, and now must use an external mouse. StuRat (talk) 22:39, 15 May 2016 (UTC)[reply]

Double floating-point: "recalling that J has exactly 53 bits (is >= 2**52 but < 2**53)"? edit

This is from Floating Point Arithmetic: Issues and Limitations. The part in the parenthesis: shouldn't it be simply "(is = 2**53)"?

More context:

the computer strives to convert 0.1 to the closest fraction it can of the form J/2**N where J is an integer  containing exactly 53 bits. Rewriting
1 / 10 ~= J / (2**N)
as
J ~= 2**N / 10
and recalling that J has exactly 53 bits (is >= 2**52 but < 2**53), the best value for N is 56:
>>> 2**52 <=  2**56 // 10  < 2**53
True
That is, 56 is the only value for N that leaves J with exactly 53 bits. The best possible value for J is then that quotient rounded:

--Llaanngg (talk) 14:00, 13 May 2016 (UTC)[reply]

(moved from another desk)

This really belongs on the Math or the Computer desk, but it's a simple question and I'll answer here: No, the wording is correct. Think of a smaller example: the numbers 4, 5, 6, 7 each have exactly 3 bits: binary 100, 101, 110, 111. Those are the numbers that are >= 2**2 and < 2**3. The numbers with exactly 4 bits are 8 (1000) to 15 (1111); they are >= 2**3 and < 2**4. Well, similarly for 53 bits. --69.159.60.83 (talk) 14:09, 13 May 2016 (UTC)[reply]

Efficient algorithms on collections of pixels (Java) edit

I have a situation where i have two groups of pixels. These groups are generated using a Random walk, so the pixels in each group are close to other pixels of that same group. Here is what i need to find out:

Is one random walk "too close" to another? In other words, are ALL the pixels in Group 1 at least "x" pixels from ALL the pixels in group 2?

Here is the algorithm i currently use, in pseudocode for Java:

First, make a Rectangle object "r" for each group. Add each point to its respective rectangle, using r.add() method.
When it is time to see if one group is "too close" to another, check the distance between the rectangles. If the rectangles are more than "x" pixels away from eachother, then the groups are far enough apart, because it is not possible for any given pixel in groups 1 and 2 to be closer than "x" to eachother. If the rectangles are CLOSER than "x" pixels to eachother, this isn't necessarily a problem, but now we have to check EVERY pixel in group 1 to EVERY pixel in group 2, because the rectangles are not very tight boundaries to the actual shape of the random walk. This check is done in a double for loop, eg: "for (Point p : group1) { for(Point q : group2) { ... } }".

As you may have guessed, this algorithm works perfectly, but is not very efficient, because if the random walks are close enough to eachother, you have to do a very large number of checks. If each random walk consists of 200 pixels, thats 200^2 = 40,000 checks. Can the time complexity be improved?

A couple thoughts I've had on this.... perhaps we can get a tighter bound than rectangle? something that has a .add() method and is close to the "convex hull" of the walk? I know convex hull is not the proper word, since this word is used in the context of polygons usually, but i hope it is understood what i mean. If we had a tighter bound like this, checking the distance between the two bounds might help.

Another thing that MIGHT help is knowing on what "side" each walk is to the other. For instance, if we know that "Group 1 is northwest of Group 2", then we know theoretically what pixels are closest to eachother, and we wouldnt bother checking the southernmost or easternmost pixels in group 2. This idea, although it seems good, its probably more problematic than it is worth. The reason for this is that we could have two random walks that sort of make hook shapes, which go into eachothers proximity, but dont make a statement like "Group 1 is in x direction from Group 2" meaningful. In cases like this, the groups are probably almost every conceivable direction from eachother, if you look at specific points.

Is there any nice way to improve the time complexity of my algorithm in Java? Thanks in advance!

216.173.144.188 (talk) 15:29, 13 May 2016 (UTC)[reply]

If you want to ensure a minimum Euclidean distance, then according to "Optimal Algorithms for Computing the Minimum Distance Between Two Finite Planar Sets" by Toussaint and Bhattacharya, you can do it by computing the Delaunay triangulation (or Euclidean minimum spanning tree) of the points, then iterating over those edges to find the shortest one that spans the two subsets. Computing the Delaunay triangulation (or EMST) is the hard part, but you can probably find a free graph library in Java that will do it, since it's an important and well studied problem. This algorithm takes O(n log n) time (the same as sorting a list) and O(n) space (a constant amount of memory per point).
If you want taxicab distance instead, then you should be able to do the same thing with the rectilinear minimum spanning tree. (The paper doesn't prove this, but as far as I can tell the proof goes through without change in this case.) The RMST can be computed in the same time and space as the Delaunay triangulation or EMST, but it may be harder to find a library that will do it.
I strongly advise against inventing your own homebrew algorithm, as suggested by Nimur, Stephan Schulz, StuRat, and Akld guy below this point. It will almost certainly be slower, more complicated, and incorrect. -- BenRG (talk) 20:54, 13 May 2016 (UTC)[reply]
Indeed. I agree wholeheartedly. On Wikipedia, as in the rest of your life, you should avoid everybody's advice except for one self-proclaimed expert, because nobody else has any experience, education, or ideas. Once you correctly identify which person is the expert, ignore everyone else. There is no chance that anybody else is ever competent enough to provide alternate ideas; it is a certainty that everything everybody else says is wrong, even if they are a professor of computer science, a hobbyist-enthusiast who spends hours each week on similar problems, or a new student who is progressing through the same educational process as the originator of the question. The advice of all of these individuals is absolutely wrong and irrelevant. You should never design or write new software, because all important problems are already solved; and because an otherwise-anonymous expert volunteer from the Internet has doubts about your ability to implement it correctly and efficiently. Your explanation of the problem was sufficient to prescribe a specific methodology, and you ought not think about any alternative methods, even if your use-case differs from our preconceived notions. In fact, save yourself some time by skipping over any suggestions except the first, correct, and only answer. Whatever you do, avoid reading any non-expert opinions, because those might be dangerous and wrong, and you have already identified the only person who can provide you with correct ideas. This approach to information-gathering is an excellent strategy derived from analytic decision-theory. As I am a great enthusiast and proponent of statistical methods, my only follow-up suggestion is that when you pick the single expert whose advice you follow, you ought to select them randomly, rather than simply selecting the advice whose vertical position on the page is closest to the top. Unlike certain other websites, our MediaWiki software permits any contributor to move content around on the vertical axis, so vertical position on the page is a bit of a non-signal. We're a little bit more fluid and nonlinear around here. It takes some getting-used-to. Nimur (talk) 14:56, 14 May 2016 (UTC)[reply]
LOL. But satire aside, even in cases where it's highly unlikely we will come up with a more efficient method, like a prime number sieve, there's still value in trying, as the attempt teaches us many valuable skills. This is quite common in science classes, where students are asked to experimentally confirm some principle, rather than just accept it on blind faith. And occasionally, some piece of "expert wisdom" turns out to be completely wrong. For example, the Wright Brothers made little progress until they threw out the lift calculation tables and formula they had been using, and instead calculated the values directly, using a wind tunnel they made. StuRat (talk) 16:08, 14 May 2016 (UTC) [reply]
The algorithm I linked is O(n log n), which is provably optimal for general sets (it's in the title of the paper). I think the constant factor is reasonable. It's also optimally easy to implement, because almost all of it has already been written by others, except for a few lines of glue. There's always room for new research, but both of you are far from having any competence in algorithm design, and this is not the right place to practice. Stephen Shultz's answer is the only one that outlines an approach that might work provably and efficiently. I think it would have poor big-O complexity because complicated interleavings of the sets are possible in principle, but it might do well with high probability for sets generated by random walks. But it would still be much harder to turn it into working code. Using a well-tested library in place of homebrew code is just good software engineering practice.
Both of you post terrible answers to the ref desk all the time. I don't think they're terrible because I have something against you or against the world. I think they're terrible because of their content. They would be terrible no matter who posted them, but it happens that a lot of them are signed by you two. Many people have said so in StuRat's case, though not in Nimur's, probably because it takes more domain-specific knowledge to evaluate his answers. Stephan Schulz posts high-quality answers (again, this is based on their content, not on his being a CS professor, which I didn't know). I felt bad about lumping his answer in with the others, but I still think homebrew algorithms in general are a bad idea when something better is available.
I know that you don't realize how bad your answers are. I'm sure you wouldn't post them if you did. And this doesn't apply to everything you've ever written on the reference desk. Just a noticeable fraction of it. -- BenRG (talk) 18:28, 14 May 2016 (UTC)[reply]
Actually, the criticism of my answers isn't typically that they are poor, just unreferenced (and those who make that criticism frequently post poor and/or unreferenced answers themselves). However, ad hominem attacks are out of place at he Ref Desk (as they are in general), and each post should stand or fall on it's own merits, not based on appeal to authority. StuRat (talk) 19:58, 14 May 2016 (UTC)[reply]
As I said in another thread, getting all of the answers on a true-false test wrong is as difficult as getting them all right. If you don't do any research, you get half of them right. The information content of 0% correct answers is 1 bit per answer. The information content of 50% correct answers is zero. That's why people ask you to do research.
Some people can get away with not doing the research because their internal confidence in their own beliefs correlates well with reality. Your answers here suggest that you aren't one of those people.
I'm not arguing that your answers in this thread are bad because your answers in other threads are bad. I'm trying to make you realize that you frequently post bad answers, in the hope that you'll stop doing it. -- BenRG (talk) 17:36, 16 May 2016 (UTC)[reply]
Then this is still the wrong place to post. And, if it's your contention that evidence is required for all posts, where is your evidence that my answers are wrong more often than others ? Either provide a detailed analysis, say, of my last year's worth of answers versus everyone else's, or stop posting your personal opinions here. BTW, I don't like to use your appeal to authority, but I am a computer programmer/consultant who has worked at EDS, General Dynamics, and IBM, among others, on database and computer graphics applications. StuRat (talk) 19:03, 16 May 2016 (UTC)[reply]
Here's a pseudocode/Python implementation of the algorithm for Euclidean distance:
        def minimumDistanceBetweenTwoCollectionsOfPoints(points1, points2):
            points = points1 + points2  # concatenate the two lists
            edges = delaunayTriangulation(points)  # or euclideanMST(points)
            minDistSquared = infinity
            for (i, j) in edges:  # i and j are indices into the points array
                if (i < len(points1)) != (j < len(points1)):  # if one end is in points1 and the other isn't
                    minDistSquared = min(minDistSquared, squaredDistance(points[i], points[j]))
            return sqrt(minDistSquared)
It will probably be longer in Java, but that's all of the logic you need to write. -- BenRG (talk) 19:16, 14 May 2016 (UTC)[reply]
That doesn't seem well optimized, in that it's still doing a comparison of every point in set 1 with every point in set 2. It does avoid the square root operation until the min distance is found, as I had recommended in my original post item #2. But my main criticism is that it seems to be a solution to the wrong problem. Specifically, it finds the minimum distance between two entirely random sets of coords. But the problem was to find if the minimum distance is below a threshold, and the point sets are not random, but generated by a random walk method, meaning adjacently generated points in each set must be close together. We can take advantage of these to only do the calcs in the cases where there is a possibility the distance will be below the threshold, as I outlined in my latest (15:36, 14 May 2016 (UTC)) post. Also, there should be the trivial check to stop the calcs once we find a pair of pixels that are within the threshold (and if you really did need to find the minimum distance, not just if it was within the threshold, then the trivial check would stop at a distance of zero). StuRat (talk) 19:58, 14 May 2016 (UTC)[reply]
The run time is dominated by the Delaunay triangulation. The loop at the end examines each edge in the triangulation just once, and there are O(n) edges. I wrote a minimumDistance function rather than an isDistanceLessThan function because it's more generally useful. I assume the original poster can figure out how to add the early bailout if it's important.
It's certainly possible that an algorithm could do better by exploiting the special structure of these sets. If the optimum for general sets was O(n3/2), then I'd hold out for a faster special-case algorithm. But O(n log n) is usually fast enough. That said, if you can turn your multi-pass sampling idea into an algorithm covering all cases, and prove an O(n log n) or better time bound on it, post it. -- BenRG (talk) 17:36, 16 May 2016 (UTC)[reply]
I intend to, but posting it before this Q is archived might not be possible. Maybe I will open a follow-up Q. StuRat (talk) 19:06, 16 May 2016 (UTC)[reply]
Pixels are incredible data types. Pixels are legion. We don't just process pixels: we expect them. There are so many pixels that we invented a unit, "megapixel," just to count them. The incredible thing about pixels is that one individual pixel rarely makes any difference. It is only in the aggregate case, when we have many pixels, that we have an image.
So, in the context of image processing algorithms, perhaps the most useful trick you may avail to yourself is that of statistical signal processing. While you may be able to make great use of incredibly difficult algorithm theory to reduce time-complexity from, say, O(n5) to O(n3log(n2) - usually by trading off some important algorithmic characteristic, or whatever... you're often still going to find that in actual practice, processing that many pixels takes too darned long.
But enter the realm of statistics! Can you get away with calculating the original, complete algorithm, on a tiny number of pixels? What statistical error rate do you get if you only process, say, one pixel? What if you process a randomly-selected set of ten pixels, or a thousand, or five hundred thousand pixels? In any case, it has to be better than processsing megapixels. If you have normally-distributed characteristics, chances are high that your algorithm will give you the exact same outcome, even if you use an abjectly tiny subset of the input data. It's a stupidly simple way to reduce execution time without reducing the actual time-complexity.
Of course, lots of image-processing works with non-normal statistical distributions: but don't worry, because there are ways around those problems, too. Take some inspiration from Monte carlo methods.
So: addressing your specific problem, comparing "distance" between two data sets. Consider computing the centroid of each random-walk, and comparing the distance only between centroids. If that works, great! You reduced O(n2>) comparisons with two centroid calculations and exactly one comparison operation. O(2n + 1) ... that's a good bit of improvement! Next, see if you can sneak by without actually comptuing the centroid: compute a centroid for only a subset of elements in the walk. Now you've gone sub-linear in algorithmic speed! Theory guys will drop jaws at that accomplishment! For the advanced reader, figure out how to compute the centroid without computing anything at all, assuming a normal random walk. Hey, a quadratic algorithm can be reduced to a single-instruction constant-time algorithm! Real math is hard; wars are won using sloppy math. Learn to love the predictability of random variables.
But let's suppose that the centroid is insufficient: when you compare the full algorithm against your optimized shortcut, perhaps you just can't squeeze by with this measure alone. Well, don't fear: compute the centroid and a few higher order moments. You're still linear - O( m × (2n + 1) ) for small m (number of higher order moments required to appropriately match the golden standard of a complete run of the original algorithm on the complete data set.
Write a program to test-run and parameter-sweep number of moments, and quantity of sub-samples, against the golden-standard algorithm. How low can you drop the sample-size before your accuracy falls below, say, 99.999% accurate? 99.0% accurate? 90% accurate? How do higher-order moments help?
You can keep tuning and tweaking: how much may you subsample without sacrificing accuracy?
Nimur (talk) 16:10, 13 May 2016 (UTC)[reply]

Okay.... What i am getting out of this sans the specific insight using statistics is that really, this seems to be one of those tasks that isnt possible to reduce much without a tradeoff of accuracy, right? I may indeed try to follow your logic and build a tester for various algorithms where i omit a number of points or etc, but i think the centroid thing wont work very well. Random walks can be very arbitrarily shaped, and i want to make walks that are close to eachother without touching. centroid based thinking wont do this correctly i feel, so i will maybe try based on sampling or similar.

Anyone have another approach?

216.173.144.188 (talk) 16:24, 13 May 2016 (UTC)[reply]

Sorry to have glossed over this important point: if you're concerned about the geometric shape of the walk, that's exactly what each higher order moment provides. The centroid sort of tells you "where the clump is,": the first moment sort of tells you "which direction" the clump of points is going (i.e., along which straight line the walk has sort of clumped); the second moment sort of tells you how "spread out" they are; the third order moment is difficult to describe visually, but tells you how "spread out" the "spreading out" gets, in other words - how "wrong" your guesses are. If you compute n moments, you have completely described the entire data set using a linearly-independent basis set. The point of using geometric moments is that most of the useful information is contained in the first one, two, or three data points. (I'm leaving aside some mathematical nitpicks about transforming a normal distributions - we're talking about applying a heuristic here).
You could do the same with a spatial Fourier transform of the point coordinates; or with a pole decomposition; or any other of a variety of methods. In the aggregate, these are transform theory applied to the purpose of principle component analysis. There is a lot of difficult formal math you can learn to formally prove why this stuff works; but if you just need to use it, you can accept on blind faith that the first three geometric moments will "do what you need." This is why it allows you to sub-sample so significantly, which means it's fast; but if you need accuracy, you can use the entire data set, and (depending on your hardware), an FFT might be faster than a bunch of applications of the pythagorean theorem.
Nimur (talk) 16:34, 13 May 2016 (UTC)[reply]
Maybe you can describe your actual application? One solution to your idea would be to organise the page into successively smaller squares. Squares are either empty, red or blue. Whenever you need to put a blue pixel into a red square, you split the square until the sub-squares are pure again (or until you have a purple square that is smaller than your safety distance). When looking for conflicts, you only need to check adjacent squares. Squares can be efficiently indexed via their coordinates. See octree for the 3-d analogy. --Stephan Schulz (talk) 16:38, 13 May 2016 (UTC)[reply]

Nimur, Apologies... I was simply unfamiliar with this type of information, and i did not mean to disregard it in that way. I at least conceptually understand the idea of moments, based off your explanation. I will have a look into what is involved in calculating them. Thanks.

Stephan: your subdivision method does seem interesting as well, thanks for the input!

216.173.144.188 (talk) 17:01, 13 May 2016 (UTC)[reply]

Some points (my advice is far more practical and less theoretical):
1) 40000 checks isn't all that much, as long as each check is quick.
2) One method to make each check quicker is to omit the square root. So, if the min distance is 10, instead of checking that sqrt[(x1-x2)2 + (y1-y2)2] < 10, just check that [(x1-x2)2 + (y1-y2)2] < 100.
3) There are methods that we could use to improve speed, but, they may not be accurate 100% of the time, and may not be an improvement over the suggestion in part 2. The method I am guessing was proposed above is to take the two start points of the random walks, measure each pixel's distance from the opposing start point, and only consider the closest pair. So, the pixel in set A closest to start point B and the pixel in set B closest to pixel A. Again, I'd look at the squares of the distances rather than the distances themselves. If your 200×200 pixel example is the upper limit, I really doubt if you need to do this, but if you have 1 million × 1 million, then I can certainly see the need, as without it you would need to do a trillion checks.
4) The accuracy of method 3 could be improved (but the speed decreased a bit), if instead of looking at the distance from each point in set A to start point B, you instead look at the distance from each point in set A, as projected onto the line connecting starting points A and B, to start point B. Of course, here the possibility exists that the point might project onto the line past point B, in which case the pixels are likely too close. Here's a diagram, with A and B being the respective start points, a's and b's being pixels in sets A and B, and the underlined a's and b's being their projections:
                     a
       a              
                      
---A---a---b------B--a----
            
           
           b
Of course, there's no reason to presume that the line connecting start points A and B will be actually be horizontal, it's just simpler to draw that way. StuRat (talk) 00:39, 14 May 2016 (UTC)[reply]
  • If you've left it till after the map is drawn, you've left it too late. Also, the step where the rectangles are compared can be eliminated. I'd do it this way: Initialize four variables, N1,S1,E1,W1 for the northernmost, southernmost, easternmost, and westernmost pixel values of the first walk. For a 640 x 480 pixel display, the values would be 479, 0, 0, 639 respectively. For the first path, each time a pixel is plotted, if it has a vertical-axis value less than that of the previous pixel, overwrite the N1 value so it becomes the new. If its value is greater than that of the previous value, overwrite the value in the S1 variable. Do the same for the W1 and E1 variables for the horizontal-axis values. Now initialize another four variables, N2,S2,E2,W2 and do the same while plotting the second path. This process results in the northernmost, southernmost, easternmost, and westernmost pixels of each walk. Now all you need do is compare the following values: N1 versus S2, N2 versus S1, E1 versus W2, and E2 versus W1. Four comparisons. If any of those four results in less than minimum distance, the paths are too close. Akld guy (talk) 08:47, 14 May 2016 (UTC)[reply]
  • Then they might be too close. And the term "drawing the rectangles" refers to just what you described, with the rectangles being imaginary. Here's a case where the rectangles overlap but the points are not too close:
                 b------------+
                 |            |
    a-----------++------------b
    |           |
    |           |
    +-----------a
So, as the OP stated, this might be a good first step, since, if it passes the rectangle test we know we are good, but if it doesn't, then some more sophisticated method is required. StuRat (talk) 12:26, 14 May 2016 (UTC)[reply]
Note that the test 'are ALL the pixels in Group 1 at least "x" pixels from ALL the pixels in group 2?' might occasionally produce results where one set spirals around the other, but never gets within the min distance. Is it your intention that such a case should pass, or should it fail ? If so, we need to be more careful in how we define the test. StuRat (talk) 12:38, 14 May 2016 (UTC)[reply]
The OP himself stated "the rectangles are not very tight boundaries to the actual shape of the random walk.", so there's some discrepancy that he hasn't accounted for. He had previously said that if the rectangles are closer than "x" pixels to each other, every pixel in one path would have to be checked against each pixel in the other. That's an enormous amount of processing time wasted. I have some experience in this, with GPS track plotting in APRS, but have only ever written in Assembler. Akld guy (talk) 13:45, 14 May 2016 (UTC)[reply]
Right, and that's why they asked for more efficient checking methods for those cases where the rectangles are close enough that the points might be too close. StuRat (talk) 14:19, 14 May 2016 (UTC)[reply]
Another method to prune down how many checks you do is by sampling, rather than testing against every point. For example, if you check every tenth point, then your 200×200 case becomes only a 20×20 case, so takes 1/100th the time to calculate. You could then go into more detail on a 2nd pass, say by comparing the 20×20 points on either side of the two closest points in the first pass. This would still only add up to 1/50th the CPU time. If you are still worried you might miss the closest two points, you could expand the checks on the closest 9 sets of points in the first pass, and still take 1/10th the time of the original full calcs.
You can also apply some logic, such as if the max distance between adjacent points in a walk is w, then you know the points between the sample points can only be so much closer than the sample points:
>|-|<- max walk distance between adjacent points, w
 | |
 A |                      B  <- First set of sample points.
   a                    b
     a                b
       a            b
         a        b
          >a    b<------------- Minimum possible distance, m            
         a        b
       a            b
     a                b
   a                    b
 A                        B  <- 2nd set of sample points.
 |                        |
>|------------------------|<--- min measured distance between sample points, M 
So, with every tenth point sampling, it works out that:
m = M - 10w
Or, the every nth point sampling:
m = M - nw

However, this is based on the assumption that midway the walk must "turn around" to go back to the next sample point. This may not always be so, in which case the specific formula becomes:
m = M - 20w
And the general formula becomes:
m = M - 2nw 
Therefore, if you calculate that m is farther than the min allowed, you don't need to do a more detailed pass to figure it out. So, these calcs will tell you exactly how many 2nd passes you need to take, so you are no longer guessing. If you had a larger set of points to start with, you could go to 3 passes, each at increasing resolution, or even 4 passes or more. StuRat (talk) 15:36, 14 May 2016 (UTC)[reply]