Wikipedia:Reference desk/Archives/Mathematics/2020 May 24

Mathematics desk
< May 23 << Apr | May | Jun >> Current desk >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


May 24

edit

Optimization: Necessary and Sufficient Conditions for Superlinear Convergence of Quasi-Newton Method

edit

Hello,

I am reading *Numerical Optimization* by Nocedal & Wright, and I am having trouble understanding some aspects of the proof of Theorem  . I have written the theorem, its proof, and my questions in LaTeX. I'm sorry I couldn't figure out how to make it show up nicely on Wikipedia.

I have also asked this question on Math Stackexchange, if you would prefer to see it there: https://math.stackexchange.com/questions/3686947/proof-of-superlinear-convergence-of-quasi-newton-methods-in-nocedal-wright

I have been stuck on this theorem for many hours, so any help is greatly appreciated.

There are two things I don't understand:

1) The theorem is an iff statement. The author proves one direction, but I don't see how to prove the reverse.

2) The author seems to use the assumption that the Hessian is Lipschitz, but this is not an explicit assumption of the theorem. Is this a mistake from the author? (I checked the errata and this wasn't on there)

The following are several lines the author references in the proof. The theorem and the proof follows.

 

[The above is where my point #2 comes from. This inequality was derived in the proof of an earlier theorem (the theorem about quadratic convergence of Newton's Method) and in that theorem we had a hypothesis that the Hessian is Lipschitz.(and we used that hypothesis to prove the above inequality)]

 
 
  Suppose that   is twice continuously differentiable. Consider the iteration   (that is, the step length   is uniformly  ) and that   is given by  . Let us assume also that   converges to a point   such that   and   is positive definite. Then   converges superlinearly if and only if   holds.
  We first show that   is equivalent to
 
where   is the Newton step. Assuming   holds, we have that
 
where we have used the fact that   is bounded above for   sufficiently close to  , since the limiting Hessian   is positive definite. The converse follows readily of we multiply both sides of   by   and recall  .
By combining   and  , we obtain that
 
A simple manipulation of this inequality reveals that   so we obtain
 
giving the superlinear convergence result.

In an earlier edition of the book (ISBN 978-0-387-22742-9), the statement of Theorem 3.7 starts with: "Suppose that   is twice differentiable and that the Hessian   is Lipschitz continuous ..." [my emphasis by underscoring — L.] The statement of Theorem 3.7 in a later edition (ISBN 978-0-387-40065-5) is as above, but the form of (3.36) is subtly different:
 
So what edition is the above from? The presentation is not entirely self-contained; I assume that   is shorthand notation for  .  --Lambiam 10:29, 24 May 2020 (UTC)[reply]

Thanks for the response. I'm using the second edition; it's good to hear that the Lipschitz hypothesis appears in the first edition. Its ommision must have been a mistake on the author's part, but unfortunately it wasn't listed in the errata.

As far as the diferences in (3.36) the author does use   while I use  . But from going through the proof of Theorem 3.7, I am quite confident thatI am quite confident that (3.36) a little wrong; instead of  , it should have  

Yes indeed,   stands for  . I have tried to make the proof as self contained as possible; did I miss something, or are you referring to the "A simple manipulation of this inequality...."?  --BlueDream30 15:01, 24 May 2020 (UTC)[reply]