A generic linesearch class. Most methods of this class should be overridden by subclassing.
Given a descent direction d for function func at the current iterate x, compute a steplength t such that func(x + t * d) satisfies a linesearch condition when compared to func(x). The value of the argument slope should be the directional derivative of func in the direction d: slope = f’(x;d) < 0. If given, f should be the value of func(x). If not given, it will be evaluated.
func can point to a defined function or be a lambda function. For example, in the univariate case:
test(lambda x: x**2, 2.0, -1, 4.0)
Bases: linesearch.LineSearch
An Armijo linesearch with backtracking. This class implements the simple Armijo test
f(x + t * d) <= f(x) + t * beta * f’(x;d)
where 0 < beta < 1/2 and f’(x;d) is the directional derivative of f in the direction d. Note that f’(x;d) < 0 must be true.
Keywords : |
|
---|
Given a descent direction d for function func at the current iterate x, compute a steplength t such that func(x + t * d) satisfies the Armijo linesearch condition when compared to func(x). The value of the argument slope should be the directional derivative of func in the direction d: slope = f’(x;d) < 0. If given, f should be the value of func(x). If not given, it will be evaluated.
func can point to a defined function or be a lambda function. For example, in the univariate case:
test(lambda x: x**2, 2.0, -1, 4.0)
Linesearch methods guaranteeing satisfaction of the strong Wolfe conditions.
A general-purpose linesearch procedure enforcing the strong Wolfe conditions
f(x+td) <= f(x) + ftol * t * <g(x),d> (Armijo condition)
This is a Python interface to the More and Thuente linesearch.
Instantiate as follows
SWLS = StrongWolfeLineSearch(f, g, d, obj, grad)
where
f is the objective value at the current iterate x
g is the objective gradient at the current iterate x
d is the current search direction
the objective at x + t d, given t.
of the objective at x + t d, given t.
Keywords : |
|
---|
To ensure existence of a step satisfying the strong Wolfe conditions, d should be a descent direction for f at x and ftol <= gtol.
The final value of the step will be held in SWLS.stp
In case an error happens, the return value SWLS.stp will be set to None and SWLS.message will describe what happened.
After the search, SWLS.armijo will be set to True if the step computed satisfies the Armijo condition and SWLS.curvature will be set to True if the step satisfies the curvature condition.
PyMSWolfe: Jorge Nocedal’s modified More and Thuente linesearch guaranteeing satisfaction of the strong Wolfe conditions.
A general-purpose linesearch procedure enforcing the strong Wolfe conditions
f(x+td) <= f(x) + ftol * t * <g(x),d> (Armijo condition)
This is a Python interface to Jorge Nocedal’s modification of the More and Thuente linesearch. Usage of this class is slightly different from the original More and Thuente linesearch.
Instantiate as follows
SWLS = StrongWolfeLineSearch(f, x, g, d, obj, grad)
where
f is the objective value at the current iterate x
x is the current iterate
g is the objective gradient at the current iterate x
d is the current search direction
the objective at a given point
of the objective at a given point.
Keywords : |
|
---|
To ensure existence of a step satisfying the strong Wolfe conditions, d should be a descent direction for f at x and ftol <= gtol.
The final value of the step will be held in SWLS.stp
After the search, SWLS.armijo will be set to True if the step computed satisfies the Armijo condition and SWLS.curvature will be set to True if the step satisfies the curvature condition.
Class definition for Trust-Region Algorithm
Initializes an object allowing management of a trust region.
Keywords : |
|
---|
Subclass and override UpdateRadius() to implement custom trust-region management rules.
See, e.g.,
A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.
Reset radius to original value
Compute the ratio of actual versus predicted reduction rho = (f - f_trial)/(-m)
Update the trust-region radius. The rule implemented by this method is:
Delta = gamma1 * stepNorm if ared/pred < eta1 Delta = gamma2 * Delta if ared/pred >= eta2 Delta unchanged otherwise,
where ared/pred is the quotient computed by self.Rho().
A generic template class for implementing solvers for the trust-region subproblem
minimize q(d) subject to ||d|| <= radius,
where q(d) is a quadratic function of the n-vector d, i.e., q has the general form q(d) = g’ d + 1/2 d’ H d,
where g is a n-vector typically interpreted as the gradient of some merit function and H is a real symmetric n-by-n matrix. Note that H need not be positive semi-definite.
The trust-region constraint ||d|| <= radius can be defined in any norm although most derived classes currently implement the Euclidian norm only. Note however that any elliptical norm may be used via a preconditioner.
For more information on trust-region methods, see
A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.
Solve the trust-region subproblem. This method must be overridden.
Bases: trustregion.TrustRegionSolver
Instantiate a trust-region subproblem solver based on the truncated conjugate gradient method of Steihaug and Toint. See the pcg module for more information.
The main difference between this class and the TrustRegionPCG class is that TrustRegionPCG only accepts explicit preconditioners (i.e. in matrix form). This class accepts an implicit preconditioner, i.e., any callable object.
Solve trust-region subproblem using the truncated conjugate-gradient algorithm.
Bases: trustregion.TrustRegionSolver
Instantiate a trust-region subproblem solver based on the projected truncated conjugate gradient of Gould, Hribar and Nocedal. See the ppcg module for more information.
The trust-region subproblem has the form
minimize q(d) subject to Ad = 0 and ||d|| <= radius,
where q(d) is a quadratic function of the n-vector d, i.e., q has the general form q(d) = g’ d + 1/2 d’ H d,
where g is a n-vector typically interpreted as the gradient of some merit function and H is a real symmetric n-by-n matrix. Note that H need not be positive semi-definite.
The trust-region constraint ||d|| <= radius can be defined in any norm although most derived classes currently implement the Euclidian norm only. Note however that any elliptical norm may be used via a preconditioner.
For more information on trust-region methods, see
A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.
Solve trust-region subproblem using the projected truncated conjugate gradient algorithm.
Bases: trustregion.TrustRegionSolver
Instantiate a trust-region subproblem solver based on the Generalized Lanczos iterative method of Gould, Lucidi, Roma and Toint. See pygltr for more information.
Solve the trust-region subproblem using the generalized Lanczos method.
Solve the least-squares problem
minimize ||Ax-b||
using LSQR. This is a line-by-line translation from Matlab code available at http://www.stanford.edu/~saunders/lsqr.
Michael P. Friedlander, University of British Columbia Dominique Orban, Ecole Polytechnique de Montreal
LSQR solves Ax = b or minimize |b - Ax| in Euclidian norm if damp = 0, or minimize |b - Ax| + damp * |x| in Euclidian norm if damp > 0.
A is an (m x n) matrix defined by y = aprod(mode, m, n, x), where aprod refers to a function that performs matrix-vector products. If mode = 1, aprod must return y = Ax without altering x. If mode = 2, aprod must return y = A’x without altering x.
LSQR uses an iterative (conjugate-gradient-like) method.
For further information, see
Solve the linear system, linear least-squares problem or regularized linear least-squares problem with specified parameters. All return values below are stored in members of the same name.
Parameters : |
|
||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Keywords : |
|
||||||||||||||||||||
Returns: |
|
Display vital statistics about the input problem.
Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.
Equilibrate the constraint matrix of the linear program. Equilibration is done by first dividing every row by its largest element in absolute value and then by dividing every column by its largest element in absolute value. In effect the original problem
minimize c’x subject to A1 x + A2 s = b, x >= 0
is converted to
minimize (Cc)’x subject to R A1 C x + R A2 C s = Rb, x >= 0,
where the diagonal matrices R and C operate row and column scaling respectively.
Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.
The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.
Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem
minimize ||s|| subject to A1 x + A2 s = b
which is also the solution to the augmented system
[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].
Initial values for (y,z) are chosen as the solution to the least-squares problem
minimize ||z|| subject to A1’ y = c, A2’ y + z = 0
which can be computed as the solution to the augmented system
[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].
To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.
The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.
Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are
Keywords : |
|
---|
Upon exit, the following members of the class instance are set:
x..............final iterate y..............final value of the Lagrange multipliers associated
to A1 x + A2 s = b
obj_value......final cost iter...........total number of iterations kktResid.......final relative residual solve_time.....time to solve the LP status.........string describing the exit status short_status...short version of status, used for printing.
Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.
Bases: lp.RegLPInteriorPointSolver
Display vital statistics about the input problem.
Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.
Scale the constraint matrix of the linear program. The scaling is done so that the scaled matrix has all its entries near 1.0 in the sense that the square of the sum of the logarithms of the entries is minimized.
In effect the original problem
minimize c’x subject to A1 x + A2 s = b, x >= 0
is converted to
minimize (Cc)’x subject to R A1 C x + R A2 C s = Rb, x >= 0,
where the diagonal matrices R and C operate row and column scaling respectively.
Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.
The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.
Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem
minimize ||s|| subject to A1 x + A2 s = b
which is also the solution to the augmented system
[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].
Initial values for (y,z) are chosen as the solution to the least-squares problem
minimize ||z|| subject to A1’ y = c, A2’ y + z = 0
which can be computed as the solution to the augmented system
[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].
To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.
The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.
Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are
Keywords : |
|
---|
Upon exit, the following members of the class instance are set:
x..............final iterate y..............final value of the Lagrange multipliers associated
to A1 x + A2 s = b
obj_value......final cost iter...........total number of iterations kktResid.......final relative residual solve_time.....time to solve the LP status.........string describing the exit status short_status...short version of status, used for printing.
Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.
Display vital statistics about the input problem.
Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.
Equilibrate the constraint matrix of the linear program. Equilibration is done by first dividing every row by its largest element in absolute value and then by dividing every column by its largest element in absolute value. In effect the original problem
minimize c’ x + 1/2 x’ Q x subject to A1 x + A2 s = b, x >= 0
is converted to
minimize (Cc)’ x + 1/2 x’ (CQC’) x subject to R A1 C x + R A2 C s = Rb, x >= 0,
where the diagonal matrices R and C operate row and column scaling respectively.
Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.
The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.
Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem
minimize ||s|| subject to A1 x + A2 s = b
which is also the solution to the augmented system
[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].
Initial values for (y,z) are chosen as the solution to the least-squares problem
minimize ||z|| subject to A1’ y = c, A2’ y + z = 0
which can be computed as the solution to the augmented system
[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].
To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.
The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.
Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are
Keywords : |
|
---|
Upon exit, the following members of the class instance are set:
x..............final iterate
to A1 x + A2 s = b
to s>=0
obj_value......final cost
iter...........total number of iterations
kktResid.......final relative residual
solve_time.....time to solve the QP
status.........string describing the exit status.
short_status...short version of status, used for printing.
Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.
Bases: cqp.RegQPInteriorPointSolver
Display vital statistics about the input problem.
Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.
Scale the constraint matrix of the linear program. The scaling is done so that the scaled matrix has all its entries near 1.0 in the sense that the square of the sum of the logarithms of the entries is minimized.
In effect the original problem
minimize c’x + 1/2 x’Qx subject to A1 x + A2 s = b, x >= 0
is converted to
minimize (Cc)’x + 1/2 x’ (CQC’) x subject to R A1 C x + R A2 C s = Rb, x >= 0,
where the diagonal matrices R and C operate row and column scaling respectively.
Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.
The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.
Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem
minimize ||s|| subject to A1 x + A2 s = b
which is also the solution to the augmented system
[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].
Initial values for (y,z) are chosen as the solution to the least-squares problem
minimize ||z|| subject to A1’ y = c, A2’ y + z = 0
which can be computed as the solution to the augmented system
[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].
To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.
The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.
Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are
Keywords : |
|
---|
Upon exit, the following members of the class instance are set:
x..............final iterate
to A1 x + A2 s = b
to s>=0
obj_value......final cost
iter...........total number of iterations
kktResid.......final relative residual
solve_time.....time to solve the QP
status.........string describing the exit status.
short_status...short version of status, used for printing.
Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.
TRUNK Trust-Region Method for Unconstrained Programming.
A first unconstrained optimization solver in Python The Python version of the celebrated F90/95 solver D. Orban Montreal Sept. 2003
An abstract framework for a trust-region-based algorithm for nonlinear unconstrained programming. Instantiate using
TRNK = TrunkFramework(nlp, TR, TrSolver)
Parameters : |
|
||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Keywords : |
|
Once a TrunkFramework object has been instantiated and the problem is set up, solve problem by issuing a call to TRNK.solve(). The algorithm stops as soon as the Euclidian norm of the gradient falls below
max(abstol, reltol * g0)
where g0 is the Euclidian norm of the gradient at the initial point.
Override this method to perform work at the end of an iteration. For example, use this method for preconditioners that need updating, e.g., a limited-memory BFGS preconditioner.
Generic preconditioning method—must be overridden
Bases: trunk.TrunkFramework
Class TrunkLbfgsFramework is a subclass of TrunkFramework. The method is based on the same trust-region algorithm with Nocedal-Yuan backtracking. The only difference is that a limited-memory BFGS preconditioner is used and maintained along the iterations. See class TrunkFramework for more information.
This method updates the limited-memory BFGS preconditioner by appending the most rencet (s,y) pair to it and possibly discarding the oldest one if all the memory has been used.
This method implements limited-memory BFGS preconditioning. It overrides the default Precon() of class TrunkFramework.
Class InverseLBFGS is a container used to store and manipulate limited-memory BFGS matrices. It may be used, e.g., in a LBFGS solver for unconstrained minimization or as a preconditioner. The limited-memory matrix that is implicitly stored is a positive definite approximation to the inverse Hessian. Therefore, search directions may be obtained by computing matrix-vector products only. Such products are efficiently computed by means of a two-loop recursion.
Instantiation is as follows
lbfgsupdate = InverseLBFGS(n)
where n is the number of variables of the problem.
Keywords : |
|
---|
Member functions are
in case the maximum storage has been reached,
positive-definite approximation to the inverse Hessian and a given vector.
Compute a matrix-vector product between the current limited-memory positive-definite approximation to the inverse Hessian matrix and the vector v using the LBFGS two-loop recursion formula. The ‘iter’ argument is the current iteration number.
When the inner product <y,s> of one of the pairs is nearly zero, the function returns the input vector v, i.e., no preconditioning occurs. In this case, a safeguarding step should probably be taken.
This is an alias for matvec used for preconditioning.
Store the new pair (new_s,new_y) computed at iteration iter.
Class LBFGSFramework provides a framework for solving unconstrained optimization problems by means of the limited-memory BFGS method.
Instantiation is done by
lbfgs = LBFGSFramework(nlp)
where nlp is an instance of a nonlinear problem. A solution of the problem is obtained by called the solve member function, as in
lbfgs.solve().
Keywords : |
|
---|
Other keyword arguments will be passed to InverseLBFGS.
The linesearch used in this version is Jorge Nocedal’s modified More and Thuente linesearch, attempting to ensure satisfaction of the strong Wolfe conditions. The modifications attempt to limit the effects of rounding error inherent to the More and Thuente linesearch.
This module implements a purely primal-dual interior-point methods for bound-constrained optimization. The method uses the primal-dual merit function of Forsgren and Gill and solves subproblems by means of a truncated conjugate gradient method with trust region.
References:
- [1] A. Forsgren and Ph. E. Gill, Primal-Dual Interior Methods for Nonconvex
- Nonlinear Programming, SIAM Journal on Optimization, 8(4):1132-1152, 1998
- [2] Ph. E. Gill and E. M. Gertz, A primal-dual trust region algorithm for
- nonlinear optimization, Mathematical Programming, 100(1):49-94, 2004
- [3] P. Armand and D. Orban, A Trust-Region Interior-Point Method for
- Bound-Constrained Programming Based on a Primal-Dual Merit Function, Cahier du GERAD G-xxxx, GERAD, Montreal, Quebec, Canada, 2008.
Shortcut.
Evaluate the gradient of the primal-dual merit function at (x,z). See PDMerit() for a description of z.
Compute the matrix-vector product between the Hessian matrix of the primal-dual merit function at (x,z) and the vector p. See help(PDMerit) for a description of z. If there are b bounded variables and q two-sided bounds, the vector p should have length n+b+2q. The Hessian matrix has the general form
[ H + 2 mu X^{-2} I ] [ I mu Z^{-2} ].
Assemble the modified Hessian matrix of the primal-dual merit function at (x,z). See PDMerit() for a description of z. The Hessian matrix has the general form
[ H + 2 X^{-1} Z I ] [ I Z^{-1} X ].
Compute the matrix-vector product between the modified Hessian matrix of the primal-dual merit function at (x,z) and the vector p. See help(PDMerit) for a description of z. If there are b bounded variables and q two-sided bounds, the vector p should have length n+b+2q. The Hessian matrix has the general form
[ H + 2 X^{-1} Z I ] [ I Z^{-1} X ].
Assemble the part of the modified Hessian matrix of the primal-dual merit function that is iteration independent. The function PDHess() fills in the blanks by updating the rest of the matrix.
Evaluate the primal-dual merit function at (x,z). If there are b >= 0 one-sided bound constraints and q >= 0 two-sided bound constraints, the vector z should have length b+2q. The components z[i] (0 <= i < b+q) correspond, in increasing order of variable indices to
- variables with a lower bound only,
- variables with an upper bound only,
- the lower bound on variables that are bounded below and above.
The components z[i] (b+q <= i < b+2q) correspond to the upper bounds on variables that are bounded below and above.
This function returns the value of the primal-dual merit function. The current value of the objective function can be supplied via the keyword argument f.
Generic preconditioning method—must be overridden.
Return the vector of primal multipliers at x. The value of the barrier parameter used can either be supplied using the keyword argument mu or the current value of the instance is used.
Construct or set up the preconditioner—must be overridden.
Perform a series of inner iterations so as to minimize the primal-dual merit function with the current value of the barrier parameter to within some given tolerance. The only optional argument recognized is
stopTol stopping tolerance (default: muerrfact * mu).
Compute a strictly feasible initial primal-dual estimate (x0, z0).
Update the barrier parameter before the next round of inner iterations.
Override this method for preconditioners that need updating, e.g., a limited-memory BFGS preconditioner.
where 0 < tau < 1. By default, tau = 0.9.
Todo
Insert this module.