Table Of Contents

Previous topic

Iterative Solution of Systems of Linear Equations

This Page

Optimization Tools

Linesearch Methods

The linesearch Module

class linesearch.LineSearch(**kwargs)

A generic linesearch class. Most methods of this class should be overridden by subclassing.

search(func, x, d, slope, f=None, **kwargs)

Given a descent direction d for function func at the current iterate x, compute a steplength t such that func(x + t * d) satisfies a linesearch condition when compared to func(x). The value of the argument slope should be the directional derivative of func in the direction d: slope = f’(x;d) < 0. If given, f should be the value of func(x). If not given, it will be evaluated.

func can point to a defined function or be a lambda function. For example, in the univariate case:

test(lambda x: x**2, 2.0, -1, 4.0)
class linesearch.ArmijoLineSearch(**kwargs)

Bases: linesearch.LineSearch

An Armijo linesearch with backtracking. This class implements the simple Armijo test

f(x + t * d) <= f(x) + t * beta * f’(x;d)

where 0 < beta < 1/2 and f’(x;d) is the directional derivative of f in the direction d. Note that f’(x;d) < 0 must be true.

Keywords :
beta:Value of beta. Default: 0.001
tfactor:Amount by which to reduce the steplength during the backtracking. Default: 0.5.
search(func, x, d, slope, f=None, **kwargs)

Given a descent direction d for function func at the current iterate x, compute a steplength t such that func(x + t * d) satisfies the Armijo linesearch condition when compared to func(x). The value of the argument slope should be the directional derivative of func in the direction d: slope = f’(x;d) < 0. If given, f should be the value of func(x). If not given, it will be evaluated.

func can point to a defined function or be a lambda function. For example, in the univariate case:

test(lambda x: x**2, 2.0, -1, 4.0)

The pyswolfe Module

Linesearch methods guaranteeing satisfaction of the strong Wolfe conditions.

class pyswolfe.StrongWolfeLineSearch(f, g, d, obj, grad, **kwargs)

A general-purpose linesearch procedure enforcing the strong Wolfe conditions

f(x+td) <= f(x) + ftol * t * <g(x),d> (Armijo condition)

<g(x+td),d> | <= gtol * | <g(x),d> | (curvature condition)

This is a Python interface to the More and Thuente linesearch.

Instantiate as follows

SWLS = StrongWolfeLineSearch(f, g, d, obj, grad)

where

  • f is the objective value at the current iterate x

  • g is the objective gradient at the current iterate x

  • d is the current search direction

  • obj is a scalar function used to evaluate the value of

    the objective at x + t d, given t.

  • grad is a scalar function used to evaluate the gradient

    of the objective at x + t d, given t.

Keywords :
ftol:the constant used in the Armijo condition (1e-3)
gtol:the constant used in the curvature condition (0.9)
xtol:a minimal relative step bracket length (1e-10)
stp:an initial step value (1.0)
stpmin:the initial lower bound of the bracket
stpmax:the initial upper bound of the bracket

To ensure existence of a step satisfying the strong Wolfe conditions, d should be a descent direction for f at x and ftol <= gtol.

The final value of the step will be held in SWLS.stp

In case an error happens, the return value SWLS.stp will be set to None and SWLS.message will describe what happened.

After the search, SWLS.armijo will be set to True if the step computed satisfies the Armijo condition and SWLS.curvature will be set to True if the step satisfies the curvature condition.

search()

The pymswolfe Module

PyMSWolfe: Jorge Nocedal’s modified More and Thuente linesearch guaranteeing satisfaction of the strong Wolfe conditions.

class pymswolfe.StrongWolfeLineSearch(f, x, g, d, obj, grad, **kwargs)

A general-purpose linesearch procedure enforcing the strong Wolfe conditions

f(x+td) <= f(x) + ftol * t * <g(x),d> (Armijo condition)

<g(x+td),d> | <= gtol * | <g(x),d> | (curvature condition)

This is a Python interface to Jorge Nocedal’s modification of the More and Thuente linesearch. Usage of this class is slightly different from the original More and Thuente linesearch.

Instantiate as follows

SWLS = StrongWolfeLineSearch(f, x, g, d, obj, grad)

where

  • f is the objective value at the current iterate x

  • x is the current iterate

  • g is the objective gradient at the current iterate x

  • d is the current search direction

  • obj is a scalar function used to evaluate the value of

    the objective at a given point

  • grad is a scalar function used to evaluate the gradient

    of the objective at a given point.

Keywords :
ftol:the constant used in the Armijo condition (1e-4)
gtol:the constant used in the curvature condition (0.9)
xtol:a minimal relative step bracket length (1e-16)
stp:an initial step value (1.0)
stpmin:the initial lower bound of the bracket (1e-20)
stpmax:the initial upper bound of the bracket (1e+20)
maxfev:the maximum number of function evaluations permitted (20)

To ensure existence of a step satisfying the strong Wolfe conditions, d should be a descent direction for f at x and ftol <= gtol.

The final value of the step will be held in SWLS.stp

After the search, SWLS.armijo will be set to True if the step computed satisfies the Armijo condition and SWLS.curvature will be set to True if the step satisfies the curvature condition.

search()

Trust-Region Methods

The trustregion Module

Class definition for Trust-Region Algorithm

class trustregion.TrustRegionFramework(**kwargs)

Initializes an object allowing management of a trust region.

Keywords :
Delta:Initial trust-region radius (default: 1.0)
eta1:Step acceptance threshold (default: 0.01)
eta2:Radius increase threshold (default: 0.99)
gamma1:Radius decrease factor (default: 1/3)
gamma2:Radius increase factor (default: 2.5)

Subclass and override UpdateRadius() to implement custom trust-region management rules.

See, e.g.,

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.

ResetRadius()

Reset radius to original value

Rho(f, f_trial, m)

Compute the ratio of actual versus predicted reduction rho = (f - f_trial)/(-m)

UpdateRadius(rho, stepNorm)

Update the trust-region radius. The rule implemented by this method is:

Delta = gamma1 * stepNorm if ared/pred < eta1 Delta = gamma2 * Delta if ared/pred >= eta2 Delta unchanged otherwise,

where ared/pred is the quotient computed by self.Rho().

class trustregion.TrustRegionSolver(g, **kwargs)

A generic template class for implementing solvers for the trust-region subproblem

minimize q(d) subject to ||d|| <= radius,

where q(d) is a quadratic function of the n-vector d, i.e., q has the general form q(d) = g’ d + 1/2 d’ H d,

where g is a n-vector typically interpreted as the gradient of some merit function and H is a real symmetric n-by-n matrix. Note that H need not be positive semi-definite.

The trust-region constraint ||d|| <= radius can be defined in any norm although most derived classes currently implement the Euclidian norm only. Note however that any elliptical norm may be used via a preconditioner.

For more information on trust-region methods, see

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.

Solve()

Solve the trust-region subproblem. This method must be overridden.

class trustregion.TrustRegionCG(g, **kwargs)

Bases: trustregion.TrustRegionSolver

Instantiate a trust-region subproblem solver based on the truncated conjugate gradient method of Steihaug and Toint. See the pcg module for more information.

The main difference between this class and the TrustRegionPCG class is that TrustRegionPCG only accepts explicit preconditioners (i.e. in matrix form). This class accepts an implicit preconditioner, i.e., any callable object.

Solve(**kwargs)

Solve trust-region subproblem using the truncated conjugate-gradient algorithm.

class trustregion.TrustRegionPCG(g, A, **kwargs)

Bases: trustregion.TrustRegionSolver

Instantiate a trust-region subproblem solver based on the projected truncated conjugate gradient of Gould, Hribar and Nocedal. See the ppcg module for more information.

The trust-region subproblem has the form

minimize q(d) subject to Ad = 0 and ||d|| <= radius,

where q(d) is a quadratic function of the n-vector d, i.e., q has the general form q(d) = g’ d + 1/2 d’ H d,

where g is a n-vector typically interpreted as the gradient of some merit function and H is a real symmetric n-by-n matrix. Note that H need not be positive semi-definite.

The trust-region constraint ||d|| <= radius can be defined in any norm although most derived classes currently implement the Euclidian norm only. Note however that any elliptical norm may be used via a preconditioner.

For more information on trust-region methods, see

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.

Solve()

Solve trust-region subproblem using the projected truncated conjugate gradient algorithm.

class trustregion.TrustRegionGLTR(g, **kwargs)

Bases: trustregion.TrustRegionSolver

Instantiate a trust-region subproblem solver based on the Generalized Lanczos iterative method of Gould, Lucidi, Roma and Toint. See pygltr for more information.

Solve()

Solve the trust-region subproblem using the generalized Lanczos method.

Complete Solvers

Linear Least-Squares Problems

Solve the least-squares problem

minimize ||Ax-b||

using LSQR. This is a line-by-line translation from Matlab code available at http://www.stanford.edu/~saunders/lsqr.

Michael P. Friedlander, University of British Columbia Dominique Orban, Ecole Polytechnique de Montreal

class lsqr.LSQRFramework(m, n, aprod)

LSQR solves Ax = b or minimize |b - Ax| in Euclidian norm if damp = 0, or minimize |b - Ax| + damp * |x| in Euclidian norm if damp > 0.

A is an (m x n) matrix defined by y = aprod(mode, m, n, x), where aprod refers to a function that performs matrix-vector products. If mode = 1, aprod must return y = Ax without altering x. If mode = 2, aprod must return y = A’x without altering x.

LSQR uses an iterative (conjugate-gradient-like) method.

For further information, see

  1. C. C. Paige and M. A. Saunders (1982a). LSQR: An algorithm for sparse linear equations and sparse least squares, ACM TOMS 8(1), 43-71.
  2. C. C. Paige and M. A. Saunders (1982b). Algorithm 583. LSQR: Sparse linear equations and least squares problems, ACM TOMS 8(2), 195-209.
  3. M. A. Saunders (1995). Solution of sparse rectangular systems using LSQR and CRAIG, BIT 35, 588-604.
solve(rhs, itnlim=0, damp=0.0, atol=9.9999999999999995e-07, btol=9.9999999999999995e-08, conlim=100000000.0, show=False, wantvar=False)

Solve the linear system, linear least-squares problem or regularized linear least-squares problem with specified parameters. All return values below are stored in members of the same name.

Parameters :
rhs:right-hand side vector.
itnlim:is an explicit limit on iterations (for safety).
damp:damping/regularization parameter.
Keywords :
atol:
btol:are stopping tolerances. If both are 1.0e-9 (say), the final residual norm should be accurate to about 9 digits. (The final x will usually have fewer correct digits, depending on cond(A) and the size of damp.)
conlim:is also a stopping tolerance. lsqr terminates if an estimate of cond(A) exceeds conlim. For compatible systems Ax = b, conlim could be as large as 1.0e+12 (say). For least-squares problems, conlim should be less than 1.0e+8. Maximum precision can be obtained by setting atol = btol = conlim = zero, but the number of iterations may then be excessive.
show:if set to True, gives an iteration log. If set to False, suppresses output.
Returns:

x:is the final solution.
istop:gives the reason for termination.
istop:= 1 means x is an approximate solution to Ax = b. = 2 means x approximately solves the least-squares problem.
r1norm:= norm(r), where r = b - Ax.
r2norm:= sqrt(norm(r)^2 + damp^2 * norm(x)^2) = r1norm if damp = 0.
anorm:= estimate of Frobenius norm of (regularized) A.
acond:= estimate of cond(Abar).
arnorm:= estimate of norm(A’r - damp^2 x).
xnorm:= norm(x).
var:(if present) estimates all diagonals of (A’A)^{-1} (if damp=0) or more generally (A’A + damp^2*I)^{-1}. This is well defined if A has full column rank or damp > 0. (Not sure what var means if rank(A) < n and damp = 0.)

Linear Programming

class lp.RegLPInteriorPointSolver(lp, **kwargs)
display_stats()

Display vital statistics about the input problem.

maxStepLength(x, d)

Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.

scale(**kwargs)

Equilibrate the constraint matrix of the linear program. Equilibration is done by first dividing every row by its largest element in absolute value and then by dividing every column by its largest element in absolute value. In effect the original problem

minimize c’x subject to A1 x + A2 s = b, x >= 0

is converted to

minimize (Cc)’x subject to R A1 C x + R A2 C s = Rb, x >= 0,

where the diagonal matrices R and C operate row and column scaling respectively.

Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.

The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.

set_initial_guess(lp, **kwargs)

Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem

minimize ||s|| subject to A1 x + A2 s = b

which is also the solution to the augmented system

[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].

Initial values for (y,z) are chosen as the solution to the least-squares problem

minimize ||z|| subject to A1’ y = c, A2’ y + z = 0

which can be computed as the solution to the augmented system

[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].

To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.

The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.

solve(**kwargs)

Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are

Keywords :
itermax:The maximum allowed number of iterations (default: 10n)
tolerance:Stopping tolerance (default: 1.0e-6)
PredictorCorrector:
 Use the predictor-corrector method (default: True). If set to False, a variant of the long-step method is used. The long-step method is generally slower and less robust.

Upon exit, the following members of the class instance are set:

x..............final iterate y..............final value of the Lagrange multipliers associated

to A1 x + A2 s = b
z..............final value of the Lagrange multipliers associated
to s>=0

obj_value......final cost iter...........total number of iterations kktResid.......final relative residual solve_time.....time to solve the LP status.........string describing the exit status short_status...short version of status, used for printing.

solveSystem(rhs, itref_threshold=1.0000000000000001e-05, nitrefmax=3)
unscale(**kwargs)

Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.

class lp.RegLPInteriorPointSolver29(lp, **kwargs)

Bases: lp.RegLPInteriorPointSolver

display_stats()

Display vital statistics about the input problem.

maxStepLength(x, d)

Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.

scale(**kwargs)

Scale the constraint matrix of the linear program. The scaling is done so that the scaled matrix has all its entries near 1.0 in the sense that the square of the sum of the logarithms of the entries is minimized.

In effect the original problem

minimize c’x subject to A1 x + A2 s = b, x >= 0

is converted to

minimize (Cc)’x subject to R A1 C x + R A2 C s = Rb, x >= 0,

where the diagonal matrices R and C operate row and column scaling respectively.

Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.

The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.

set_initial_guess(lp, **kwargs)

Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem

minimize ||s|| subject to A1 x + A2 s = b

which is also the solution to the augmented system

[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].

Initial values for (y,z) are chosen as the solution to the least-squares problem

minimize ||z|| subject to A1’ y = c, A2’ y + z = 0

which can be computed as the solution to the augmented system

[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].

To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.

The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.

solve(**kwargs)

Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are

Keywords :
itermax:The maximum allowed number of iterations (default: 10n)
tolerance:Stopping tolerance (default: 1.0e-6)
PredictorCorrector:
 Use the predictor-corrector method (default: True). If set to False, a variant of the long-step method is used. The long-step method is generally slower and less robust.

Upon exit, the following members of the class instance are set:

x..............final iterate y..............final value of the Lagrange multipliers associated

to A1 x + A2 s = b
z..............final value of the Lagrange multipliers associated
to s>=0

obj_value......final cost iter...........total number of iterations kktResid.......final relative residual solve_time.....time to solve the LP status.........string describing the exit status short_status...short version of status, used for printing.

solveSystem(rhs, itref_threshold=1.0000000000000001e-05, nitrefmax=3)
unscale(**kwargs)

Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.

Convex Quadratic Programming

class cqp.RegQPInteriorPointSolver(qp, **kwargs)
display_stats()

Display vital statistics about the input problem.

maxStepLength(x, d)

Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.

scale(**kwargs)

Equilibrate the constraint matrix of the linear program. Equilibration is done by first dividing every row by its largest element in absolute value and then by dividing every column by its largest element in absolute value. In effect the original problem

minimize c’ x + 1/2 x’ Q x subject to A1 x + A2 s = b, x >= 0

is converted to

minimize (Cc)’ x + 1/2 x’ (CQC’) x subject to R A1 C x + R A2 C s = Rb, x >= 0,

where the diagonal matrices R and C operate row and column scaling respectively.

Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.

The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.

set_initial_guess(qp, **kwargs)

Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem

minimize ||s|| subject to A1 x + A2 s = b

which is also the solution to the augmented system

[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].

Initial values for (y,z) are chosen as the solution to the least-squares problem

minimize ||z|| subject to A1’ y = c, A2’ y + z = 0

which can be computed as the solution to the augmented system

[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].

To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.

The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.

solve(**kwargs)

Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are

Keywords :
itermax:The maximum allowed number of iterations (default: 10n)
tolerance:Stopping tolerance (default: 1.0e-6)
PredictorCorrector:
 Use the predictor-corrector method (default: True). If set to False, a variant of the long-step method is used. The long-step method is generally slower and less robust.

Upon exit, the following members of the class instance are set:

  • x..............final iterate

  • y..............final value of the Lagrange multipliers associated

    to A1 x + A2 s = b

  • z..............final value of the Lagrange multipliers associated

    to s>=0

  • obj_value......final cost

  • iter...........total number of iterations

  • kktResid.......final relative residual

  • solve_time.....time to solve the QP

  • status.........string describing the exit status.

  • short_status...short version of status, used for printing.

solveSystem(rhs, itref_threshold=1.0000000000000001e-05, nitrefmax=5)
unscale(**kwargs)

Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.

class cqp.RegQPInteriorPointSolver29(qp, **kwargs)

Bases: cqp.RegQPInteriorPointSolver

display_stats()

Display vital statistics about the input problem.

maxStepLength(x, d)

Returns the max step length from x to the boundary of the nonnegative orthant in the direction d.

scale(**kwargs)

Scale the constraint matrix of the linear program. The scaling is done so that the scaled matrix has all its entries near 1.0 in the sense that the square of the sum of the logarithms of the entries is minimized.

In effect the original problem

minimize c’x + 1/2 x’Qx subject to A1 x + A2 s = b, x >= 0

is converted to

minimize (Cc)’x + 1/2 x’ (CQC’) x subject to R A1 C x + R A2 C s = Rb, x >= 0,

where the diagonal matrices R and C operate row and column scaling respectively.

Upon return, the matrix A and the right-hand side b are scaled and the members row_scale and col_scale are set to the row and column scaling factors.

The scaling may be undone by subsequently calling unscale(). It is necessary to unscale the problem in order to unscale the final dual variables. Normally, the solve() method takes care of unscaling the problem upon termination.

set_initial_guess(qp, **kwargs)

Compute initial guess according the Mehrotra’s heuristic. Initial values of x are computed as the solution to the least-squares problem

minimize ||s|| subject to A1 x + A2 s = b

which is also the solution to the augmented system

[ 0 0 A1’ ] [x] [0] [ 0 I A2’ ] [s] = [0] [ A1 A2 0 ] [w] [b].

Initial values for (y,z) are chosen as the solution to the least-squares problem

minimize ||z|| subject to A1’ y = c, A2’ y + z = 0

which can be computed as the solution to the augmented system

[ 0 0 A1’ ] [w] [c] [ 0 I A2’ ] [z] = [0] [ A1 A2 0 ] [y] [0].

To ensure stability and nonsingularity when A does not have full row rank, the (1,1) block is perturbed to 1.0e-4 * I and the (3,3) block is perturbed to -1.0e-4 * I.

The values of s and z are subsequently adjusted to ensure they are positive. See [Methrotra, 1992] for details.

solve(**kwargs)

Solve the input problem with the primal-dual-regularized interior-point method. Accepted input keyword arguments are

Keywords :
itermax:The maximum allowed number of iterations (default: 10n)
tolerance:Stopping tolerance (default: 1.0e-6)
PredictorCorrector:
 Use the predictor-corrector method (default: True). If set to False, a variant of the long-step method is used. The long-step method is generally slower and less robust.

Upon exit, the following members of the class instance are set:

  • x..............final iterate

  • y..............final value of the Lagrange multipliers associated

    to A1 x + A2 s = b

  • z..............final value of the Lagrange multipliers associated

    to s>=0

  • obj_value......final cost

  • iter...........total number of iterations

  • kktResid.......final relative residual

  • solve_time.....time to solve the QP

  • status.........string describing the exit status.

  • short_status...short version of status, used for printing.

solveSystem(rhs, itref_threshold=1.0000000000000001e-05, nitrefmax=5)
unscale(**kwargs)

Restore the constraint matrix A, the right-hand side b and the cost vector c to their original value by undoing the row and column equilibration scaling.

Unconstrained Programming

TRUNK Trust-Region Method for Unconstrained Programming.

A first unconstrained optimization solver in Python The Python version of the celebrated F90/95 solver D. Orban Montreal Sept. 2003

class trunk.TrunkFramework(nlp, TR, TrSolver, **kwargs)

An abstract framework for a trust-region-based algorithm for nonlinear unconstrained programming. Instantiate using

TRNK = TrunkFramework(nlp, TR, TrSolver)

Parameters :
nlp:a NLPModel object representing the problem. For instance, nlp may arise from an AMPL model
TR:a TrustRegionFramework object
TrSolver:a TrustRegionSolver object.
Keywords :
x0:starting point (default nlp.x0)
reltol:relative stopping tolerance (default nlp.stop_d)
abstol:absolute stopping tolerance (default 1.0e-6)
maxiter:maximum number of iterations (default max(1000,10n))
inexact:use inexact Newton stopping tol (default False)
ny:apply Nocedal/Yuan linesearch (default False)
nbk:max number of backtracking steps in Nocedal/Yuan linesearch (default 5)
monotone:use monotone descent strategy (default False)
nIterNonMono:number of iterations for which non-strict descent can be tolerated if monotone=False (default 25)
silent:verbosity level (default False)

Once a TrunkFramework object has been instantiated and the problem is set up, solve problem by issuing a call to TRNK.solve(). The algorithm stops as soon as the Euclidian norm of the gradient falls below

max(abstol, reltol * g0)

where g0 is the Euclidian norm of the gradient at the initial point.

PostIteration(**kwargs)

Override this method to perform work at the end of an iteration. For example, use this method for preconditioners that need updating, e.g., a limited-memory BFGS preconditioner.

Precon(v, **kwargs)

Generic preconditioning method—must be overridden

Solve(**kwargs)
class trunk.TrunkLbfgsFramework(nlp, TR, TrSolver, **kwargs)

Bases: trunk.TrunkFramework

Class TrunkLbfgsFramework is a subclass of TrunkFramework. The method is based on the same trust-region algorithm with Nocedal-Yuan backtracking. The only difference is that a limited-memory BFGS preconditioner is used and maintained along the iterations. See class TrunkFramework for more information.

PostIteration(**kwargs)

This method updates the limited-memory BFGS preconditioner by appending the most rencet (s,y) pair to it and possibly discarding the oldest one if all the memory has been used.

Precon(v, **kwargs)

This method implements limited-memory BFGS preconditioning. It overrides the default Precon() of class TrunkFramework.

Solve(**kwargs)
class lbfgs.InverseLBFGS(n, npairs=5, **kwargs)

Class InverseLBFGS is a container used to store and manipulate limited-memory BFGS matrices. It may be used, e.g., in a LBFGS solver for unconstrained minimization or as a preconditioner. The limited-memory matrix that is implicitly stored is a positive definite approximation to the inverse Hessian. Therefore, search directions may be obtained by computing matrix-vector products only. Such products are efficiently computed by means of a two-loop recursion.

Instantiation is as follows

lbfgsupdate = InverseLBFGS(n)

where n is the number of variables of the problem.

Keywords :
npairs:the number of (s,y) pairs stored (default: 5)
scaling:enable scaling of the ‘initial matrix’. Scaling is done as ‘method M3’ in the LBFGS paper by Zhou and Nocedal; the scaling factor is <sk,yk>/<yk,yk> (default: False).

Member functions are

  • store to store a new (s,y) pair and discard the oldest one

    in case the maximum storage has been reached,

  • matvec to compute a matrix-vector product between the current

    positive-definite approximation to the inverse Hessian and a given vector.

matvec(iter, v)

Compute a matrix-vector product between the current limited-memory positive-definite approximation to the inverse Hessian matrix and the vector v using the LBFGS two-loop recursion formula. The ‘iter’ argument is the current iteration number.

When the inner product <y,s> of one of the pairs is nearly zero, the function returns the input vector v, i.e., no preconditioning occurs. In this case, a safeguarding step should probably be taken.

solve(iter, v)

This is an alias for matvec used for preconditioning.

store(iter, new_s, new_y)

Store the new pair (new_s,new_y) computed at iteration iter.

class lbfgs.LBFGSFramework(nlp, **kwargs)

Class LBFGSFramework provides a framework for solving unconstrained optimization problems by means of the limited-memory BFGS method.

Instantiation is done by

lbfgs = LBFGSFramework(nlp)

where nlp is an instance of a nonlinear problem. A solution of the problem is obtained by called the solve member function, as in

lbfgs.solve().

Keywords :
npairs:the number of (s,y) pairs to store (default: 5)
x0:the starting point (default: nlp.x0)
maxiter:the maximum number of iterations (default: max(10n,1000))
abstol:absolute stopping tolerance (default: 1.0e-6)
reltol:relative stopping tolerance (default: nlp.stop_d)

Other keyword arguments will be passed to InverseLBFGS.

The linesearch used in this version is Jorge Nocedal’s modified More and Thuente linesearch, attempting to ensure satisfaction of the strong Wolfe conditions. The modifications attempt to limit the effects of rounding error inherent to the More and Thuente linesearch.

solve()

Bound-Constrained Programming

This module implements a purely primal-dual interior-point methods for bound-constrained optimization. The method uses the primal-dual merit function of Forsgren and Gill and solves subproblems by means of a truncated conjugate gradient method with trust region.

References:

[1] A. Forsgren and Ph. E. Gill, Primal-Dual Interior Methods for Nonconvex
Nonlinear Programming, SIAM Journal on Optimization, 8(4):1132-1152, 1998
[2] Ph. E. Gill and E. M. Gertz, A primal-dual trust region algorithm for
nonlinear optimization, Mathematical Programming, 100(1):49-94, 2004
[3] P. Armand and D. Orban, A Trust-Region Interior-Point Method for
Bound-Constrained Programming Based on a Primal-Dual Merit Function, Cahier du GERAD G-xxxx, GERAD, Montreal, Quebec, Canada, 2008.
  1. Orban, Montreal
class pdmerit.PrimalDualInteriorPointFramework(nlp, TR, TrSolver, **kwargs)
AtOptimality(x, z, **kwargs)

Shortcut.

GradPDMerit(x, z, **kwargs)

Evaluate the gradient of the primal-dual merit function at (x,z). See PDMerit() for a description of z.

HessProd(x, z, p, **kwargs)

Compute the matrix-vector product between the Hessian matrix of the primal-dual merit function at (x,z) and the vector p. See help(PDMerit) for a description of z. If there are b bounded variables and q two-sided bounds, the vector p should have length n+b+2q. The Hessian matrix has the general form

[ H + 2 mu X^{-2} I ] [ I mu Z^{-2} ].
PDHess(x, z, **kwargs)

Assemble the modified Hessian matrix of the primal-dual merit function at (x,z). See PDMerit() for a description of z. The Hessian matrix has the general form

[ H + 2 X^{-1} Z I ] [ I Z^{-1} X ].
PDHessProd(x, z, p, **kwargs)

Compute the matrix-vector product between the modified Hessian matrix of the primal-dual merit function at (x,z) and the vector p. See help(PDMerit) for a description of z. If there are b bounded variables and q two-sided bounds, the vector p should have length n+b+2q. The Hessian matrix has the general form

[ H + 2 X^{-1} Z I ] [ I Z^{-1} X ].
PDHessTemplate(**kwargs)

Assemble the part of the modified Hessian matrix of the primal-dual merit function that is iteration independent. The function PDHess() fills in the blanks by updating the rest of the matrix.

PDMerit(x, z, **kwargs)

Evaluate the primal-dual merit function at (x,z). If there are b >= 0 one-sided bound constraints and q >= 0 two-sided bound constraints, the vector z should have length b+2q. The components z[i] (0 <= i < b+q) correspond, in increasing order of variable indices to

  • variables with a lower bound only,
  • variables with an upper bound only,
  • the lower bound on variables that are bounded below and above.

The components z[i] (b+q <= i < b+2q) correspond to the upper bounds on variables that are bounded below and above.

This function returns the value of the primal-dual merit function. The current value of the objective function can be supplied via the keyword argument f.

Precon(v, **kwargs)

Generic preconditioning method—must be overridden.

PrimalMultipliers(x, **kwargs)

Return the vector of primal multipliers at x. The value of the barrier parameter used can either be supplied using the keyword argument mu or the current value of the instance is used.

SetupPrecon(**kwargs)

Construct or set up the preconditioner—must be overridden.

SolveInner(**kwargs)

Perform a series of inner iterations so as to minimize the primal-dual merit function with the current value of the barrier parameter to within some given tolerance. The only optional argument recognized is

stopTol stopping tolerance (default: muerrfact * mu).
SolveOuter(**kwargs)
StartingPoint(**kwargs)

Compute a strictly feasible initial primal-dual estimate (x0, z0).

UpdateMu(**kwargs)

Update the barrier parameter before the next round of inner iterations.

UpdatePrecon(**kwargs)

Override this method for preconditioners that need updating, e.g., a limited-memory BFGS preconditioner.

ftb(x, z, step, **kwargs)
Compute the largest alpha in ]0,1] such that
(x,z) + alpha * step >= (1 - tau) * (x,z)

where 0 < tau < 1. By default, tau = 0.9.

General Nonlinear Programming

Todo

Insert this module.