Two-Phase Generalized Reduced Gradient Method for Constrained Global Optimization

El Mouatasim, Abdelkrim

doi:https://doi.org/10.1155/2010/976529

Journal of Applied Mathematics

On this page

Abstract Introduction References Copyright Related Articles

Research Article | Open Access

Volume 2010 | Article ID 976529 | https://doi.org/10.1155/2010/976529

Two-Phase Generalized Reduced Gradient Method for Constrained Global Optimization

Abdelkrim El Mouatasim¹

Academic Editor: Ying U. Hu

Received01 Jun 2010

Revised17 Sept 2010

Accepted31 Oct 2010

Published08 Dec 2010

Abstract

The random perturbation of generalized reduced gradient method for optimization under nonlinear differentiable constraints is proposed. Generally speaking, a particular iteration of this method proceeds in two phases. In the Restoration Phase, feasibility is restored by means of the resolution of an auxiliary nonlinear problem, a generally nonlinear system of equations. In the Optimization Phase, optimality is improved by means of the consideration of the objective function, on the tangent subspace to the constraints. In this paper, optimal assumptions are stated on the Restoration Phase and the Optimization Phase that establish the global convergence of the algorithm. Some numerical examples are also given by mixture problem and octagon problem.

1. Introduction

We consider the problem where and are continuously differentiable, and is closed and convex set (e.g., ).

Notice that any nonlinear programming can be put into standard form (1.1), by introduction of nonnegative slack variables, if there are inequalities (other than bounds on the variables) among the constraints, and by allowing some of the bounds to be or if necessary. The standard form is adopted here for ease in notations and discussion.

Feasible methods for optimization with constraints like generalized reduced gradient play an important role in practicing and are still widely used in technological applications. The main techniques proposed for solving constrained optimization problems in this work are generalized reduced gradient method via random perturbation. We are mainly interested in the situation where, on one hand, is not concave and, on the other hand, the constraints are in general not linear. It is worth noting that some variant of the generalized gradient method reduces, in the case where all the constraints are linear, to the reduced gradient method [1], and some other variant, in the case of linear programming, to the Dantzig simplex method.

The problem (1.1) can be numerically approached by using sequential quadratic programming in [2], other methods for nonlinear programming in [3, 4], and generalized reduced gradient method [5], which generates a sequence , where is an initial feasible point and, for each , a new feasible point is generated from by using an operator (see Section 3). Thus the iterations are given by

A fundamental difficulty arises due to the lack of concavity: the convergence of the sequence to a global maximum point is not ensured for the general situation considered. In order to prevent from converging to local maximum, various modifications of these basic methods have been introduced in the literature. For instance, we can find in the literature modifications of the basic descent methods [6–9], stochastic methods combined to penalty functions [10], evolutionary methods [11], and simulated annealing [12]. We introduce in this paper a different approach, inspired from the method of random perturbations introduced in [13] for unconstrained minimization of continuously differentiable functions and adapted to linearly constrained problems in [14].

In such a method, the sequence is replaced by a random vectors sequence and the iterations are modified as follows: where is a suitable random variable, called the stochastic perturbation. The sequence goes to zero slowly enough in order to prevent convergence to a local maximum (see Section 4), the generalized reduced gradient method is recalled in Section 3, the notations are introduced in Section 2, and the results of some numerical experiments are given in Section 5.

2. Notations and Assumptions

We have the following. (i) is the -dimensional real Euclidean space.(ii) stands for .(iii) is the column vector whose components are .(iv) is the Euclidean norm of : (v) is the transpose matrix associated to .

Definition 2.1. A point is said to be feasible if , for , where is same preassigned small positive constant.

Assume that we know some feasible point . We make the following nonedegeneracy assumption. The vector can be split into two components: an -dimensional component (the basic part), and , a component of dimension (the nonbasic part) such that the following two properties hold: (H1) is strictly between bounds; (H2) the square matrix , computed at , is nonsingular.

If the property (H2) does not hold or , we add artificial variables to the constraints, so the property (H2) holds and in associate problem (1.1).

Let The objective function is , and its upper bound on is denoted by : Let us introduce We assume that where is the measure of .

Since is a finite dimensional space, assumption (2.6) is verified when is bounded or is coercive, that is, . Assumption (2.6) is verified when contains a sequence of neighborhoods of a point of optimum having strictly positive measure, that is, when can be approximated by a sequence of points of the interior of . We observe that the assumptions (2.5)-(2.6) yield that From (2.5)-(2.6), Then where is the direction determined on the basis of the gradient .

Thus, where are positive real numbers.

3. Generalized Reduced Gradient Method

By the implicit function theorem, there exists, in some neighborhood of , a unique continuous function (mapping), say , such that is identically zero in . In addition, has a continuous derivative , which can be computed by the chain rule: or, more conveniently, In what follows, we call the Jacobian of computed at . Similarly, we set Substituting into the objective function , we obtain the reduced function: the gradient of which at is, by the chain rule again, (all derivatives computed at ). Setting we have the following formula for the reduced gradient : The generalized reduced gradient method tries to extend the methods of linear optimization to the nonlinear case. These methods are close to, or equivalent to, projected gradient methods [15]; just the presentation of the methods is frequently quite different.

Let us define the projected reduced gradient (in the space of ) by its components: (i) if and ; (ii) if and ; (iii), otherwise.

It is convenient to set and hence It is a simple matter to verify that Kuhn-Tucker condition for the problem in standard form reduces to , and that is the row-vector of multipliers corresponding to the equations . We assume from now on that .

The following relations if and , if and

define what we call the face (at ), denoted by . The row-vector (the projected reduced gradient) is also named the projection of onto .

Let be any nonzero vector column-vector in such that ; the vector is an ascent direction for the reduced function .

There is a striking analogy with what is usually done in linear programming, where can be computed in close form. This is generally not the case if the constraints are nonlinear. Even if close form is available, actual substitution may very well be undesirable.

The generalized reduced gradient algorithm consists of the following steps.

Step 1. Assume that some that feasible is known. Set and go to the next step.

Step 2. Step 2 is conveniently divided into substeps. (1.1) Compute the Jacobian and the gradient of the objective function.(1.2) Determine a splitting of into and corresponding of into , such that is strictly between bound and is nonsingular; invert .(1.3) Compute the Lagrange multipliers and the reduced gradient . (1.4) Determine the face and the projection of onto [16]. (1.5) If is zero (or almost zero in some sense), then terminate. is a KKT point [17, 18]. Otherwise, go to the next step.

Step 3. Choose ascent direction [18], that is,

Step 4. Choose a first stepsize .

Step 5. Maximize, with respect to , the function with more or less accuracy (the linear search). For each value of under consideration, this step requires solving the following system of equations: where is the -dimensional vector of unknowns.

Step 6. Assuming that Step 5 succeeds, an improved feasible point is obtained, which replaced . Replace by and go back to Step 2.

Let us recall briefly the essential points of the generalized reduced gradient method: an initial feasible guess is given and a sequence is generated by using iterations of the general form:

Remark 3.1. The theoretical convergence of generalized reduced gradient method has been proved by [17, 19, 20].

4. Two-Phase Generalized Reduced Gradient (TPGRG) Method

The main difficulty remains the lack of concaveity: if is not concave, the Kuhn-Tucker points may not correspond to a global maximum (see e.g., [7, 8]). In the next, this point is improved by using an appropriate random perturbation.

The sequence of real numbers is replaced by a sequence of random variables involving a random perturbation of the deterministic iteration (3.13); then we have ; where satisfied the Step 5 in GRG algorithm, and see [9], and Equation (4.1) can be viewed as perturbation of the ascent direction , which is replaced by a new direction , and the iterations (4.1) become General properties defining convenient sequences of perturbation can be found in the literature [13, 14]: usually, sequence of Gaussian laws may be used in order to produce elements satisfying these properties.

We introduce a random vector , and we denote by and the cumulative distribution function and the probability density of , respectively.

We denote by the conditional cumulative distribution function: and the condition probability density of is denoted by .

Let us introduce a sequence of -dimensional random vectors . We consider also , a suitable decreasing sequence of strictly positive real numbers converging to 0 and such that .

The optimal choice for is determined by Step 5. Let : It follows that So, we have The relation (2.11) shows that We assume that there exists a decreasing function , on such that For simplicity, let where is a random variable; for simplicity let .

The procedure generates a sequence . By construction this sequence is increasing and upper bounded by : Thus, there exists such that

Lemma 4.1. Let and if is given by (4.11); then there exists such that where .

Proof. Let , for .

Since , it follows from (2.7) that is not empty and has a strictly positive measure.

If for any , the result is immediate, since we have on .

Let us assume that there exists such that . For , we have and .

for any , since the sequence is increasing, and we have also Thus Let ; we have from (4.12) But Markov chain yields that By the conditional probability rule Moreover From (4.15),we have Thus Taking (4.8) into account, we have The relation (2.11) shows that and (4.10) yields that Hence

4.1. Global Convergence

The global convergence is a consequence of the following result, which yields from the Borel-Catelli's lemma (e.g., see [13]):

Lemma 4.2. Let be an increasing sequence, upper bounded by . Then, there exists such that for . Assume that there exists such that for any , there is a sequence of strictly positive real numbers such that Then almost surely.

Proof. For instance, see [13, 21].

Theorem 4.3. Let ; assume that , the sequence is nonincreasing, and Then almost surely.

Proof. Let Since the sequence is nonincreasing, Thus, (4.29) shows that Using Lemmas 4.1 and 4.2 we have almost surely.

Theorem 4.4. Let be defined by (4.11), and let where , , and is the iteration number. If , then, for large enough, almost surely.

Proof. We have So, For such that we have and, from Theorem 4.4, we have almost surely.

4.2. Practical Implementation

The above results suggest the following numerical algorithm.(1)An initial guess is given. (2)At the iteration number , is known and is determined by performing the following three substeps.(2.1) Unperturbed ascent: we determine the descent direction and the step using ascent method (3.13). This generates the first trial point: (2.2) Perturbation: we determine a sample of new trial points: (2.3) Dynamics: we determine by selecting it from the set of available points:

The computation of is performed in two phases. First phase: we determine a trial point (unperturbed ascent step and perturbation step). Second phase: we determine by selection it from (dynamics step).

As was shown in Theorem 4.4, Substep (2.2) may use , where is a sample of and is given by (4.33).

For instance, we can consider elitistic dynamics:

5. Numerical Results

In order to apply the method presented in (4.41), we start at the initial value . At step , is known and is determined.

We generate the number of perturbation; the iterations are stopped when is a Kuhn-Tucker point. We denote by the value of when the iterations are stopped (it corresponds to the number of evaluations of the gradient of ). The optimal value and optimal point are and , respectively. The perturbation is normally distributed and samples are generated by using the log-trigonometric generator and the standard random number generator of the FORTRAN library. We use , where .

Concern experiments performed on a workstation HP Intel(R) Celeron(R) M processor 1.30 GHz, 224 Mo RAM. The row cpu gives the mean CPU time in seconds for one run.

5.1. Octagon Problem

Consider polygon in the plane with 5 sides (5-gons for short) and unit diameter. Which of them have maximum area ? (see e.g., [22]).

Graham's conjecture states that the optimal octagon can be illustrated as in Figure 1, in which a solid line between two vertices indicates that the distance between these points is one.

This question can be formulated to the quadratically constrained quadratic optimization problem defining this configuration that appears as follows:

There are 33 variables and 34 constraints.

We use and and used regular octagon as a starting point (area of regular octagon see Figure 2), and if we let the error less than .

The Fortran code of TPGRG furnishs the following optimal solutions, :

Branch and cut method [23] was solving the octagon problem with using regular octagon as a starting point, the maximum area found is , but the cpu time is more than 30 hours.

5.2. Mixture Problem

In this example of petrochemical mixture, we have four reservoirs: , , , and . The two first receive three distinct source products, and then their content is combined in the two other in order to create the wanted miscellanies. The question is to determine the quantity of every product to buy in order to maximize the profits (see e.g., [24, 25]).

The reservoir receives two quality products 3 and 1 in quantities and , respectively. The reservoir contains a product of the third source, of quality 2 in quantity . One wants to get in the reservoirs and of capacities 10 and 20, respectively.

Figure 3, where the variable to represents some quantities, illustrates this situation. The unit prices of the products bought are, respectively, 60, 160, and 100; those of the products finished are 90 and 150. Therefore the difference between the costs of purchase and sale is The quality of the mixture done contained in is The qualities of the final miscellanies are The addition of the constraints of volume conservation and does permit the elimination of the variables , and :

Remark 5.1. One has By the relation (5.5), Suppose that , then , and so that contradict with inequality (5.8).

The mathematical model transformed is the following: There are 11 variables and 10 constraints.

We use , , and The Fortran code of TPGRG furnishes the following optimal solutions: cpu = 100 second, and , since the mixture problem is known by this global solution.

6. Concluding Remarks

A two-phase generalized reduced gradient method is presented for nonlinear constraints, involving the adjunction of a stochastic perturbation. This approach leads to a stochastic ascent method where the deterministic sequence generated by the two-phase generalized gradient method is replaced by a sequence of random variables.

TPGRG method converges to global maximum for all differential objective function, but GRG method converges to local maximum.

The numerical experiments show that the method is effective to calculate for global optimization problems. Here yet, we observe that the adjunction of the stochastic perturbation improves the result, with a larger number of evaluations of the objective function. The main difficulty in the practical use of the stochastic perturbation is connected to the tuning of the parameters and .

References

A. El Mouatasim and A. Al-Hossain, “Reduced gradient method for minimax estimation of a bounded Poisson mean,” Journal of Statistics: Advances in Theory and Applications, vol. 2, no. 2, pp. 183–197, 2009.
View at: Google Scholar
J. Mo, K. Zhang, and Z. Wei, “A variant of SQP method for inequality constrained optimization and its global convergence,” Journal of Computational and Applied Mathematics, vol. 197, no. 1, pp. 270–281, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
H. Jiao, Y. Guo, and P. Shen, “Global optimization of generalized linear fractional programming with nonlinear constraints,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 717–728, 2006.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
Y. C. Park, M. H. Chang, and T.-Y. Lee, “A new deterministic global optimization method for general twice-differentiable constrained nonlinear programming problems,” Engineering Optimization, vol. 39, no. 4, pp. 397–411, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
A. A. Haggag, “A variant of the generalized reduced gradient algorithm for nonlinear programming and its applications,” European Journal of Operational Research, vol. 7, no. 2, pp. 161–168, 1981.
View at: Publisher Site | Google Scholar
C. C. Y. Dorea, “Stopping rules for a random optimization method,” SIAM Journal on Control and Optimization, vol. 28, no. 4, pp. 841–850, 1990.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
A. El Mouatasim, R. Ellaia, and J. E. Souza de Cursi, “Random perturbation of the variable metric method for unconstrained nonsmooth nonconvex optimization,” International Journal of Applied Mathematics and Computer Science, vol. 16, no. 4, pp. 463–474, 2006.
View at: Google Scholar | Zentralblatt MATH
J. E. Souza de Cursi, R. Ellaia, and M. Bouhadi, “Global optimization under nonlinear restrictions by using stochastic perturbations of the projected gradient,” in Frontiers in Global Optimization, vol. 74 of Nonconvex Optimization and Its Applications, pp. 541–563, Kluwer Academic Publishers, Boston, Mass, USA, 2004.
View at: Google Scholar
P. Sadegh and J. C. Spall, “Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation,” IEEE Transactions on Automatic Control, vol. 43, no. 10, pp. 1480–1484, 1998.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
W. L. Price, “Global optimization by controlled random search,” Journal of Optimization Theory and Applications, vol. 40, no. 3, pp. 333–348, 1983.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, Mass, USA, 1989.
A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multimodal functions of continuous variables with the “simulated annealing” algorithm,” ACM Transactions on Mathematical Software, vol. 13, no. 3, pp. 262–280, 1987.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. Pogu and J. E. Souza de Cursi, “Global optimization by random perturbation of the gradient method with a fixed parameter,” Journal of Global Optimization, vol. 5, no. 2, pp. 159–180, 1994.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. E. Souza de Cursi, R. Ellaia, and M. Bouhadi, “Stochastic perturbation methods for affine restrictions,” in Advances in Convex Analysis and Global Optimization, vol. 54 of Nonconvex Optimization and Its Applications, pp. 487–499, Kluwer Academic Publishers, Boston, Mass, USA, 2001.
View at: Google Scholar | Zentralblatt MATH
P. Beck, L. Lasdon, and M. Engquist, “A reduced gradient algorithm for nonlinear network problems,” ACM Transactions on Mathematical Software, vol. 9, no. 1, pp. 57–70, 1983.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. Abadie, “The GRG method for non-linear programming,” in Design and Implementation of Optimization Software, H. J. Greenberg, Ed., Sijthoff and Noordhoff, Alphen aan den Rijn, The Netherlands, 1978.
View at: Google Scholar
D. Gabay and D. G. Luenberger, “Efficiently converging minimization methods based on the reduced gradient,” SIAM Journal on Control and Optimization, vol. 14, no. 1, pp. 42–61, 1976.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
E. P. de Carvalho, A. dos Santos Jr., and T. F. Ma, “Reduced gradient method combined with augmented Lagrangian and barrier for the optimal power flow problem,” Applied Mathematics and Computation, vol. 200, no. 2, pp. 529–536, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
H. Mokhtar-Kharroubi, “Sur la convergence théorique de la méthode du gradient réduit généralisé,” Numerische Mathematik, vol. 34, no. 1, pp. 73–85, 1980.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Y. Smeers, “Generalized reduced gradient method as an extension of feasible direction methods,” Journal of Optimization Theory and Applications, vol. 22, no. 2, pp. 209–226, 1977.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
P. L'Ecuyer and R. Touzin, “On the Deng-Lin random number generators and related methods,” Statistics and Computing, vol. 14, no. 1, pp. 5–9, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
R. L. Graham, “The largest small hexagon,” Journal of Combinatorial Theory. Series A, vol. 18, pp. 165–170, 1975.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
C. Audet, P. Hansen, F. Messine, and J. Xiong, “The largest small octagon,” Journal of Combinatorial Theory. Series A, vol. 98, no. 1, pp. 46–59, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
L. R. Foulds, D. Haugland, and K. Jörnsten, “A bilinear approach to the pooling problem,” Optimization, vol. 24, no. 1-2, pp. 165–180, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. A. Haverly, “Studies of the behaviour of recursion for the pooling problem,” ACM SIGMAP Bulletin, vol. 25, pp. 19–28, 1996.
View at: Google Scholar

Copyright

Copyright © 2010 Abdelkrim El Mouatasim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

3108

Downloads

1290

Citations