Optimal control theory and static optimization in economics ebook




















We obtain. These loci are drawn in Figure 2. The last one is a saddle point , the first one is an unstable node , and the other two are locally asymptotically stable nodes. The matrix of co efficient of the linearized system is 1 2 X 2Y 2X 2. The configuration of the other equilibria can be similarly checked. In this section we concentrate on a class of systems often encountered in optimal control theory. We tracted from the first equation. This yields xt in terms straightforward.

Figure 2. Appendix Example 2. It is clear that all antiderivatives of a given function differ from one another only by a constant. The symbol for it is 1. As we shall see, it is in implicit form. Find the indefinite integral of xex. A8 Example. Evaluate the following definite integral: F b - F a f u du If we define A 9 is the same as F b F a , where F x is any antiderivative o f f x.

In the definite integral A 9 , a and b are called the lower and upper limits of the integral. Suppose that aggregate output adjusts. G te Differentia. Let u w be the utility of u for which the coefficient of absolute.

Find the equilibriumequilibrium is stable. This economy 's labor force grows or declines according to the difference between the marginal product of labor and w: L t L t FL K t , L l w. Is it stable? We use a simple dynamic macroeconomic model to introduce several im portant concepts through numerical examples.

In such a model we no longer solve for the static equilibrium values but determine the functional form of the variables in terms of t For this we reduce 3. Scanned by CamScanner 3 Introduction to dynamic optimization 3. Let the size of the loan be L , with continuously compounded interest at the rate r ; the amount to be repaid at the expiry date of the loan , say T , is LerT. This implies that we jump onto a new trajectory at time Tt as illustrated in Figure 3.

This emphasizes the importance of correctly specifying boundary conditions. The exact conditions to ensure Substituting be derived but are intricate. Note that L could be negative. Then equation 3. Our objective is to maximize utility over some horizon T U C t dt. Letting c vary over time could only improve the integral maximand , if it had any effect. In that sense the above solution is suboptimai.

Clearly, then , it is not possible to solve equation 3. The elementary calculus techniques used in this chapter are of no help in solving the problem of 3 25 3. This is indeed a simple example of what we call a control problem; the solution of such problems is the subject of the next chapter. It is well understood in a financial context.

There are several criticisms of. A second criticism is that future generations and that 6 of the summation of utility lev els across time, which is implicit in the integral formulation of the crite. Whereas it is reasonable to assume that money flows can be added , This assumes that interest is compounded each period , that is, reinvested after each period.

One remarkable feature of the exponential form of the discount one that ensures that the planner will be able factor is that it is the only to formulate a plan that he will actually wish to follow, even if offered the opportunity to change it at a later date. This is the issue of consistency that will be briefly discussed in Chapter 4. In continuous- time models , that is, when the time variable is real-valued , equation 3.

Fur ing near full employment; hence, its existence is not often by the argument thermore, we can give it a neat theoretical justification productive as of a use the that and time that production processes take time must attract an eco set or money to buy it - for some length of money terms , it seems in nomic rent Thus, if our objective is expressed , if the flow of profit at appropriate to use a discount factor. For example, if the Ievel of much as we would a monetary t y we discount and aggregate this the subjective j H e r e.

Such welfare or. This is the exponential decay typical of radioactive materi als. We can express this phenomenon as a differential equation. The time. Equation 3. Reconsider the consumption path problem of Section 3. Suppose now that the planner has for horizon the time interval 0, 3 He has more latitude in the choice of the propensity to consume in the sense that he can choose one value , c during the time interval 0, 1 and another value, c2, during the time interval 1, 3.

Cal culate V as a function of c ; calculate 5 1 ; use this initial condition to deter mine C 0 over 1, 3 and calculate V2. Can you explain why? Attempt now a more elaborate exercise in which the horizon 0 , 4 is split into four intervals of length 1.

On each interval the propensity to consume is constant The task is to choose c , c2, c3, and c4 to maximize the total utility logarithm of consumption on the whole interval.

Determine their exact forms by using the boundary conditions. The first one depends only on c ; the last one depends on c , c2 , c 3 , and c4. Can you rationalize it? Suppose that you Reconsider the optima! Does your answers satisfy equation 3. The storage cost a date fer selling the bottle that maxi choose to wish You. Redo the calculations the optimal time of sale? How does Section 3. Find the form of W when Reconsider the fiscal policy problem of the value of 0 that maximizes W and. The maximum principle is the central result of the theory.

It was originally developed by Pontryagin and his associates; see Pontryagin et al. To help the reader become thoroughly acquainted with it , we proceed with the analysis of a simple case , without paying undue attention to some technical regularity conditions.

These and other matters will be dealt with in Chapter Some variables can be identified that describe the state of the system: they are called state variables - for instance, the distance of the spaceship from earth or the stock of goods present in the economy. The rate of change over time in the value of a state variable may depend on the value of that variable, time itself , or some other variables , which can be con trolled at any time by the operator of the system.

These other variables are called control variables - for instance, the pitch of the motor or the flow of goods consumed at any instant. The equations describing the rate of change in the state variables are usually differential equations , as dis cussed in Chapter 2.

Once values are chosen for the control variables at each date , the rates of change in the values of the state variables are thus d i n i t i a i a i u g I q r. For instance , the pitch of the spaceship engine determines itj pesd and hence its distance from earth once its mitial posi tion is known; the consumption path of the economy determines net in vestment ana hence capital stock accumulation over time.

The object ot controlling a system is usually to contribute to a given objective. For in stance , the values of all the relevant variables determine the fuel con sumption of the spaceship at any time, and the objective is to minimize total fuel consumption so that some destination is reached within a given time period. A salient feature of optimal control problems that emerges from the foregoing discussion is that it is necessary to choose a value for the control variable or variables at each instant ; when , as is usually the case , time is taken to be real - valued , there are infinitely many values of the control to be chosen.

Another way of putting this is to say that we must find a functional form over some time interval , which the control yariable. Fortunately, the maximum principle provides a framework that makes these.

This is a very restrictive assumption ; see the final remark of Section 4. The necessary conditions that constitute the maximum principle are most conveniently stated after some auxiliary variables, akin to multipliers, have been introduced ; with the state variable 5 0 is associated an We define, auxiliary variable called a at each instant , a new function called a Hanvllonian, similar to a LaIS grangean.

The Hamiltonian. Using the - definition of the Hamiltonian in 4. Therefore, the optimal triplet is a solution of equations 4. These consist of two differential equations and an algebraic equation , often called the first order condition since it optimally selects the control.

Before attempting to apply the maximum principle to a specific problem , we will show how easy it is to derive it in a discrete analog of the problem presented in this section. We present a proof for the continuous- time case in Section 4. Thus , it is a constrained maximum problem with a special recursive structure. The hfference equation 4. It is instructive to apply our result to a simple example. These three sets of equations can be written more compactly as that described the change at each instant.

The independent time argument has been suppressed here to simplify the notation ; its inclusion docs not affect this derivation. To this end a multiplier to each constraint.

Example 4. It is interesting that in equation 4. Hence, there is a different optimal' path for each terminal condition. It is worth remarking that for some terminal conditions no feasible solution exists: since the control must remain positive, we get from 4. In Section 3. Although it was pointed out that this was unduly restrictive, more flexibility was shown to yield an insoluble problem. In this section we reconsider this problem. Before leaving this example it is useful to reflect on one hidden assumption.

Our purpose is not to argue for or against this assumption , as already mentioned in Section 3. Note that 5 could become negative because the control was free of all constraints. Had we wished to restrict it to values between 0 and 1 , we would have needed more general results ; the con strained control problem and other extensions are the subject of Chapter 6. To conclude this section we shall solve a slightly more general version of this model.

The rate of interest r and the values of the state variable s0 and 5r are exogenously specified. T - — It is obvious how to proceed : we can use the second equation to get the general solution for x, substitute it in the first equation to get c, and substitute c into the last equation to get the general solution for s.

Nevertheless, these are often difficult to solve analytically. If all functional forms and relevant parameter values were specified , it would be possible to use numerical methods to derive the solution. This may be useful in physics or engineering; however, since our ultimate purpose in using control theory is to gain insight into the dy namic behavior of economic models, we often deal with problems involv ing unspecified functional forms.

For example , often for the sake of gen erality, we do not wish to specify a particular form of the utility function e. In such cases the explicit solution of the differentia equa l tions is impossible. The best we can hope for is a qualitative characteriz a tion of the optimal solution , as was often the case in static optimizatio n problems in economics. This seems a formidable task. Fortunately , we have just the device needed: the representation of the solution on a phase diagram as described in Section 2.

We will be able to partition the phase space into regions in which we know whether the variables increase or decrease over time. Further analysis will yield restrictions on the shape of the trajectories that are candidates for the optimal path.

As with most qualitative tools this is not a perfect device. In many cases it will require some ingenuity to pinpoint exactly the optimal trajectory. Nonetheless , it provides a structure for detailed qualitative analysis. Initially we shall illustrate the technique with a numerical example. There is not well defined, since it depends section.

The procedure used to derive 4. First , we totally differentiated the first -order condition and obtained an equation involving a d term and a x term. We eliminated the latter using the differential equation in x. This in turn introduced a x term , which was eliminated by using the original first-order condition again.

Finally, some algebraic manipulation yielded a simpler form. This procedure is often useful in solving simple problems, but it sometimes fails. In that event we will attempt to devise another way out of the dif ficulty. We follow the procedure outlined in Section 2. The main task is to use equations 4. The curve goes through the origin , where it has an infi nitely large slope. They define four regions or isosectors , which are labeled I -IV.

The expressions defining 5 and c are continuous in the posi tive orthant check 4. In order to ascertain the sign of 5 and c in any one of the four regions , it is sufficient to evaluate these signs at an arbitrary point of the region. I i Scanned by CamScanner 4 The maximum principle 4.

We must gather another piece of information before we can draw the general shape of trajectories. This information , along with the direction of trajectories in each region , allows us to draw the shape of trajectories when crossing a critical locus from one region to the next; this is done in Figure 4. The general solution to 4. The knowl edge of the boundary conditions will determine the specific solution.

We did not assign specific values to 50 and sT in order to discuss the influence of the boundary conditions on the choice of the optimal trajectory. To this end we represent a few possible trajectories in Figure 4. We know from the theory of differential equations that trajectories cover the whole space; therefore, there is one trajectory in region I that reaches the equilibrium point E We refer to it as a stable path.

We can check the saddlepoint property locally by linearizing th system 4. The matrix of coefficients obtained is '. The arms of the saddle point and some other trajectories are drawn in Figure 4. Vertical dashed lines have been drawn through those values to emphasize the fact that although boundary values are specified for the state variable, all values for the control are to be optimally chosen.

The optimal path will be within regions I and IV. Indeed , there is no way to reach region III beginning at s0 , and were the path to enter region II , it would never reach sT. The optimal path may be wholly in region IV, as are i and ii , or begin in I , as does iii.

However, to travel the length of a path such as i takes some fixed amount of time. Since the time hori zon has been specified exogenously as [ 0 , T ] , we must select the path that will go from s0 to sT in exactly T units of time. Hence , if T is small , the optimal path will be a high path , such as i ; the larger Tt the further down we go until we reach path ii.

This is , of all paths wholly within region IV, the one that takes the longest time to go from s0 to sT. If Tis still larger, we must select as optimal a path such as iii that begins in region I.

No matter how large T is , we shall always be able to select an appropriate path , because we can choose one that goes arbitrarily close to E before turning left into region IV, and in the neighborhood of E, movement along the path would be very slow. The effect of Ton the choice of an optimal path can be explained in simple terms. Recall that c is con sumption and 5 a stock of capital. You have been given T periods to eat into your capital from S Q to S j.

The shorter the time you have , the higher the rate of consumption you will be able to afford. If you must plan for a long enough time , you will find it optimal to begin with a level of consumption low enough that you accumulate capital initially.

In any event , consumption increases monotonically through time. The length of the planning path , the closer would determine exactly which one , since the higher theslowly. This completes the analysis of the 5, c phase diagram for this problem. In some cases it may be useful , necessary, or the only feasible choice to conduct the phase diagrammatic analysis in the state, costate plane.

Recall that we attempted this task but encountered difficulties because of an independent time term in the costate differential equation.

We now show how to deal with this problem by a change of variable. We had to deal with equations 4. Substituting 4. The phase diagram in the 5, 0 space can be constructed in the same manner as the preceding one. We shall again restrict our attention to the positive orthant since it is obvious from the optimality condition 4.

From 4. Equation 4. There are again four regions in the phase space. We have numbered them in a way that is consistent with the numbering in Figure 4. In fact it is very important to grasp the correspondence between the two diagrams. The first-order condition 4. Therefore , whereas the horizontal axes in Figures 4. Thus, region I of Figure 4. Similar correspondences apply between other regions , as can readily be checked.

With boundary values s0 and sr , a short planning horizon will lead to an optimal path such as i. The most time- consuming monotone path is ii. If more time is avail able , a path such as iii would be selected. If boundary values were s0 and J7 , a path such as iv would be chosen. Paths i - iv in Figure 4. We mal solution. Note that if a numerical solution were required , theihe opti insights gained from the phase diagram analysis would be very useful in it down.

Our next task in this section is to introduce a far moretracking general growth model from which the previous examples were drawn. Ii will be shown that the phase diagram analysis is no more complicated in 1 he gen eral case than it was in the example. Hence, the technique is more power ful than this example and those of Section 2. Find c 0 to maximize! The similarities between Figures 4. We can prove the existence of a local saddle point by linearizing 4.

In Example 4. Equations 4. The d 0 locus is a oo 0 exists. These loci are drawn in Figure 4. Substitution into 4. Nonetheless , all the important features of the solution , given specific values for T , s0, and sT , are common to the two problems , whether or not the above special as sumptions are made.

We have established our contention that phase dia gram analysis is a powerful technique that can be used on models without specified functional forms. This concludes our introduction to phase diagram analysis. Before we move on to the next section , it is convenient to include here an alterna tive statement of the maximum principle that yields equations of the type 4.

This procedure applies to control problems in which the only independent time term appears in the discount factor in the maxi mand ; these are an important class of problems. Instead of introducing the costate variable ir t and the Hamiltonian Figure 4. Furthermore , we know lat both u' and Fare positively valued and defined for all positive values f their arguments.

Hence, lis point of intersection is the unique equilibrium. Figure 4. The only eature of the phase diagram in Figure 4. The economic interpretation of the maximum principle is taken up in the next section , but it is already apparent that if H and 0 reflect current val ues, then H and i r reflect discounted or present values see Section 3.

H and 0 0 are called the current value Hamiltonian and current value costate, respectively. Had we instead used the more general formulation 30 max 4. One can easily check that 4. It is an instructive exercise to verify that 4. We substitute this result into 4. Before concluding this section we must 4. The difficulty with any other form of discount factor is that the individual would persistently wish to change her opti mal plan , thereby rendering planning rather meaningless.

This problem is known as a problem of dynamic inconsistency; it is perhaps one of the reasons for the almost universal use of the exponential discount factor.

At time 0, the individual would want to solve 4. Equaprove the equivalence of 4. She would do this if the discount Economic interpretation of the maximum principle In this section we show that the maximum principle can be given an appealing economic interpretation that gives us further insight into the optimality of dynamic choice.

We deal with the general control problem in troduced in Section 4. First we need to transform multipliers Lagrange interpret the one used to of integration by parts method the ' from know the above expression. Thus, 4. The meaning of the results 4. The meaning of this costate variable parallels that of the multipliers and dual variables encountered in static optimization. The difference in signs between 4.

Mi v Scanned by CamScanner 4 The maximum principle 4. We must formal ize the notion that at some lime 0 in the interval 0, T , a small amount of capital stock is suddenly added to the existing stock. The number e is arbitrarily small and positive. Now that the meaning of each variable is understood , we can turn to the maximum principle itself.

Recall that v s , c , is some instantaneous value function and that f s, c j describes the growth of s. One could call it a dynamic value function because it also takes into account the effect of current stock and control on the size and valuation of future stock.

Therefore , maximizing H at each instant t yields a dynamic optimum at that time. That is the general structure of the maximum principle. Let us now — examine each of the equations 4. The Hamiltonian of 4. When the control is chosen to maxi mize that function , 4. The second term is the product of the value of stock at time t and the marginal effect of consumption on the rate of growth of stock at that time; hence , it is the marginal contribution of current consumption to future value , through its effect on the growth of stock.

Together these two terms account for Scanned by CamScanner 4. The first one, 4. The first term is the marginal effect of stock on the instantaneous value function , that is , on the current flow of value.

This interpretation is appropriate for a central planner who seeks the optimal pricing of stock over time. Let us now imagine the reasoning of an individual agent to whom the price of stock is exogenously given. There is no need for the agent to take into account the whole planning problem.

Indeed , we may view agents at different times as distinct individuals who simply maximize the Hamiltonian at one instant. Their knowledge of x and their myopic optimizing behavior at each instant are sufficient to guide the economy along the optimal path. While the agent at each instant t need choose only the control , it is instructive to reconsider equation 4. Therefore, it is as if the agent were following an optimal investment plan at each instant.

We now illustrate our economic mterpretation of the maximum principle wuh a more spec fic problem. To begin with , we restate the problem and describe it in economic terms. Consider an economy with a single capital resource that can be used to produce a single good according to some technology and can also be consumed. If we denote the stock of capital at time t by s t , gross production is F s t , where F is the production function.

Thus , the rate of change of capital stock is described by. Note that value is mea sured in units of utility. To reach a dynamic maximum the Hamiltonian is maximized at each instant ; the influence of the consumption flow on the growth of capital is taken into account. This maximization is charac terized by 4. This reflects the fact that consumption is simply subtracted from the growth in capital stock.

The net mar ginal physical product is the gross marginal physical product less the rate of depreciation. Hence , equation 4. This sum represents the net marginal gain of a would - be investor holding 5 units of capital and facing the price path T. If it were to be negative or positive, the investor would be holding too much or too little capital , respectively.

We can derive further insights into the nature of the optimal solution by examining the consumption path. To this end we totally difc: ferentiate 4.

Suppose that a central planner is in charge of charting the path of the economy over some time horizon [ 0, T He knows that at time 0 there are 50 units of capital available and that he must reach time T with 57units of the stock. Meanwhile , he has chosen as his objective the maxi mization of the total discounted utility over the planning horizon.

This implicitly assumes that utility at any instant is not directly dependent of the stock of capital at that time no consumption Scrooge element. Consider the control problem stated at the beginning of Section 4. We assume for simplicity that all variables are continuously differentiable functions of time. This class of functions is denoted by A. Clearly, cm t is a member of A.

The resulting state variable path through 4. It pays to have a higher future consumption relative to present consumption if the cost of postponing it is smaller than the gains it provides through its marginal effect on net capital return , where the cost of postponing consumption evaluated at the margin is indicated by the discount rate 6.

To conclude this section we now briefly present a phase diagram analysis of this problem in the s, c plane , using equations 4. If T is very small , a high path such as i is chosen ; for larger T , ii may be optimal. Along both these paths capital decreases monotonically. If T is even larger, we may, as in case iii , go through an initial phase when capital stock is increasing. This diagram will be referred to in Section 6. Scanned by CamScanner 4 The maximum principle 4.

We recognize here the maximum principle stated in more detail in equations 4. We will shortly state conditions guaranteeing this. For simplicity of notation , we will suppress all arguments of functions , and let an asterisk denote optimality when superscripted to variables or functions; the absence of an asterisk denotes any other feasible solution in particular, s satisfies V a man the boundary conditions.

Consider the problem defined in 4. Suppose that the maximum principle of 4. If the Hamiltonian of 4. First we must define a new concept and restate the maximum principle. For the problem of 4. It follows that if H is concave in 5, c , then any solution satisfying the maximum principle yields a value V at least as high as the value V yielded by any feasible solution. We now state the result formally. This is why we now provide a set of conditions that ensure the concavity of H.

We can now restate the maximum principle. Consider the problem of 4. If the maximized Hamiltonian of 4. Note that attention must still be paid to the properties of H in terms of c given s and x to ensure that 4. The proof proceeds along the same lines as that of Theorem 4. We now illustrate Theorems 4. Here we illustrate the remark made after Theorem Consider a stock whose growth is enhanced by its own size and 4 6 4 the effort made to tend it. Utility depends on the size of the stock and is negatively related to effort.

However, the first - order condition did not select a maximum over c but indeed a global minimum since the Hamiltonian H is clearly convex in c. Before leaving this section , recall that we have assumed through out that the variables are continuously differentiable functions of time. It is an attractive feature of optimal control theory that it can deal with a wider class of solutions. All that is needed is that the control variable be piecewise -continuous.

This means that it is acceptable to have a control function c t that exhibits some jump discontinuity at a finite number of points. It follows that s and x themselves are piecewise-continuous while 5 and x are continuous and piecewise -differentiable.

A more formal state ment of the maximum principle is postponed until Chapter 6 , and the special features of problems exhibiting such discontinuities are the sub ject of Chapter 8.

What restriction must be placed on 7, 50 , and sT to ensure that the optimal control is positive? With out this a solution cannot exist. Consider a modified version of the problem of exercise 5. Apply the maximum principle and solve the problem. R, 6, and a are fixed positive constants. Set up the control problem.

Which is the state variable and which is the control? Write down the Hamiltonian and apply the maximum principle you will have three equations. Solve of integration , ential equation in x - you will need to use an arbitrary constant this to solve the say A. Obtain the optimal path for c - it depends on A. Repeat exercise 8 with the 50 2 c re , r TT , 0. Determine the approxi mate value of T. Show that there is a problem with this outcome. SQ , and sT are specified positive constants.

Apply the maximum principle. Obtain a differential equation for c and solve it. Use the result to solve the differential equation in s. Use the boundary conditions on s to determine the constants of integration. How does the optimal trajectory change if a larger 7 value is selected? Show that the results are qualitatively identical if an arbitrary U function is selected as in exercise 2.

Use the problem of exercise 3 to derive autonomous differential equations for c and 5 as in exercise 11 and draw a phase diagram in the positive quadrant of the c, 5 plane Show that the optimal trajectory is a straight line of slope 1.

How does the trajectory change when a larger 7 value is selected? Repeat the exercise with the problem of exercise 4. Show that the optimal trajectory has slope 1 a Does the exaci value of a change the general shape of the optimal trajectory? Consider the problem of exercise 5. Apply the maximum principle and derive autonomous differential equations for the state and the control variable i. Draw the phase diagram. Derive autonomous differential equations for the state and costate r variables.

Draw the phase diagrams in the c, 5 plane and the x, 5 plane. Show that c always increases over time but 5 is not always monotone. Can the value of 7 have a qualitative effect on the tra jectory of 5? Repeat exercise 13 for the problem of exercise 6. Obtain a system of autonomous differential equations for c and 5 and draw a phase diagram in the c, 5 plane.

Draw the phase diagrams in the c , s plane for exercises 8 and 9. Consider the mineral spring problem of exercise 7. Give an economic inter pretation of the costate variable. Discuss the conditions obtained from the maximum principle and comment on the trajectories identified in exercise The rate of interest is Any introductory treatment of optimal control theory would be incomplete without explicit mention of its predecessor, the calculus of variations , and the parallel development of dynamic programming.

The calculus of vari ations owes much to the eighteenth -century mathematician Euler, but many developments and refinements were made in the following centu ries. Optimal control theory, developed by Pontryagin and his co- workers in the late s , may be regarded as a generalization of the calculus of variations: not only is its field of applicability broadened , but the general problem is approached from a fresh and more insightful viewpoint.

Dynamic programming was developed by Bellman , also in the late s. In this chapter, we examine the connection of optimal control theory with the calculus of variations and dynamic programming. We illustrate how all three approaches lead to the same solution and comment on their relative usefulness in analytical economics.

We assume that Fpossesses continuous second-order partial derivatives. In opti mal control problems we require only that the function s t be piecewisedifferentiable i.

In the calculus of variations , at least in its early development , attention is restricted to the class of functions s t that have continuous second -order derivatives for all t in [ 0 , T. This class of functions is denoted by C 2[ 0, T. In the remainder of this section we present a necessary condition for a maximum and show how it can be used to derive the solution.

We first state an important result. Let us define the difference between the right - hand side and the left - hand side of 5. Any function 5 0 satisfying 5. Let us use 5. Our main reasons for noi devoting a great deal of space to the calculus dealing of variations are, first , that optimal control theory is capable of with a wider class of problems and , second , that the necessary conditions obtained from the calculus of variations contain no new information.

This we now proceed to demonstrate. Problcn 5. Let c t be a control variable , and let. Re call that Fc is the same as Fs. Example 5.

In order to formulate this problem in the calculus of variations format , we use 4. We need to find 5 0 and hence s t that maximize 0 — 5. This concludes our brief introduction to the calculus of variations. For a thorough treatment of economic growth using the calculus of variations see Hadley and Kemp , We will not discuss this topic again , because it is more economical to approach continuous- time optimization problems using optimal control theory. The latter is a more unified , more elegant , and more systematic body of knowledge that contains all of the resuits of the calculus of variations as special cases.

Therefore , the solution is Applying the necessary conditions 4. An alternative method of solving this type of problem is the dynamic programming approach , which successfully exploits the recursive nature of problem 4. To ex plain this , we first rewrite 4. The problem is to find c l , c 2 ,.. Scanned by CamScanner 5 Calculus of variations and dynamic programming 5. The usual dynamic programming terminology is as follows: In 5. Equation 5. Problem 5. More precisely, , i ii ,. This theorem is known as the principle of optimality.

This follows the fundamental observation made earlier that the value of the state variables at some time summarizes a l l the relevant information about the system at that time. Clearly, the principle of optimality relies on properties i and ii stated earlier.

This principle does not apply to problems that cannot be put in the form 5. Maximize R 5. We can prove the principle of optimality formally by establishing a con tradiction. This would contradict the hypothesis that the latter is an optimal solution of 5. This completes the proof of the principle of optimality. We call this the return function The prin ciple of optimality implies that.

It provides the basis for an efficient method of solution called backward induction , which we explain below. Substituting for V 3 , we have Example 5. We then Gbtain the optimal solution by retracing our steps. We now give a more formal account of the procedure.

There 5. Next 5. Again , 5. We now apply the backward solution method to an economics problem. The term a' is called the discount factor. When we consider equations 5. When we work back to initial time , for which we actually know the state of the system , we can begin to derive the particular optimal solution corresponding to this ini tial condition. The solution as described by 5.

This is in contrast to an open-loop control , in which the solution is given as a function of time only. Typically dynamic program ming yields a closed loop solution , whereas the maximum principle yields an open- loop solution e. The reader must be warned that closed loop solutions, such as 5. This method consists of showing that if 5. As we shall see in the next section , dynamic programming compares even less favorably with optimal control when time is represented as a continuous i.

In this section we show that the continuous time counterpart of the functional recurrence equation 5. In addition , the Hamilton Jacobi-Bellman equation can be used to derive heuristically the basic version of the maximum principle as presented in Chapter 4. We must hasten to add that for more complicated control problems the techniques provided by optimal control theory are more powerful; this is why we will concentrate on optimal control theory from Chapter 6 onward.

S T It follows from 5. In order to give the reader a better appreciation of the Hamilton Jacobi - Bellman equation , we illustrate it with a simplified version of problem 5.

Find c t that maximizes f Jo 5. V is defined by 5. Substitute the above equation into 5. Unfortunately, the analytical solution of partial differential equations is a formidable problem , and we will not attempt to solve 5. We wish , however, to convince the reader that the solution obtained through dynamic programming is the same as that obtained from the maximum principle.

Show that if the integrand in equation 5. Hint: Take the time derivative of equation 5. This completes our exposition of dynamic programming. Although not well suited to the analytical approach we wish to follow in this book , it is a very useful technique. Interested readers may consult Bellman and Dreyfus Exercises 1.

We now wish to consider a more general control problem involving many state variables and many control variables. We also wish to introduce constraints on the values that the control variables may take on at any point of time and also constraints on their overall paths from time 0 to time T.

For ease of exposition , we retain the assumption that initial time and terminal time are exogenously specified , as are the initial and terminal values of the state variables; these conditions will be relaxed in Chapter 7.

For example, consumption cannot be greater than output , and the rate of extraction of a natural resource may not exceed a certain upper bound specified by environmental control laws. In order to be more specific let us consider a version of the optimal con trol problem studied in Chapter 4.

Suppose that the first m' constraints are inequality constraints and that the remaining m m' constraints are equal ity constraints. Then the set of admissible controls can be represented by m' inequalities and m m' equations. W z can then write the constraints as 6. Condition 6.

The s e t o f admissible controls is therefore the striped area in Figure 6. Note that the northeast frontier of the set moves with time and depends on the value of the state variable s.

Since the set of admissible controls depends on t and s t , we shall denote it by the symbol f V s t , t. It is, of course, possible to specify constraints on the values of the state variables alone, e. In addition , since we will have to find values of the control variables that maximize a Hamiltonian function subject to the constraints 6. The most convenient constraint qualification is the rank condition, which we restate here.

In what follows we shall assume that the rank condition is satisfied. Heretofore we have been implicitly assuming that the vari ables were differentiable functions of time. We know that if the variables were simply continuous, we would have to deal with left hand -side and right hand -side derivatives , and this would somewhat complicate the al gebra.

However , control theory actually encompasses far more general problems. Formally we say that a variable is a piecewise - continuous function of time if and only if it is continuous almost everywhere , that is , anywhere but at a finite number of points , where it may exhibit jump discontinuities.

It would seem rather queer to complicate things so much were it not for the fact that even in some simple problems , no solution would exist if jump discontinuities were disallowed. To prove our point it suffices to consider a trivial example. These will naturally take the form of integral constraints. On reflection , an equality integral constraint such as 6. Thus , 6. Similarly, any inequality integral constraint e. Although we have not dealt with inequality examined in the next chapter.

This simplifies the exposition without loss of generality. In this problem , both the initial time and the terminal time are fixed. As one would expect , the necessary conditions for this problem are quite similar to the ones presented in Chapter 4.

There are , however , cer tain differences. We should , strictly speaking , include an additional constant is the constant 7r 0 in our statement of the necessary conditions , where TT0 multiplier associated with the integrand of the objective function.

This may reflect the fact that higheryielding ore is obtained when s is relatively large. Only in pathological cases does 0 may consult Long and not be concerned with these cases. The reader p for further Vousden , pp. It is denoted is output This. Let s t denote the Example 6. From 6. Our aim , however , is to introduce the reader to the use of the maximum principle in a constrained problem.

Scanned by CamScanner 6 The general constrained control problem 6. It follows from this and 6. Differentiating 6. Substituting 6. Equation 6. Thus , our interpretation in Section 4. In equation 6. We now proceed to show how an explicit solution may be obtained when both the utility function and the production function are specified.

In economic theory, it is desirable to find out the extent to which a particular result e. The remainder of this section is devoted to this task. It does affect x and X , however. Furthermore, suppose that for some utility function u we have found an optimal path with the property that c is constant.

Let us look at the previous optimal asterisked solution as a candidate solution. Equations 6. Use 6. Sufficiency is guaran teed because u is concave. We can state our result more formally: in Ex ample 6.

The constancy of the consumption path relies heav ily on the Cobb-Douglas form of the production function. In fact , a! S fru Jo. Thus, the fol lowing familiar complementary slackness conditions hold : ,. The first example of this section falls in that category ; subsequently, we present a more complex example. Example 6. In Sections 4. We did not impose the restriction that consumption be no greater than out put. Thus , along paths such as i in Figures 4. Unfortunately, that method will not because it would introduce a X term that cannot be eliminated work here with 6.

The way to deal with such problems is in general to construct a diagram in the state, costatc space; this is done in the next example for a more complicated case. Here , however , there is a single control variable and the inequality constraint is a simple bound on the control. This makes it possible to proceed directly to a phase diagram in the s , c space.

If the bound is inactive, we can use the same method as in Chapter 4 ; if the control is on the boundary, wc fnake use of that information to obtain the solution. In what follows wc apply this simple idea to the.

Let us consider the two cases separately: case A , in which the bound is active , and case B , in which it is not. In case B , the results arc the same as in Section 4.

Using equations 4. This is done in Figure 6. It is similar to Figure 4. Note that these phase lines are valid only when c 0 , the equality holds and we are in case A , to which we now turn. Presents a fully updated textbook on optimal controls with management science applications that models realistic situations faced in business and management Contains end of chapter exercises and numerous worked-out examples in the text for students, with answers to selected exercises An Instructor Manual containing solutions of the end of chapter exercises is available online Author is pioneer in the field known for his contributions in the area Request lecturer material: sn.

Front Matter Pages i-xxvii. What Is Optimal Control Theory? Pages The Maximum Principle: Continuous Time. Applications to Finance. Applications to Production and Inventory. Applications to Marketing. The Maximum Principle: Discrete Time. Maintenance and Replacement. Applications to Natural Resources. Applications to Economics. Stochastic Optimal Control. Differential Games. Back Matter Pages About this book Introduction This fully revised 3rd edition offers an introduction to optimal control theory and its diverse applications in management and economics.

It brings to students the concept of the maximum principle in continuous and discrete time by using dynamic programming and Kuhn-Tucker theory. While some mathematical background is needed, the emphasis of the book is not on mathematical rigor, but on modeling realistic situations faced in business and management. The book exploits optimal control theory to the functional areas of management science including finance, production and marketing and to economics of growth and of natural resources.

In addition, this new edition features materials on stochastic Nash and Stackelberg differential games and an adverse selection model in the principal-agent framework. The book provides exercises for each chapter and answers to selected exercises to help deepen the understanding of the material presented.



0コメント

  • 1000 / 1000