dc.description.abstract 
This thesis proposes and analyzes a new parallelintime gradienttype method for timedependent optimal control problems. When the classical gradient method is applied to timedependent optimal control problems, each iteration requires the forward solution of the state equations followed by the backward solution of the adjoint equations before the gradient can be computed and the controls can be updated. The solution of the state equations and the adjoint equations is computationally expensive, is carried out sequentially in the time dimension, and consumes most of the computation time. The proposed new parallelintime gradienttype method introduces parallelism by splitting the time domain into N subdomains and executes the forward and backward computation in each time subdomain in parallel using state and adjoint variables at time subdomain boundaries from the last optimization iteration as initial values. Ignoring communication cost and assuming computation load balance, N parallelintime gradienttype iterations can be executed in the computing time required by a single classical gradient iteration. The basic proposed parallelintime gradienttype method is also generalized to allow for different time domain partitions for forward and backward computations and overlapping time subdomains.
Due to the time domain decomposition, the state and the adjoint equations are not satisfied, since states and adjoints exhibit jump discontinuities at the time subdomain boundaries. Therefore the control update direction in the parallel gradienttype method is not a negative gradient direction and existing convergence theories for the gradient method do not apply. I provide convergence analyses for the new parallelintime gradienttype method applied to discretetime optimal control problems. For linearquadratic discretetime optimal control problems, I show that the new parallelintime gradienttype method can be interpreted as a multiplepart splitting iteration scheme where control update in one iteration is determined by control variable iterates from multiple previous iterations, and I prove convergence of the new method for sufficiently small fixed step size by showing that the spectral radius of a corresponding implicitly constructed iteration matrix is less than one. For general nonlinear discretetime optimal control problems the parallelintime gradienttype method is combined with metric projection onto a closed convex set to handle simple control constraints. Convergence is proved for sufficiently small step sizes. Convergence theorems are given with different assumptions on the problem, such as convex objective function or compact control constraints. For linearquadratic optimal control problems, I also interpret the parallelintime gradienttype method using a multiple shooting reformulation of the optimization problem. The new method can be seen as using a gradienttype iteration to solve the optimality saddle point system in the multiple shooting formulation. An alternative convergence proof is given for linearquadratic problems by the multiple shooting point of view.
The new parallelintime gradienttype method is applied to linearquadratic optimal control problems governed by linear advectiondiffusion partial differential equations (PDEs), and to a wellrate optimization problem governed by a system of highly nonlinear PDEs that models twophase immiscible incompressible subsurface flow. For linearquadratic optimal control problem, each iteration of the parallel gradienttype method with N time subdomains takes roughly 1/Nth of the time required for one iteration of the classical gradient method. For moderate N (up to N=50 in one example) time subdomains, the parallel gradienttype method converges in approximately the same number of iterations as the classical gradient method and thus exhibits excellent scaling. For larger N, the parallel gradienttype method may use significantly more iterations than the classical gradient method, which negatively impacts scaling for large N. The parallelization in time is on top of parallelization already used to solve the state and adjoint equations (e.g., through parallel linear solvers/preconditioners). This is exploited for the larger and more complex wellrate optimization problem. If existing parallelism in space scales well up to K processors, the addition of time domain decomposition scales well up to K * N processors for small to moderate number N of time subdomains.
