Discrete Time Linear Quadratic RegulatorEdit

Discrete Time Linear Quadratic Regulator

Discrete Time Linear Quadratic Regulator (DT-LQR) is a cornerstone of modern control theory for designing optimal feedback laws for discrete-time, linear dynamical systems. When the system dynamics are linear in the state and input and the performance objective is quadratic, the optimal controller is a linear state feedback u_k = -K x_k. The gain K is computed from the system matrices A and B and the weight matrices Q and R, with Q weighting state deviation and R weighting control effort. DT-LQR is particularly well suited for digital implementations, where the controller is updated at fixed sampling intervals and executed on microcontrollers or digital signal processors.

DT-LQR is widely used across industries such as aerospace, robotics, automotive, and industrial automation because of its clear performance guarantees, analytical tractability, and the ability to tune performance via simple weightings. The method assumes a known model and, in its simplest form, seeks to minimize a predictable cost over time. Proponents emphasize transparency, rigorous guarantees, and the ease with which the design translates into implementable code. Critics often point to real-world mismatches between the model and the operating environment, but the framework remains a central reference point for principled, cost-aware control design. See Linear-quadratic regulator for the broader continuous-time counterpart and Discrete-time algebraic Riccati equation for the mathematical core behind the solution.

Mathematical formulation

Consider a discrete-time, linear time-invariant system

x_{k+1} = A x_k + B u_k,

where x_k ∈ R^n is the state and u_k ∈ R^m is the control input. The DT-LQR problem seeks to minimize the infinite-horizon quadratic cost

J = ∑_{k=0}^{∞} ( x_k^T Q x_k + u_k^T R u_k ),

subject to the dynamics above, with Q ∈ S_+^n (symmetric positive semidefinite) and R ∈ S_++^m (symmetric positive definite). Additional terminal costs or finite-horizon variants are treated similarly with corresponding modifications to the cost and recursions.

When a stabilizing solution exists, the optimal policy is a linear state feedback u_k = -K x_k. The gain K and the associated value functions are derived from the solution of a discrete-time Riccati equation. The central objects are

  • P: the symmetric positive semidefinite matrix solving the discrete-time algebraic Riccati equation (DARE),
  • K: the feedback gain given by K = (R + B^T P B)^{-1} B^T P A,
  • P solving the equation P = A^T P A - A^T P B (R + B^T P B)^{-1} B^T P A + Q.

The DARE guarantees that, for the stabilizing solution P, the closed-loop eigenvalues of A - B K lie inside the unit circle, ensuring asymptotic stability under the optimal policy. See Riccati equation and Discrete-time algebraic Riccati equation for formal treatments, and State-space representation for the modeling framework.

Infinite-horizon solution and conditions

  • Q must be symmetric positive semidefinite, R must be symmetric positive definite.
  • Under standard detectability and stabilizability conditions, there exists a unique stabilizing solution P to the DARE.
  • The resulting K yields a closed-loop system x_{k+1} = (A - B K) x_k that is stable in the sense of the discrete-time Lyapunov standard.
  • The optimal cost is J* = trace(P X_0), where X_0 is the initial state covariance in the noise-free setting; in the deterministic setting, J* = x_0^T P x_0 when starting from a known x_0.

Finite-horizon solution

For finite horizon N with a terminal cost P_N, the problem is solved by backward recursion:

P_N = Q_f (terminal weight, if present), P_k = Q + A^T P_{k+1} A - A^T P_{k+1} B (R + B^T P_{k+1} B)^{-1} B^T P_{k+1} A, for k = N-1, ..., 0,

and K_k = (R + B^T P_{k+1} B)^{-1} B^T P_{k+1} A,

yielding the time-varying gain for a finite-horizon optimal policy.

Implementation, computation, and practical considerations

  • Computing P (and K) typically relies on numerical solution of the DARE, using reliable linear-algebra routines. Software libraries commonly available for control theory tasks implement these steps efficiently and robustly.
  • In practice, model errors, disturbances, and unmodeled nonlinearities can degrade performance. Engineers address this with design choices on Q and R, robustness analysis, and, when necessary, more advanced frameworks such as robust control or Model predictive control.
  • Discrete-time implementation naturally aligns with digital hardware, sampling, quantization, and actuator saturation. Real-time constraints motivate careful selection of sampling rate and data precision, and sometimes necessitate augmentations (e.g., integral action) to prevent steady-state error.
  • Extensions to handle partial state information commonly combine LQR with estimation via a Kalman filter, producing the LQG (Linear-Quadratic-Gaussian) framework. The separation principle shows that, under standard assumptions, estimation and control can be designed separately. See Kalman filter and LQG for details.
  • For nonlinear systems, DT-LQR serves as a local linear approximation around a nominal trajectory. Nonlinear extensions include iterative methods such as iLQR (iterative LQR), which successively linearize and solve LQR subproblems to approach a nonlinear optimum.

Extensions and related approaches

  • Robust and constrained variants: When great care is required around model mismatch or input constraints, practitioners turn to robust control methods (e.g., H2 or H-infinity frameworks) or to constraint-aware approaches such as Model predictive control (MPC), which can incorporate hard constraints and nonlinear dynamics.
  • Nonlinear and time-varying settings: iLQR and differential dynamic programming extend LQR ideas to nonlinear dynamics by iteratively solving local LQR problems along a trajectory.
  • Estimation and sensing: In systems where the full state is not measurable, the LQG combination with a Kalman filter provides an optimal framework under Gaussian noise assumptions.

Controversies and debates within the field often center on how to balance simplicity, robustness, and constraint handling. Proponents of the classic DT-LQR prize the mathematical clarity, stability guarantees, and easy tunability of the cost weights. Critics emphasize that real-world systems frequently violate model assumptions, have hard constraints, or operate under uncertainties that the basic LQR formulation does not address without augmentation. The move toward robust or predictive methods is, in practice, a continuum: the DT-LQR remains a foundational, well-understood starting point whose assumptions and limitations are well documented, while practitioners layer on additional tools as needed.

From a practical standpoint, decisions about DT-LQR design are driven by performance, reliability, and cost-efficiency. While some critics argue for broader social or policy considerations in engineering practice, the core mathematics remains value-neutral: it provides a way to balance tracking performance against control effort in a predictable, analyzable fashion.

See also