{Agorithm}
Algorithms/Matrix Chain Multiplication
DPadvancedoptimizationinterval-dpparenthesization

Matrix Chain Multiplication

Find the optimal parenthesization of matrix products to minimize operations.

How it works

Matrix Chain Multiplication asks: given a sequence of matrices, in what order should we perform the multiplications to minimise the total number of scalar operations? Because matrix multiplication is associative, the order of parenthesization can dramatically affect cost — the difference can be orders of magnitude for long chains.

The algorithm builds a 2-D DP table where dp[i][j] holds the minimum cost to multiply matrices i through j. It fills the table by increasing chain length, and for each subproblem tries every possible split point k, combining the costs of the left and right sub-chains with the cost of the final merge.

Recurrence: dp[i][j] = min over k in [i, j-1] of { dp[i][k] + dp[k+1][j] + p[i-1]·p[k]·p[j] }, where p is the dimension array.

Base case: dp[i][i] = 0 for all i (a single matrix needs no multiplication).

The O(n³) time complexity comes from the three nested loops: chain length, start index, and split point. This classic interval-DP pattern recurs throughout competitive programming in problems involving optimal tree structures, polygon triangulation, and optimal BST construction.

Complexity

Best case
O(n³)
Average case
O(n³)
Worst case
O(n³)
Space
O(n²)

Visualizer

Generating steps…

Where it's used in the real world

Compiler expression optimization

Choosing evaluation order for chained linear-algebra expressions

Compilers and JIT engines for array languages (NumPy, Julia, TensorFlow XLA) model a series of tensor contractions as a matrix chain and use DP to select the cheapest evaluation order before emitting code, saving redundant FLOPS at runtime.

Graphics and game engine matrix pipelines

Optimising batched transform chains for scene graph nodes

A scene graph may accumulate dozens of 4×4 transform matrices per object. Choosing the cheapest parenthesization of these chains — especially when many objects share a common prefix — reduces CPU/GPU work per frame and improves throughput.

Quantum circuit compilation

Minimising gate count in unitary matrix decompositions

Quantum compilers must decompose a target unitary into a product of hardware-native gates. Selecting the splitting order that minimises entangling-gate count is structurally identical to matrix chain optimisation over the SU(2) gate alphabet.

Implementation

function matrixChain(dims: number[]): number {
  const n = dims.length - 1;
  // dp[i][j] = min cost to multiply matrices i..j (1-indexed)
  const dp: number[][] = Array.from({ length: n + 1 }, () =>
    Array(n + 1).fill(0)
  );

  for (let l = 2; l <= n; l++) {
    for (let i = 1; i <= n - l + 1; i++) {
      const j = i + l - 1;
      dp[i][j] = Infinity;
      for (let k = i; k < j; k++) {
        const cost = dp[i][k] + dp[k + 1][j] + dims[i - 1] * dims[k] * dims[j];
        if (cost < dp[i][j]) dp[i][j] = cost;
      }
    }
  }

  return dp[1][n];
}

// Usage — 6 matrices with dimensions from [30,35,15,5,10,20,25]
console.log(matrixChain([30, 35, 15, 5, 10, 20, 25])); // 15125

Related algorithms