The single master rule: einsum iterates over all index combinations, multiplies the corresponding elements, and sums over any index not in the output.
"inputs -> output", where inputs are comma-separated index strings for each operand.
Each letter represents one axis (dimension) of a tensor. The size of that dimension must be consistent wherever the letter appears.
Any index that appears on the right side of -> is a free index — it survives into the output.
Any index that appears in the input(s) but not in the output is a dummy index — it gets summed over (contracted).
"ij,jk->ik": j is dummy → summed → matrix multiply"ij->i": j is dummy → summed → row sumsSame letter twice in one operand constrains to the diagonal along those axes.
"ii->i": diagonal elements → vector"ii->": diagonal then sum → traceSame letter in two operands: aligned along that dimension before multiplying.
->)With arrow: You fully control which indices appear in the output and in what order.
Without arrow (implicit): Include each index appearing exactly once (alphabetical order), sum any appearing more than once.
An index in multiple operands and in the output is a batch index — no summation, operation performed independently per value.
"bij,bjk->bik": b is batch → batched matrix multiply
The order of indices in the output string determines the shape. This lets you transpose implicitly: "ij->ji"
| Expression | What happens |
|---|---|
"ij->ij" | Identity |
"ij->ji" | Transpose |
"ij->i" | Row sums |
"ij->" | Sum all → scalar |
"ii->i" | Diagonal |
"ii->" | Trace |
"ij,jk->ik" | Matrix multiply |
"ij,ij->ij" | Element-wise multiply |
"ij,ij->i" | Row-wise dot products |
"i,j->ij" | Outer product |
"i,i->" | Dot product |
"bij,bjk->bik" | Batched matmul |
"ij,kj->ik" | A @ B.T |
i,i→ — the shared index i is contracted, leaving no free indices.
i,k→ik — both indices are free, so the result is 2D.
Broadcasting view: reshape a to a column via a[:, None], b to a row via b[None, :], then broadcast-multiply.