Reduced Row Echelon Form
Definition 1. A matrix is said to be in row echelon form (REF) if the following conditions are satisfied:
(1) Any row containing a nonzero entry precedes any row in which all the entries are zero.
(2) The first nonzero entry in each row is $1$, called the leading $1$, or the pivot.
(3) Below each leading $1$ is a column of zeros.
A matrix is said to be in reduced row echelon form (RREF) if it is in row echelon form and satisfies the following additional condition:
(4) Each column containing a leading $1$ has zeros everywhere else.
쉽게 생각해서 REF는 upper triangle matrix와 유사한 꼴이고, 더 나아가 RREF는 leading $1$을 포함하는 열의 성분을 싹 다 $0$으로 만들어 버린 행렬이다.
Gauss-Jordan Elimination
Definition 2. The procedure for reducing an augmented matrix $(A \,|\, b)$ corresponding to the system $Ax = b$ to REF by performing the elementary row operations is called Gaussian elimination. This procedure is also called forward elimination.
Furthermore, if this procedure reduces $(A \,|\, b)$ to RREF, then it is called Gauss-Jordan elimination, or backward substitution.
Gauss-Jordan elimination, 즉 가우스-조르당 소거법은 주어진 임의의 행렬을 RREF로 만드는 과정을 일반적으로 통칭하기도 한다. 그러나 정의에서 연립 방정식 $Ax = b$를 끼워 넣은 이유는, 실제로 가우스-조르당 소거법이 선형 연립 방정식의 해를 구하는 데 매우 효과적인 방법이기 때문이다. 아래 Theorem 1과 Corollary 1에서 알 수 있듯이 가우스-조르당 소거법에 의해 모든 행렬은 반드시 유일한 RREF로 변환할 수 있다.
Theorem 1
Theorem 1. Gaussian elimination transforms any matrix into its RREF.
Proof. We may apply the induction to $m$, where $A \neq O \in M_{m \times n}(F)$. If $m = 1$, then $R_{\frac{1}{A_{11}}1}(A)$ is the RREF of $A$.
Suppose that the theorem is true for $m-1$ for some $m > 1$. Denote $R_{\frac{1}{A_{11}}1}(A) = A'$, i.e., $A' = \begin{pmatrix} 1 & \cdots & A'_{1n} \\
& B \end{pmatrix}$, where $B \in M_{(m-1) \times n}$. By the induction hypothesis, $B$ can be transformed into its RREF. By means of at most $n$ type 2 row operation, we can transform $A'$ into its RREF. $\blacksquare$
Theorem 2
Theorem 2. Let $A \in M_{m \times n}(F)$ such that rank($A$) = $r > 0$, and let $B$ be the RREF of $A$. Then the followings hold:
(a) The number of nonzero rows in $B$ is $r$.
(b) $\forall i \in \{1, ..., r\}, \exists$ $[B]^{j_i}$ such that $[B]^{j_i} = e_i$, where $e_i$ is the standard ordered basis vector of $\mathbb{R}^r$.
(c) $\{[A]^{j_1}, \cdots, [A]^{j_r}\}$ are linearly independent.
(d) $\forall k \in \{1, ..., n\}$, if $[B]^k = \sum_{i=1}^r d_ie_i$, then $[A]^k = \sum_{i=1}^r d_i[A]^{j_i}$.
Proof.
(a) Since elementary row operation is rank-preserving and $B$ is the RREF of $A$, rank($B$) = $r$. This means that the maximum number of linearly independent rows of $B$ is $r$. Clearly zero rows are not the linearly independent rows. Thus the number of nonzero rows in $B$ is $r$.
(b) Since rank($B$) = $r$, there is $r$ linearly independent columns in $B$. By the definition of RREF, the each columns are same to $e_i$. Thus for each $i = 1, ..., r, \exists [B]^{j_i}$ such that $[B]^{j_i} = e_i$.
(c) Suppose that $\sum_{i=1}^r c_i[A]^{j_i} = \mathbf{0}$ for some $c_i \in F$. Since $B$ is the RREF of $A$, we denote that $B = R_p \cdots R_1 (A) = R_p(I) \cdots R_1(I)A = MA$, where $M := R_p(I) \cdots R_1(I)A$. Note that $[B]^{j_i} = M[A]^{j_i} (i = 1, ..., r)$. Thus $$M(\sum_{i=1}^r c_i[A]^{j_i}) = \sum_{i=1}^r c_i[B]^{j_i} = \sum_{i=1}^r c_ie_i = (c_1, \cdots, c_r)^t = \mathbf{0} \\
\Longrightarrow c_1 = \cdots = c_r = 0.$$ Hence $\{[A]^{j_1}, \cdots, [A]^{j_r}\}$ is linearly independent.
(d) Note that $A = M^{-1}B$, i.e., $[A]^k = M^{-1}[B]^k (k = 1, ..., n)$. Since $[B]^k = \sum_{i=1}^r d_ie_i$, $[A]^k = M^{-1}\sum_{i=1}^r d_ie_i = M^{-1} \sum_{i=1}^r d_i[B]^{j_i} = \sum_{i=1}^r d_i[A]^{j_i}$. $\blacksquare$
위 정리는 여러가지 중요한 사실을 포함하고 있다. 간단하게 요약하면, 모든 행렬은 $r$개의 nonzero row들을 가지고 있으며, 다른 행들을 생성하는 $r$개의 linearly independent column들을 가지고 있다. 이때 $r$은 행렬의 rank이다. 즉 행렬의 rank는 그 행렬의 linearly independent column의 개수와 대응되며, 이 column들은 그 행렬의 행들을 생성한다. Corollary 2에 의해 linearly independent row에 대해서도 동일한 결과를 얻음을 알 수 있다.
Corollary
Corollary. The RREF of a matrix is unique.
Proof. Assume that $B$ and $C$ are the RREF of $A \in M_{m \times n}(F)$ that rank($A$) = $r > 0$. By Theorem 2, we can write $[B]^k = \sum_{i=1}^r c_ie_i$ and $[C]^k = \sum_{i=1}^r d_ie_i$ for some $c_i, d_i \in F (i=1, ..., r)$, respectively. Then $$[A]^k = \sum_{i=1}^r c_i[A]^{j_i} = \sum_{i=1}^r d_i[A]^{j_i} \\ \Longrightarrow \sum_{i=1}^r (c_i - d_i)[A]^{j_i} = \mathbf{0} \\ \Longrightarrow c_i = d_i (i = 1, ..., r) \Longleftrightarrow B = C,$$ where $[B]^{j_i} = e_i (i = 1, ..., r)$. $\blacksquare$