How is a Generator Matrix for a (7, 4) Hamming code created?

$G$ represents the mapping from the input to the codewords of the code. You get the codeword vector $c$ by multiplying the input vector $v$ from the left of $G$: $$c = vG.$$

So, you see that the input vector $\begin{pmatrix} 1 & 0 & 0 & 0 \end{pmatrix}$ maps to the first row of $G$ (if you doubt me, do the calculation), and the other standard basis vectors map to the other rows of $G$.

Thus, you can construct a matrix $G$ by taking a basis of your input vectors, and deciding which codewords these map on, these codewords will be the rows of your $G$. Of course, when you do this, it is not guaranteed that your matrix $G$ will be on the from $\left[I \mid P \right]$, to achieve this you will have to perform row reductions on your matrix.

This describes in very general terms of how you construct a linear code. It is usually easier for Hamming codes.

So, let's construct $G$ for the $(7,4)$ Hamming code. Note that we are working over the finite field with two elements $\mathbb F_2$.

We have four input bits $b_1, b_2, b_3, b_4$. We define three parity bits as: $$\begin{align} p_1 &= b_1 + b_2 + b_3 \\ p_2 &= b_2 + b_3 + b_4 \\ p_3 &= b_1 + b_2 + b_4 \\ \end{align}$$

and our codeword is $$\begin{pmatrix} b_1 & b_2 & b_3 & b_4 & p_1 & p_2 & p_3 \end{pmatrix}.$$

We can see that this makes it easy to construct the first part of our $G$, it will just be the identity matrix, since we can see that $$ \begin{pmatrix} b_1 & b_2 & b_3 & b_4 \end{pmatrix} \begin{pmatrix} I_4 & P \end{pmatrix} = \begin{pmatrix} b_1 & b_2 & b_3 & b_4 & - & - & - \end{pmatrix} $$ independent of what $P$ is. Now we just use the equations for $p_1, p_2, p_3$ above and transform it to a matrix: $$ \begin{pmatrix} p_1 & p_2 & p_3 \end{pmatrix} = \begin{pmatrix} b_1 & b_2 & b_3 & b_4 \end{pmatrix} \underbrace{ \begin{pmatrix} 1 & 0 & 1\\ 1 & 1 & 1\\ 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix}}_{= P} $$

Here's an alternative way to see it. We use the standard basis for our input words and calculate what the codewords for the basis vectors will be, using the parity equations above. This gives us the rows of our $G$ matrix.

For $e_1 = \begin{pmatrix} 1 & 0 & 0 & 0\end{pmatrix}$ we get $p_1 = 1, p_2 = 0, p_3 = 1$, so $$e_1G = \begin{pmatrix} 1 & 0 & 0 & 0 & 1 & 0 & 1\end{pmatrix}$$ Likewise for the other basis vectors: $$\begin{align} e_2G &= \begin{pmatrix} 0 & 1 & 0 & 0 & 1 & 1 & 1 \end{pmatrix} \\ e_3G &= \begin{pmatrix} 0 & 0 & 1 & 0 & 1 & 1 & 0 \end{pmatrix} \\ e_4G &= \begin{pmatrix} 0 & 0 & 0 & 1 & 0 & 1 & 1 \end{pmatrix} \end{align}$$ and these will be the rows of our $G$, i.e. $$ G = \begin{pmatrix} e_1G \\ e_2G \\ e_3G \\ e_4G \end{pmatrix} $$


The standard way of finding out the parity matrix $G_{k,n}$ for a Hamming code is constructing first the check parity matrix $H_{n-k,n}$ in systematic form.

For this, we recall that a Hamming code has $d=3$ (minimum distance). Hence the columns of $H$ have the property that we can find a set of $3$ linearly dependent columns, but not $2$ columns or less. Because we are in $GF(2)$ (binary code) this is equivalent to saying that the columns of $H$ must be distinct (and different from zero). Then the columns of $H$ just correspond to the $2^{n-k}-1$ different non-zero binary tuples. We can choose to construct it systematic, hence, for example:

$$ H= \begin{pmatrix} 1 &1 &0& 1 &1& 0 &0\\ 0 &1 &1 &1 &0 &1 &0\\ 1 &1& 1 &0& 0 &0& 1 \end{pmatrix} = [P^t \mid I_{n-k}] $$

Be aware that you could also choose other permutations of the first 4 columns.

From this, you get $G=[I_k \mid P ]$