Tensor and Matrices

Firstly, in tensor index notation the order of indices matters. If there's one upstairs index and one downstairs index, then one should either write $$A^\mu_{\ \ \nu}$$ or $$A_\nu^{\ \ \mu}$$. Traditionally the first index (whether upper or lower) is used to designate the row, and the second is used to designate the column, so for example $$A^0_{\ \ 1}=a_{01}$$.

Your next question gets to the heart of why it's dangerous to naively say that a tensor with two indices is simply a matrix. In linear algebra, we often first deal with objects which eat vectors and spit out other vectors, e.g.

$$\pmatrix{a_1\\a_2}=\pmatrix{m_{11}&m_{12}\\m_{21}&m_{22}}\pmatrix{b_1\\b_2}$$

Such an object is a $$(1,1)$$-tensor, and the above equation would be written $$a^\mu = m^\mu_{\ \ \nu} b^\nu$$. $$(1,1)$$-tensors have familiar properties, such as the fact that their trace and determinant are basis-independent.

$$(2,0)$$-tensors and $$(0,2)$$-tensors can also be laid out in matrix form. This is a convenient way to write out what their components are. For example, on often says that the Minkowski metric, a $$(0,2)$$-tensor, is given by

$$\eta = \pmatrix{-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1}$$

This is perfectly fine. However, it's critical to remember that e.g. the determinants and traces of such objects are not basis independent, and their transformation rules under basis changes are different.

So, to answer your question if you write down a couple of rows and columns of numbers, it's impossible to tell the difference between a $$(1,1)$$-tensor, a $$(0,2)$$-tensor, or a $$(2,0)$$-tensor without knowing how it transforms when you change bases, and knowing the difference is critical to understanding the properties of the tensor in question.

It depends whether you want the matrix to represent a linear map or a quadratic form. In the former case you should replace $$a_{\mu\nu}$$ by $${A^\mu}_\nu$$, so that $${\bf y}={\bf A}{\bf x}$$ becomes $$y^\nu= {A^\mu}_\nu x^\nu$$ and in the latter case $$Q[{\bf x}]={\bf x}^T {\bf A}{\bf x}$$ becomes $$Q= A_{\mu\nu}x^\mu x^\nu.$$ The different placement of the indices reflects how the matrices change under a change of basis. A linear map transforms as $${\bf A}\to {\bf D}^{-1} {\bf A}{\bf D}$$ (a similarity transformation) while a quadratic form transforms as $${\bf A}\to {\bf D}^T {\bf A}{\bf D}$$ (a congruence transformation) This makes a big difference when you have to diagonalize the matrices.