How do basis functions work?

First we need to be precise about what space of functions we're talking about. A choice where your question makes sense is $L^2([-\pi,\pi])$, which means the functions $f:[-\pi,\pi]\to\mathbb C$ satisfying $$\int_{-\pi}^\pi|f(x)|^2dx<\infty.$$ (There are also issues of measurability in specifying these functions but let's ignore that complication; see Stein's Real Analysis.) Now we can address your question about how to take the dot product of two functions. In this context we call it an inner product and it's written $$(f,g)_{L^2}=\int_{-\pi}^\pi f(x)\overline{g(x)}dx.$$ (Notice the parallel between this and the finite-dimensional Euclidean dot product.)

From here you can prove that the functions $\{e^{inx}\}_{n\in\mathbb Z}$ (these are the sines and cosines you asked about) are an orthonormal basis (see pp. 200-201 in Stein) (note that you need a topology to talk about a basis in this context, which is determined by the metric coming from the above inner product).

Now, as you know from the finite-dimensional case, you can write a function in a basis by taking the dot product: for $n\in\mathbb Z$, $$\hat f(n)=(f,e^{inx})_{L^2}=\int_{-\pi}^\pi e^{-inx}f(x)dx.$$ So $\hat f(n)$ is just the $n$th Fourier coefficient, or the Fourier transform evaluated at $n$! This hints that the Fourier transform is an isomorphism between $L^2([-\pi,\pi])$ and the $\ell^2(\mathbb Z)$, where the latter is defined as the space of square-summable sequences (you can probably guess how its inner product is defined).

Ps. If you're looking for search terms, "Hilbert space" refers to vector spaces (often of functions) which have a "dot product" behaving like the Euclidean one enough that you can use it to do calculus.

Bis ps. I realized too late that you're asking about the discrete Fourier transform but I hope this is nonetheless useful.


for a basis which was a function instead of a constant

See, there is not really a dichotomy between functions and constants. A function is simply a constant of a different type than a number or a tuple. What's not constant are function values, i.e. the results of evaluating a function for some argument. Unfortunately, the notation many people use completely obscures this distinction. To make it clear: let $f : \mathbb{R} \to \mathbb{R}$ a function. Then

  • $f(x)$ is not a function. It is just an expression that depends on the free variable $x$. Thus it is not constant, and indeed does not make sense to consider as an element of a vector space.
    Lots of people will tell you that $f(x)$ is a function. They're wrong.
  • The function is just $f$, when not applied to any argument. If you have a proper definition like $$ \forall x .\ f(x) = \frac1{1+x^2} $$ (note that the universal quantor $\forall$ – which is usually not written out but assumed implicitly – makes $x$ is a bound variable) then $f$ as a whole is a perfectly good function-constant, just like $5$ or $\sqrt{2}$ or $b$ are perfectly good number-constants (if you previously defined it as $b = 17$).

As a matter of fact, you might consider vectors $v\in\mathbb{R}^3$ also as functions: $$ v : \{0,1,2\} \to \mathbb{R} $$ or, less precisely, $$ v : \mathbb{N} \to \mathbb{R}. $$ This means, $v(i)$ (or, as it's more commonly written, $v_i$) is a number, whereas $v$ as a whole is a vector.

Now, with such finite-dimensional vectors a scalar product will generally be a simple sum. $$ \langle v,w\rangle_{_{\mathbb{R}^3}} = \sum_{i=0}^2 v_i\cdot w_i $$ Such a definition is equally possible if the vectors aren't functions of a natural number but a real one; for instance, you might define $$ \langle f,g\rangle_{_\text{Bogus}} = \sum_{x\in \{1,3,\pi^2\}} f(x) \cdot g(x) $$ Only, that's not a scalar product: it's not positive definite, for instance if $f(x)$ has nozero values only for negative $x$ then $\langle f,f\rangle_{_\text{Bogus}} = 0$ even though $f\neq 0$.

Actually useful scalar products avoid this by evaluating the functions everywhere on their domain. This is not really feasible with sums, but you can do it with integrals, like in the classical $L^2$ Hilbert space: $$ \langle f,g\rangle_{_{L^2(\mathbb{R})}} = \int_{\mathbb{R}}\!\mathrm{d}x \ f(x) \cdot g(x). $$


You may object that this is still not positive definite: if you plug in a function that's zero almost everywhere but e.g. $f(0)=1$, then the $L^2$ norm yields $0$ although $f\neq0$. The elements of the $L^2$ Hilbert space are in fact not general square-integrable functions, but equivalence sets of such functions modulo differences on null sets; hence we can say $f=0$ if only it's zero almost everywhere. (Alternatively, you can just say: the $L^2$ norm is a norm for all continuous functions.)


Let us consider the DST (Discrete Sine Transform) which is very close to the DFT (Discrete Fourier Transform) and slightly simpler to understand:

The DST of order $N$ is defined by a matrix acting on vectors that represent discretized functions in this way:

$$\begin{pmatrix}\sin(1a)&\sin(2a)&\cdots&\sin(qa)&\cdots&\sin(na)\\ \sin(2a)&\sin(4a)&\cdots&\sin(2qa)&\cdots&\sin(2na)\\ \cdots&\cdots&\cdots&\cdots&\cdots&\cdots\\ \sin(pa)&\sin(2pa)&\cdots&\sin(pqa)&\cdots&\sin(pna)\\ \cdots&\cdots&\cdots&\cdots&\cdots&\cdots\\ \sin(na)&\sin(2na)&\cdots&\sin(nqa)&\cdots&\sin(n^2a) \end{pmatrix} \begin{pmatrix}f(h)\\f(2h)\\f(ph)\\\cdots\\\cdots\\f(nh) \end{pmatrix}=\begin{pmatrix}g(h)\\g(2h)\\g(ph)\\\cdots\\\cdots\\g(nh) \end{pmatrix}$$

with $a=\dfrac{2\pi}{n+1}$, and $h$ is the discretization step.

This matrix transforms the discretized function $f$ into the discretized function $g$.

The generic line-by-column product (line numbered $p$ by the column vector of $f$) is:

$$\sin(pa)f(h)+\sin(2pa)f(2h)+\cdots+\sin(pqa)f(qh)+\cdots+\sin(pna)f(nh)=g(ph)$$

which can be put into the form:

$$g(ph)=\sum_{q=1}^n \sin(pqa)f(qh)$$

It suffices now to "by analogy" to replace $\sum$ signs by $\int$ signs, and discrete values by continuous ones:

$$g(s)=\int_{t=0}^1 \sin(2\pi st)f(t)dt$$

Remark: the "details" concerning the replacement of a finite range of values by a continuous one may look rather arbitrary. In fact, other choices "by analogy" could be taken, in particular for dealing with infinite ranges of continuous variables. This would be necessary for the recovery of the so-called classical "continuous" sine transform. But our purpose here is only pedagogical.

These transformations methods, illustrated here by the building of a continuous transform out of a finite discrete one, constitute a very powerful "heuristical tool" : a method which helps you in the discovery (and understanding) of properties, relationships, etc.

But, let us repeat it, we work by analogy and no more than that.

Remark 1: It can be proved that the columns of the matrix of the discrete sine transform are orthogonal and have a common norm. Dividing by this norm, one has a privilegized orthonormal basis, a point that is important in the "analog" switch to Hilbert space $L^2$ (mentionned by @Stan Palasek), where so many things depend upon decompositions onto an orthogonal (or Hilbert space) basis.

Remark 2: I could have taken a different path and consider that we extend our matrix to infinity ($p,q=1,\cdots \infty$) with a convenient mathematical apparatus. This would have given a different "world"; but it is better not to mix up things that are not of the same nature.

Remark 3: You may have a look at a question I have asked some times ago about the relationship between discrete and continuous "worlds" and bridges that can be put between them: (Looking for examples of Discrete / Continuous complementary approaches).