What is a function?

I see there are already a few answers to this question but I'm going to try my hand at answering it the way I see it.

A function $f$ is a mathematical object that relates elements of two sets, one called the domain $A$ and one called the codomain $B$. The notation $f : A \to B$ denotes the fact that $f$ is a function with domain $A$ and codomain $B$.

What it means to be a function $f : A \to B$ is this: $f$ assigns to each element of $A$ exactly one element of $B$. If $a \in A$, the notation $f(a)$ denotes the element of $B$ to which $a$ is assigned by $f$.

Those elements of $B$ which can be written in the form $f(a)$, for some $a \in A$, are called values of $f$. The set of all values of $f$ is the image of $f$, which is a subset of $B$ (and not necessarily all of $B$).


There are various ways of specifying functions. For example:

  • If $A$ is finite, you can simply list the values of $f$. For example we can define a function $f : \{ 1,2,3 \} \to \{ \text{red}, \text{green}, \text{blue} \}$ by $$f(1) = \text{green}, \quad f(2) = \text{blue}, \quad f(3) = \text{red}$$
  • Sometimes functions can be defined by an equation. An example of a function $f : \mathbb{R} \to \mathbb{R}$ is the one defined by the equation $$f(x) = x^2+3$$ This equation is not itself a function. What it means is, given an element $x \in \mathbb{R}$, the value of $\mathbb{R}$ associated by $f$ with $x$ is $x^2+3$. The expressions $f(x)$ and $x^2+3$ both denote exactly the same thing here: the real number associated with the number $x$. For example $f(2)$ denotes the same thing as $2^2+3$, which in turn denotes the same thing as $7$. The function is $f$, and $f(x)$ denotes the value of $f$ at a given number $x$.

The graph of a function $f : A \to B$ is the set $$\{ (a,b) \in A \times B : b = f(a) \}$$ Sometimes, particularly when $f : \mathbb{R} \to \mathbb{R}$, it is convenient to define a function in terms of its graph. (Indeed, $\mathbb{R} \times \mathbb{R}$ what you're depicting when you draw a pair of coordinate axes.) For example the equation $y=x^2+3$ specifies a function $f : \mathbb{R} \to \mathbb{R}$ whose graph is the set $$\{ (x,y) \in \mathbb{R} \times \mathbb{R} : y=x^2+3 \}$$ That is, the function $f$ specified by this equation is the one which associates to each $x \in \mathbb{R}$ the value $x^2+3$. Some people would then say '$y$ is a function of $x$', but this is slightly misleading: what it means is that there is a function whose values are exactly the values of $y$ satisfying the given equation.


An example of how a function works is as follows. Suppose a bird is flying in a straight line at a constant speed of $12$ metres per second. The distance the bird flies 'is a function of time', in the following sense: if $t$ is a positive real number, then the distance flown by the bird in $t$ seconds is $12t$ metres. Thus the relationship between distance and time defines a function $d : \mathbb{R}^+ \to \mathbb{R}$, which is defined by the equation $$d(t) = 12t$$ for all $t \in \mathbb{R}^+$. Thus you can consider $d$ as being the 'distance function', and for each $t$ you can consider $d(t)$ as the 'distance travelled at time $t$'.


Some non-examples of functions are:

  • $f : \mathbb{R} \to \mathbb{Q}$ defined by $f(x) = x$. This is not a function because, for example, $f(\sqrt{2})=\sqrt{2}$, which is not an element of $\mathbb{Q}$.
  • $f : \mathbb{R}^+ \to \mathbb{R}$ defined by $f(x)^2=x$. This is not a function because, for example, the expression $f(1)$ has more than one possible value satisfying the equation, namely $1$ or $-1$.
  • $f : \mathbb{R} \to \mathbb{R}$ defined by the graph $x^2+y^2=1$. This is not a function because the values of $y$ are not uniquely determined by the values of $x$, for example $0^2+1^2=1$ and $0^2+(-1)^2=1$.

In summary, if $f : A \to B$ is a function, then

  • $f$ is the function itself, which has domain $A$ and codomain $B$;
  • $f(a)=(\text{expression in terms of}\ a)$ is an equation which specifying $f$ by declaring its effect on the elements of $A$; the expression $f(a)$ is not itself a function ($f$ is the function), but a function is determined by its values, so specifying $f(a)$ suffices;
  • $y=f(x)$ is an equation that specifies $f$ in terms of its graph.

In the mathematical branch of set theory, which is used as a foundation for most mainstream mathematics, we need to specify precisely in terms of sets what it means for $f$ to be a function. In this setting, a function $f : A \to B$ is usually defined to be its graph: that is, if $f$ assigns to each $a \in A$ the value $f(a) \in B$, then we'd write $$f = \{ (a,b) \in A \times B : b=f(a) \} \subseteq A \times B$$ This formal approach isn't something you need to worry about if you're learning about functions for the first time. All that matters is that for every element of the domain $A$, $f$ identifies that element with exactly one element of the codomain $B$.


This explanation is woefully incomplete, but there's only so much you can do in an MSE answer... let me know if you need more clarifications.


A function requires some inputs and for each valid combination of inputs produces one output. What is valid is determined by the domain, which is sometimes specified but sometimes left for the reader to infer. The issue is when talking about graphs, because historically people have used single letters to refer to changing quantities, and still do so in many areas of mathematics. When we say that "$y$ is a function of $x$" it means that "$y$ changes together with $x$". That concept does not correspond directly to the proper definition of functions.

As you realize, when you have a straight line it can be described by the linear function $f$ where $f(x) = mx+c$ for any $x \in \mathbb{R}$, with $m,c$ as some fixed values. Notice that everything here is fixed. $f$ is a fixed unchanging object called a function. "$f(x)$" is a notation to denote the output when $x$ is given as input to $f$.

To use your other example, if $f$ is a function such that $f(t) = t^2$ for any $t \in \mathbb{R}$, then $f(0) = 0^2 = 0$, and $f(2) = 2^2 = 4$, and so on. This shows that if you give a different input to a function, you may get a different output. But the function itself hasn't changed. It always does the same thing. If you give it the same input many times in a row, it is going to give you the same output each time. Neither is the input changing. You can give it different inputs, but you cannot give an input that 'changes'.

Note also that the variable used as the input in defining a function is not important and is called a dummy variable. In the above example we could have defined $f$ to be such that $f(u) = u^2$, and it would make no difference.

So "$y = f(x)$" and "$y = x^2$" are actually misleading notations because it is used to denote $y$ changing when $x$ is changed. I wouldn't use it as far as possible. If you really need to write something like "$y = x^2$", you should understand it as follows: When $x = 3$, $y = 9$. When $x = -2$, $y = 4$. Note that such notation is often used in the context of plotting graphs, but then remember that these are just equations describing what points are in the graph, and equations are not the same as functions.

Moreover, some graphs (and their associated equations) do not even correspond to a function, such as "$x^2+y^2 = 1$". What we can say is that this equation describes the set of points in the Cartesian plane that make up the unit circle, but it does not correspond to a function that produces $y$ given $x$, because for instance there are two solutions when $x = 0$, namely $y = 1$ or $y = -1$. A function cannot produce two outputs.

For a concrete example, if say you are told that the position of a ball above the earth's surface is $at-\frac{1}{2}gt^2$ at time $t$, that can be represented by the function $f$ such that $f(t) = at-\frac{1}{2}gt^2$, and we can say that the position of the ball is $f(t)$ at time $t$. We can also plot the ball's trajectory over time by plotting the points $(t,f(t))$ for various values of $t$. It is not correct to say that $f$ is the position at time $t$, because $f$ is just a function and not a value, and remember that "$t$" in the definition of $f$ is a dummy variable and hence meaningless outside the definition. Instead, you either say that $f(t)$ is the position at time $t$ or you say that $f$ is the function that maps any given time to the position of the ball at that time.

For an example of a function with multiple inputs, consider a function $d$ such that $d(x,y) = \sqrt{x^2+y^2}$ for any $x,y \in \mathbb{R}$. $d(x,y)$ is in fact the Euclidean distance of a point $(x,y)$ from the origin. Note that it is valid to take the square-root since $x^2+y^2 \ge 0$ for any $x,y \in \mathbb{R}$. In general whenever we define a function we should think about whether the definition is valid, which means that it must give a well-defined output for any valid combination of inputs.

For another example, consider $\max$ such that $\max(x,y) = \cases{ x & if $x \ge y$ \\ y & if $x < y$ }$. The notation used here is a way of defining different values in different cases. In this example $\max(x,y) = x$ if $x \ge y$ and $\max(x,y) = y$ if $x < y$. Note that the order of the inputs matters.

In general a lot of things in mathematics can be thought of as functions. $x+y$ can be thought of as $\text{sum}(x,y)$ where $\text{sum}$ is a 2-input function. $x<y$ can be thought of as $\text{lessthan}(x,y)$ where $\text{lessthan}$ is a 2-input function that outputs a boolean ($true$ or $false$). When a function's output is always boolean, the function can be called a predicate, which plays a major role in logic.

Finally, I did not talk about set theory because that is not the only way of defining a function and is in my opinion not the most intuitive way. What I said above holds whether functions are defined in terms of sets or not.


I'm going to take a little bit of a different approach from that commonly found in textbooks, and instead of giving you mathematical explanations and examples, I'll try to explain it in terms of the semantics and usage of functions. It seems to me that this where the confusion actually lies (not just for you - for most other people too.)

First off, the definitions others have posted here and that you probably can find in your textbook are correct. I'll put it into layman's terms:

A function is a relationship between a set of inputs and outputs, where each input produces only one output.

That's a pretty broad definition, which probably doesn't sound too useful, and which also probably doesn't go far in explaining all of the $y = x$ and $f(x) = x^2$ stuff you are seeing.

So let's think about this definition a little bit first. What we are talking about is an abstract idea - we are talking about the concept of a "thing" which takes in information and gives back some other information. What makes a function special is the requirement that it has to be predictable - one input can't give two or more different outputs.

This isn't too hard of an idea to grasp - you are already familiar with lots of things that do that. For example, 2 + 2 = ? You know the answer to this is four - you look at the + symbol, then look at the inputs: 2 and 2, and you know you are supposed to add them together. Here the + symbol is an operator that represents the addition function. This is a language used to represent the abstract concept of addition, and we use it so that we can communicate the idea of adding two numbers together to get a sum in a concise manner.

I think this is where you are getting stuck. You know that if you have something like:

$$ y = mx + b $$

that you can "plug in numbers" for m, x, and b, then follow the order of operations to get an answer for y = ?

You also said that we can just write $f(x) = mx + b$, which is true. Why don't we? The answer is: it's just semantics. The concept of taking some numbers, plugging them into an expression, and evaluating to get a single answer is the function. $y = mx + b$ is a representation of the function - it's a convenient, concise notation. The same is true for $f(x) = mx + b$.

So why do we have both notations, and what is their use? Well, the one you are used to: $y = mx + b$ is good when y represents some quantity, measurement, or other thing that you understand pretty well. For example, this is a linear equation, and since we typically plot lines on Cartesian axes and label them x and y, in this case y would represent the y-coordinate of points on the line. We usually graph independent variables on the x axis, so $x$ is typically our input. $m$ is the slope and $b$ is the y-intercept, and these don't change for a given linear relationship, so we call them parameters - they define the linear function. Since we are used to seeing it written like this in this context, there is not much confusion about what the inputs and outputs are - $x$ is usually the input, $y$ is usually the output. The notation matches up with the common usage, and it's convenient and simple, so we use it.

The new notation - the one that is confusing, with the $f(x)$ [and later the $g(x)$ and $h(x)$ and really anything(anything_else)] serves a slightly different purpose. The $f(x)$ part tells you what the inputs are, explicitly. If I say $f(x) = mx + b$, I am telling you directly that x is an input. Once I "plug in" a value for $x$, then whatever the right hand side ends up evaluating to is the output.

So why would we want to use this? For a linear equation, we probably wouldn't, at least not in most cases. When learning this stuff though, we often start with that so that you can see how it all works. There are two major advantages to $f(x)$ notation, though:

  1. You can exactly communicate what the intended input variables are
  2. You can write an expression for a function even when you don't know what the exact relationship is

#2 is very important. This is why you need to learn this notation at this level - because soon you are going to start working with functions in an even more abstract sense - you will do things like this:

"I have this big equation with $y$ in it that I can't solve yet. I know that y is a function of x and t, but I don't yet know what that relationship is. So, I'll let $y = f(x,t)$, then use that as a placeholder while I do a bunch of things like take derivatives and integrate in order to solve the main problem."

In other words, it lets us write down what we do know when we don't know everything, which happens pretty often.

Now you probably have the answers to your three questions from reading that, but just in case it wasn't clear:

y=f(x)→ This is one of the main reasons I have difficulties understanding functions... What does the above statement tell me, and if y is a function why do we use y= at all for a formula like y=mx+b would it be the same as writing f(x)=mx+b?

It would be the same - we use $y$ when we don't need to be very clear about which variable(s) are inputs.

Something like y=x2 is apparently a function...... but where is the function name? Which is the input and which is the output?

This function has no name yet - although you could just say "the function y equals x squared." More likely you would say "y is a function of x." The function happens to be $f(x) = x^2$

Lastly another example : Let me suppose f= distance f(t)=t2 f(2)=4→ Does this mean distance is 4.. which is the input which is the output?

Yes, if distance $d$ is a function of t, and $d = f(t) = t^2$, then when $t = 2$, $d = 4$. The input is 2, which is "plugged in" to the variable $t$, and the output is 4 - the answer you get after evaluating the expression.