Confusion with derivative of the Dirac Delta function.

OK let me patch all of the comments together in an answer. Pretty much every comment above touches on an important issue but these points are not separate from each other.

First my own comment. The "correct" definition of a convolution of two functions $f,g:\mathbb{R}\to \mathbb{R}$ is $$ (f*g)(t)=\int_{-\infty}^\infty f(x)g(t-x)dx $$ You explained in a comment that you're are using a less mainstream definition in your class. But in doing so you're encountering a problem and for good reason. I assume your definition is (and I will denote your "convolution" by $f\star g$ to make a distinction) $$ (f\star g)(t)=\int_0^t f(x)g(t-x)dx $$ But under what circumstances the two definitions equivalent? As @user1952009 points out $f*g$ reduces to $f\star g$ only if $f(t)=g(t)=0$ for all $t\leq 0$.

Now comes the issue you are encountering: It is true that under true convolution $\delta * f=f$ for any function $f$, in fact, it is more appropriate to take this as the definition of $\delta$. But $\delta \star f=f$ only for functions $f$ such that $f(t)=0$ for all $t\leq 0$. As a result, if $f=1$, then $$ 1\neq \int_0^t \delta(x)dx $$ since $1$ is not zero for $t\leq 0$. But what is this integral? As @paul mentions $d/dt\int_0^t \delta(x) dx\neq \delta'(t)$, actually $=\delta(t)$. This brings us to the point of @tst. By definition, $\int_0^t \delta(x) dx$ is the anti-derivative of $\delta$, as we actually just saw by fundamental theorem of calculus (assuming it makes sense for the delta "function"). But what can this function be? Well, it needs to be pretty much exactly like constant function $1$, except it must vanish for $t\leq 0$. This is exactly the step-function $$ \theta(t) = \begin{cases} 1 & t>0\\ 0 & t\leq 0 \end{cases}=\int_0^t \delta(x) dx $$ So let me wrap up my fist half of answer: The definition $f\star g$ for convolution is fine as long as you have in mind that it defined for functions which vanish for non-postive numbers.


In this part, we will explore a little bit around what this $\delta$-"function" actually is. In doing so, hopefully, I will clarify a few things. As I said let us take, as the definition, $$ f(t)=\int_{-\infty}^\infty \delta(x)f(t-x)dx= \int_{-\infty}^\infty f(x) \delta(t-x)dx\qquad(1) $$ for all $f$.

Point I: Let $a<b$ and consider $\rho(x)=\theta(x-a)-\theta(x-b)$, which is zero in $(-\infty, a]\cup(b, \infty)$ and one in $(a,b]$. What you find then is $$ f(t)\rho(t)= \int_{a}^b f(x) \delta(t-x)dx\Longrightarrow \begin{cases} f(t)=\int_{a}^{b} \delta(t-x)f(x)dx & a<t\leq b\\ 0=\int_{a}^{b} \delta(t-x)f(x)dx & \text{otherwise} \end{cases} $$ Actually by defining a function $\tilde{\theta}(x)$ such that $\tilde{\theta}(0)=1$ and $\tilde{\theta}(t)=\theta(t)$ for $t\neq 0$, one can actually prove $$ \begin{cases} f(t)=\int_{a}^{b} \delta(t-x)f(x)dx & a{\color{red}\leq } t\leq b\\ 0=\int_{a}^{b} \delta(t-x)f(x)dx & \text{otherwise} \end{cases} $$ recall $a{\color{red}<}b$ throughout.

Point II: Now consider an even function $f$. Then $$ f(t)=f(-t)=\int_{-\infty}^\infty \delta(x)f(-t-x)dx= \int_{-\infty}^\infty\delta(x)f(t+x)dx= \int_{-\infty}^\infty\delta({\color{red}-x})f(t-x)dx $$ Now suppose $f$ is instead odd, then $$ f(t)=-f(-t)=\int_{-\infty}^\infty \delta(x)[-f(-t-x)]dx= \int_{-\infty}^\infty\delta(x)f(t+x)dx= \int_{-\infty}^\infty\delta({\color{red}-x})f(t-x)dx $$ Now consider a general function $f(x)$. Define $e(x)=[f(x)+f(-x)]/2$ and $o(x)=[f(x)-f(-x)]/2$. Then $f(x)=e(x)+o(x)$. As a result we have just found that $$ f(t)=\int_{-\infty}^\infty \delta(x)f(-t-x)dx= \int_{-\infty}^\infty\delta({\color{red}-x})f(t-x)dx $$ So as far $\delta(x)$ and $\delta(-x)$ interact with function, we have $\delta(x)=\delta(-x)$. By abuse of language we say $\delta$ is an even "function".

Point III: Combining the two points $$ \begin{cases} f(t)=\int_{a}^{b} \delta(x-t)f(x)dx & a\leq t\leq b\\ 0=\int_{a}^{b} \delta(x-t)f(x)dx & \text{otherwise} \end{cases} $$ for all functions $f$ and all $a<b$. Specifically I'm interested in case of $t=0$: $$ \begin{cases} f(0)=\int_{a}^{b} \delta(x)f(x)dx & a\leq 0\leq b\\ 0=\int_{a}^{b} \delta(x)f(x)dx & \text{otherwise} \end{cases}\qquad(2) $$ If you stare at the above equation long enough, trying to rationalize and reconcile it with ordinary intuition of function, you would realize that, the integrad $\delta(x)f(x)$ is somehow completely annihilating all the details of the function $f(x)$ except at point $x=0$. It is as if this $\delta$ "function" is zero everywhere but at the origin. If you want to push the "function" agenda even further you then ask: what is the value of $\delta(0)$? Well we know that $1=\int_{-\infty}^\infty \delta(x) dx$. This would be impossible if $\delta(0)$ is any finite number, since then the integral is zero! This actually immediately means $\delta(x)$ is NOT a function. But if one is really attached to their functions, then you can say $\delta(0)=\infty$.

Exercise: Start from (2) and prove (1), i.e. one can equivalently take $(2)$ as the definition of the delta function.

So am I saying "essentially that I am being taught a totally contradictory framework for the Dirac delta function"? No, not really! I hope you never have to teach Dirac delta function to people who are seeing it for the first time. Because, boy, it is challenging from an educational point of view! No matter what approach your teacher chooses to go something goes wrong. If the teacher does everything completely rigorously then the essential intuition of Dirac delta function will be completely lost in all of the integral manipulations which the students are still not completely comfortable with. If however, the teacher chooses to do the "everywhere zero except at the origin, there it is infinity" then confusions like what you just had (and a lot of other ones I've seen over the years) are born. My suggestion: Learn both approaches at the same time, struggle to reconcile them together and figure out how far you can bend the misleading and wrong function-theoretic picture until it breaks.


Finally the Laplace transform. By definition the Laplace transform of a function is $$ L[f]=\int_0^\infty f(x)e^{-sx}dx $$ If one insists on doing the same thing to delta "function" (which actually has a very precise meaning in distribution theory), then $$ L[\delta] = \int_0^\infty \delta(x) e^{-sx}dx = 1 $$ Now let us understand what $\delta'(x)$ means. One defines $\delta'(x)$ via $$ \int_{-\infty}^\infty \delta'(x) f(x)dx=-f'(0) $$ again the function-theoretic motivation of this definition is integration-by-parts. A more rigorous derivation of the above definition as the "derivative" of delta function needs for us to say "what the heck does the derivative of a not-really-a-function-thingy (distribution) $\delta(x)$ mean?". For that you need to read-up a bit and I can't possibly contain it here. Here the "wrong" function-theortic picture and integration-by-parts let us be sloppy and not deal with this delicate issue.

With that being said, now the Laplace transform becomes $$ L[\delta']=\int_0^\infty \delta'(x) e^{-sx}dx =s $$ Note that quite generally for a function $f:[0, \infty)\to \mathbb{R}$, one has $L[f']=sL[f]-f(0)$. In a sense, and do not take this too seriously, the failure of $\delta$ of being a function is captured in that $L[\delta']=sL[\delta]$ which is off only up to a "constant" (as polynomial in $s$), although allegedly that constant is $\delta(0)=\infty$! Again do not read too much into this last part.


First you need to understand what the delta "function" is. Let's construct it as a limit. Let $$f_n(x) = n \,\mathbf{1}_{[-1/2n,1/2n]}(x), $$ with $\mathbf{1}_A$ the indicator function of the set $A$. The functions $f_n$ are simple step functions, which take the value 0 for $|x|>1/2n$ and the value $n$ for $|x|\le1/2n$.

Now let's define the function $$ f(x) = \lim_{n\to\infty}f_n(x)$$ and wonder what meaning can this limit have. Trivially we see that if $x\ne0$ then the pointwise limit $\lim_{n\to\infty}f_n(x)$ makes sense and is 0.

What happens when $x=0$? Then we have $\lim_{n\to\infty}f_n(0)=\infty$. This is implies that we were wrong to assume that $f(x)$ was a functions to begin with. Also the fact that the pointwise limit at 0 is infinity implies that we cannot have any form of stronger convergence. So let's try weaker ones.

First notice that for all $n$ we have $$ \int_{-\infty}^\infty f_n(x) dx =1 $$ so we can reasonably define $$ \int_{-\infty}^\infty f(x) dx =1. $$ So $f$ is not really a function but we are happy to assign a value to its integral.

Now take $\phi$ to be smooth and bounded on $\mathbb{R}$. Then what is the value of $$ I_n(\phi) := \int_{-\infty}^\infty f_n(x) \phi(x) dx? $$

Since $\phi$ is smooth it holds $\phi(x)=\phi(0)+O(x)$(Taylor's theorem). OK, let's plug it in: $$ I_n(\phi)= \int_{-\infty}^\infty n \,\mathbf{1}_{[-1/2n,1/2n]}(x) \phi(0) dx +\int_{-\infty}^\infty n \,\mathbf{1}_{[-1/2n,1/2n]}(x) O(x) dx. $$ This implies that $$ I_n(\phi)= \phi(0) +O(1/2n). $$ So we are happy to say that $$ I(\phi) := \lim_{n\to\infty}\int_{-\infty}^\infty f_n(x) \phi(x) dx = \phi(0). $$

Now the question is which object is the delta function, $f$ or $I$? If you are a physicist then your answer is probably $f$. If you are a mathematician then your answer is probably $I$.

The problem of the physicist's answer is that you would say things like "the delta function is infinite at 0 and 0 everywhere else", which is kind of OK but it can be problematic. Let's see that. Let $$g_n(x) = n \,\mathbf{1}_{[-1/n,1/n]}(x) $$ and as it was done above define $$ g(x) = \lim_{n\to\infty} g_n(x). $$ Then for the "function" $g$ we can say exactly the same thing as above. Is it however the delta function? It is not because its integral is 2 and not 1. The limit is actually 2 times the delta function.

So let's go to your question. What is the value of $ 1*\delta$? The answer depends on your definition of convolution. We saw that is reasonable to write $$ \int_{-\infty}^\infty \delta(x)dx=1. $$ On the other hand the integral $$ \int_{-x_0}^x \delta(t)dt $$ with some $x_0>0$ is a function of $x$. For $x<0$ is 0 and for $x>0$ is 1. We may assign a value at $x=0$ if we need. This is the Heaviside step function and it is reasonable to say that it is the antiderivative of the delta function. I took $x_0>0$ to avoid the issue at 0. We can however do the same with your definition of convolution.

A bit about hyperfunctions.

Hyperfunctions bring complex analysis in real analysis. Which I personally find extremely cool. So the idea is to represent generalised functions as the difference of 2 analytic functions.

From now on I will say things like "$f$ is analytic on $\mathbb{R}$". This means that there exists an open set $A\in\mathbb{C}$ such that it contains $\mathbb{R}$ and $f$ is analytic in $A$.

So how does it work? Obviously if you take 2 functions analytic on $\mathbb{R}$ then their difference is also analytic on $\mathbb{R}$. So we gain nothing. But on the other hand, if we take a function $F_+$ analytic on the open upper half plane and a function $F_-$ analytic on the lower half plane then their difference makes no sense since there is no domain where they are both defined. :)

However! We may be able to define the difference of their limits!

So let's start by defining what a hyperfunction is. We denote a hyperfunction by $F=[F_+,F_-]$ and we say that the hyperfunction $F$ has $F_+$ as upper component and $F_-$ as lower component, with $F_+$ and $F_-$ as above.

So how does it act on test functions? Let $\phi$ be a function analytic on $\mathbb{R}$ which decreases exponentially toward the real infinity. Let $A(\mathbb{R})$ be the space of all such functions. We see that this space is a subspace of the Schwartz space. Then we define $$ F[\phi] = \int_{\mathbb{R}+\epsilon i} F_+(z)\phi(z)dz - \int_{\mathbb{R}-\epsilon i} F_-(z)\phi(z)dz. $$ So we integrate just above and just below the real line and we take the difference. This means that $F$ is in the dual of $A(\mathbb{R})$, so let's see how many elements of the dual we can represent.

Let $$I(\phi) = \int_{\mathbb{R}} \phi(z) dz $$ Then obviously we can write $$ I = [1,0]=[1/2,-1/2]=[0,-1]. $$ This brings up an important point about hyperfunctions: let $\psi\in C^\omega(\mathbb{C})$, then $$ F=[F_+,F_-] =[F_++\psi,F_-+\psi]. $$ This come from the fact that we subtract the 2 integrals. So the proper definition for the space of hyperfunctions requires a to take the modulous with a proper equivalence relation (which you can probably guess) but I will not go into details on that.

So what more can we do? Let's define $$ J(\phi) = \int_0^\infty \phi(z) dz. $$ Trivially it holds $$ J(\phi)=I(H\cdot\phi), $$ with $H$ the Heaviside step function. Let's define $$ J = [-\frac{1}{2 \pi i}\log(-z),-\frac{1}{2 \pi i}\log(-z)], $$ then I claim that $J$ is actually the Heaviside step function. I will not go througth this, but it is relatively simple to see it using the definition $\log(z) = \int_1^z \frac{1}{t}dt$. What makes this work is that for $x<0$ the value of the logarithm is the same, however for $x>0$ it depends on the path of integration.

Now let's do the delta function. We define as usual $$\delta(\phi) = \phi(0).$$ Lets's recall Cauchy's integral that states that if a function is analytic in a neigbourhood of the origin then $$ \frac{1}{2\pi i} \oint_C \frac{f(\zeta)}{\zeta} d\zeta = f(0) $$ and define $$ \delta = [-\frac{1}{2\pi i z},-\frac{1}{2\pi i z}]. $$ So how does this make any sense? First notice that for all $x\ne0$, the values of the "two" functions are the same, so the value of the hyperfunction is 0. At $x=0$ it complex infinity. But actually we can say much more things about the origin. Both functions have a simple pole with coefficient $\frac{1}{2\pi i}$. So when we integrate and take the difference, we can transform the integral to the Cauchy integral (I'm not going to actually do this here, but it is pretty straightforward).

What is cool about the theory is that we can define the derivative in the most straightforward way possible: just by writing $$ F'=[F_+',F_-']. $$ The beauty of this can be shown by writting $$ J' = [(-\frac{1}{2 \pi i}\log(-z))',(-\frac{1}{2 \pi i}\log(-z))'] = [-\frac{1}{2\pi i z},-\frac{1}{2\pi i z}] = \delta. $$ So the delta function is the actual derivative of the Heaviside step function!

I will stop here. There is much more into this theory and it is not an easy subject but it is a beautiful one. A relatively simple introduction is the book "Introduction to Hyperfunctions and Their Integral Transforms" by Urs Graf.