What is operational calculus and what should it be?*

Just multiply the differential.

Most people who have taken some sort of undergraduate math course in college are probably familiar with the following situation. You are given an ordinary differential equation of type

$\displaystyle \frac{\mathrm{d}y}{\mathrm{d}x}=f(x)$

and want to find the function $y(x)$ that solves it. What can you do? Well, just multiply both sides by $\mathrm{d}x$ and integrate the respective differentials,

$\displaystyle y(x)=C+\int_{x_0}^x\mathrm{d}\xi\,f(\xi)$ .

Now I might hear you scream, „No, that’s wrong. You can’t treat the derivative as a fraction,“ and you would be right. But what if I told you that it’s actually possible (assuming certain conditions are met), and that many mathematicians have been using these and similar techniques for many decades?

What may seem like a shady trick is actually known as operational calculus⁺, and has been developed already early on at the birth of calculus by Wilhelm Gottfried Leibniz, and later on during the 19th century by British and Irish mathematicians such as Sylvester, Boole, Glaisher, Crofton, and Blizard. Another important figure who used operational methods extensively in his work on electrical engineering and strongly promoted their use was the self-taught physicist Oliver Heaviside. So, what is it all about? The main idea is to treat operators (especially the derivative operator) as numbers and manipulate them algebraically. Take the example from before and write the derivative as $\hat{D}_x$ (I’ll denote operators with hats from now on). Note that we then identify $\hat{D}_x^nf(x)=\frac{\mathrm{d}^nf}{\mathrm{d}x^n}$ . With this, we can recast the original ODE as

$\displaystyle \hat{D}_xy(x)=f(x)$ ,

which after dividing by $\hat{D}_x$ leads to the formal solution

$\displaystyle y(x)=\hat{D}_x^{-1}f(x)$ .

„Obviously“, the inverse operation of differentiation is integration, so we get the same result as before. So why then do I write obviously in quotation marks? To keep it simply, whereas the derivative is a local operation on a function, the integral is not. We have to provide a lower bound for the integral to make any sense, or in other words, there is no such thing as the one inverse of the derivative. While this is not an issue here (the lower bound will just give a constant that doesn’t contribute to the derivative), it is something to keep in mind.

Okay, granted, the differential equation was pretty unimpressive, so how about spicing things up a bit! Take a look at the differential equation below:

$\displaystyle \hat{D}_x^2y(x)+\hat{D}_xy(x)-2y(x)=0$ .

I know it’s still not the most exciting ODE, but it’s a good demonstration of the powers of the operational method. First, we pull out the function and rewrite the equation in the form $(\hat{D}_x^2+\hat{D}_x-2)y(x)=0$ . Not much has changed, but we can now „factor“ the operator to arrive at

$\displaystyle (\hat{D}_x+2)(\hat{D}_x-1)y(x)=0$ .

Since this should be true for all (non-zero) $y(x)$ , it immediately follows that

$\displaystyle (\hat{D}_x+2)y(x)=0\qquad\textsf{and/or}\qquad(\hat{D}_x-1)y(x)=0$ .

But these are just the ODEs of scaled exponential functions! Therefore, we find the solution as a superposition of two exponential functions:

$\displaystyle y(x)=A\mathrm{e}^{-2x}+B\mathrm{e}^{x}$ ,

for some constants fixed by the initial conditions. Pretty interesting, and actually a shortcut to using an ansatz like $y(x)=c_\lambda\mathrm{e}^{\lambda x}$ instead. But what about non-homogeneous differential equations? For instance, take the simple ODE

$\displaystyle y(x)-\hat{D}_xy(x)=f(x)$ .

Performing the same steps as before (pulling out the function, etc.), and by treating the operator algebraically, we arrive at the formal solution

$\displaystyle y(x)=\frac{1}{1-\hat{D}_x}f(x)$ .

What to make of this expression? There are many possible interpretations of $\frac{1}{1-\hat{D}_x}$ . Quite naturally, one has the formal power series

$\displaystyle\frac{1}{1-\hat{D}_x}=1+\hat{D}_x+\hat{D}_x^2+\dotsm$ ,

which simply gives

$\displaystyle y(x)=\sum_{n=0}^\infty\hat{D}_x^nf(x)$ .

It’s a simple task to check the correctness of this expression (although some may notice something missing). However, computing infinitely many derivatives is probably not the best solution to a differential equation, and in fact we can do better! Recall the integral

$\displaystyle \frac{1}{a}=\int_0^\infty\mathrm{d}x\,\mathrm{e}^{-ax}$ ,

which is known to some (especially physicists) as Schwinger parameterization. Under the defining assumption of operational calculus – treating operators algebraically – we can use it to cast the formal solution in the form

$\displaystyle \frac{1}{1-\hat{D}_x}f(x)=\int_0^\infty\mathrm{d}\xi\,\mathrm{e}^{-\xi(1-\hat{D}_x)}f(x)=\int_0^\infty\mathrm{d}\xi\,\mathrm{e}^{-\xi}\mathrm{e}^{\xi\hat{D}_x}f(x)$ .

„What?“ you might say, „the exponential of the derivative? Not yet another function of the derivative“. But don’t worry (and I’ll come back to this in detail later in another post), this is the very famous representation of the shift operator, which, as the name suggests, just shifts the argument, i.e. $\mathrm{e}^{\xi\hat{D}_x}f(x)=f(x+\xi)$ . Therefore, we can write the integral as

$\displaystyle \int_0^\infty\mathrm{d}\xi\,\mathrm{e}^{-\xi(1-\hat{D}_x)}f(x)=\int_0^\infty\mathrm{d}\xi\,\mathrm{e}^{-\xi}f(x+\xi)$ ,

which has an equivalent form you would also be able to find in a textbook:

$\displaystyle y(x)=\mathrm{e}^{x}\int_x^\infty\mathrm{d}\xi\,\mathrm{e}^{-\xi}f(\xi)$ .

Again, it’s a simple task to check the accuracy of this expression. So, finally, let us talk about the two elephants in the room. Where is the homogeneous solution $C\mathrm{e}^x$ ? And what happens if $f(x)$ grows faster than $\mathrm{e}^x$ , so that the integral does not converge? To answer the first point, for reasons I won’t go into here and which could be summarized by the question „What is $\frac{1}{1-\hat{D}_x}\,0$ ?“, this approach yields the solution to the vanishing initial value problem, and since $\mathrm{e}^x>0$ over the entire real line, it won’t appear. To answer the second point, let’s just say that operational calculus is still somewhat of a shortcut. We haven’t discussed convergence or the existence of the solution, so it’s really no surprise that there are limitations. Nonetheless, in this case it is fairly easy to see that the upper bound of the integral can actually take any value (which can be checked by differentiating under the integral sign, aka the Leibniz rule).

To conclude this brief introduction, let me say that this procedure can be generalized to much more complicated ODEs. Take a general differential equation like

$\displaystyle \hat{\mathbb{D}}_xy(x)=f(x)$ ,

for some differential operator $\hat{\mathbb{D}}_x$ , the formal solution can always be found by

$\displaystyle y(x)=\hat{\mathbb{D}}_x^{-1}f(x)$ .

However, the treatment of the resulting inverse operator can be arbitrarily complicated and may require more in-depth study. But as we have already seen, we can reformulate the formal solution using an integral transform of type

$\displaystyle y(x)=\int_0^\infty\mathrm{d}\xi\,$ $\displaystyle \mathrm{e}^{\xi\hat{\mathbb{D}}_x}f(\xi)$ .

Maybe then it’s a good idea to focus a bit on operators (as the one highlighted in red) that are similar to the shift operator^#. But for now that should be enough. I hope I have illustrated the general idea of treating operators algebraically. In future posts I will cover more interesting examples outside of ODEs, so stay tuned!

See you next time, Cheers!

A nice read is Operational Calculus and Related Topics by Glaeske, Prudnikov, and Skòrnik

*freely adapted from Dedekind
⁺Actually, the precise definition is more nuanced than that. But it captures the idea well
^#A hint to the next post

Pseudo-Riemannian Manifold of Random Ideas