Variant of parabolic maximum principle: $\partial_t u\ge \Delta_x u + b\cdot\nabla_x u-u \;\Rightarrow\; u\ge e^{-t}\inf_y u(0,y)$

Note that it suffices to show the inequality on $[0, C] \times \mathbb R^d$ for any $C>0$.

Part I: Changing $u$ to $w$: Write $U = \inf_y u(0,y)$ and let $w = u - e^{-t} U$. Then $\inf_y w(0,y) = 0$ and

\begin{align} \partial_t w &= \partial_t u + e^{-t} U \\ &\ge \Delta u + b \cdot \nabla u - u + e^{-t} U \\ &= \Delta w + b \cdot \nabla w - w. \end{align} Thus we have $$\tag{3} \partial_t w \ge \Delta w + b \cdot \nabla w - w$$ and we need to show that $w\ge 0$ given $\inf_y w(0,y) = 0$ and (3).

Part II: apply Parabolic Maximum Principle to $w$ (Heuristic): We argue by contradiction. Assume the contrary that $w<0$ somewhere. For the moment assume that the infimum $$\inf w = \inf_{(t, y) \in [0,C] \times \mathbb R^d} w (t, y)$$ is attained at some $(t_0, y_0)$. Then $t_0>0$ and thus we have at $(t_0, y_0)$, $$\tag{4} (\partial_t-\Delta) w \le 0, \ \ \nabla w=0.$$ Then (3) implies that $w(t_0, y_0)\ge 0$, which is a contradiction.

Part III: Apply Non-compact Parabolic Maximum Principle: In general, due to the non-compactness of $\mathbb R^d$ such a point $(t_0, y_0)$ might not exist. There are several ways to deal with this. One of them is the Parabolic version of Omori-Yau maximum principle:

Parabolic Omori Yau Maximum Principle: Let $w : [0, C]\times \mathbb R^d \to \mathbb R$ be a $C^2$ function so that $\inf_y w(0,y) = 0$, $\inf w <0$, and (Sublinear growth) $|w(t, y)|\le C o(|y|)$. Then there exists a sequence $(t_n, y_n)$ so that \begin{align} \lim_{n\to \infty} w(t_n, y_n) &=\inf w,\\ \tag{5}\lim_{n\to \infty} |\nabla w( t_n, y_n)|&= 0,\\ \limsup_{n\to \infty}\left(\partial_t - \Delta\right) w (t_n, y_n) &\le 0. \end{align}

(5) should be compared to (4).

Using the above Theorem (i.e. we assume that sublinear growth of $w$), we argue similarly as in part II. Assume the contrary that $\inf w<0$. Then the Theorem implies the existence of a sequence $(t_n, y_n)$ which satisfies (5). Plug $(t_n, y_n)$ into (3) and let $n\to \infty$ (boundedness of $b$ used), we obtain $\inf w \ge 0$, which is a contradiction.

The Omori-Yau Maximum Principle is well known, in particular if you are working in Riemannian Geometry/Ricci-flow. One of the reference is this, which is in the context of Mean Curvature Flow. I will add more accessible references later.

One could assume, instead of the sublinear growth condition, some bound on the $L^p$-norm of $u$ to obtain different version of non-compact Parabolic Maximum Principle (I will add more references later). Without any growth assumption, maximum principle is false even for heat equation: the classical nontrivial solution $T(t, y)$ to the heat equation constructed by Tychonov satisfies $T(0, y) = 0$ for all $y$, but $T$ attains some negative values for all $t\neq 0$.


Here is a probabilistic solution. Let $L(t)$ be the operator which sends a function $u$ to $ \Delta_x u+ b(t,\cdot ) \cdot \nabla_x u$. This family of operators generates the diffusion process $$dX_t = dB_t + b(t,X_t)dt,$$ which is inhomogeneous Markov. Let $p_{s,t}(x,y)$ be the associated transition densities, i.e., they are defined by the relation $p_{s,t}(x,y) dy = P(X_t \in dy|X_s=x)$, and they solve the adjoint equation $\partial_t p_{s,t}(x,y) = \Delta_y p_{s,t}(x,y) -\text{div}_y\big(b(t,y) p_{s,t}(x,y)\big)$ with initial data $\delta_x$ (a more abstract way to think about $p_{s,t}$ can be found in the comments below).

Then the solution of the equation $\partial_tv = L(t)v $ can be written as $v(t,x) = \int_{\Bbb R^d} v(0,y)p_{0,t} (x,y)dy$. Thus the solution of the equation $\partial_t w = L(t) w - w$ is given by $w(t,x)=e^{-t}\int_{\Bbb R^d} w(0,y)p_{0,t} (x,y)dy$.

Now consider any $u$ satisfying $\partial_t u \geq L(t) u -u$. Define $f(x,t) := \partial_t u - L(t)u+u\ge 0.$ Then $\partial_t u = L(t)u-u+f$, so by the Duhamel principle we can write $$u(t,x) = e^{-t}\int_{\Bbb R^d} u(0,y) p_{0,t}(x,y)dy+\int_0^t\int_{\Bbb R^d} e^{-(t-s)}p_{s,t}(x,y) f(s,y)\;dy\; ds.$$ The second integral is non-negative because $f$ is non-negative. The first integral is bounded below by $e^{-t} \inf_{\Bbb R^d} u(0,\cdot)$ because $p_{0,t}$ is a probability measure, completing the proof.

As Arctic Char points out in the comments above, $u(0,\cdot)$ should be reasonable enough so that these integrals are well defined, i.e. at-worst exponential growth will suffice to show that it can be integrated against $p_{0,t}$.

Sorry if this is unclear, I can try to update later.