Uniqueness of the maximizer of the Hardy inequality

Edit: No, it turns out that the maximizing sequence is far from unique. See Final Edit at the bottom...

This is not an actual answer to your question, just a suggestion regarding how to look at it, that it seems to me may be useful.

Observe that $$Hf(x)=\int_0^1 f(tx)\,dt.$$

So if we define $$f_t(x)=f(tx)$$ we can regard $$Hf=\int_0^1 f_t\,dt$$ as a vector-valued integral. This makes (H) more or less obvious: $$||Hf||_p\le\int_0^1||f_t||_p\,dt=||f||_p\int_0^1t^{-1/p}\,dt=\frac p{p-1}||f||_p.$$This makes it clear why there is no $f\ne0$ with $||Hf||_p=\frac p{p-1}||f||_p$; that would require that the norm of the integral equal the integral of the norm, implying that for almost every $t$ we have $f_t=c_tf$, certainly impossible for $f\in L^p$, $f\ne0$.

So regarding your question, perhaps you can get somewhere by investigating (maybe by Hahn-Banach) what follows if the norm of the integral is almost the integral of the norm...

Edit in answer to the question of what makes $x^{-1/p}$ special: Note first that there's no actual rigorous math from this point on, just fuzzy heuristics.

We've seen that if $f$ were a maximizer for $H$ then $f_t=c_tf$ for every $t>0$, which is impossible for a (non-zero) $L^p$ function. The obvious way to get $f_t=c_tf$ is to set $f(x)=x^\alpha$ (I think it's clear that this is essentially the only way, at least if we assume $f$ is continuous).

So say $f(x)=x^\alpha$. Then $\int_0^1 f^p<\infty$ if and only if $\alpha p>-1$, while $\int_1^\infty f^p<\infty$ if and only if $\alpha p<-1$. So, if you promise not to tell anyone I put it this way, setting $\alpha=-1/p$ makes $\int_0^\infty f^p$ "as close as possible to finite at both endpoints".

(Hmm. An actual true fact is that $\alpha=-1/p$ is the only choice that makes $\int_{1/A}^Af^p=O(\log A)$.)

Better yet, start over and look at it this way: In fact $H$ is just a convolution operator on the multipicative group $(0,\infty)$. This is really the right way to look at it, for example it makes both Hardy's inequality and the fact that almost-maximizers are what they are completely transparent.

We wimp out and make a change of variables so we can consider convolutions and Fourier transforms on $\Bbb R$ instead of on that group:

Define a surjective isometry $T:L^p((0,\infty))\to L^p(\Bbb R)$ by $$Tf(x)=e^{x/p}f(e^x).$$Define $\tilde H:L^p(\Bbb R)\to L^p(\Bbb R)$ by $$\tilde H=THT^{-1}.$$

You can calculate that $$\tilde Hf=f*K,$$where $$K(x)=e^{-((p-1)/p)x}\chi_{(0,\infty)}(x).$$Hence $$||\tilde Hf||_p\le||K||_1||f||_p=\frac p{p-1}||f||_p.$$(Since $T$ is a surjective isometry this is exactly Hardy's inequality, made totally obvious/motivated.)

And at least formally $$\widehat{\tilde Hf}=\hat K\hat f.$$Since $K\ge0$ it's clear that $$||\hat K||_\infty=||K||_1=\hat K(0).$$ This makes it at least very plausible that the almost-maximizers for $\tilde H$ should be $f$ such that $\hat f$ is supported near the origin (for $p=2$ that's not only plausible it's even true, by Plancherel). But if $f=1$ then $\hat f$ is literally supported on $\{0\}$. And $$T^{-1}1=x^{-1/p}.$$

Final Edit: No, that's wrong.

I was unable to prove that the last paragraph above actually works, even for $p=2$, which was supposed to be clear. In fact the plausibility argument was too fuzzy. Say $p=2$.. It's true that if $f_n$ is a maximizing sequence then $\widehat {f_n}$ must in some sense have most of its mass concentrated near the origin, but it does not follow that $f_n$ must be close to $1$ on a large set. If $\widehat {f_n}$ were $L^1$ with integral $1$ that would be true, but $\widehat {f_n}$ is not integrable. What's true is that $|\widehat {f_n}|^2\to\delta_0$ weakly, but that says nothing about $f_n$ tending to $1$.

And in fact there are jillions of other maximizers. Temporarily define $$\phi_n(x)=n^{-1/p}\phi(x/n),$$and note that $$||\phi_n||_p=||\phi||_p.$$

It turns out that if $f\in L^p(\Bbb R)$ and $f_n$ is defined as above then $(f_n)$ is a maximizing sequence for $\tilde H$. If it happens that $f$ is continuous at the origin then $f_n$ is approximately constant on compact sets, consistent with our wrong conjecture about a maxmizing sequence being essentially unique. But if $f$ oscillates suitably near the origin then every $f_n$ does a lot of oscillation on $[-1,1]$. So "$f=1$, except truncated to lie in $L^p$" is far from the only sort of thing that maximizes $\tilde H$, hence similarly for $x^{-1/p}$ and $H$.

Proof: If $p=2$ then it's easy to see from Plancherel that $$||f_n*K||_2\to||K||_1||f||_2=||\tilde H||\,||f||_2.$$This was the first thing I noticed indicating that the conjecture was wrong, further evidence that looking at $H$ in terms of convolutions is the right way to look at it. It's not hard to give a direct proof valid for $p>1$:

First, a change of variable shows that $$f_n*K=(f*K^n)_n,$$where $(.)_n$ is defined as above and $$K^n(x)=nK(nx);$$hence $$||f_n*K||_p=||f*K^n||_p.$$But $(K_n)$ is an approximate identity, except for the mis-normallization $\int K_n=p/(p-1)$. So $$\left|\left|f*K_n-\frac p{p-1}f\right|\right|_p\to0,$$hence $$||\tilde Hf_n||_p=||f*K^n||_p\to\frac p{p-1}||f||_p.$$