Why do we have to dump the extra entropy created in a heat engine?

We are talking about cycles here. After a complete cycle the system must be right back where it started. Since entropy is a state variable, it must then be that after one complete cycle the entropy is at its "initial" value. By the second law this must mean that the entropy has to "go somewhere else". If you "gathered more and more" entropy, then it is no longer a cycle.

If you did want to do what you propose of replacing engines then it would be extremely inefficient. You would get one "run" out of the process and then have to get a new engine (I am not sure how this would actually work). It is much better to use the same engine on a cycle.

In my understanding, we could gather more and more extra entropy, while converting all the heat into work

You can't store entropy while still converting all the heat into work. Storing an amount of entropy $dS$ requires that you also store an amount of energy $TdS$, where $T$ is the temperature of the object you're storing the entropy in.

But then we could just go to the next machine and do the same, always converting all heat into work.

You can retain some of the entropy internally inside your heat engine rather than expelling it into some external reservoir such as a river or the air. Let's say you have a tank of water that stays inside your heat engine until you throw the heat engine away. You store entropy in this tank of water, which requires heating the water. There are two issues here: (1) As the water tank gets hotter, the energy cost of storing energy in it, $TdS$, gets worse and worse. (2) The tank is no different from an external heat reservoir. You can keep it inside the box that holds your engine, but that doesn't matter. Our description of a heat engine abstracts away questions like where the low-temperature reservoir is physically located. The only real difference is that we normally idealize the low-temperature reservoir as an infinite resource, whose temperature never changes, while the tank is actually finite, and therefore worse thermodynamically because it heats up.

The reason you need to dump the heat is because engines, by their definition, operate in a cycle. They return to a previous configuration before continuing on. So your solution of just using an unlimited number of one-shot devices is theoretically possible. It just wouldn't be called an engine. It also wouldn't be practical. One might, however, consider the big bang itself to be the ultimate one-shot device!

Engines also want this cycle to go in one direction, so we have to engineer them to do so. In theory, one could have a device which simply goes in either direction without dumping heat. We've built such devices -- little turbines that operate at the molecular scale where random molecular motion causes things to bump one way or another. However, we can't make them do work (we tried). To get work out of them, we need to know which way around the cycle they will go.

To make that happen, we target two equilibrium, rather than one. One equilibrium is at maximum entropy, such as at the fullest expansion of a piston. Once we get there, we need to reset the machine, targeting a second equilibrium (such as with a piston at its most compressed). As we do this, we have to dump the heat because this second equilibrium is not the highest entropy state with all that heat in the system. We have to get rid of the heat before this second equilibrium is achieved.

Now you are allowed to use the heat to drive another engine. This is called a multi-stage engine and they are used in many power plants. They can be more efficient than a single stage engine. However, the laws of thermodynamics provide a hard limit on how efficient they can be, no matter how many stages you use. The resulting limit on efficiency is defined by Carnot's Theorem, and depends on the temperatures of the hot source and the cold sink. (Note: only heat engines have this limit. Other devices, such as fuel cells, do not operate as a heat engine, so they can achieve higher efficiency)

The ultimate example of this is a Matiroshka Brain. This is a fantastic megastructure wrapped around a star to get as much usable work out of the fusion engine as possible. It is a massive heat engine which has a tremendously large number of stages (thousands to millions), where the waste heat from each stage is the hot source for the next stage. The result could theoretically get close to the ultimate ideal heat engine.

For a Matrioshka Brain around our star (the sun), we can calculate its efficiency. The sun is roughly 5800K on its surface, so that's our hot temperature. Our low temperature is the background radiation of empty space, which is a mighty frigid 2.725. Plugging this into Carnot's equation, $\eta_{max}=1-\frac{T_C}{T_H}$, we get a maximum efficiency of 99.95%. These brains can be amazingly efficient, but they can never avoid the slow march of entropy!