If a lens focuses all incoming light to a point, how do we get 2D images?

...if a lens bends all incoming rays of light to intersect at the focal point? Shouldn't this produce a single dot of light...?

(In your diagram, the source image is at infinity. I will continue the analysis along that idea.)

It is true that all rays parallel to the axis focus to that single dot. Not all rays, however, are parallel to the axis:

enter image description here

Rays coming from different angles focus to different points. That is how an image is formed.

A convex lens does not focus all the rays to a single point. It focuses the all axis parallel rays on the focal point. It also focuses all the rays emanating from a given point to a corresponding point on the other side. That point is the image of the original point.

The standard diagram shows that a lens sends all axis parallel incoming rays to the focal point. Parallel rays can be thought of as emanating from a point at inifinity. So you can think of focal point as the image of a point at infinity.

More general ray diagrams look like this: enter image description here

The full rules for drawing ray diagrams as given in the Hyperphysics article on the subject:

  1. A ray from the top of the object proceeding parallel to the centerline perpendicular to the lens. Beyond the lens, it will pass through the principal focal point. For a negative lens, it will proceed from the lens as if it emanated from the focal point on the near side of the lens.
  2. A ray through the center of the lens, which will be undeflected. (Actually, it will be jogged downward on the near side of the lens and back up on the exit side of the lens, but the resulting slight offset is neglected for thin lenses.)
  3. A ray through the principal focal point on the near side of the lens. It will proceed parallel to the centerline upon exit from the lens. The third ray is not really needed, since the first two locate the image.

Your diagram shows parallel light beams originating from infinity. Light entering the eye in the real world isn't all parallel.

In the real world, all the light that bounces off a single point will hit the sensor (eye-cone, digital sensor, etc) at a single point, assuming that point is in focus. However, light from different points will strike the sensor at different places, creating the image that you see.

In-focus points

If the point is not in focus, it will be spread out on the sensor in a circle. Photographers call this the circle of confusion.

Out-of-focus point

Playing around with this tool should make it pretty obvious. I took these images from my answer to the Photography.SE question How does aperture work without “cropping” the image hitting the sensor?