Parallel RAM without large number of pins?

The appropriate standard solution is probably QSPI (also called QPI, or also SQI). It is somewhat an extension of the SPI interface, but uses four (quad, hence the Q in the acronym) data bits (IO0/IO1/IO2/IO3) instead of a single signal for each direction (MISO/MOSI).

So the chips are very small (typically SO-8), and the interface is very efficient: you need to send the address for each read or write command, but then you can read multiple bytes in burst, four bit at each clock cycle. Max clock speed is typically ~104MHz for flash. It can be made even faster using a Dual Data Rate signaling (four bits at each clock edge, both rising and falling: so eight bits at each clock cycle - typically, flash chips will max out at 80MHz in this mode).

The chip datasheets will provide all details about the exact meaning/usage of each signal. To illustrate, here is a read command timing diagram (in single data rate mode, and taken from this datasheet):

enter image description here

Here, you see you need 14 clock cycles to get the first byte (at 80MHz, it means 175ns access time). But if you need more bytes, just add 2 cycles per byte (25ns). So reading in burst will make it much faster than a typical 70ns or even a 45ns flash parallel chip.

You can easily find NOR flash parts from a lot of manufacturers, using this interface. Note that their performances (max speed, dummy cycles count) and features (Quad i/O or just Dual I/O, DDR support) will vary, so check the datasheet.

RAM is a bit more difficult to find, but still available, notably from Microchip (e.g. 23LC512), ON semi (e.g. N01S818HA) and ISSI (e.g. IS62WVS2568GBLL-45). They are slower than flash, though. But the ISSI I suggest above still goes up to 45MHz (single data rate) with apparently a minimum read cycle needing 11 clocks for the first byte. Or put in another way: 200ns + 45ns per byte (180Mbit/s throughput), which is not bad, and exceeds the GRAM speed you indicated.

Also, note that a lot of high-end MCUs (from NXP, ST, ...) support this interface in hardware.

I'm posting this as another answer because it is something totally different.

There is another, but less common, interface that also nicely fits your description: HyperBus, designed by Cypress (it's proprietary).

This one uses DDR at much higher speeds (up to 166MHz), and a 8-bit bus. So you can reach 2666 Mbit/s (wow!), which leaves QSPI far behind. It is also designed for higher-density DRAM rather than SRAM, so you can find 8M x 8 chips (vs 256k x 8 for the ISSI QSPI SRAM mentioned in the other post). It uses only 12 signals (supply voltages excluded).

Here is a HyperRAM product from ISSI: IS66WVH8M8ALL. There are also HyperFlash products you can find.

But we are on another category of products. It is more expensive, less easily sourceable, chips are typically BGA, and the interface is a bit more complex (due to high speed and DDR). Also, fewer MCUs support this.