Securing HSM-stored symmetric data encryption keys (AES) in memory on server

You've asked several questions here, so I'm going to provide several answers, and a couple of clarifications.

I will be storing the AES key (DEK) in a HSM-based key management service (ie. Azure Key Vault / AWS KMS) and will retrieve the key to encrypt/decrypt data on my Nodejs server.

This is not generally what you want to do. If you're using an HSM, the goal is for the key to stay inside the HSM, and never leave. One common way to accomplish this is to store a key-encryption-key (KEK) in the HSM, and encrypted data-encryption-key(s) (DEK) elsewhere. In the case of Azure Key Vault, these could be retrievable secrets. I'm sure Amazon has something similar. Then, when you need to use a DEK, you pass it in to the HSM, where it is decrypted using the KEK, and then returned to you.

1) Should I create a different AES data encryption key for each user? If so, what is the reasoning behind this? Just another layer of complexity?

Yes. The reason is that if a key is compromised, all of the data is not, only the data encrypted with this key. (The data for one customer, in other words.) In addition, it eliminates the ability of many application flaws (outside of flaws specifically related to key retrieval) to allow one customer to accidentally gain access to another customer's data.

2) How best to secure the key against attacked like when the server is hacked and traced / monitored?

Secure your servers. And follow the rest of the advice here, like only exposing keys when needed.

3) Is it better practice to retrieve the key at startup of the server, then store the key in memory on the server, OR retrieve the key each time the server needs to encrypt data for the user?

It is best to expose the keys for the minimum amount of time required to perform the encryption or decryption operations they need to perform. Retrieving them every time the server starts certainly sounds excessive. Whether it makes the most sense for you to retrieve them say, when a user logs in, or wait until an actual crypto operation needs to be performed is up to your judgement. Again, the smaller the window the better, within reason.