Delete old Windows Azure Diagnostics data from table storage (performance counters, etc.)

Is there some way how to delete old data in these tables so it wouldn't break anything?

You would need to do this manually. The way this would work is that you will first query the data that needs to be deleted and then once you get the data you will delete it. PartitionKey attribute of the entities stored in these tables actually represents a date/time value (in ticks prepended with zeroes to make it an equal length string) so you would need to take the from and to date/time values, convert them to ticks, make it a 19 character long string (by prepending appropriate number of zeroes) and query the data. Once you get the data on the client side, you will send delete request back to table storage.

To speed up the whole process, there are a few things you could do:

  • When you query the data, use query projection to return only PartitionKey and RowKey attributes as only these two attributes are needed for deletion.
  • For deletion, you could use entity batch transaction. This could speed up the deletion operation considerably.
  • For faster deletes, you can spin up a VM in the same region as that of your storage account. That way you are not paying for data egress charges.

I wrote a blog post some time ago that you may find helpful: https://gauravmantri.com/2012/02/17/effective-way-of-fetching-diagnostics-data-from-windows-azure-diagnostics-table-hint-use-partitionkey/.

Or even better, is there some way to configure retention policy or set it up so that it doesn't keep accumulating forever?

Unfortunately there isn't at least as of today. There's a retention setting but that's only for blobs.