What are the requirements for an application health monitoring system?

  • Whether the application is running.
  • Unusual cpu/memory/network usage.
  • Report any unhandled exceptions.
  • Status of various modules (if applicable).
  • Status of external components (databases, webservices, fileservers, etc.)
  • Number of pending background tasks (if applicable).
  • Maybe track usage of the application and report statistics on most/less used functionalities so you know where optimizations are most beneficial.

The answer is 'it depends'. Why do you need to monitor? How large is your operations staff? Do you need reporting? What is the application environment? Who cares if the application fails? Who cares if an exception happens? Are any of the errors recoverable? I could ask questions like these for a long time.