Ask HN: How do you handle observability and alerting at tiny companies?
What's your usual solution when you want better visibility into problems than you get from going through unstructured text logs with grep and vim, but your dev team of 1-10 people is too small to operate a bunch of complicated cloud software like Kubernetes and the ELK stack?
Depending on the scale of your actual systems, you might be surprised how far you can get with open source tools like Prometheus / Graphana and just not configuring them for huge scale.
You could for example just run one VM for all of your observability stuff, and stick these tools on it and store data to disk.
Alternatively, if you've got some money and you're systems are OK with outbound internet connections. SaaS monitoring solutions like NewRelic, Dynatrace, etc. are much more plug-and-play.
This. I've began to roll out prometheus counters to some of our internal services to have some visibility on them. There is nothing fancy about our Prometheus config, its the absolute bare minimum required running in a docker container (One of 3 docker containers in the entire business) on the same box as the Grafana panel.
Its not going to win any awards but it works well enough for our needs, while requiring 0 maintenance whatsoever.
Depends on what kind of problems mostly. If you need metrics, Prometheus and its ecosystem is as simple as it gets on or off Kubernetes. There are good quality “packages” for any kind of “infrastructure as code” solutions, like Ansible too.
For logs, there’s Loki which is a lot saner choice than ELK in 2025.
To have proper troubleshooting abilities, you will need a bit more than tooling. You should also need to spend some time instrumenting your apps (Prometheus exporters can only take you to a certain level, e.g. node_exporter for host level stats, or other technology-specific exporters) with metrics, and ensure that your apps are logging in a structured way at least.
Give https://github.com/coroot/coroot a try for full visibility in minutes with eBPF
disclaimer: I'm a co-founder
I like http://papertrail.com; it’s super easy to integrate with any backend, offers alerts, and much more.
I face this problem running a single website with an API and a few side projects. Think a dozen docker containers on three servers in total. It's really hard to get a simple "wake me up if something breaks" system running.
I do like BetterStack heartbeats, but there is nothing similar for "pushing" a failure alert. You just push all your logs then configure filters in BetterStack logs.