SpO₂ is oxygen saturation and is used in medical person monitoring.
At Meili, we needed a tool that allows us to monitor our pods, we already have vigil which health checks our front page and backend, but the number of these services is limited. We do not pop new front or backend servers dynamically (for now). When we create new search engines for the user we instanciate a kubernetes pod, we need to monitor the health of this service. Adding each of those URLs by hand in the vigil config file is not a solution.
So we decided that we needed a simple tool, a tool that can accept HTTP requests to register/unregister URLs to health check. We use the new async/await Rust syntax along with tide for the http server, no big deal here.
Our current cloud provider is Digital Ocean, therefore, we cannot host our SpO₂ service there. We chose Scaleway as it is way cheaper, and it works out of the box. We need persitent storage of the health checked URLs. What would you do if those are only stored in RAM? What if the server restarts? I already worked on a disk backed key-value store in Rust named Sled. So we chose to rely on it.
In the last release, we made some improvements to the Slack notification system. We now batch status changes events by 40; this means that SpO₂ sends one message with at most 40 events and regulates channel spamming. It also displays the HTTP status related to an unhealthy measurement and the error message on an unreachable one.
SpO₂ does not support SSL/TLS by itself, neither for the HTTP nor the WebSocket endpoints. We needed this kind of security so we looked at NGINX, a tiny little obscure reverse proxy server which we configured with basic authentication. It is not an easy task and because we are cool we made the documentation to help you do the same.
Do not hesitate to share or star this project, pull requests are welcome 😊
And just as a side note, we do not measure humans but machines actually.