Resiliency | Multinode

Hardware failures are unavoidable in computing. Multinode is no exception. Fortunately, hardware failures are also extremely rare.

Functions, jobs and scheduled tasks: retries

If a function, job or scheduled task gets interrupted due to a hardware failure, it will be retried on the same input. This means that functions and jobs should be implemented in an idempotent manner - meaning that when the code is retried, the second execution attempt will run cleanly and produce the same outcome as if the first execution attempt was successful.

Services and daemons: self-healing

For services and daemons, any process that is interrupted due to a hardware failure will be immediately replaced.