Knowledge Updates

Observations while developing web applications and creating great software.

  • System-Theoretic Accident Model and Processes to improve resilience in production systems ↗

    Tim Falzone and Ben Treynor Sloss at Google:

    In the face of increasing system complexity and emerging challenges, we at Google are always asking ourselves: what’s next? How can we continue to push the boundaries of reliability and safety?

    To address these challenges, Google SRE has embraced systems theory and control theory. We have adopted the STAMP (System-Theoretic Accident Model and Processes) framework, developed by Professor Nancy Leveson at MIT, which shifts the focus from preventing individual component failures to understanding and managing complex system interactions.

    System failures often have subjective root causes. Asking different questions leads to different outcomes:

    Instead of asking “What software service failed?” we ask “What interactions between parts of the system were inadequately controlled?” In complex systems, most accidents result from interactions between components that are all functioning as designed, but collectively produce an unsafe state.

    The concept of a system entering a hazard state is a good one.

    Hazard states are not system failures, but they are unsafe conditions which can lead to failures. Having automated and manual tools maintain awareness of being in a hazard state can help prevent disasters.

  • Experimental middleware in React Router 7 ↗

    Expected on March 6:

    RSC is cooking as well. React Router is set for rapid improvements this year.

  • Shipping fast for perfection ↗

    Fast clock speed moves your work closer to perfection.

  • Writing an x86 operating system in Rust ↗

    Philipp Oppermann:

    This blog series creates a small operating system in the Rust programming language. Each post is a small tutorial and includes all needed code, so you can follow along if you like.

    See also: Build a RISC-V operating system in 1,000 lines by Seiya Nuta and Compiler Explorer by Matt Godbolt.

  • Just use SemVer ↗

    Changing public APIs is okay. Accept you will not get it right the first time. Very thankful so many developers just use Semantic Versioning.

    As a solution to this problem, we propose a simple set of rules and requirements that dictate how version numbers are assigned and incremented. These rules are based on but not necessarily limited to pre-existing widespread common practices in use in both closed and open-source software. For this system to work, you first need to declare a public API. This may consist of documentation or be enforced by the code itself. Regardless, it is important that this API be clear and precise. Once you identify your public API, you communicate changes to it with specific increments to your version number. Consider a version format of X.Y.Z (Major.Minor.Patch). Bug fixes not affecting the API increment the patch version, backward compatible API additions/changes increment the minor version, and backward incompatible API changes increment the major version.

    We call this system “Semantic Versioning.” Under this scheme, version numbers and the way they change convey meaning about the underlying code and what has been modified from one version to the next.

    Ignore calls for vanity versioning.