Interesting piece which challenges the shared-memory vs. message-passing dichotomy. It message passing indeed gets rid of data races but nothing more. Of course this is nice already, but that doesn't mean you can't have the other families of concurrency bugs creeping in.
Short explanation of why you want to make invalid state impossible to represent. This leads to nice properties in your code, the price to pay is introducing more types to encode the invariants of course.
A good explainer on what metastable failures are and how to try to mitigate them.
This is something I've definitely seen indeed. There are clearly a threshold effect in the amount of code you have to manage. Solutions working at smaller amounts don't work anymore a couple of order of magnitudes higher, and vice versa.
Looks like a nice kit to add to your tool belt. Does some handy checks if you have a Postgres database to manage.
A bit of a shameless plug toward the end. That said the explanations of why Cloudflare is banking on Rust so much or how the recent downtime could have been avoided are spot on.
Error handling is not easy. Having simple rules to apply for complex systems is a good thing. Of course the difficulty is to apply them consistently.
Interesting point of view. Indeed, you probably want things to not be available 100% of the time. This forces you to see how resilient things really are.
Interesting approach for a project to collect such traps in there dependencies like this.
This is a good look at the reasons behind throttling. If you accept a less naive model than "preventing abuse", you can build a better throttling strategy.
If the funding dries up... we'll have another AI winter on our hands indeed.
I find the title misleading. Still, this is a good exploration of how to treat unwrap() and expect() in Rust code.
Illustrated with the Clojure ecosystem, bit there's nothing inherently specific to the language here. If you want to ensure stability to your users, you need to manage your APIs properly and this article put forward a couple of interesting ideas.
This is almost by definition. The post mortem needs to be wisely crafted to look also at previous incidents and the actions to mitigate them.
At some point the complexity is high enough that you indeed need more tools than only handcrafted tests to discover bugs.
Interesting rambling and exploration. What would a computer built to last a century look like?
Interesting research, looking forward to the follow ups to see how it evolves over time. For sure the number of issues is way to high still to make trustworthy systems around search and news.
This highlights quite well the limits of the models used in LLMs.
This is an important trait to have for a developer. If you're content of things working without knowing why and how they work, you're looking for a world of pain later.
Interesting point. You likely need to be careful with fallback modes especially in distributed systems. They might bring even more issues when the system is already under stress.