Interesting idea, trying to bridge the best of both UUID options.
Maybe it's time to stop obsessing about scale and distributed architectures? The hardware has been improved quite a bit at the right places, especially storage.
This is definitely true. Keep all this in mind when dealing with performance questions: design properly for the task, profile and profile some more, focus on the hotspots, keep things maintainable.
Of course I agree with this. We should fight undue complexity at every step. This beast tends to creep up very quickly and we're all guilty of it at times. This is indeed particularly obvious in otherwise rich languages like C++ or Rust. Those tend to push people to try to be clever and show off, often for "performance reasons". This is rarely a good idea, think twice.
I would have titled the "defer refactoring" section differently though. Probably "defer abstracting" would have fit the bill better.
A few ideas to dig deeper into for better multi threaded throughput.
Nice bag of tricks for better Rust performance at low level. The compiler is indeed helping quite a bit here.
This can be useful indeed to explore concurrency issues. It requires some work though.
We really have nice facilities in the kernel to squeeze some extra performance nowadays.
Interesting, it looks like index scans in your databases can have surprising performance results with SSDs.
No good tricks to optimize your code, but knowing the tooling knobs sometimes help.
Indeed, CPU prefetchers are really good nowadays. Now you know what to do to keep your code fast.
SIMD instructions are indeed a must to get decent performance on current hardware.
It's just hard to make Python fast. It can be improved yes, but it'll stay cache un-friendly without a redesign. Nobody wants a Python 4. :-)
Interesting tricks to optimize this function in V8.
Nice new tool to investigate code generated by macros in Rust. Indeed you can quickly add lots of lines to the compiled code without even realizing, in large code bases it's worth keeping in check.
Nice exploration which shows the many levers one can use to impact Rust compilation times. They all have their own tradeoffs of course, so don't use this just for the sake of reducing time.
A good reminder that "push it to the GPU and it'll be faster" isn't true. If you move a workload to the GPU you likely have to rethink quite a bit how it's done.
We can expect this to be a game changer for the C++ ecosystem. A couple of examples are presented in this article.
What is premature optimization really? If you look at the full paper it might not be what you think. In any case we get back to: do the math and benchmark.
I'd like to see the equivalent for Europe. Clearly in the US things aren't always great for Internet access. The latency is likely higher than you think, and the bandwidth lower.