Good explanation of how flame graphs are produced and how to read them. Gives a few tips on what to look for to optimize.
Interesting exploration on the performance of SSDs regarding write patterns. Turns out sequential IO is still a thing, just for a different reason than with good old HDDs.
Very interesting to see that move to owned hardware... turns out that not only the invoice is smaller in their case but the performances are much better as well.
Nice set of tips, I knew a few but not all of them. The discussion around CTEs is interesting.
Nice exploration of false sharing on performances in several hardware scenarii. A couple of surprises along the way.
Shouldn't come as a surprise if you paid attention to C++ evolutions for the past 30 years. We're now reaping the fruits though, so it's really become easy to keep both options in sight when designing. This is especially important for performance sensitive code.
Nothing really new here (apart from the "how easy it is these days!")... Still it needs to be reminded on a regular basis. :-)
Nice post explaining the common algorithms used for load balancing. Each having their own trade offs of course. Well done with tiny simulations.
Polars really looks like a nice alternative to Pandas with a nice upgrade path from data exploration to production.
Nice walk through for a use of PyO3 to make some Python code much faster. Nice to see how useful py-spy turn out to be in such scenarii as well.
Very interesting conversation between Uncle Bob and one of the recent critics of his work regarding performance. I like how he admits some faults in the way he presents things and try to improve for later rather than trying to be right. Some people should learn from that. There's clearly a tension between performance and what is described in Clean Code, it'd be pointless to deny it.
perf now available also to Python programs. This definitely can be useful for proper profiling.
Interesting position from AMD regarding the race on the next super computers. They're all being caught up by energy efficiency so it'll need to be addressed both at the processor architecture level but also at the software architecture level. How we design our computing tasks will matter more and more.
A bit of a sarcastic tone but a few good point in there. Also shows interesting alternatives to C++ to squeeze every ounce of performance out of your code whatever the platform it runs on. Of the three options explored I knew only about Numba really.
Excellent analysis and explanation of the stutter problem people experience with game engines. It's an artifact of the graphics pipeline becoming more asynchronous with no way to know when something is really displayed. Extra graphics APIs will be needed to solve this for real.
Time to look a bit at the maze of WebAssembly runtimes. Good overview on how they currently perform and how well they are documented or easy to use.
Python is getting faster but is still far from what you can get with C++ of course. That said, for simulations you likely don't want everything in Python or in C++. Part of the challenge is to split the subsystems properly and use C++ where it matters.
Don't underestimate performance of the generated code when a JIT is in the picture. Very good example with the JVM just there.
Don't bank it all on faster hardware, make sure your software isn't slow first. Otherwise it'll bring quite some hidden costs.
Little simple benchmark of WebAssembly performances for the most common languages found there. Careful to the payload size though.
Definitely this, we have to stop pointing disk I/O so much for performance issues. This is just not really slow anymore. Obviously network is a different story.
Nice summary on the false sharing problem with caches and how it can impact your performances in multithreaded contexts.
Interesting deep dive on how sets and dicts are implemented in CPython. There are a couple of interesting tricks in there.
There are indeed a few architectural problems with the Fediverse as it is. Can this be solved? Hopefully yes.
Interesting take about how performance optimizations can sometimes leverage even more performance gains than you would expect.
Good reminder that "premature" doesn't mean "early". Poor Knuth is so often badly quoted in the context of optimization that it's really sad. The number of times I see "early pessimisation" on the pretense of avoiding "premature optimization". Such a waste...
This is good news, this provide more venues for improving performances in Python modules next to switching to compiled Rust with something like PyO3. There's clearly a case to be more for not having to rewrite when the codebase was already mostly Python.
This has some interesting promises in terms of performance using Python. Looks a bit like a CUDA for Python... to be seen how it fares in practice.
Let's put this quote back in its context, shall we?
One of the best developer tools around for analysis and profiling. I'm glad it exists, saved me a few times.
Wow, this is a very good exploration of the performances of several common languages and runtimes. This is one of the most thorough I've seen. A good resource for deciding what to pick.
And this is why you likely need to optimize your data pipelines at some point. There are plenty of levers available.
Excellent deepdive about pipes, on the path to optimization we see how perf is used, how memory is managed by the kernel etc. Very thorough.
Debatable "feature", bad implementation, dubious community handling... Clearly not a good example to follow from the Go space.
This looks like a very interesting tracing tool for debugging and profiling purposes.
That looks like a very interesting tool for larger Python based projects. Definitely need a way to profile memory use in there.
Oh this is really neat! This is a good way to visualize how it evolved over time, I find the period starting in 2005 especially interesting.
Really cool optimizations for B-Trees. Once the layout is reworked this is a neat way to use SIMD as well.
Interesting use of WebAssembly for fast and very portable code. Also especially interesting is the care in the move to the new software architecture.
Interesting tips for potential bottlenecks in your queries.
Not necessarily unknown paths to squeeze more performance out of Python. Still it's nice to have those options measured and listed in the same post.
Good reminder that CORS can have an impact regarding the performance of your application.
Good reminder on how a shared atomic can become a huge bottleneck in multi-CPU setups.
Mostly about the general approach on how to profile this kind of things. Still a couple of interesting pytest specific tips in here.
Interesting exploration and workaround for the Postgres query planner.
This looks like an interesting full system profiler.
Interesting piece covering: how a memory allocator works, why it can be slow, how to use it the best way possible and how to pick an allocator for your project.
This is a very interesting deep dive in how branch predictors work. Also comparing timing profiles between different families of CPUs.
Excellent reminder about where the limit is for the compiler to optimize things. Nowadays it's mostly about the memory accesses and then it means that the design matters a lot. Object-oriented designs being far from optimal here. Data-oriented designs fare much better but are definitely less friendly for human brains to reason about them.
Very thorough analysis on the kind of web frontend performances you can expect for most people on mobile. Since we basically need to reduce the footprint of such frontends to make this sustainable again this is a very welcome article.
Obviously didn't read it all but this is a very large knowledge repository of practices from many companies one can get inspired by to work on Site Reliability Engineering. It is especially comprehensive since it's not only about technical tips but also deals with hiring, team building and culture (which is almost as important if not more).