65 private links
Excellent work to improve Llama execution speed on CPU. It probably has all the tricks of the trade to accelerate this compute kernel.
Interesting take and theory about pair and mob programming. Indeed finding the right path to optimize a piece of code is likely harder in such setups.
Nice set of tricks to optimize load and render time of webpages.
Want to better understand the JIT approach introduced in Python 3.13, this is a good little article. This JIT is a first step towards more optimizations.
Very nice collection of stories from the trenches of Firefox development. Lots of lessons learned to unpack about optimizing for the right thing, tooling, telemetry and so on.
Interesting technique to speed up the generation of large language models.
Or why using a profiler is not as easy as it sounds. This requires quite some experience and the ability to tap in other information not present in the profile.
The Rust tooling makes it super easy to profile your programs. This is neat.
Not in full agreement with this, but having a rough idea of the different leverages you can use for optimizations is worthwhile.
Nice to see the same optimizations than in a previous article play out in Python. By leveraging Numpy and Numba it goes a long way already.
This gives a good list of things to try when optimizing (Rust code or otherwise).
Very thorough paper on optimization techniques when dealing with GPUs. Can be a useful reference or starting point to then dig deeper. Should also help to pick the right technique for your particular problem.
Another partial quote which led to misunderstanding. One should indeed think about performances early on.
Obvious advice perhaps, but so easily forgotten somehow...
Not as detailed as I would like on the proposed solution. A good starting point for inspiration though. Should help with geospatial and fleet management problems.
Interesting optimization on this somewhat common data structure.
Interesting research turning to genetic algorithms to optimize bytecode handler dispatchers.
Good explanation of how flame graphs are produced and how to read them. Gives a few tips on what to look for to optimize.
Don't underestimate performance of the generated code when a JIT is in the picture. Very good example with the JVM just there.
Don't bank it all on faster hardware, make sure your software isn't slow first. Otherwise it'll bring quite some hidden costs.