86 private links
Interesting dive into how join() and generator behave in CPython.
The ordering used for matrix multiplications definitely matters.
SIMD keeps providing interesting performance boosts for parsing work loads.
Another good example of how to speed up some Python code with nice gains.
Definitely to keep in mind when using sampling profilers indeed. They're useful, they help to get a starting point in the analysis but they're far from enough to find the root cause of a performance issue.
Interesting how much extra performance you can shave off the GPU by going back to how the hardware works.
Interesting quick comparison, this shows the design tradeoffs quite well.
Interesting use of database templates and memory disks to greatly speed up test executions.
More work about eco-design of software. This is definitely welcome. I found this work a bit weak on the state of the art and the interview parts (10 people in the same company). But the field is so nascent that it's to be expected I guess, PhD students have to do with what they have access to. Unsurprisingly this shows a great lack of proper tools to tackle the measurement problem. This thesis shows interesting prospects to reduce variations in measurements though, some of the proposed guidelines might help but cannot offset the hardware heterogeneity completely... The parts focusing on practical advices around Java use and deployment are interestingly easy to apply though. You need to take into account the context of your application to make the right choices of course.
Excellent work to improve Llama execution speed on CPU. It probably has all the tricks of the trade to accelerate this compute kernel.
With some tuning SQLite can go a long way, even for server type workloads. There are still a few caveats but in some case this can reduce complexity and cost quite a bit.
Nice balanced view on some of Rust characteristics. This is much less naive than some of the "Rust is great" posts out there.
As usual measure and don't just assume when you want to optimize something. This is an interesting case in Python using Numba.
Indeed this. It's not only about payload size, it's also about CPU consumption. Our profession is still assuming too much that users will get faster CPU on a regular basis.
This is nice to see a new benchmark being published. This seems to follow real life scenarios. We can expect browser engines performance to increase.
A response to "The Hunt for the Missing Data Type" article. There are indeed potential solutions, but they're not really used/usable in the industry right now. Maybe tomorrow.
Indeed, graphs are peculiar beasts. When dealing with graph related problems there are so many choices to make that it's hard or impossible to come up with a generic solution.
Interesting take even though I'm not sure I buy it completely. This is an interesting pledge for aiming at power efficiency and squeezing performance out of software.
Interesting library if you got to do a lots of heavy analysis work with strings.
This is indeed an odd situation... there is no good explanation about why this is like this.