Interesting dive into the experience of writing a small Vulkan engine (almost) from scratch.
Looks like an interesting benchmarking tool. To keep an eye on.
Another cruel reminder that basic reasoning is not to be expected from LLMs. Here is a quote from the conclusion of the paper which makes it clear:
"We think that observations made in our study should serve as strong reminder that current SOTA
LLMs are not capable of sound, consistent reasoning, as shown here by their breakdown on even such a simple task as the presented AIW problem, and enabling such reasoning is still subject of basic research. This should be also a strong warning against overblown claims for such models beyond being basic research artifacts to serve as problem solvers in various real world settings, which are often made by different commercial entities in attempt to position their models as a strong mature product for end-users. [...] Observed breakdown of basic reasoning capabilities, coupled with such public claims (which are also based on standardized benchmarks), present an inherent safety problem. Models with insufficient basic reasoning are inherently unsafe, as they will produce wrong decisions in various important scenarios that do require intact reasoning."
The difficult path for Vulkan. The data obviously is biased since it includes games and most of them are still targeting Windows and so DirectX. I'd be curious to see something similar excluding games (and so focusing on medical, industrial etc.).
Interesting use of cryptography without a security concern. It's more about safety and ensuring something wasn't missed by mistake.
The more releases out there the more vulnerabilities are (and could be) discovered. Some actions are necessary to get things under control properly.
Interesting critique of this new platform... it's the beginning of the hype cycle but will probably the same "enshittification" phenomenon than other platforms.
Very nice explanation and metaphors on how CPUs cache levels work.
A good reminder of everything which might go wrong when connectivity is bad. Most tools let you down in such a case.
Definitely a nice Python trick. Fairly elegant, I'll try to remember it.
Packed with useful information. Clearly some things I'm eager to test in there.
Shows well why you likely don't want to use GraphQL. The chances you need the extra complexity and caveats are very slim.
Another good example of how to speed up some Python code with nice gains.
Definitely to keep in mind when using sampling profilers indeed. They're useful, they help to get a starting point in the analysis but they're far from enough to find the root cause of a performance issue.
This definitely shows PyPy as a successful runtime.
Definitely this. In a world where LLM would actually be accurate and would never spit outright crappy code, programmers would still be needed. It'd mean spending less time writing but more time investigating and debugging the produced code.
Interesting research! Is reading code a math and logic task? Is it a language task? Well... it might be its own thing.
The words we use indeed matter. This is definitely a domain where we should avoid ambiguities...
Or why you should let domain simply expire, there's plenty of work to do before that.
Not necessarily my favorite governance model, but if you're on that scheme... those are good guiding principles.