Exciting new type of neural networks. There are limits to use them at large scale for now. Still, they have very interesting properties like the interpretability. And also, they tend to give similar performance to traditional neural networks for a smaller size.
This ought to be easier, this should help a bit.
More discussion about models collapse. The provenance of data will become a crucial factor to our ability to train further models.
Still not perfect, but that's an interesting development.
Content creators are clearly annoyed at the lack of consent. The more technical ones are trying to take the matter in their own hands.
I'm rarely on the side of a Goldman Sachs... Still this paper seems to be spot on. The equation between the costs (financial and ecological) and the value we get out of generative AI isn't balanced at all. Also, since it is stuck on trying to improve mostly on model scale and amount of data it is doomed to plateau in its current form.
Those brand new models keep failing at surprisingly simple tasks.
A new era of spam is on us... this is going to be annoying to filter out.
This arm race should be stopped... This is becoming an ecological disaster, so much wasted energy.
Makes a strong case about why LLMs are better described as "bullshit machine". In any case this is a good introduction into bullshit as a philosophical concept. I guess with our current relationship to truth these are products well suited to their context...
Further clues that transformer models can't learn logic from data.
Interesting paper showing a promising path to reduce the memory and workload of transformer models. This is much more interesting than the race to the gigantic size.
This is ignoring the energy consumption aspect. That said, it is spot on regarding the social and economics aspects of those transformer models. They have to be open and self hostable.
OK, this is a rant about the state of the market and people drinking kool-aid. A bit long but I found it funny and well deserved at times.
The creative ways to exfiltrate data from chat systems built with LLMs...
Since there are ways to offset the plagiarism a bit, let's do it. Obviously it's not perfect but that's a start.
Chatbots can be useful in some cases... but definitely not when people expect to connect with other humans.
Another cruel reminder that basic reasoning is not to be expected from LLMs. Here is a quote from the conclusion of the paper which makes it clear:
"We think that observations made in our study should serve as strong reminder that current SOTA
LLMs are not capable of sound, consistent reasoning, as shown here by their breakdown on even such a simple task as the presented AIW problem, and enabling such reasoning is still subject of basic research. This should be also a strong warning against overblown claims for such models beyond being basic research artifacts to serve as problem solvers in various real world settings, which are often made by different commercial entities in attempt to position their models as a strong mature product for end-users. [...] Observed breakdown of basic reasoning capabilities, coupled with such public claims (which are also based on standardized benchmarks), present an inherent safety problem. Models with insufficient basic reasoning are inherently unsafe, as they will produce wrong decisions in various important scenarios that do require intact reasoning."
Definitely this, it's not the first time we see such a hype cycle around "AI". When it bursts the technology which created it is just not called "AI" anymore. I wonder how long this one will last though.
No, your model won't get smarter just by throwing more training data at it... on the contrary.