Search: [ai] - ervin's web review

New era of slop security reports for open source

Let's hope security teams don't get saturated with low quality security reports like this...

tech · security · ai · machine-learning · gpt

December 4, 2024 at 9:16:53 AM GMT+1 * · permalink

·

https://sethmlarson.dev/slop-security-reports

·

Core copyright violation claim moves ahead in The Intercept’s lawsuit against OpenAI

Another lawsuit making progress against OpenAI and their shady practice.

tech · ai · machine-learning · gpt · copyright · law

December 1, 2024 at 10:18:31 AM GMT+1 * · permalink

·

https://www.niemanlab.org/2024/11/copyright-claim-moves-ahead-in-the-intercepts-lawsuit-against-openai/

·

SmolVLM - small yet mighty Vision Language Model

Nice vision model. Looks like it strikes and interesting balance between performance and memory consumption. Looks doable to run cheaply and on premise.

tech · ai · machine-learning · vision

November 30, 2024 at 2:39:11 PM GMT+1 * · permalink

·

https://huggingface.co/blog/smolvlm

·

OpenAI blamed NYT for tech problem erasing evidence of copyright abuse

More shady practices to try to save themselves. Let's hope it won't work.

tech · ai · machine-learning · gpt · law · copyright

November 26, 2024 at 9:37:47 AM GMT+1 * · permalink

·

https://arstechnica.com/tech-policy/2024/11/tech-problems-plague-openai-court-battles-judge-rejects-a-key-fair-use-defense/

·

‘Thirsty’ ChatGPT uses four times more water than previously thought

The water problem is obviously hard to ignore. This piece does a good job illustrating how large the impact is.

tech · ai · machine-learning · gpt · ecology · water

November 24, 2024 at 9:42:31 AM GMT+1 * · permalink

·

https://www.thetimes.com/uk/technology-uk/article/thirsty-chatgpt-uses-four-times-more-water-than-previously-thought-bc0pqswdr

·

ChatGPT is Slipping

Good reminder that models shouldn't be used as a service except maybe for prototyping. This has felt obvious to me since the beginning of this hype cycle... but here we are people are falling in the trap today.

tech · ai · machine-learning · gpt · vendor-lockin

November 18, 2024 at 10:15:50 AM GMT+1 * · permalink

·

https://adriano.fyi/posts/chatgpt-is-slipping/

·

OpenAI, Google and Anthropic Are Struggling to Build More Advanced AI - Bloomberg

More signs of the generative AI companies hitting a plateau...

tech · ai · machine-learning · gpt

November 15, 2024 at 8:04:21 AM GMT+1 * · permalink

·

https://www.bloomberg.com/news/articles/2024-11-13/openai-google-and-anthropic-are-struggling-to-build-more-advanced-ai

·

Releasing the largest multilingual open pretraining dataset

It shouldn't be, but it is a big deal. Having such training corpus openly available is one of the big missing pieces to build models.

tech · ai · machine-learning · gpt · data · copyright · licensing

November 14, 2024 at 7:47:03 AM GMT+1 * · permalink

·

https://simonwillison.net/2024/Nov/14/releasing-the-largest-multilingual-open-pretraining-dataset/#atom-blogmarks

·

Everything I've learned so far about running local LLMs

This is an interesting and balanced view. Also nice to see that local inference is really getting closer. This is mostly a UI problem now.

tech · ai · machine-learning · gpt · foss

November 10, 2024 at 6:55:46 PM GMT+1 * · permalink

·

https://nullprogram.com/blog/2024/11/10/

·

Does Open Source AI really exist?

Like me, you find the Open Source AI Definition weak on the training data information side? You'd be right and there's a reason for it... it's probably hiding quite some open washing for the larger models. This is a good explanation of the motives and consequences.

tech · foss · ai · machine-learning

November 1, 2024 at 10:22:26 AM GMT+1 * · permalink

·

https://tante.cc/2024/10/16/does-open-source-ai-really-exist/

·

Vector Databases Are the Wrong Abstraction

I definitely like the approach of having vectorisation in the RDBMS directly. This is one less moving part, less complexity at the application level to synchronize everything together. In this case it's a Postgres extension.

tech · postgresql · databases · ai · machine-learning · language

October 31, 2024 at 8:52:21 AM GMT+1 * · permalink

·

https://www.timescale.com/blog/vector-databases-are-the-wrong-abstraction/

·

The Open Source AI Definition

Nice initiative from the OSI. It is timely, such a definition was surely needed. The data information part seems fairly weak though... for sure you could make a system which doesn't respect the four freedoms that way.

tech · foss · ai · machine-learning

October 29, 2024 at 11:22:41 AM GMT+1 * · permalink

·

https://opensource.org/ai/open-source-ai-definition

·

Google, Microsoft, and Perplexity promote scientific racism in AI search results

This is what you get by making bots spewing text based on statistics without a proper knowledge base behind it.

tech · ai · machine-learning · gpt · criticism

October 25, 2024 at 10:04:17 PM GMT+2 * · permalink

·

https://arstechnica.com/ai/2024/10/google-microsoft-and-perplexity-promote-scientific-racism-in-ai-search-results/

·

Introducing quantized Llama models with increased speed and a reduced memory footprint

More marketing announcement than real research paper. Still it's nice to see smaller models being optimized to run on mobile devices. This will get interesting when it's all local first and coupled to symbolic approaches.

tech · ai · machine-learning · gpt · optimization

October 25, 2024 at 8:40:45 AM GMT+2 * · permalink

·

https://ai.meta.com/blog/meta-llama-quantized-lightweight-models/

·

You Should Probably Pay Attention to Tokenizers

This is still an important step with LLM. It's not because the models are huge that tokenizers disappeared or that you don't need to clean up your data.

tech · statistics · ai · machine-learning · gpt · language

October 22, 2024 at 8:32:38 AM GMT+2 * · permalink

·

https://cybernetist.com/2024/10/21/you-should-probably-pay-attention-to-tokenizers/

·

The 3 AI Use Cases: Gods, Interns, and Cogs

Using the right metaphors will definitely help with the conversation in our industry around AI. This proposal is an interesting one.

tech · ai · gpt · copilot · language

October 21, 2024 at 9:19:13 AM GMT+2 * · permalink

·

https://www.dbreunig.com/2024/10/18/the-3-ai-use-cases-gods-interns-and-cogs.html

·

Big Tech has given itself an AI deadline

More signs of the current bubble being about to burst?

tech · ai · machine-learning · gpt · economics · energy · criticism

October 19, 2024 at 4:00:50 PM GMT+2 * · permalink

·

https://www.theatlantic.com/newsletters/archive/2024/10/big-tech-has-given-itself-an-ai-deadline/680301/

·

Reliable Reasoning Beyond Natural Language

Now this is an interesting paper. Neurosymbolic approaches are starting to go somewhere now. This is definitely helped by the NLP abilities of LLMs (which should be used only for that). The natural language to Prolog idea makes sense, now it needs to be more reliable. I'd be curious to know how many times the multiple-try path is exercised (the paper doesn't quite focus on that). More research is required obviously.

tech · ai · machine-learning · gpt · logic · research

October 19, 2024 at 2:21:50 PM GMT+2 * · permalink

·

https://arxiv.org/abs/2407.11373

·

Large language models reduce public knowledge sharing on online Q&A platforms

Now the impact seems clear and this is mostly bad news. This reduces the production of public knowledge so everyone looses. Ironically it also means less public knowledge available to train new models. At some point their only venue to fine tune their models will be user profiling which will be private... I've a hard time seeing how we won't end up stuck with another surveillance apparatus providing access to models running on outdated knowledge. This will lock so many behaviors and decisions in place.

tech · ai · machine-learning · gpt · knowledge · criticism · research

October 14, 2024 at 9:05:03 AM GMT+2 * · permalink

·

https://academic.oup.com/pnasnexus/article/3/9/pgae400/7754871#483096365

·

Can Logic Programming Be Liberated from Predicates and Backtracking?

Finally a path forward for logic programming? An opportunity to evolve beyond Prolog and its variants? Good food for thought.

tech · programming · logic · ai · prolog

October 13, 2024 at 9:59:53 AM GMT+2 * · permalink

·

https://www-ps.informatik.uni-kiel.de/~mh/papers/WLP24.pdf

·