zogwarg@awful.systems to

TechTakes@awful.systemsEnglish · 3 years ago

AGI Sparklings proponents rejoice! Finding a literal map(*) means LLMs have a world model.

56

AGI Sparklings proponents rejoice! Finding a literal map(*) means LLMs have a world model.

zogwarg@awful.systems to

TechTakes@awful.systemsEnglish · 3 years ago

Source: nitter, twitter

Transcribed:

Max Tegmark (@tegmark):
No, LLM’s aren’t mere stochastic parrots: Llama-2 contains a detailed model of the world, quite literally! We even discover a “longitude neuron”

Wes Gurnee (@wesg52):
Do language models have an internal world model? A sense of time? At multiple spatiotemporal scales?
In a new paper with @tegmark we provide evidence that they do by finding a literal map of the world inside the activations of Llama-2! [image with colorful dots on a map]

With this dastardly deliberate simplification of what it means to have a world model, we’ve been struck a mortal blow in our skepticism towards LLMs; we have no choice but to convert surely!

(*) Asterisk:
Not an actual literal map, what they really mean to say is that they’ve trained “linear probes” (it’s own mini-model) on the activation layers, for a bunch of inputs, and minimizing loss for latitude and longitude (and/or time, blah blah).

And yes from the activations you can get a fuzzy distribution of lat,long on a map, and yes they’ve been able to isolated individual “neurons” that seem to correlate in activation with latitude and longitude. (frankly not being able to find one would have been surprising to me, this doesn’t mean LLM’s aren’t just big statistical machines, in this case being trained with data containing literal lat,long tuples for cities in particular)

It’s a neat visualization and result but it is sort of comically missing the point

Bonus sneers from @emilymbender:

You know what’s most striking about this graphic? It’s not that mentions of people/cities/etc from different continents cluster together in terms of word co-occurrences. It’s just how sparse the data from the Global South are. – Also, no, that’s not what “world model” means if you’re talking about the relevance of world models to language understanding. (source)
“We can overlay it on a map” != “world model” (source)

Chat

self@awful.systems
link
fedilink
English
arrow-up
9·
3 years ago
as a large language model, I am incapable of feeling surprise that Tegmark is associated with neo-nazis

(also I really need to de-jank the stylesheet for the archive and get the rest of the data in it soon)

TechTakes@awful.systems

techtakes@awful.systems

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !techtakes@awful.systems

Big brain tech dude got yet another clueless take over at HackerNews etc? Here’s the place to vent. Orange site, VC foolishness, all welcome.

For actually-good tech, you want our NotAwfulTech community

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

33 users / day
277 users / week
895 users / month
143 users / 6 months
1 local subscriber
1.01K subscribers
294 Posts
6.11K Comments
Modlog

mods:
David Gerard@awful.systems