Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

brianpeiris@lemmy.ca · edit-2 3 months ago

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

SuspciousCarrot78@lemmy.world · edit-2 2 months ago

deleted by creator

Rozaŭtuno@lemmy.blahaj.zone · 3 months ago

Just don’t click where the mines are.

RustyShackleford@piefed.social · 3 months ago

As a psychiatrist, I have a theory about what’s missing in AI. First, it lacks childhood dependency and attachments. Second, it struggles to overcome repeated pain and suffering. Third, it lacks regular eating and restroom breaks. Fourth, it struggles to accept loss in everyday situations. Finally, it lacks the concept of our inevitable death. Without these nagging memories and concepts, machines will simply revert to the simpler concepts we use them for in our recent times, such as stealing cryptocurrency. After all, we live in a world run by capitalism, so it’s only logical. ¯\(ツ)/¯

CosmicTurtle0 [he/him]@lemmy.dbzer0.com · 3 months ago

As a technologist, I have to remind everyone that AI is not intelligence. It’s a word prediction/statistical machine. It’s guessing at a surprisingly good rate what words follow the words before it.

It’s math. All the way down.

We as humans have simply taken these words and have said that it is “intelligence”.

unpossum@sh.itjust.works · 3 months ago

As another technologist, I have to remind everyone that unless you subscribe to some rather fringe theories, humans are also based on standard physics.

Which is math. All the way down.

NewOldGuard@lemmy.ml · 3 months ago

As a mathematician, it should be noted that the mathematics of physics aren’t laws of the universe, they are models of the laws of the universe. They’re useful for understanding and predicting, but are purely descriptive, not prescriptive. And as they say, all models are wrong, but some are useful

SorteKanin@feddit.dk · 3 months ago

That’s true, but that doesn’t contradict the above comment. Unless you believe in something like a spirit or soul, you must concede that human intelligence ultimately arises from physical matter (whatever your model of physics is). From what we know of science right now, there are no direct reasons for thinking that true intelligence or even consciousness is limited to biological organisms based on carbon and could not arise in silicon.

Aceticon@lemmy.dbzer0.com · 3 months ago

As a random person on the Internet I don’t actually have anything to add but felt it would be nice to jump in.

HereIAm@lemmy.world · 3 months ago

I agree, the maths argument is not a good one. While a neural network is perhaps closer to what a brain is than just a CPU (or a clock, as it was compared to in he olden days), it would be a very big mistake to equate the two.

HaunchesTV@feddit.uk · 3 months ago

Grok Reasoning: 0%

Hilarious

Janx@piefed.social · 3 months ago

Grok isn’t designed to solve problems. It’s designed to create sexually explicit images of children for Republicans…

brsrklf@jlai.lu · 3 months ago

Reasoning is woke propaganda anyway.

SaraTonin@lemmy.world · 3 months ago

Tell me again how AGI is just around the corner, Sam

Vupware@lemmy.zip · 3 months ago

When Sammy fuck says “we’re so close to AGI, I can just feel it. Like a tingle on the tip of my shrimpdick it’s getting so close to blossoming into something guys”, just ignore him. He’s crazy man!

Kilgore Trout@feddit.it · 3 months ago

a tingle on the tip of my shrimpdick

mhh that’s erotic ASMR on Youtube

lath@lemmy.world · 3 months ago

Biased study. Take any average person off the streets and shove this thing in their face. That 100% notion will go down fast.

tomalley8342@lemmy.world · 3 months ago

They didn’t say “100% of humans can solve this benchmark”, they said “humans can solve 100% of this benchmark”.

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

Announcing ARC-AGI-3 - A benchmark that tests if AI can explore, learn, and adapt in unfamiliar situations. Humans score 100%. Frontier AI scores 0.26%.

Announcing ARC-AGI-3 | ARC Prize