The word “intelligence” is doing a lot of heavy lifting here. LLMs lack any mechanism for true logical reasoning, and they always will by nature. This is why they fail at simple questions like “the car wash test”. It’s also why agents are expensive; They just flail around in token hungry “reasoning loops” until they happen to come across a correct solution. And it’s why Claude Opus 4.8 (High) only scores 1.5% on the ARC-AGI-3 benchmark at a cost of $10,000.
This kind of panic is just part of the hype. Wake me up when real intelligence arrives.
The word “intelligence” is doing a lot of heavy lifting here. LLMs lack any mechanism for true logical reasoning, and they always will by nature. This is why they fail at simple questions like “the car wash test”. It’s also why agents are expensive; They just flail around in token hungry “reasoning loops” until they happen to come across a correct solution. And it’s why Claude Opus 4.8 (High) only scores 1.5% on the ARC-AGI-3 benchmark at a cost of $10,000.
This kind of panic is just part of the hype. Wake me up when real intelligence arrives.