The "Problem" Problem (11.26.25)

My preferred framework for assessing competing AI research programs is still the Problem problem: our essential difficulty is defining an objective function for "intelligence" in such an extraordinarily high-dimensional search space. This remains the Hamming question for AI research.

Our lack of success in generating a positive definition of intelligence across the fields of AI, neuroscience, psychology, and ToM should give us pause. LLMs offered a clever way to defer this question a la Potter Stewart: we know it when we see it - and conveniently, we see it strewn over the corpus of the internet. [1]

Yet while recent progress has been impressive and surprising, a contemporary version of Moravec's paradox persists: we can sweep a math olympiad but children still appear to outperform frontier models in transfer learning. Expert opinion also appears to be gradually souring on returns from solely scaling LLM pre-training. [2]

Sutton's recent critique of the language approach is that it is just an internet-scale instance of expert knowledge masquerading as the opposite. I think the framing of the bitter lesson is actually a bit misleading: perhaps a more appropriate lens is identifying the habitable zone of abstraction for computational intelligence.

With the existence proof of human intelligence generated via emergence, we can hypothesize - intractable as it may be - that running a physics simulation of a local region of the universe eventually yields what we're after. If we flatten this out onto an information-density line with progressive abstraction going to the right, there is a left-bounded interval of what we can computationally mine in the present and a right-bounded interval of intelligence-sustaining complexity. [3] There may or may not be an overlapping region right now, but the left-bounded interval shifts leftward as technology progresses.

In this picture, the application of the bitter lesson is to stay as far to the left as possible while remaining in the universe of harnessable compute. It's easy to say "simulate the biosphere without priors," but I suspect that even Sutton's approach quietly embeds knowledge of neuroscience and biology. At the very least, these inform the animal models undergirding his intuition.

To solve the Problem problem we are likely in need of further soft constraints on the intelligence search space (to borrow Karpathy's phrasing) to supplement or even substitute LLMs.

Modeling the neuromorphology of the brain could be one viable research direction because we are continuing to accrue observations through neuroscience about the machine the ghost inhabits. We still can't adequately describe what comes off the intelligence factory line, but we can measure the factory itself against our reference implementation and perform gradient descent.

The emergent processes of gene expression and neurogenesis could provide some of the additional constraints we're looking for. The alternatives are frankly unclear.

---

[1] In broad terms, treat the internet as a noisy human imprint and bank on your universal approximator converging on the "intelligence" region of the search space as a necessary means of reducing error.

[2] Sutskever, Karpathy, Carmack, Sutton, and Le Cun among recent others.

[3] Of course, this 1D view is a simplification and we can treat this as a search for a habitable zone of overlap in model-space between regions of "intelligence" and regions that are computationally tenable.