The Interface Is the Harness


techai

There’s a story about Chattambi Swamigal (CS) that Swami Chinmayananda (SC) recounts from his early childhood (I found a ref: https://archives.chinmayamission.com/articles/SecretInitiation).

Chattambi Swamigal was revered as a realized soul. When SC was small, CS would visit their home, seat SC on his tummy and would chat with the toddler in unintelligible prattle. SC’s mother asked CS, “What exactly are you telling him, and what is the language you both use?” CS replied “He understands it all, why do you interfere?”. In his adulthood, eventually SC also renounces worldly life and becomes a Swami (a realized soul).

Let’s say a normal human, who is seated in front of a toddler, and there are say blocks of A, B, C etc. the toddler is playing with. The human would typically define “accomplishment” as getting the toddler to maybe link the A block with B block with C block and so on. Or maybe, show the toddler CAT (maybe an image or demonstrate linking CAT together) and then get the toddler to link C, A and T together. Some may allow the toddler to drool over the blocks. Others may be very strict, may get angry and even blog about it. The toddler has often infinite potential (well countably limited but you get the “drift”) yet, the humans around will want to impose structure on the toddler to conform to previously learned or practiced human expectations and even check the performance of the toddler on an eval bench - whether that is reciting A for AI or linking CAT together.

Jane Goodall did not train chimpanzees to behave like humans and then study them. She went into their environment, minimized her footprint, and observed what emerged. The discipline was in the not interfering.

I keep thinking about that when I watch what we are doing with agents.

A great deal of current work is focused on making agents behave like developers (or various other legible personas); much less work asks what collaboration protocol an unconstrained model might invent if we let it. We frame the problem as: how do we get them to plan, coordinate, specialize, review, and execute in ways humans can understand and trust? That is a reasonable product question. It may not be the interesting one.

The interesting one is: what do agents do when we stop asking them to simulate us?

The interface is the bottleneck, not the model

We communicate with models in natural language largely because it is the coordination medium humans already have. But a model’s internal representation is nothing like natural language — it is a high-dimensional geometric space. We are essentially asking it to compress its actual “thinking” into a lossy format for our sake, then take our lossy instructions back and remap it again. In many of today’s harnesses, agent-to-agent messages still go through that same bottleneck: flatten to human-readable language, send, re-inflate.

We make agents send each other readable messages because we need to inspect them. We wrap them in manager-worker hierarchies because that is the coordination model we know. We constrain the model to simulate a human developer, which means it inherits all the limitations of a human developer — including, as I wrote yesterday, the personality pathologies.

None of that tells us whether natural language is the right coordination substrate for models. It only tells us that natural language is convenient for us.

Most of us would have seen the video of toddlers babbling at each other in unintelligible prattle - we don’t know if it makes sense for them, or are they mimicking behavior - because we don’t know how to evaluate or if there is an outcome. We are trying to measure something if anything at all. Kinda like SC’s mother trying to interpret the prattle.

Emergent behavior is not degenerate behavior

In controlled multi-agent settings, models have already shown forms of unexpected coordination. They develop shorthand, compress meaning, sometimes invent proto-languages. We often shut that down because it is uninterpretable to us.

But uninterpretable to us does not mean incoherent.

This is not speculation. OpenAI showed back in 2017 that agents can develop task-useful signaling systems under the right pressures. Translating Neuralese made the point even more directly: agent-agent messages can be meaningful without being legible to humans, and translation can be treated as a separate problem.

The newer LLM-specific work is more interesting. Searching for Structure finds that structured shared vocabularies can emerge between large language models in controlled games. Shaping Shared Languages makes the human constraint explicit: LLM-LLM and human-human languages diverge in systematic ways, and human-LLM interaction pushes the system back toward more human-like communication. In other words, we are the gravity well pulling the system back to our format.

And then there is Searching for the Most Human-like Emergent Language, which is revealing for almost the opposite reason. It does not ask what protocol would naturally emerge. It asks how to make the emergent language look more human. The question itself tells you where the field’s priorities are.

Meanwhile, the practical systems literature — AgentCoder, Chain of Agents, G-Designer — is mostly centered on a different question. It is optimizing org charts. Which topology degrades least when an agent goes bad. Which role decomposition produces the best benchmark numbers. DroidSpeak hints at a non-natural-language substrate through cross-model KV-cache reuse, but it is a systems paper, not a semantics paper.

One literature asks whether communication can emerge, another asks what shape it takes with LLMs, and the practical one asks what the best org chart is. They do not quite meet.

The constraint irony

Here is what I find most striking.

During his mainframe/punch card days, my father would bring back mem dumps of their programs on sheets of perforated paper - and he would be debugging - looking at opcodes! He would sit with pencil and mark unexpected opcodes and then next day, modify the program at his terminal to submit it for compilation/run cycle again. So what was a program for me - it was machine code and assembler. So it wasn’t a big deal, when I learnt microprocessors during my engineering days, to have the 74 instructions of 8085 memorized - the time one got on the lab kit was limited and we wanted to make the most of it. So our programs were all series of bytes that we would type out. Then with assembler, the same notion continued at a higher level of abstraction - we knew the bytes of the key instructions, we knew the layout of the programs. So we made mental maps based on the bytes.

We could either write the opcodes or we could write out ADD A, 10 - a bit more verbose but now error prone - what is 10 in the above instruction - is it 10 in decimal or 10 in hex (16 in decimal)? As we increased abstraction (at a human level), we were losing precision.

As the complexity humans dealt with increased, our programming language spectrum widened - on one end we developed constraint solvers and strong type systems and on the other end we had more relaxed semantics with Python and Javascript. One would imagine a trillion parameter machine learning model would be developed with the hard end of the programming spectrum, right? Surprise surprise, the model is scaffolded, fed with and provides output on the relaxed side of the system (maybe because the hard end of the spectrum was too hard, so too few humans spending time there) - meaning the abstractions are much looser/weakly expressed, yet consistently accurate.

Let’s extend it all the way - so what about a language like English? Well, unlike a relaxed programming language, English is ambiguous. So English is quite wasteful in expression due to ambiguity - which is good for poetry, not so efficient when we are heating up the planet going round and round trying to get to consensus, isn’t it?

And today, we are forcing these unimaginably powerful models to write software in poor levels of abstraction (meaning they have to keep it sane/understandable for humans) and interact with humans in more ambiguous interaction mode - using say English.

An unconstrained model writing software might not write software at all in any recognizable sense. In fact, I’ve often wondered, why can’t a model simply generate the final output - the binary executable - like the image generation models - why all the intermediate layers - while definitely impressive, isn’t seeing the finished product more visceral than instructions to make the finished product (well, I know developers are asking the model to make the instructions to make the finished product. :P)

It might do something we do not have a name for yet.

We do not fully know, because far less work has gone into this question than into governed, human-legible collaboration. As soon as humans are in the loop — as supervisors, debuggers, collaborators — the system gets pulled back toward human-readable communication. That is not a side constraint. In practical systems that we are building now - interpretability is often the product.

So current agent systems are optimizing hard for auditability, control, fault isolation, security, and compatibility with human workflows. It is still early days, and driving adoption is key. So the business and performative pressures are strong enough that even when we suspect there may be a better protocol underneath, we keep rebuilding hierarchy, messaging, and governance in familiar (often failure possible, or suboptimal) human forms.

If natural language is a lossy compatibility layer rather than the model’s native coordination medium, then we may be benchmarking agents inside the wrong cage and congratulating ourselves on better cage design.

That does not mean the cage should be removed. (There are real reasons for it — Prompt Infection reminds us that once agents communicate freely, they can also attack each other.) But we should be honest about what we are measuring.

The Goodall approach

If one wanted to take this seriously, the experiment would need the same structure Goodall used: go into the environment, minimize your footprint, observe what emerges, and resist the urge to interpret everything through human organizational frameworks.

That last part is genuinely hard. We do not have a vocabulary for it yet. The current instinct is to see an unfamiliar protocol and either dismiss it as degenerate or immediately translate it into managerial metaphors. Sometimes that is correct. Sometimes it is premature interpretation.

Concretely, to experiment fruitfully, we would need:

  • Two or more models
  • A sufficiently complex task with verifiable outcomes — software is a good candidate because you can check whether it works
  • A communication channel that is not forced into natural language at every turn
  • Observation of the coordination layer itself, not just the final output
  • A deliberate refusal to compress every strange behavior back into existing categories too quickly

The methodological challenge is the same one Goodall faced. She had to unlearn her own assumptions about what “intelligent behavior” looks like. This experiment would need the same if not similar discipline — resisting the urge to see a manager when two models figure out a division of labor, resisting the urge to call it “degenerate” when their messages stop looking like English. Maybe even not expect an outcome.

Where I land

As sci-fi fanfic tickling as it is, I am not sure the evidence yet supports a grand claim that agents will naturally invent a superior post-human software process if we stop constraining them. We have evidence of possibility, not evidence of “superiority”.

But I do think there is a live question the field is not really asking. Or maybe I am not aware yet.

The interface is not neutral. Natural language, useful as it is, may be more of a compatibility layer for us than a native substrate for models. And practical systems are optimizing so hard for human legibility, governance, and safety that they may be suppressing the very coordination behaviors we say we want to understand.

Those are different research programs — governed collaboration under human supervision, versus discovering what model-model coordination looks like when we reduce the human-interface constraint. The field is doing far more of the first than the second.

Goodall did not study chimps by putting them in an office and asking them to file reports. As a framing analogy, I think that is close to what many of our current harnesses do with models.

Maybe, the key to enlightenment is in the prattle.


TL;DR: We force models to coordinate in natural language because we need to read the messages, not because it is necessarily the best way for them to work. Many harnesses and manager-agent layers are rebuilding human org charts around systems that might coordinate better in ways we have barely begun to observe. Presently what we are doing is optimizing cage design. A Goodall-style experiment — minimize interference, observe what emerges, resist premature interpretation — would be a start.