Causal World Models, or Just Very Good Predictors?

A lot of recent AI work uses the language of causality more confidently than the evidence supports. Latent prediction, object masking, action-conditioned rollouts, and counterfactual-style benchmarks are all useful signals. But none of them, by themselves, show that a model has learned causal mechanisms.

That distinction matters.

A model can predict what usually happens next without knowing what makes it happen. It can perform well in a simulator without separating state from mechanism. It can answer a counterfactual-style benchmark by exploiting regularities in the data rather than by representing an invariant causal process.

This essay is about that gap: the difference between representations that are causally useful and representations that actually identify causality.

A model predicts future latent states. It handles masked objects. It does well on control. It passes some counterfactual-style benchmark. Then somewhere in the abstract or introduction, the word causal appears.

Sometimes the claim is careful: “causal inductive bias,” “counterfactual-like,” “action-conditioned world model.” Fair enough.

But the public interpretation often becomes stronger:

The model learned causality.

That is usually not what has been shown.

Most of these systems are not proving that they have learned causal mechanisms. They are showing that their representations are useful for prediction, control, or counterfactual-style evaluation. Those are important results. They are not the same thing as causal understanding.

Prediction is not causality

A predictive model learns what tends to happen.

A causal model learns what makes something happen.

That distinction sounds philosophical, but it is very practical.

A model can predict that a glass will fall after it is pushed. It may even predict the trajectory. But has it learned the mechanism of support, contact, force, mass, surface geometry, and gravity? Or has it learned a powerful statistical mapping from video context and action to next-frame latent states?

Both can produce good predictions. Only one gives you causal structure.

This is the old problem with correlation, but in modern form. Instead of correlating scalar variables, we now correlate huge latent embeddings. The scale changed. The epistemic problem did not.

A latent world model of the form:

z_{t+1} = f(z_t, a_t)

can be extremely useful. It can support planning. It can improve control. It can generalize in some regimes.

But that equation alone does not say that the model has separated state, action, mechanism, context, preconditions, effects, and invariants. It may have entangled all of them into a convenient predictive representation.

And if those things are entangled, calling the result “causal” is premature.

JEPA-style objectives are representation learning, not causal identification

JEPA-style models are an important development. The basic idea is attractive: do not reconstruct pixels; predict missing or future representations in latent space. I-JEPA, for example, predicts target image-block embeddings from context embeddings and learns useful semantic representations without hand-crafted augmentations.

That is a good self-supervised learning objective.

But it is not a causal-identification objective.

If a model predicts the representation of a missing region in an image, it may learn object structure, scene regularities, part-whole relations, and semantic priors. These are useful. They may be prerequisites for causal reasoning. But they do not prove causal reasoning.

A model can infer that a bicycle likely has two wheels without understanding the causal role of the wheels. It can infer that a hand near a cup often precedes motion of the cup without representing the mechanism of grasping. It can infer that one object occludes another without knowing what intervention would change the scene.

Latent prediction gives structure. But structure is not automatically causality.

World models are not automatically mechanism models

World models are often described as if they are close to causal models. Sometimes they are. But a lot depends on what is meant by “world model.”

A Dreamer-style model, for example, learns latent dynamics that are useful for imagined rollouts and control. DreamerV3 is impressive because it learns across many domains with a single algorithmic setup.

But its core object is still a learned dynamics model. It answers something like:

Given current latent state and action, what latent state comes next?

That is not the same as:

What mechanism produced this transition, where else does it apply, and how does it compose with other mechanisms?

The first is dynamics prediction.

The second is causal mechanism learning.

A model can be good at the first and weak at the second. In fact, most current models probably are.

This matters because “causal” is not just “temporally predictive.” If that were true, every sufficiently good video predictor would be a causal reasoner. But we know that is not enough. A video model can predict familiar physics while failing under interventions that break the training distribution. It can produce plausible futures without knowing the underlying causal variables.

Counterfactual-like is not counterfactual

Object-centric masking is a step in the right direction.

C-JEPA-style work argues that masking objects rather than random patches creates something closer to latent interventions. If one object is hidden, the model must infer its state from the rest of the scene. That can encourage interaction reasoning and counterfactual-style behavior.

This is a better bias than patch masking. A random patch is often just texture completion. An object mask asks a more structural question.

But “counterfactual-like” is doing a lot of work.

A model can succeed at a counterfactual-style benchmark by learning simulator regularities, object co-occurrence statistics, dataset biases, or common interaction templates. That does not prove it has recovered the actual causal mechanism.

A benchmark may ask:

What would the scene look like if this object were removed?

But a causal mechanism claim asks:

Did the model identify the invariant process that explains how this object participates in the scene, across interventions and contexts?

Those are different standards.

Counterfactual performance is evidence. It is not proof.

“Causal inductive bias” is a weaker claim

Some papers are careful and only claim to introduce a causal inductive bias. That is a reasonable phrase.

A causal inductive bias means the architecture or loss encourages the model to organize information in a causally useful way.

Examples:

object-level masking,
action conditioning,
temporal prediction,
interaction graphs,
slot-based representations,
intervention-like perturbations,
invariance across environments.

These are all good ideas.

But an inductive bias is not an identification result.

You can bias a model toward causal structure and still have it learn a shortcut. This is not a minor concern. Deep networks are excellent shortcut machines. If there is an easier predictive path through the data, they often find it.

That is why “the model does well after we added a causal-looking bias” is not the same as “the model learned causality.”

The missing distinction: variables, mechanisms, and laws

A lot of the confusion comes from compressing several different problems into one word.

Causal learning has at least three layers:

Causal variables
What are the right high-level variables? Object, position, support, ownership, belief, permission, force, goal, etc.
Causal mechanisms
What process maps causes to effects? What changes what? Under what conditions?
Compositional structure
How do mechanisms combine? What happens when one intervention enables or disables another?

Many current models focus mostly on the first half of the first problem and some of the second. They learn useful latent variables and predictive dynamics. But they rarely demonstrate clean mechanism separation, let alone mechanism composition.

The causal representation learning literature is more careful about this. Schölkopf et al. frame the discovery of high-level causal variables from low-level observations as a central open problem, not as something solved by prediction alone.

That caution is often lost when world-model papers are summarized.

Where Lie algebra fits — and where it does not

Lie theory is one of the places where “causal-looking structure” can actually be mathematically clean.

If the transformations are smooth, continuous, reversible, and symmetry-like, Lie groups and Lie algebras are a natural language. Rotations, translations, scalings, pose changes, continuous dynamics, and some control problems can be described this way.

A Lie algebra gives infinitesimal generators. Exponentiating those generators gives transformations. This is a beautiful fit for many physical and geometric domains.

For example:

small rotation generator → finite rotation

That is not just a latent vector. It is a structured transformation.

This is why equivariant neural networks and Lie-group-based architectures are interesting. They bake in the fact that certain transformations should behave consistently. A rotation of the input should correspond to a rotation of the representation or output. That is a real structural constraint.

But Lie algebra is not a general theory of causality.

Many causal mechanisms are not smooth, reversible group actions.

Consider:

unlock,
delete,
buy,
promise,
ask,
repair,
search,
prove,
approve,
deny,
transfer ownership,
revoke access.

These are often discrete, conditional, irreversible, resource-sensitive, institutional, linguistic, or partially observable. A “promise” is not a rotation in some high-dimensional vector space, unless the word “rotation” has been stretched so far that it no longer explains anything.

Lie theory is powerful when the world gives you symmetry and smooth transformation. It is not the right outer framework for all mechanisms.

So if a model learns good latent dynamics for continuous control, it may have learned something Lie-like. That is valuable. But it should not be confused with general causal understanding.

Category theory is closer to the right language, but not a magic wand

If Lie algebra is too narrow, category theory is almost the opposite: extremely general.

Category theory starts with objects and morphisms:

f:A→B

A morphism is a typed transformation from one thing to another. Morphisms compose:

g∘f:A→C

This is already much closer to the language people implicitly use when they talk about mechanisms.

An action, a program, a physical process, a proof step, a database migration, a probabilistic transition, a rewrite rule, or a neural module can all be viewed as a morphism in some category.

That is why category theory shows up in compositional systems, programming language semantics, process theory, open systems, and probabilistic causality. Fong and Spivak’s Seven Sketches in Compositionality is a good entry point for this perspective.

Category theory gives a clean way to ask:

What are the types of things being transformed?
What are the allowed arrows?
When can arrows compose?
What structure is preserved?
What is the identity transformation?
What does parallel composition mean?

This sounds highly relevant to causal world models.

But here is the catch: most neural networks do not actually learn morphisms in this sense.

They learn tensor functions.

A transformer block is a differentiable map from arrays to arrays. You can describe it categorically if you want. That does not mean the trained model has discovered typed causal arrows in the world.

This is where some category-theory-for-AI discussion becomes misleading. It is easy to draw diagrams where boxes compose. It is much harder to make a neural network learn the right boxes, with the right interfaces, from data.

Category theory may be the right semantic language. It is not, by itself, an empirical result.

The category error in “causal world model”

The phrase “causal world model” often commits a category error.

It treats “good predictive latent state” as if it were the same type of thing as “causal mechanism.”

But these are different objects.

A latent state is a representation of what is.

A causal mechanism is a representation of what changes what.

A dynamics model maps one latent state to another.

A mechanism explains why that transition occurs, when it applies, what it preserves, what it changes, and how it behaves under intervention.

Those distinctions matter.

If a model’s latent space entangles all of this into one vector, it may still work well. It may even look surprisingly intelligent. But calling it causal hides the unresolved problem.

What would count as stronger evidence?

Without proposing a new architecture, we can at least say what the evidence should look like.

A convincing causal mechanism claim would need to show more than prediction, masking, or control.

It would need tests where:

Shortcut prediction and mechanism learning diverge
The dataset must contain cases where the easiest statistical predictor fails but the mechanism generalizes.
The same mechanism transfers across contexts
The model should identify the same underlying process when surface features change.
Interventions break correlations
The evaluation should include interventions that destroy training-set associations.
The model separates changed and invariant factors
Causality is partly about knowing what does not change.
The representation supports compositional tests
If one process followed by another produces a third effect, the internal structure should reflect that.
Ablations should show the causal variable is load-bearing
Removing or swapping the supposed mechanism representation should change the outcome in the expected way.

Most current papers do not clear this bar. Some gesture toward parts of it. Very few make it the central test.

Why this matters

This is not pedantry.

If we overstate what current world models have learned, we will design the next generation around the wrong bottleneck.

If the bottleneck is “better prediction,” then more data, larger models, and more stable latent objectives are the obvious path.

If the bottleneck is “mechanism identification,” then the issue is different. It is about variables, interventions, invariances, compositionality, and the separation between state and process.

Those are not the same research program.

The current literature is full of progress on representation learning and predictive dynamics. That progress is real. But it should not be confused with having solved causality.

A more honest vocabulary

It would help if papers used more precise terms.

Instead of saying:

We learn a causal world model.

Say:

We learn an action-conditioned predictive world model.

Or:

We introduce a causal inductive bias through object-level masking.

Or:

We evaluate counterfactual-style generalization.

Or:

We learn representations useful for control under interventions.

Those are still strong claims. They are just more accurate.

The word “causal” should be reserved for cases where the model actually identifies mechanisms, not merely cases where it performs well on tasks that smell causal.

The bottom line

JEPA-style models, world models, object-centric predictors, and counterfactual masking methods are useful. Some are genuinely impressive.

But most of them do not prove causality.

They show that predictive representations can be made more abstract, more stable, more object-aware, more action-sensitive, or more useful for control.

That is not the same as learning causal mechanisms.

Lie algebra gives a beautiful mathematical structure for smooth symmetry-like transformations. Category theory gives a broader language for typed compositional processes. Both help clarify what real causal structure would look like.

And that clarification makes the current gap obvious:

Most so-called causal world models are still learning what tends to happen.

They are not yet showing that they know what makes things happen.

Disclosure: This post was drafted with AI assistance from research notes we are developing at Noumenal on causal world models and mechanism representation. The goal is not to claim that our approach solves causality, but to clarify why much of the current “causal world model” literature demonstrates causal usefulness rather than causal identification.