A file format for video. You write JSON. An engine renders it. A grammar of cognitive moves makes any film composable. A contract keeps it from being slop.
Most video tools start with the canvas — a timeline, layers, keyframes. Docent starts with the moves any piece of thought can make. You declare them. The engine renders. The same grammar handles a code review, a brand quarterly, a poetry close reading, a sci-fi short, a quarterly earnings walk, a documentary.
The format is the surface an LLM can author against without the output drifting into slop. Every scene declares its schema. Every scene declares its depth rules. A film that doesn't say anything doesn't ship.
Connection. Time. Flow. Comparison. Categorization. Experience. Narrative. Seven clusters of cognition — enough to compose any film. Adding a thirtieth move is a major version bump. That restraint is the format.
A software architecture review. A security essay walked through. A research paper rendered. Docent reviewing itself. Different subjects, different scene compositions, one cascade.
The spec is the source. The render is the artifact. Same path whether the film is a PR review, a documentary, or a brand opener.
▍Three packages: the framework, the default implementation, the binary. Write a spec at films/<id>.json. Run docent build <id>. Watch.
Ship.