Sign of the future: GPT-5.5

Ethan Mollick had a problem that a lot of researchers quietly share. He had hundreds of anonymized data files sitting in folders, collected during his crowdfunding research in the early 2010s. Surveys, spreadsheets, Word documents, a mix of formats, all carefully gathered. And then life moved on, and the paper never got written.
For about a decade, those files just sat there.
Then, using OpenAI's new Codex powered by GPT-5.5, he gave it four prompts. He asked it to sort the data, generate an interesting hypothesis, test it using rigorous statistical methods, include a literature review, and format the result as an academic paper. No manual editing. No touching the text himself. Four instructions.
The paper it produced would have satisfied a second-year PhD student. The literature review cited real papers. The statistics were sound. The formatting was clean.
The only critique Mollick had, as a genuine expert in the field, was that the hypothesis was not especially exciting, and there were some standard concerns about causation. The AI had done the technical work with sophistication. The judgment call was the one thing it still left on the table.
What 2,000 SaaS Companies Reveal About Growth in 2026
Is your growth in-line with your peers in B2B SaaS & AI?
Benchmark yourself against actual billings data for Maxio’s 2000+ global customers, alongside firsthand company perspectives to understand how growth varied by company size, business model, and strategic focus.
Key takeaways from the report:
Average growth across 2,000 companies
Growth by revenue band
AI-led vs AI-enhanced. Who performed better?
Why This Matters for You
That story is not about academia. It is about what happens when the tools you use every day quietly become capable of things that used to require specialists.
To understand what changed, it helps to think about AI in three layers. The first is the model itself, meaning the underlying intelligence, such as GPT-5.5, Claude Opus 4.7, or Gemini 3.1. The second is the app, meaning the product you use to talk to it, like chatgpt.com or claude.ai or desktop tools like Claude Code. The third is the harness, meaning the tools the AI can actually use: writing code, generating images, browsing the web, controlling your computer.
When all three of these improve at the same time, which is what has happened in 2025 and into 2026, the gap between what an individual can produce and what used to require a team or a specialist starts to close.
That gap is your opportunity.
What the Tools Can Actually Do Now
Image generation reached a point this year where it can render readable, accurate text inside images. That might sound minor, but it means you can now generate a product mockup with legible labels, a polished slide with real copy, or a UI wireframe that looks like a finished design rather than a rough sketch. For freelancers doing client work in marketing, product, or content, this compresses timelines significantly.
On the coding side, Mollick ran a test where he asked several AI models to build a procedurally generated 3D simulation of a harbor town evolving across five thousand years of history. Only GPT-5.5 Pro built something that actually modelled change over time rather than just swapping out buildings. It completed the task in twenty minutes. The previous version took thirty-three.
As a second experiment, he asked Codex to create an entirely new tabletop roleplaying game, complete with rules, lore, tables, and playtesting simulations. The result was a formatted, illustrated, 101-page PDF. The setting was original, the rules were internally consistent, and the illustrations matched the world.
The content had real problems too. The fiction was flat. Every character spoke in the same tone. But as a proof of system, it was striking.
This is what people in AI research call the jagged frontier. The tools are extraordinarily good at some things and still clumsy at others. Long-form creative writing with genuine voice is still difficult. Novel hypotheses that require real-world wisdom are still weak. But structured, technical, repeatable work? That has moved fast.
The Practical Read for Freelancers and Builders
The researchers, the marketers, and the developers who are moving quickly right now are not chasing every new model announcement. They are thinking about which layer of the stack can make the biggest difference to the work they already do.
If you are a freelancer, the question is not whether AI will replace your service. The question is how much of the work you currently charge hourly for can now be done in minutes, and what you do with that time differential.
If you are building a product or side project, the practical thing Mollick's experiments show is that the bottleneck has moved. The technical execution of a structured deliverable, a research report, a rules document, a formatted PDF, is no longer the constraint. The constraint is now your taste, your judgment, and your ability to define what a good outcome actually looks like.
That is genuinely good news if you have domain expertise. The tools can do the mechanical parts. The part that requires knowing whether the hypothesis is interesting or whether the rules are actually fun to play? That still lives with you.
THREE THINGS WORTH TRYING THIS WEEK
Pick one deliverable you produce regularly for clients, a report, a content brief, a competitor analysis, and run the whole thing through a model with a detailed prompt. Compare your time to the output quality. You may surprise yourself.
If you have data sitting around that you never turned into anything, whether that is a survey you ran, customer feedback you collected, or analytics you exported, try feeding it to an AI and asking for a narrative summary with insights. Treat it as a first draft, not a final answer.
The image generation improvements are worth experimenting with for anyone doing client-facing work. Try generating a product mockup, a social media visual with text, or a simple slide using a detailed prompt. The text rendering alone changes what is possible.
One honest note: the pace of improvement in these tools has been consistent for over three years. Every few months, something that required a specialist becomes something an individual can do. The practical edge right now is not in knowing that these tools exist. It is in building the habits to use them on real work, regularly, and in developing the judgment to know when the output is good enough.
The paper was not perfect. But it was done.
REFERENCED THIS ISSUE:
The AI-generated crowdfunding paper: Read on Google Drive
The 101-page AI-created RPG: View the PDF




