ResearchCreativity of AI

Creativity of AI

December 30, 20258 min readCeltic Labs Research Team

If you have used an LLM in the last year, you know the feeling.

You ask for a story, and it gives you a "tapestry of emotions." You ask for a business strategy, and it tells you it's "crucial to leverage synergies." You ask for a critique, and it gives you a compliment sandwich so polite it's useless.

We call this The Beige Box Problem.

The Architecture of Boring

Somewhere along the way, we optimized Artificial Intelligence to be inoffensive rather than interesting. By training models to please the average human rater, the industry inadvertently trained them to regress to the mean. They became safe, verbose, and utterly devoid of style.

At Celtic Labs, we decided to break the pattern. Here is how we engineered Celt 4 to have something AI isn't supposed to have: Taste.

Standard models are trained using RLHF (Reinforcement Learning from Human Feedback). This essentially rewards the model for generating answers that look "correct" to a random contractor.

The problem? Most people prefer safe, generic answers over risky, novel ones. Over billions of training runs, the model learns a hard lesson: Creativity is risky. Platitudes are safe.

This is why GPT-4 sounds like a corporate HR department. It's not a bug; it's a safety feature.

The Fix: Recursive Thought Cloning

To fix this, we realized we couldn't just change the data. We had to change the process.

Human creators don't just blurt out the first word that comes to mind. We draft. We delete. We cringe at our own clichés. We edit.

Celt 4 is the first model designed to cringe.

We trained Celt 4 using a methodology called Recursive Thought Cloning. Before the model sends you a single token, it enters a hidden reasoning state where it performs a Style Audit.

Example from Training Logs

User Prompt:

"Write a hook for a video about coffee."

Standard Model (Internal Monologue):

User wants a hook. I should mention coffee is popular. I will start with a question.

Standard Output:

"Are you looking to unlock the full potential of your morning brew? Let's delve into the world of beans."

Celt 4 (Internal Monologue):

Analyze intent: Viral retention. Target: Gen Z/Millennials.
Audit: 'Unlock potential' is marketing fluff. 'Delve' is an AI trope. Delete.
Strategy: Negative bias. Attack the cost vs. value. Make it personal.

Celt 4 Output:

"Your $7 latte is a scam. Here is the math."

System 2 Creativity

In cognitive science, System 1 is fast, instinctive thinking. System 2 is slow, deliberate logic.

Most AI models operate entirely in System 1—they predict the next word instantly. Celt 4 simulates System 2 Creativity. It separates the generation phase from the critique phase.

By forcing the model to explicitly identify "AI Slop" words (tapestry, symphony, landscape, testament) in its hidden thought block and ban them, we force it to dig deeper into its vocabulary. The result is language that feels sharper, denser, and more human.

Why This Matters

We didn't build this just to write better tweets. We built it because Creative Intelligence is the next frontier.

If we want AI to write code that solves new physics problems, or draft legal strategies that win cases, it cannot rely on the "average" of the internet. It needs to be able to make lateral jumps. It needs to understand subtext.

Celt 4 is our first step toward a machine that doesn't just predict the future—it invents it.

The era of beige AI is over.

Welcome to the era of Celtic.