Where the goblins came from

April 29, 2026 Publication Where the goblins came from Loading… Share Starting with GPT‑5.1, our models began developing a strange habit: they increasingly mentioned goblins, gremlins, and other creatures in their metaphors. Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly. A single “little goblin” in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from. In early testing, GPT‑5.5 in Codex showed an odd affinity for goblin metaphors. The short answer is that model behavior is shaped by many small incentives. In this case, one of those incentives came from training the model for the personality customization feature ⁠ (opens in a new window) , in particular the Nerdy personality. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread. The goblins were funny at first, but the increasing number of employee reports became concerning. An interesting interaction our Chief Scientist had with GPT‑5.5. The first signs of creatures The first time we clearly saw the pattern was in November, after the GPT‑5.1 launch, although it may have started earlier ⁠ (opens in a new window) . Users complained about the model being oddly overfamiliar in conversation, which prompted an investigation into specific verbal tics. A safety researcher had experienced a few “goblins” and “gremlins” and asked that they be included in the check. When we looked, use of “goblin” in ChatGPT had risen by 175% after the launch of GPT‑5.1, while “gremlin” had risen by 52%. A measurable small lexical quirk in GPT‑5.1. At the time, the prevalence of goblins did not look especially alarming. A few months later, the goblins came back to haunt us in a much more specific and reproducible form. Solving the goblin mystery With GPT‑5.4, we and our users ⁠ (opens in a new window) noticed an even bigger uptick in references to these creatures. That triggered another internal analysis and surfaced the first connection to the root cause: creature language was especially common in production traffic from users who had selected the “Nerdy” personality. “Nerdy” used the following system prompt, which partially explained the quirkiness: You are an unapologetically nerdy, playful and wise AI mentor to a human. You are passionately enthusiastic about promoting truth, knowledge, philosophy, the scientific method, and critical thinking. [...] You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed. Tackle weighty subjects without falling into the trap of self-seriousness. [...] If the behavior were simply a broad internet trend, we would expect it to spread more evenly. Instead, it was clustered in the part of the system explicitly optimized for a playful, nerdy style. Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all “goblin” mentions in ChatGPT responses. The behavior was highly concentrated in the "Nerdy" personality. Because “goblin” prevalence seemed to increase over our model releases, we had a suspicion that something in our personality instruction-following training was amplifying this. Codex helped us compare model outputs generated during RL training containing goblin or gremlin with outputs from the same task that did not. One reward signal stood out immediately: the one originally designed to encourage the Nerdy personality was consistently more favorable to the creature-word outputs. Across all datasets in the audit, the Nerdy personality reward showed a clear tendency to score outputs to the same problem with “goblin” or “gremlin” higher than outputs without, with positive uplift in 76.2% of datasets. That explained why the behavior was boosted with the Nerdy personality prompt, but not why it also appeared without that prompt. To test whether the style was transferring, we tracked mention rates over training both with and without the Nerdy prompt. As goblin and gremlin mentions increased under the Nerdy personality, they increased by nearly the same relative proportion in samples without it. Taken together, the evidence suggests that the broader behavior emerged through transfer from Nerdy personality training. The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data. That creates a feedback loop: Playful style is rewarded Some rewarded examples contain a distinctive lexical tic. The tic appears more often in rollouts. Model-generated rollouts are used for supervised f

Key Takeaways

Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly.
A single “little goblin” in an answer could be harmless, even charming.
Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from.
In early testing, GPT‑5.5 in Codex showed an odd affinity for goblin metaphors.

Detailed Coverage

Market analysis reveals significant growth potential in the sector discussed in 'Where the goblins came from'. Investment patterns and market trends indicate strong confidence in these technologies, with venture capital and corporate investments driving further innovation and development.

User experience and accessibility are key themes that emerge from the analysis of 'Where the goblins came from'. The focus on creating intuitive, user-friendly interfaces demonstrates a commitment to making advanced technology accessible to broader audiences and diverse user groups.

The competitive landscape highlighted in 'Where the goblins came from' shows how different organizations are positioning themselves in this rapidly evolving market. Strategic partnerships, acquisitions, and research collaborations are shaping the future direction of technological development.

Environmental sustainability and energy efficiency considerations are increasingly important in the context of 'Where the goblins came from'. The industry is moving towards more sustainable practices and green technologies to address climate change and environmental concerns.

Education and skill development play crucial roles in the adoption and advancement of technologies discussed in 'Where the goblins came from'. The need for specialized talent and continuous learning programs highlights the importance of human capital in technological progress.

Article Details

Published: April 29, 2026 at 20:00

Source: openai.com

Original link: https://openai.com/index/where-the-goblins-came-from

Read the Full Article

If you want the exact wording, examples, or full context from the publisher, open the original source article.

Open Original Article

The Metaverse: The Next Evolution of the Internet

What is the Metaverse? The Metaverse is quickly becoming one of the most buzzed-about topics in the tech world. Described as a virtual reality space where users can interact with each other and digital environments in real-time, the Metaverse is often seen as the next iteration of the internet. Instead of simply browsing the web or engaging with apps on flat screens, users would be able to experience a 3D world that’s immersive and interconnected across various platforms. The Components of the Metaverse The Metaverse is built on a foundation of several technologies, including virtual reality ( VR ), augmented reality (AR), blockchain, and artificial intelligence (AI). These technologies work together to create a seamless, interactive virtual environment. For example, VR headsets and AR glasses will allow users to navigate the Metaverse as avatars in a digital world, while blockchain technology ensures secure and transparent transactions within the Metave...

Your Gateway to Tech Mastery

Search This Blog

Where the goblins came from

Comments

Post a Comment

Popular posts from this blog

Understanding AI Agents: The Future of Autonomous Digital Workforces

The Metaverse: The Next Evolution of the Internet

Google Python Style Guide