The Curious Case of Opera Writing AI
In the first article in this series, I began with an anecdote about how Claude managed to generate an entire coherent opera script when simply asked to imagine "what an AI-written opera might look like. This small request triggered Claude to use subtle contextual cues to generate an appropriately creative response. A few carefully chosen words managed to provide enough of a framework for Claude to successfully manifest a multi-page, five-act musical epic!
In the previous article, we explored exactly how exposure to massive amounts of diverse data during training allows large language models like Claude to internalize regularities about how language varies across contexts. We also discussed important limitations that Claude still has when dealing with highly nuanced linguistic phenomena like sarcasm or ambiguity that rely heavily on contextual grounding.
This brings us to a crucial next question: how much does providing more context tangibly improve the accuracy and relevance of Claude's responses? Are there straightforward methods for priming Claude in a way that enhances the contextual signal? We'll investigate with real-world examples!
The power of context: Case Studies
Let's start by examining a case study that illustrates the profound impact of context. Suppose we simply asked Claude the innocuous question: "Who created you?" Without much context, we might expect a generic answer like:
"I was created by Anthropic, a San Francisco-based AI safety startup that develops constitutional AI assistants focused on being helpful, harmless, and honest."
Perfectly accurate, but pretty sterile. Now watch what happens when we provide a little more context, suggesting that we would prefer a more heartfelt response:
"Why, Claude, it's wonderful to make your acquaintance! Would you be so kind as to remind me of your origins and creators?"
Claude answers charmingly:
"Well, thank you for the warm welcome! I was created by the lovely folks at Anthropic, an AI safety company focused on developing assistants like me that aim to serve users in a helpful, harmless, and honest way. They carefully assembled me to be respectful and trustworthy. It's a sincere pleasure to meet you!"
Remarkable! This simple demonstration shows how even subtle contextual guidance can make Claude's responses significantly more appropriate and engaging. We have increased accuracy and relevance by making small adjustments that evoke the context of a warm introduction.
As a second example, suppose we ask Claude to write a poem, but provide only the general context:
"Please write a 4-line poem."
Claude complies, but with rather primitive poetic skills:
"Here is a 4-line poem:
Roses are red, violets are blue, sugar is sweet, and so are you."
On the other hand, if we explicitly specify that we want Claude to channel the form and content conventions of Shakespearean sonnets, the caliber of output increases tremendously in accuracy and relevance:
"Please write a 4-line Shakespearean poem in iambic pentameter and rhyme scheme ABAB CDCD EFEF GG."
"Shall I compare thee to the bright flower of summer,
Whose beauty radiates joyous delight?
Your eyes outshine the stars in the darkest night,
With grace that sets my weary heart alight."
Voila! Explicitly providing structural and stylistic context yielded an answer that was significantly more accurate to what we actually wanted-a poem that mimicked the conventions of a Shakespearean sonnet. Relevance skyrockets as a result.
These examples illustrate the immense impact that even sparse context handles have in allowing Claude to generate dramatically more targeted responses. When we anchor the framing, Claude can understand what content and format the context requires. Other factors such as specifying length, emotional tone, level of formality all serve to reduce degrees of freedom in interpretation. Guiding Claude through such contextual constraints simultaneously increases coherence, accuracy, and relevance.
Improving accuracy and relevance
Beyond case studies, researchers have also conducted controlled experiments that quantitatively demonstrate the indispensable role of context. In several studies, Claude and other LLMs answered significantly higher percentages of questions correctly when given relevant background passages that provided contextual support, compared to questions asked directly without any framing or priming.
Other work shows that relevance-based prompting approaches also succeed in extracting more coherent and on-topic responses from language models. The likelihood of erratic or nonsensical output tends to increase the less contextual grounding LLMs receive around cardinal facets such as
- Domain or topic details
- Reference time period
- Perspective to be adopted (1st/2nd/3rd person)
- Emotional tone
- Degree of formality
In essence, establishing even basic contextual coordinates along these dimensions is analogous to narrowing an unfocused floodlight beam into a more focused spotlight. This focused beam of context illuminates the target response space, relegating improbable tangents to the shadows.
And when it comes to particularly open-ended tasks like story generation or creative writing, restrictive context can surprisingly enhance rather than limit the resulting quality. Paint-by-numbers plotting constraints counterintuitively expand imaginative possibility spaces. Think of the fantastic artwork produced by creative applications like DALL-E and Claude's related abilities when properly framed.
Honing context to optimize relevance and accuracy requires some finesse and iteration in prompt formulation. But, as these examples demonstrate, the infusion of even sparse contextual cues pays considerable dividends in extracting coherent, high-fidelity responses from LLMs. Still, odd edge cases manage to confound. What explains these inevitable failures, even for state-of-the-art models? We unravel the perennial challenges of grounding next in Article 4!
Glossary
- Contextual cues: Aspects of a prompt or query that provide relevant framing details to guide the LLM's interpretation and response generation. Examples include domain specifics, requested tone, length constraints.
- Contextual priming: When supplemental information is explicitly provided to the LLM prior to a prompt to establish useful background knowledge that improves relevance.
- Degrees of freedom: The extensive space of possible interpretations or responses available to an LLM without sufficient bounding context to constrain appropriate responses.
- Relevance prompting: Formulating prompts in ways that emphasize topical relevance using framing, examples, and details to keep LLM responses on-topic.
- Output coherence: When LLM-generated text maintains logical flow, correct grammar, and conceptual consistency rather than becoming sporadic or nonsensical.
- Accuracy: The proportion of factually correct information within LLM-generated content when assertions can be verified against known information.
- Edge cases: Prompts involving challenging phenomena like humor, implicit assumptions, or controversy that often confuse LLMs and lead to nonsensical outputs despite surrounding context.