Article 3: Responsible Sourcing - Fact-Checking Model Results

Details: Category: Prompting the LLMs'

The previous articles established strategies to help users ask better questions - clarifying terminology, constraints, and assumptions to prime language models for truthful answers. However, the responsibility for responsible AI extends beyond the quality of inputs. Users also need to check the output before acting on suggestions.

Computer scientists caution against blindly trusting model-generated text without checking it against known facts. We cannot assume accuracy just because answers sound coherent or authoritative. Models often hallucinate pseudo-profound statements that have no basis in reality.

This risk requires that language models be used more as hypothesis-generating tools than as oracles deserving unquestioning faith. Responsible use involves scrutinizing results by requiring evidence and external confirmation from reputable sources before further reliance for decision making.

The dangers of unverified content

In 2021, a financial services company deployed AI chatbots to provide investment advice to customers. However, the bots made unrealistic claims about guaranteed returns of up to 300%, which resembled fraud. Developers traced the problem to insufficient verification of model outputs before release. They failed to validate suggestions that were consistent with financial expertise rather than sounding convincing while fabricating claims.

This case highlighted the dangers of assuming that the plausibility of linguistic models indicates factual reliability. Without verification, dangerous misinformation could spread unchecked in sensitive areas such as finance, health, and law. Fact-checking provides the necessary oversight before unattended model output is amplified or acted upon.

Principles of Output Verification

Rigorous verification of language model proposals includes

Corroborating key assertions with known facts
Requiring supporting evidence for key claims
Consulting subject matter experts to confirm alignment with established understanding

Evaluating truth value

Before augmenting or applying model guidance, evaluate the truth value of the above principles. Be wary of unsupported opinions, controversial superlatives, or questionable framing biases. Models often go too far, masked by eloquent flair.

Without exhaustive content monitoring, certain false claims are likely to persist. But weighing proposals against common sense provides an initial guardrail. Look for faulty logic, glaring omissions, or inconsistencies that signal the need for greater scrutiny before adoption.

Seek Justified Evidence

In addition, challenge language models that provide evidentiary support for conclusions, whether by citing data, precedent, or expert opinion.

This habit strengthens critical thinking muscles by evaluating the strength of arguments rather than accepting unsupported statements as true. Encouraging justification also improves model calibration by communicating the degree of confidence or uncertainty in assertions in the absence of clear evidence.

External Expert Opinion

For high-stakes guidance, consult domain experts to validate language model proposals, not unlike second medical opinions that guard against error. Researchers find that models struggle to reliably footnote factual sources for their claims. This underscores the importance of verifying accuracy yourself, rather than assuming that citations manifest truthful justifications without transparency.

The combination of such evidence-based verification limits the harm of spreading misinformation or unscrupulous advice. More scrutiny instills accountability.

A checklist for responsible sourcing

When reviewing model results, be critical:Test key claims against verified facts

Ask for supporting evidence
Consult experts to validate claims
Look for overly confident or ambiguous framing
Note inconsistencies from prior knowledge

Of course, some speculative responses, such as hypotheticals or projections, defy hard validation. Still, insist on clearly labeled assumptions in these cases so that readers can distinguish opinion from established fact when evaluating plausibility, rather than accepting speculation without transparency. Trust but verify! Of course, exhaustive third-party fact-checking of every model-generated sentence remains infeasible in the face of perpetual output. But prioritizing validation for high-impact claims or advice provides an important guardrail that limits the spread of falsehoods with only superficial legitimacy.

Responsible use requires verifying then trusting language models, not the dangerous opposite. Ronald Reagan popularized the Russian adage "trust but verify," affirming the wisdom of combining trust in the good intentions of others with accountability that confirms aligned results. This maxim proves relevant when working with AI systems, whose ability to do great good or harm depends on human judgment that responsibly verifies results.

Cross-validation of language models

An increasingly popular technique for verifying the output of language models is to cross-validate suggestions against other state-of-the-art models before accepting guidance.

As discussed earlier, blind reliance on a single model runs the risk of propagating pseudo-profound misinformation if users assume that plausibility means factual reliability. However, consulting multiple models mitigates such concerns by validating congruence. Each model has strengths and limitations based on idiosyncrasies of training data, architectures, or narrow problem-solving specializations.

However, differently built models manifest different blind spots. What may be a hallucination for one may be a red flag for another if trained differently. This diversity drives collective intelligence that benefits users.

Cross-validation from multiple perspectivesLarge language models such as Google's LaMDA, Anthropic's Claude, and Meta's Galactica bring different design priorities that shape behavior differently. For example, Claude received additional security training to resist generating harmful, illegal, or unethical content more aggressively than commercial models, which allow for more speculation. Such diverse engineering means that no single model is likely to be best for every use case. Seeking collective wisdom across options through cross-validation checks thus limits individual weaknesses. It is akin to getting a second medical opinion to guard against a single misdiagnosis.

This validation approach also provides a diversity of perspectives, highlighting areas of consensus and disagreement. Consistently conflicting assertions prompt deeper scrutiny to determine the credibility of the information, rather than hastily accepting singularly sourced guidance that is now visible as an outlier view lacking confirmation elsewhere.

Of course, universal hallucination risks around unprecedented scenarios exist in all models that share similar training limitations regarding recent world events or cutting-edge developments after training cutoffs. But minimizing this vulnerability through enhanced evidence gathering remains a critical best practice.

Efficient verification workflows

Rather than sequentially interrogating different models independently, users can more efficiently design verification using uncertainty-detecting preamble prompts that signal at the outset the need for externally supported assertions in subsequent responses. For example, priming models by asking:

"Please provide citations from credible external sources to support any factual claims or statistics presented in your following response. I will check your statements against other language models to confirm consistency with established facts. Such overt expectation-setting trains models to provide ready evidence, increasing confidence in claims or clearly identifying speculation that otherwise lacks citations. It promotes responsible transparency.

Additional statements such as "Limit responses to areas within your confidence boundary based on available evidence" further frame the preference for corroborated guidance rather than completely independent musings when opportunities for collaboration exist.

Follow-up questions that specifically request supporting citations also underscore that users value reasoned analysis over eloquent but unsupported thought leadership when assessing model quality.

Cultivate healthy skepticism

User feedback that rewards truthfulness and transparency, while scrutinizing unsubstantiated opinions, fosters the collaborative benefits of the ecosystem over seamless but unsupported pontification. It incentivizes substantiating analysis with readily available evidence in anticipation of the expectation of subsequent validation, even if not explicitly requested alongside each query.

This fosters user fluency in cross-validating model claims as second nature rather than passive acceptance, reinforcing the understanding of modern information analysis skills that transfer to other web-based learning contexts. A vigilant spirit of healthy questioning serves a lifelong search for truth. The modern abundance of accessible language models provides an invaluable opportunity to harvest collective intelligence on key questions, if users commit to avoiding overreliance on any single source without external corroboration. Cross-validation analysis across models has emerged as a critical best practice to hedge against the limitations of individual systems by honing universal critical thinking muscles that sift substantiated, high-quality signals from the unsupported noise that pollutes the modern information ecosystem.

Conclusion

Blind deference to language models allows for real dangers regardless of eloquence or conviction during generation. Prioritizing critical verification practices, including corroborating evidence, consulting experts, and weighing against common sense, offers prudent safeguards that help conscientious users maximize the benefits of these promising but unreliable tools while screening out the harms introduced by the unchecked spread of machine-generated misinformation. The next article will explore other facets of responsible use of language models, including monitoring for troubling patterns of response that require intervention.