caseway

ChatGPT vs. Caseway: Accuracy

Alistair Vigier

Published: 2/8/2026

In the age of artificial intelligence, software like ChatGPT have made it tantalizingly easy to get quick answers to complex questions. Legal professionals, academic researchers, and enterprise clients are understandably intrigued by the prospect of AI-driven legal research that saves time.

However, when it comes to law, what an artificial intelligence says is only as good as where the information comes from. Recent examples show that general artificial intelligence models can pull information from non-official sources (even crowd-sourced sites like Wikipedia) in sensitive legal contexts, raising red flags about accuracy and reliability.

This article analyzes those risks and contrasts them with Caseway’s approach (via its AI assistant “Casey”), which is a platform engineered to draw exclusively from verified legal databases and to uphold the highest standards of data quality, source verification, and legal compliance.

The Risk of Unofficial Sources in AI-Powered Legal Answers

Legal research demands authoritative sources. ChatGPT’s helpfulness can be a double-edged sword: the model may supply answers based on whatever information it finds or has in its training data, even if that means using unofficial or secondary sources. For instance, consider a scenario where a user asks ChatGPT about the doctrine of privity of contract and third-party obligations.

In one documented exchange, ChatGPT can be seen referencing a publicly editable online encyclopedia (Wikipedia) when formulating its answer on contract law principles. In that conversation, after failing to retrieve data from an official legal database due to access issues, the AI defaulted to using a Wikipedia article as its source of information. This example highlights the risk that a general AI might lean on convenience over authority, pulling explanations from a non-authoritative outlet in a legal context.

You can also see that it wasn’t able to access CanLII because it was blocked. This is something that Casey is working on with SFU.

It’s important to stress why this is problematic. Wikipedia, while a useful starting point for general knowledge, is not an official legal source and is not considered reliably authoritative in professional settings. Anyone can edit its content, and while errors are often corrected over time, the information isn’t guaranteed to be up-to-date or fully accurate on nuanced legal issues.

Observers have noted that ChatGPT often references Wikipedia for definitions of legal terms or overviews of legal topics. This tendency is inherently risky. Wikipedia itself explicitly warns that it “cannot be considered a reliable source” for formal research, and many academic institutions forbid citing it as such. In the legal world, where precision and precedent are critical, relying on a crowd-sourced summary is a precarious shortcut.

A misinterpretation from an unofficial source can cascade into a flawed legal argument or an oversight of controlling authority.

Hallucinated Cases and the Perils of Misinformation

The issues with general AI in legal research go beyond sourcing Wikipedia. A well-documented phenomenon with AI like ChatGPT is “hallucination”, the AI’s tendency to fabricate information or citations that sound plausible but are completely false. Unfortunately, this isn’t just a theoretical quirk; it has spilled into real courtrooms.

One high-profile cautionary tale was the New York “ChatGPT lawyer” incident in 2023. Pressed for time, an attorney used ChatGPT to help draft a brief. The AI confidently presented several case citations that perfectly supported the lawyer’s argument, but none of those cases actually existed. ChatGPT effectively invented cases out of thin air, complete with fake quotes and bogus citations. The lawyer, unaware of this hallucination, submitted the brief.

The result was a severe embarrassment once the judge and opposing counsel discovered the citations were fictitious. The court castigated the lawyer, who had to admit on the record that a chatbot duped him, and he faced sanctions and fines as a consequence. This case became an instant emblem of AI’s pitfalls in law, demonstrating how blind trust in an unverified answer can derail a legal proceeding.

The hallucination leaderboard

That New York episode is not an isolated case. Across jurisdictions, there is a growing “hallucination leaderboard” tracking AI-induced legal errors. U.S. courts have seen dozens of lawyers (and even self-represented litigants) submit materials citing non-existent precedents that an AI like ChatGPT confidently supplied. Similar incidents are recorded in Canada, Europe, and elsewhere. The consequences have ranged from professional embarrassment and case delays to monetary sanctions and mandatory remedial training.

In some instances, judges have contemplated referrals for disciplinary action when repeat offenders presented AI-generated falsehoods. The pattern is clear… Relying on a general-purpose AI for legal research can be professionally perilous.

Why do these smart AI models go off the rails? The underlying issue is that a model like ChatGPT doesn’t know facts the way a database or a human expert does, it generates responses by predicting plausible sequences of words based on patterns in its training data. If asked a question and the model doesn’t find a clear answer in its learned knowledge, it may improvise to avoid disappointing the user.

It may “bullshit” in a very confident manner, producing something that sounds right but isn’t. The AI isn’t malicious; it’s simply doing what it was designed to do, being eloquent and helpful, but “plausible” text is not good enough when a lawyer or court needs the truth.

A general artificial intelligence may also draw from the entire internet without distinction, mingling high-quality legal information with unreliable content, which can lead to confusion or errors. All of these factors underscore a key point that in legal research, accuracy and accountability are paramount. An answer that cannot be trusted to be correct and sourced is worse than no answer at all, given the stakes.

Here’s the problem…

Unverified Sources:

The AI may pull data from non-official, secondary sources (like Wikipedia or personal blogs) without the user realizing it, which can introduce errors or one-sided views.

Fabricated Citations (Hallucinations):

If it lacks information, the AI might generate fictitious cases or statutes to fill the gap, as seen in real-world incidents. These fabrications can mislead legal professionals and carry serious consequences.

Lack of Accountability:

ChatGPT and similar models typically do not provide citations by default. The burden falls on the user to fact-check every statement, which is a time-consuming task that negates the convenience of AI. If a mistake slips through, the responsibility (and liability) is on the lawyer or firm, not the AI provider.

Context and Compliance Concerns:

General AI might not be up to date on the latest law or might mix jurisdictions unknowingly. It also might not respect confidentiality or usage norms. For example, using proprietary data or providing advice that edges into unauthorized practice of law. All told, these software don’t come with guarantees that their output meets the legal industry’s regulatory and ethical standards.

Given these pitfalls, it is little wonder that many experts urge caution. A partner at a law firm or a general counsel at a company must ask: can an answer from a free-form AI be trusted without verification? In a field where a single erroneous citation or misquoted rule can undermine an argument, the margin for error is zero. This is where a different philosophy of legal AI becomes not just advantageous, but necessary.

Caseway’s Approach: Verified Sources

Caseway (and its AI assistant, “Casey”) was built as a direct answer to the uncertainties posed by general AI in the legal arena. Rather than scraping the entirety of the internet for information, Caseway’s technology pulls directly from official legal databases and primary source repositories.

This means that when Casey is asked a legal question, its knowledge is drawn from millions of real court decisions and statutory materials, the same primary sources a diligent attorney would trust. As the company proudly states, Casey is “intentionally grounded only in primary legal sources. No random internet content or secondary commentary”.

In other words, if a document wasn’t written by a judge or enacted by a legislature, Casey won’t rely on it. This design is a sharp contrast to ChatGPT’s “whole internet in one gulp” approach. By limiting the artificial intelligence’s diet to vetted case law and legislation, Caseway greatly reduces the risk of stray misinformation creeping into answers.

How does this focus on official sources translate into day-to-day usage? For one, the answers Casey provides come with citations to the underlying law every time, by design. Each response is accompanied by references, for example, the specific court decision or statute that supports the answer, so that users can immediately verify the information against the source material.

This built-in source verification is crucial in legal practice. It means a lawyer using Caseway can confidently follow the trail to the exact page of a reporter or the clause of a statute that the AI is referencing.

The AI’s role is not to replace legal reasoning or due diligence, but to supercharge it by doing the heavy lifting of finding and summarizing the right texts, while leaving the human in control of confirming and applying the law.

Caseway’s commitment to data quality and legal compliance goes beyond just where the data comes from. Because its corpus is composed of official publications (court opinions, statutes, regulations, etc.), the content is inherently compliant with citation standards and free of the kind of copyright or privacy issues that can arise with random web content. Caseway’s database is updated and maintained, ensuring that it reflects current law.

And importantly, if Casey cannot find a clear answer in the official sources, it is designed to respond more cautiously, possibly by indicating that it didn’t find a definitive answer, rather than fabricating one. This is a conscious trade-off... The platform would rather deliver a partial or negative result than risk a flamboyant falsehood.

As the Caseway team puts it, they have “sacrificed a bit of the ‘anything goes’ flexibility” of a general model in order to gain integrity and trustworthiness. The result is that it’s virtually impossible for Casey to conjure a fake case, and it can’t cite what isn’t in its library of real judgments. And if something isn’t there, Casey will let you know, just as a diligent associate might report that a thorough search yielded no direct precedent.

To illustrate the difference in approach, imagine posing the same complex legal query to ChatGPT and to Caseway. ChatGPT might give a smooth narrative answer drawing on a mix of its training data, which could include a Wikipedia summary, a law blog, some case law snippets, and its own interpolations. The user would get a nice-sounding answer but have no immediate way to verify the details, which is a risky proposition.

Casey, on the other hand, would parse the question, retrieve the most relevant judicial decisions or statutes, and provide an answer like a concise memo, complete with pinpoint citations. The user could click the citation or look up the case to see the exact wording of the judge’s holding. In corporate and academic settings, this level of transparency is not just comforting, it’s often required.

Compliance protocols

Enterprises and educational institutions have compliance protocols that demand knowing the source of information and ensuring it’s authoritative. By pulling from official databases, Caseway ensures that those using it for research or drafting are automatically working within a universe of vetted, legally compliant materials.

It’s also worth noting that a platform like Caseway is built with the understanding of jurisdictional nuance. Users can filter queries by jurisdiction (e.g., asking specifically about Canadian law vs. the federal law in the United States) to get answers that are relevant to their context.

This guards against the inadvertent mixing of laws from different regions, a subtlety that a generalized model might miss. All these measures contribute to an ecosystem where accuracy is maximized and any output is traceable back to an original, trusted document.

Caseway’s methodology tackles the very problems that make ChatGPT unreliable for legal work: it curates the input (official sources only), structures the output (with citations and jurisdictional awareness), and implements checks against hallucination (the AI cannot easily go outside its source bounds).

Accuracy, Accountability, and Trust

In a hands-on field like law, trust is everything. The introduction of AI into legal research doesn’t change that fundamental truth, if anything, it makes the principles of accuracy and accountability more crucial than ever. Lawyers, researchers, and clients must be able to trust that a tool isn’t leading them astray with incorrect information.

The contrasts between ChatGPT and Caseway underscore a broader insight: AI is a powerful aid, but without verified sources and careful guardrails, it can become a liability. The legal tech field has learned this the hard way through high-profile mishaps.

The wiser approach, as exemplified by Caseway, is to build artificial intelligence systems that respect the established norms of legal research, systems that act less like all-knowing oracles and more like ultra-efficient research assistants who always cite their sources and always know the limits of their knowledge.

Accuracy in legal AI means aligning every answer with the law as it actually exists in statutes and case reporters. Accountability means designing systems where every statement can be checked against an original source, and where the AI is engineered to avoid speculation outside those sources.

Trust flows naturally from these two: when users see that an artificial intelligence consistently provides solid references and never fabricates, confidence in the technology grows. Caseway’s absence from the AI “hallucination hall of shame” is not a coincidence but a result of these intentional choices.

The platform was purpose-built for organizations that literally “can’t afford errors” in legal analysis. By focusing on primary data and compliance, it assures its users, whether they are in a law firm, a courtroom, or a classroom, that the convenience of AI will not come at the cost of credibility.

The experience of using a verified, legally-grounded AI like Caseway is one of augmentation rather than automation. The lawyer or researcher remains in control, armed with relevant information faster than ever, but also with the means to verify and apply it responsibly.

This stands in stark relief to the scenario of using a generic AI tool and hoping for the best. As the legal industry navigates the AI revolution, the lesson is clear: insist on accuracy, demand accountability, and prioritize trust.

A Call to Action

The next time you or your organization considers leveraging AI for legal research or education, ask the critical question, does this tool provide verified, source-based answers that you can stake your reputation on? If the answer is anything less than an unqualified yes, you may want to rethink your approach. In an environment where mistakes carry high costs, relying on verified, legally-compliant platforms like Caseway isn’t just a convenience, it’s a necessity.

Embracing AI in law should never mean compromising on the core values of the profession. With purpose-built solutions that emphasize accuracy, accountability, and trust, we can harness the benefits of artificial intelligence while upholding the standards that legal work demands. The future of legal research will belong to those who choose their software wisely, and ensure that AI serves the law, not the other way around.

-Alistair Vigier