AI Evidence in Courts: Should Algorithms Be Treated as Expert Witnesses

Authored By: Khadija Bilal

Pakistan College of Law

ABSTRACT

The increasing reliance on artificial intelligence (AI) and algorithmic tools in judicial processes has raised complex questions regarding the admissibility, reliability, and accountability of such evidence. Courts across jurisdictions are beginning to rely on algorithmic risk assessments, predictive tools, and data-driven systems, particularly in criminal justice contexts such as bail, sentencing, and forensic analysis. This article critically examines whether AI-generated outputs should be treated as expert evidence within existing legal frameworks. It argues that while algorithmic tools may enhance efficiency and consistency, their opaque functioning, susceptibility to bias, and limited explainability pose serious challenges to fair trial rights, judicial transparency, and accountability. The article concludes that AI should function strictly as a decision-support mechanism and must not be equated with human expert witnesses unless robust legal safeguards are established.

INTRODUCTION

Technology has always had an impact on how justice is served. Courts are constantly adapting to new types of evidence, like fingerprinting, DNA analysis, CCTV, and digital forgery. The current “AI turn” in legal systems is more disruptive than earlier changes because it doesn’t just collect information; it also processes it, draws conclusions, and sometimes makes suggestions that change the outcomes.1Increasingly, tools with names like “risk assessment,” “predictive analytics,” or “decision support” are being used in the criminal justice system to help with bail decisions, sentencing, and deciding where to put investigative resources.2

This trend raises a challenging legal issue: should algorithmic outputs to be regarded as expert evidence? Expert testimony is allowed because it helps the court understand things that most people don’t know about. But experts have to follow certain rules: they have to use a recognised method, explain their reasoning, and be able to be cross-examined.3 Different AI systems work in different ways. They might use confidential code and training data, and even the people who set them up might not be able to understand how they work.4

It’s not just that “technology is new” that worries people; it’s also that algorithmic evidence can undermine the justice system’s core values, like the right of parties to test evidence, understand why a decision is made, and hold decision-makers accountable. These promises are especially important in criminal cases, where the results can mean losing your freedom and being shamed. The right to a fair trial, which is protected by due process principles in legal systems and Article 6 of the European Convention on Human Rights (ECHR) in Europe, means that people should have a real chance to question the evidence and reasoning used by courts.5

This article argues that AI-generated outputs should not be treated as expert witnesses by default. The better view is that algorithmic tools can help people make decisions, but they should only be allowed to be used in very specific ways. When AI evidence is used, courts need to put in place protections that promote disclosure, explainability, bias assessment, and contestability. If they don’t, algorithmic evidence could become an “unanswerable expert,” which would hurt both fairness and public trust in the legal system.

RESEARCH METHODOLOGY

This article employs a doctrinal and analytical approach. It is based on: (i) the principles that govern expert evidence in common law, (ii) case law that shows the risks and limits of technical evidence that is too persuasive, and (iii) fair trial standards based on human rights that stress openness and the right to challenge evidence. Comparative insight is employed to demonstrate the practical judicial interaction with algorithmic tools, notably through the case of State v. Loomis in the United States.6 The goal is not to say that all AI evidence is not allowed, but to make clear what conditions need to be met before algorithmic outputs can be safely used in court.

EXPERT EVIDENCE AND ITS LEGAL LIMITS

Expert evidence is an exception to the general rule that witnesses can only talk about things they know and not make judgements about them. Experts are allowed to help the court with specific issues. But because testimony from experts can be very convincing, courts have always required safeguards to make sure it doesn’t change the way cases are decided.

The English Court of Appeal said in R v. Turner that experts should not “usurp the function” of the judge or jury. If the jury can make up its own mind, expert opinion may not be needed or appropriate.7 The point is constitutional in a small “c” sense: decision making authority rests with the court, not with external specialists. Expert evidence is justified only when it genuinely assists and remains open to scrutiny.

R v Bonython, a well-known Australian case that is often cited for its clarity, also lists two main questions about admissibility: (1) is the subject matter such that a lay tribunal needs help, and (2) does the witness have enough expertise based on a recognised body of knowledge?8 The second limb is important because it connects admissibility to reliability. Expertise isn’t just confidence or credentials; it’s disciplined knowledge that can be evaluated.

Cross-examination is an important part of the process that protects people. Experts can tell you why they think what they do, what they think, what they think is wrong, and how to deal with criticism. Cross-examination therefore helps courts distinguish solid expertise from persuasive guesswork. It also promotes fairness by giving the other side a fair chance to challenge the “special” evidence that was used.

The legal model of expert evidence thus assumes three elements: (i) a rational methodology; (ii) transparency regarding the derivation of conclusions; and (iii) the capacity for contestation via cross-examination and rebuttal. When the “expert” is an algorithm, these assumptions start to break down.

WHAT MAKES AI EVIDENCE DIFFERENT?

AI systems that are often used in courts and other places of justice are not conscious thinkers; they are tools for finding patterns in data. Their outputs are based on the data they were trained on, the way the model is set up, and the choices made about the parameters. A tool may not work well in certain situations, even if it works well on average. This is especially true when the training data does not fairly represent the population.

The Black Box Problem

Many contemporary machine learning systems, particularly those employing intricate models, generate outputs that are challenging to elucidate in human terms. The critique of the “black box” is not merely philosophical; it is substantiated by evidence. A party cannot effectively contest an algorithmic conclusion if they do not comprehend its foundation.9 The court also can’t give the evidence any weight unless it knows what its limits are.

In traditional evidence law, reasoning, methodological scrutiny, and adversarial challenge are used to test reliability. Algorithmic opacity messes up these steps. AI evidence can look like an “answer” without a visible “working,” which is the opposite of what courts usually want from experts.

Bias and Historical Data

AI tools that learn from past data can copy the biases that are already in that data. If police have historically focused on certain neighbourhoods, arrest data may show bias in enforcement rather than actual crime rates. When risk tools are trained on this kind of data, they might call people from communities that are heavily policed “higher risk,” which makes the inequality worse. 10

Saying “the tool is neutral” doesn’t help with bias issues. The claim of neutrality ignores the fact that design choices decide what counts as risk, which variables to include, and which outcomes to optimise. Those choices are loaded with meaning, but they are often hidden behind technical language.

Automation Bias and Deference

Even when a judge is not legally bound by an algorithm, algorithmic outputs can carry psychological authority. Judges may defer to a numeric score or recommendation because it appears objective and scientific. This “automation bias” can shift the real centre of gravity of a decision away from judicial reasoning, even while formal discretion remains.11 The danger is not that judges stop deciding, but that their decisions become anchored by tools they cannot fully interrogate.

ADMISSIBILITY, RELIABILITY, AND THE LESSONS OF TECHNICAL EVIDENCE

Courts already know that technical proof can be very convincing in a bad way. The Court of Appeal in R v. Doheny stressed the need for careful directions and caution when dealing with scientific proof (in this case, DNA evidence) because juries may give it too much weight.12 This warning is clear in algorithmic terms: “technical” doesn’t mean “infallible,” and courts shouldn’t treat complicated tools as if they are always right.

If algorithmic outputs are introduced as expert evidence, the court must ask: (i) what is the methodology; (ii) can it be scrutinised; (iii) what is the error rate; (iv) how was the tool validated; (v) does it perform differently across groups; and (vi) can the opposing party test it? These are not optional quest Yet, AI companies often say that they own the code and data that they use. If commercial confidentiality prevents disclosure, the adversarial system’s capacity to assess reliability is diminished. This creates a structural unfairness because one party (usually the state) has access to the tool’s authority, but the other party does not have access to its foundations.

This is why it’s dangerous to treat AI outputs as “expert evidence.” It can bring unclear reasoning into court under the guise of expertise, without the protections that make expertise legitimate.

FAIR TRIAL AND DUE PROCESS

Fair trial guarantees demand that individuals have a meaningful opportunity to challenge evidence relied upon in determining their rights. Article 6 ECHR, for example, reflects principles of procedural fairness and equality of arms.13 Even outside ECHR systems, similar values appear through constitutional due process and common law fairness.

A system that relies on algorithmic outputs that cannot be challenged risks violating these guarantees. If a defendant is told, “the algorithm says you are high risk,” but cannot see the basis, the model, or the assumptions, the ability to challenge becomes superficial The European Court of Human Rights has said many times how important it is to have procedural protections and make decisions based on reasons. Osman v. United Kingdom is not an “AI case,” but it shows how important fairness norms are under Article 6.14 Courts have to be competent to explain their decisions in a way that people can understand, check, and appeal. AI systems that give outputs without clear reasons go against that goal.

There is also a problem with legitimacy. Courts get their power not just from being right, but also from being able to explain their decisions in a way that makes sense to the public. People may not be able to trust court decisions as much when they are made by systems that aren’t visible. This is especially troubling in the criminal justice system, where people need to believe that outcomes are reached through fair processes, not hidden machinery.

COMPARATIVE PERSPECTIVE: STATE V LOOMIS

State v. Loomis is a well-known case that shows how algorithmic decision-making works. In this case, a sentencing court used a proprietary risk assessment tool.15 The Wisconsin Supreme Court said it was okay to use, but it was worried about transparency and said it shouldn’t be the only thing used to decide sentences.

Loomis shows the main problem: algorithmic tools can help keep things consistent, but their lack of transparency can lead to unfairness. The case is also important because it shows that judges know that “admitting” algorithmic outputs is not the end of the analysis. The court must also say how the outputs can be used, what warnings go with them, and what limits they have.

But warnings alone aren’t enough if the defendant still can’t challenge the tool in a meaningful way. A warning that “this tool may be biassed” does not help a defendant challenge the details of how the score was made. The case thus corroborates the thesis of this article: algorithmic tools may be employed, but they should not be treated as conventional experts unless genuine contestability exists.

SHOULD ALGORITHMS BE TREATED AS EXPERT WITNESSES?

Treating AI as an “expert witness” is conceptually tempting because AI seems to do what experts do: analyse complex information and generate conclusions. But legally, the analogy breaks down.

First, an “expert witness” is responsible. A human expert can be criticised, discredited, and punished in their field. An algorithm cannot be morally or professionally accountable. Developers, vendors, and users are now responsible, but evidential law isn’t meant to spread responsibility like that.

Second, experts can be questioned in a cross-examination. Algorithms are not able to answer questions. A human representative can explain the system at best, but that doesn’t mean they can test how the system thinks, especially if the explanations are incomplete or come after the fact.

Third, experts are expected to reason. Many AI tools do not “reason” in a way courts can recognise. They classify or predict based on correlations that may not be causally meaningful.

For these reasons, elevating AI to expert status risks creating an “unchallengeable” authority. The safer legal stance is: AI outputs should be treated as technical material that must be supported by a human expert who can explain validation, limitations, bias testing, and relevance, and whose testimony remains contestable. Even then, courts should be cautious about how much weight is assigned.

WAY FORWARD: SAFEGUARDS AND REFORM

If courts are to use AI evidence responsibly, the law must develop safeguards that translate fairness principles into operational rules.

Transparency and Disclosure

Courts should mandate the disclosure of training data characteristics (at a minimum, in summary form), validation studies, established error rates, and the model’s intended application. When proprietary claims hinder disclosure, courts should evaluate whether fairness necessitates exclusion or diminished weight.

Explainability Standards

When an output impacts liberty or significant rights, courts ought to require a level of explainability adequate for substantive challenge. Be very careful with a “risk score” that doesn’t come with an explanation.

Bias Audits and Equality

Before use, there should be independent bias audits. Courts should also look at whether an algorithm works differently for different groups that are protected and whether using it makes structural inequality worse.

Judicial Training

Judges don’t need to be engineers, but they should know how to ask simple questions about bias, reliability, error rates, and validation. Without this, courts might rely on technical authority without fully understanding it.

Human Oversight and Limits on Use

AI should always be helpful. Courts should not let algorithmic outputs decide things. Instead, they should require independent judicial reasoning and give clear reasons when they do use algorithmic outputs.

CONCLUSION

AI tools can support courts, but they also risk importing opaque, biased, and unchallengeable reasoning into legal decisions. Traditional expert evidence is tolerated because it is transparent, contestable, and accountable. Many algorithmic tools fail those conditions. Accordingly, AI generated outputs should not be treated as expert witnesses by default. Instead, courts must insist on disclosure, explainability, bias auditing, and the preservation of meaningful challenge. The guiding principle should be simple: technology must serve justice, not replace the conditions that make justice legitimate.

BIBLIOGRAPHY

Cases

Osman v United Kingdom (1998) 29 EHRR 245
R v Doheny [1997] 1 Cr App R 369
R v Turner [1975] QB 834
R v Bonython (1984) 38 SASR 45
State v Loomis 881 NW 2d 749 (Wis 2016)

Legislation and Materials

European Convention on Human Rights (ECHR) art 6.

Books and Reports

Richard Susskind, Tomorrow’s Lawyers: An Introduction to Your Future (2nd edn, OUP 2017).

Journal Articles

Danielle Keats Citron, ‘Technological Due Process’ (2008) 85 Washington University Law Review 1249.
Frank Pasquale, ‘A Rule of Persons, Not Machines’ (2019) 87 George Washington Law Review 1.

1Richard Susskind, Tomorrow’s Lawyers: An Introduction to Your Future (2nd edn, OUP 2017)

2 State v Loomis 881 NW 2d 749 (Wis 2016)

3 R v Bonython (1984) 38 SASR 45

4 Frank Pasquale, ‘A Rule of Persons, Not Machines’ (2019) 87 George Washington Law Review 1

5 European Convention on Human Rights (ECHR) art 6

6 State v Loomis 881 NW 2d 749 (Wis 2016)

7 R v Turner [1975] QB 834

8 R v Bonython (1984) 38 SASR 45

9 Danielle Keats Citron, ‘Technological Due Process’ (2008) 85 Washington University Law Review 1249

10 Citron (n 9)

11 Pasquale (n 4)

12 R v Doheny [1997] 1 Cr App R 369

13 ECHR art 6

14 Osman v United Kingdom (1998) 29 EHRR 245

15 State v Loomis 881 NW 2d 749 (Wis 2016)

Authored By: Khadija Bilal

Pakistan College of Law

Related Posts

Leave a Comment Cancel Reply