Learning Library

← Back to Library

Balancing AI and Human Judgment

Key Points

  • Deciding whether a human or an AI should make a particular decision depends on the task’s nature, with AI generally outperforming humans on many statistical decisions but humans excelling when nuanced judgment and context are needed.
  • In fraud detection, AI can filter the bulk of alerts by assigning confidence scores, achieving high accuracy on clearly high‑ or low‑confidence cases, while human analysts handle the ambiguous alerts where AI confidence is low.
  • Performance curves show AI’s success rate rises sharply with confidence, whereas humans maintain a flatter curve, often outperforming AI at the mid‑range (around 50 % confidence) due to their ability to incorporate external information and flexible reasoning.
  • The optimal solution is a hybrid system that routes high‑confidence alerts to AI for efficiency and delegates uncertain or complex cases to skilled analysts, leveraging the strengths of both.

Full Transcript

# Balancing AI and Human Judgment **Source:** [https://www.youtube.com/watch?v=8lo1s29ODj8](https://www.youtube.com/watch?v=8lo1s29ODj8) **Duration:** 00:08:54 ## Summary - Deciding whether a human or an AI should make a particular decision depends on the task’s nature, with AI generally outperforming humans on many statistical decisions but humans excelling when nuanced judgment and context are needed. - In fraud detection, AI can filter the bulk of alerts by assigning confidence scores, achieving high accuracy on clearly high‑ or low‑confidence cases, while human analysts handle the ambiguous alerts where AI confidence is low. - Performance curves show AI’s success rate rises sharply with confidence, whereas humans maintain a flatter curve, often outperforming AI at the mid‑range (around 50 % confidence) due to their ability to incorporate external information and flexible reasoning. - The optimal solution is a hybrid system that routes high‑confidence alerts to AI for efficiency and delegates uncertain or complex cases to skilled analysts, leveraging the strengths of both. ## Sections - [00:00:00](https://www.youtube.com/watch?v=8lo1s29ODj8&t=0s) **Human‑AI Decision Allocation in Fraud Detection** - The speaker explains how to split fraud‑alert handling between analysts and an AI by using confidence‑score performance curves to assign high‑certainty cases to humans and low‑uncertainty ones to the algorithm. - [00:03:09](https://www.youtube.com/watch?v=8lo1s29ODj8&t=189s) **AI Confidence vs Human Judgment** - The passage explains that AI outperforms humans when its confidence is high, humans surpass AI when its confidence is low, and merging both via augmented intelligence creates a balanced, intermediate performance curve. - [00:06:18](https://www.youtube.com/watch?v=8lo1s29ODj8&t=378s) **Optional AI Display Reduces Bias** - The speaker explains that showing AI fraud recommendations only when analysts request them mitigates automation bias and trust concerns by allowing a human first impression, while noting that displaying accuracy percentages can further diminish reliance on the AI. ## Full Transcript
0:00A decision needs to be made. 0:05But who should make it? 0:08Me, a human, ... or an artificial intelligence, an AI? 0:21We've discussed before that humans can outperform AI at some tasks, 0:26but that, statistically, AI will make a better job of deciding for other tasks. 0:30So for one single decision, who should decide? 0:35Well, the answer is a fascinating combination of holistic curves and human bias. 0:40Let's get into it. 0:42So, consider a fraud detection system. 0:46Fraud detection. 0:52The system generates the alerts of potentially fraudulent transactions. 0:57Financial analysts review each alert. 1:01Now, there's thousands of events generated each day, 1:03and the analysts are overwhelmed with 90 percent of those alerts being false positives. 1:09An AI system could help alleviate the workload. 1:13But which alerts should the AI handle, and which should be processed by a skilled financial analyst? 1:21Well, let's draw a graph to answer the question, "Is this a real alert?" 1:39So, let's draw a graph with an X and Y axis. 1:45The Y axis tracks the success rate. 1:52So an alert comes in, we make a prediction as to if it is real or not, 1:58and we track if that prediction turned out to be right. 2:03Along the X axis is the confidence score. 2:10So a confidence score of zero percent 2:14says a prediction thinks that this is definitely not a real alert, it's a false positive. 2:21A confidence score of 100 percent 2:25means that a prediction is certain that it is a real alert. 2:31Now a typical AI performance curve will look something like this. 2:43So we've got very low confidence scores, this is not a real alert, 2:48and very high confidence scores, this is a real alert. 2:53They're correlated to a high success rate. 2:57That's these areas up here. 2:59When the AI is not sure about a given prediction, then it's not such a case. 3:06Lower success rate when the AI is not sure. 3:10And so effectively the AI algorithm is saying, "I don't know". 3:15Now, human performance curves are typically a little bit flatter than that. 3:20So the human's performance curve might look something like this. 3:27Often not quite as accurate as a very confident AI algorithm, 3:31but a little better at making the right decision when the AI is unsure. 3:36At a 50 percent confidence level, a human is likely to do a better job than an AI. 3:45Now why is that? 3:46Well, when an AI is certain of itself, 3:49it's highly performant and beats out humans who can lose consistency and focus and attention. 3:54AIs, they don't get distracted. 3:57But on the other hand, when an AI is unsure, 4:00often for cases that are complex or statistically rare, 4:04humans can outperform an AI prediction by bringing in additional information and context. 4:10They can look stuff up or ask a colleague, 4:12whereas the AI sticks to its same old decision logic and information. 4:18So when a new alert comes in, if the AI assigns a high or low confidence level, 4:24then chances are that statistically speaking, it will do a better job of deriving if that alert is real 4:32or a false positive, than a given financial analyst. 4:35But this is not a zero sum game. 4:39It doesn't have to be AI or human. 4:43We have one more option. 4:46Augmented. 4:51Augmented intelligence combines both a human decision, aided by AI, 4:57and this performance curve falls somewhere between the two. 5:04And for somewhat low and for somewhat high confidence scores, 5:09which make up a significant number of predictions, 5:11it's augmented intelligence that will have the highest success rate. 5:19Except ... 5:20... for augmented intelligence to be most effective, we need to account 5:25for the messy business of human cognitive bias. 5:35We're not always great at doing what we're told. 5:40It turns out that how we present information from an AI algorithm to a human decision maker 5:47has a significant influence on how effectively that information is used. 5:52So, to illustrate that, let's consider forced display vs. optional display. 6:10A forced display simultaneously displays an AI recommendation along with a given decision case. 6:18So, for every fraud decision alert that I need to make a decision about, 6:22I, as the analyst, also see the AI's recommendation. 6:27And this can lead to something called automation bias, 6:30which is the propensity for humans to favor suggestions from automated decision making systems 6:36and to ignore contradictory information. 6:40Effectively, the human decision maker is saying the AI knows best 6:44and going with the AI prediction at the expense of their own judgment. 6:49Optional display means the AI recommendation is only shown to the human decision maker when they request it. 6:58So, a person sees a decision case and can then ask the AI to reveal its recommendation. 7:05This overcomes automation bias 7:07by giving a person time to consider the case for themselves before consulting an AI recommendation. 7:15The human is not overwhelmingly influenced by what the AI thinks 7:19because they've had a chance to make up their own first impression. 7:25And then there's the whole issue of trust, too. 7:30When an AI recommendation is accompanied by an accuracy percentage, 7:37which indicates how likely this prediction is to be correct, 7:39humans are less likely to incorporate the AI recommendation into their decision, 7:45regardless of the accuracy percentage being displayed. 7:49Basically, we don't like recommendations that openly tell us that they might be wrong. 7:55So, we've seen that who should make a decision, a human, an AI, 8:00or a human assisted by an AI recommendation, is something that we can derive. 8:06We can move from subjective decisions to the quantifiable. 8:11That for a given decision who the most effective decision maker is likely to be. 8:16And when the most effective decision maker is a combination of AI and human, that's augmented intelligence, 8:23we must consider a presentation of that augmentation to minimize human cognitive bias in the decision making process. 8:31Brought together, us humans and AI algorithms make a pretty powerful team. 8:38We can improve decision making outcomes - if we just know who to ask. 8:45If you have any questions, please drop us a line below, 8:48and if you want to see more videos like this in the future, please like and subscribe. 8:53Thanks for watching.