Learning Library

← Back to Library

AI Black Friday Deal Showdown

11m • Unknown Channel • ai-ml • review • intermediate • Watch on YouTube ↗

Key Points

The experiment compared five AI tools—ChatGPT 5.1, Claude Opus 4.5, Gemini 3, and the Atlas and Comet smart browsers—to see which could locate the best Black Friday discount on a specific item (a gray sectional couch).
Clear, detailed intent in the prompt is crucial; vague instructions caused Comet to miss the color requirement and led to generic or incorrect results from the browsers.
Model‑based AIs (ChatGPT, Claude, Gemini) can combine web‑search results with their own reasoning, whereas the smart browsers rely more on the raw page content, leading to markedly different performance.
In the test, Atlas failed to surface any specific product deals (returning only generic articles), illustrating that without precise prompting the browsers struggle to deliver useful Black Friday recommendations.

Sections

Full Transcript

# AI Black Friday Deal Showdown **Source:** [https://www.youtube.com/watch?v=DcnTK7E1Ayc](https://www.youtube.com/watch?v=DcnTK7E1Ayc) **Duration:** 00:11:01 ## Summary - The experiment compared five AI tools—ChatGPT 5.1, Claude Opus 4.5, Gemini 3, and the Atlas and Comet smart browsers—to see which could locate the best Black Friday discount on a specific item (a gray sectional couch). - Clear, detailed intent in the prompt is crucial; vague instructions caused Comet to miss the color requirement and led to generic or incorrect results from the browsers. - Model‑based AIs (ChatGPT, Claude, Gemini) can combine web‑search results with their own reasoning, whereas the smart browsers rely more on the raw page content, leading to markedly different performance. - In the test, Atlas failed to surface any specific product deals (returning only generic articles), illustrating that without precise prompting the browsers struggle to deliver useful Black Friday recommendations. ## Sections - [00:00:00](https://www.youtube.com/watch?v=DcnTK7E1Ayc&t=0s) **AI Black Friday Deal Test** - The speaker compares five AI tools (ChatGPT 5.1, Claude Opus 4.5, Gemini 3, Atlas, and Comet) in finding the best Black Friday deal on a sectional couch, emphasizing that clear, detailed prompts are essential for accurate results. - [00:03:06](https://www.youtube.com/watch?v=DcnTK7E1Ayc&t=186s) **Evaluating Comet's Product Recommendations** - The speaker critiques Comet’s shopping suggestions, highlighting its ability to surface similar products with detailed info but noting a Walmart bias, lack of color specificity, and less‑than‑optimal price selections. - [00:06:32](https://www.youtube.com/watch?v=DcnTK7E1Ayc&t=392s) **Critique of LLM Product Recommendations** - The speaker evaluates ChatGPT, Claude, and Gemini’s shopping suggestions, noting missing links, confusing category groupings, inconsistent rankings, and ultimately rating their overall usefulness. - [00:10:03](https://www.youtube.com/watch?v=DcnTK7E1Ayc&t=603s) **Comparing AIs for Deal Hunting** - The speaker assesses several AI models—including GPT‑5.1, Claude, Gemini, and others—on their effectiveness at locating Black‑Friday and Cyber‑Monday deals, concluding that GPT‑5.1 outperforms the rest while each model shows distinct strengths and limitations. ## Full Transcript

0:00I put five AIs to the test on Black 0:02Friday and I want to share the results 0:03right here with you. I'm going to pull 0:04up the browsers and all of that in a 0:06minute. You're going to see what we did. 0:07First, what is the test? We are figuring 0:10out which AI is able to help us find the 0:14true best deal on Black Friday commonly 0:17discounted items. I chose a sectional 0:20couch for this, but you can absolutely 0:21use this for TVs or anything else that 0:23you're looking for. What are the 0:24contestants? Chad GPT 5.1, Claude Opus 0:284.5, Gemini 3, and two smart browsers. I 0:31went with Atlas and I went with Comet. 0:34I am going to show you what each of them 0:36did in a second. But first, I want to 0:38give you the takeaways. What are the 0:40overall learnings I had as I ran this 0:42test? And this will help you regardless 0:44of whether you're prompting for Black 0:45Friday or anything else. So, number one, 0:48the intent has to be very clear for the 0:52model to come back and give you what you 0:53want. I know that's not new, but it came 0:56through. You'll see in Comet that Comet 0:58did not figure out even though I was on 1:01a gray sectional couch page that I 1:04wanted a gray sectional couch. So, the 1:06color didn't come through because I 1:08didn't specify it. Anywhere you don't 1:10have clear intent, the model is just 1:12going to give you its best guess. That's 1:14why I built a prompt that is actually 1:16designed to let you specify as much as 1:20you want, in as much detail as you want, 1:22the thing you're looking for for Black 1:23Friday, so the model can go and get it. 1:25And you'll see what happens when you 1:27start to get clearer and how much more 1:29powerful the search gets. Overall, the 1:31browsers have a very different approach 1:33architecturally than the models do. The 1:35models can use web search tools and then 1:37their own inference or reasoning 1:39abilities to figure out what is the 1:41right answer to the prompt around a 1:43Black Friday deal. The web search tools 1:45are actually equipped with more 1:46information. They are able to look at 1:49the web page itself along with whatever 1:52prompt you give. Now, typically when we 1:54are browsing the web, we are using more 1:56casual language. And so, I opted for a 1:59split test where we have a commonly used 2:03phrase for Black Friday deals in the 2:05browsers along with all of that rich 2:07detailed information on the page to feed 2:09the model and a more detailed prompt in 2:12Chad GPT, Claude, and Gemini. What you 2:15see is what you get here, guys. Let's 2:17jump right in. Our first contestant is 2:19the Atlas browser. I just pulled up a 2:21love seat convertible couch. It is a se 2:25couch in the meaning of the word, but 2:27it's a very sort of casual couch. I 2:28wanted to see if the model would pick up 2:31on other options. What I got was really 2:34really disappointing. I got Black Friday 2:37sectional so sofa deals in general. I 2:40got crate and barrel. I got articles. I 2:43did not get any specific product. Right? 2:45If you look through, you will see that 2:47this was a couch and they just could not 2:49come up like the model just could not 2:51come up with specific deals for me at 2:53all. So, I grade that one as a failure. 2:55I don't think I got any useful results. 2:57And because of the way it wrote just 2:59general articles, I'm confident that 3:00whatever product I picked, I'm going to 3:03get general articles about that deal. 3:04It's not going to help me with the 3:06economic activity of shopping. Let's see 3:08what Comet had to say. Here's Comet. And 3:10already we can see it's a much more 3:12useful answer. it. Yes, I started with a 3:15different one. It doesn't matter in this 3:16case. It found specific products. That's 3:18the key differentiator, right? It's 3:20found products. And I will tell you, the 3:22first one is a gray match. It's actually 3:24a bed, though. So, it's finding products 3:26that are within the same cognitive 3:28family, but they're not exactly the 3:30same. I love the generative interface 3:32here where it shows you specific 3:33options. It gives you a short text 3:35description, it gives you a verified 3:36link, and it tells you the retailer. One 3:38thing you will notice is that this is a 3:40very Walmartheavy response. And I wonder 3:43if that's the prompting. There's another 3:44Walmart product. There's a Sam's Club 3:46product. Uh Wayfair. Wayfair does make 3:48an appearance. You can see it's not 3:50being color specific. If I'd asked for 3:52gray, it probably would have actually 3:54leaned into the gray side. I didn't 3:55specify my intent. And so that's what I 3:57get. Back to Walmart. One of the other 4:00things you will notice is that this is 4:01not super price optimized. So this is an 4:05$839 deal. This is a $1,200 couch. Uh 4:09and it's claiming this is around $7 to 4:11$900. and bringing me a $1,200 couch. 4:14That's a relevant price difference. And 4:15so, as much as the presentation is good, 4:18I found the actual substantive results 4:21not super great. Like, if I gave Atlas a 4:23failing grade, I would give Comet about 4:25a C for this. Let's see what we get when 4:28we move to other models. Okay, here's 4:31Chat GPT. I have a much more complex 4:34prompt here. If you're wondering, by the 4:35way, is it a fair test? Is a more 4:37complex prompt in a browser going to 4:39work better? You will see the results of 4:41that at the end of this video. Yes, I 4:42will be sharing this prompt on the 4:43substack. I'm actually going to be 4:44sharing multiple variants because 4:46running this has made me realize that 4:48you can skew this prompt to be more of a 4:50dealhound or a deal hunter. You can skew 4:53it to be more around the preferences. 4:54You want the details right, the gray 4:56couch, etc. So, I'll do a few variants, 4:58but overall, the product results are 5:02really clean. It gives you a visual. It 5:04shows you already deals that are better. 5:06I was not seeing deals this good. I 5:07wasn't seeing consistent deals this 5:09good. It has a really odd choice for a 5:10Black Friday deal. This is a very fancy 5:12$3,000 couch. I'm not quite sure why it 5:15chose to include that in the summary. Uh 5:18it is able to calculate value and so I'm 5:20able to get I guess five couches under 5:23600 bucks, four of which are under 500. 5:26So overall that the value density is 5:28higher on chat GPT. Now I still have to 5:30hunt around for the text um and sort of 5:33click on that to get what I want, but 5:34it's pretty good. It gives me a nice 5:36table. It gives me the percentage off. 5:38if I find that very handy. It does uh 5:40claim to give me the link over here, but 5:42it doesn't actually, which I think is a 5:44mess. Let's see what Claude had to say. 5:46Okay, here's Claude running the same 5:48prompt. By the way, this prompt is quite 5:50detailed. I opted to just write 5:52sectional couch right here and tell it 5:54to basically assume average for 5:56everything else because I'm lazy. I love 5:58that you can be lazy with prompts. Uh, 6:00it goes through a complete search. It 6:02comes through with a budget tier, which 6:04is really widely ranging right from 270 6:06bucks to 989. I'm not sure why those are 6:08in the same category. I think that's 6:10probably incorrect. If you look at the 6:12links, they appear to be just links to 6:14the overall website, not to the Pacific 6:17product, which I think is a mess. Uh, it 6:19gives me mid-range tier, uh, which it 6:22defines as anything$1 to $2,000, but 6:25then this is over $2,000. It doesn't 6:27give me features to help me understand. 6:29So, I think this is a mess. Uh, it does 6:32give me a ranking of overall best value, 6:34but that doesn't really make sense. like 6:35the Albany Park one shows up again. Uh 6:38it shows up again at a different price. 6:40I think I think this was over $3,000 6:42when I was looking at chat GPT. 6:46And I'm glad it found the Wayfair George 6:48Oliver modular. That's fine. I I I think 6:51overall this is less useful than the 6:53chat GPT answer. I I feel like the lack 6:56of specific links really really hurts 6:59and the categories and the way it 7:00grouped it doesn't really make sense. 7:02So, so far if you're looking at the 7:03LLMs, I would say chat GPT maybe a BB B 7:07minus. And I think we're looking at like 7:09a C++ maybe for Claude. It's there's 7:12some usefulness. There's some good 7:13value, but like it feels like it's on 7:15the bar with Comet, but with less 7:16graphics, except maybe a more in-depth 7:18search. So, that kind of makes up for 7:20it. What did Gemini do? Gemini dove 7:22right in and gave you all of its 7:25thoughts, came back with an overall 7:28price point, very detailed, but there 7:30were no links. and it gives me top 7:32picks, but it doesn't give me a way to 7:34get to it. It appears to lean into that 7:36George Oliver modular again. It comes 7:38back with the Hanbeay. You remember the 7:40Hanbeay was over on Comet. Um, it gives 7:44me Deal Hunter advice that's pretty 7:45generic. I would say this is in some 7:47ways even worse than Claude because it 7:49has fewer choices. It's less clear what 7:51the categories mean to it. And there are 7:55no links whatsoever, just like Claude, 7:56but there's just even less choice. So, 7:59if Claude was like trying to get to a 8:00C++ and if we had like a C-grade uh 8:04because of the sort of weird issues with 8:06pricing that Comet had, this might 8:08actually be a C minus. And so, right 8:10now, when I'm looking overall at where I 8:12would trust for getting a reliable rate 8:15on the best deals possible, I'm looking 8:18at Chad GPT 5.1 as a really good deal 8:21hunter. And it really was. It's a 8:22head-to-head test. But, we have one more 8:24to test. what happens with comet and a 8:27really detailed prompt. Okay, so I 8:29pasted in my super detailed prompt. I 8:32did exactly the same thing where I left 8:33a bunch of it unfilled out and used 8:35sectional couch and uh average price 8:38sensitivity, which is what I did with 8:39Chad GPT that got me good responses. Uh, 8:42and it comes back with a much more 8:43detailed answer here. It tells me what 8:45it checked. It comes back with deals. It 8:47has a whole table of them. It does not 8:48have a visual picker, which I kind of 8:50miss. Uh what's weird is if you look 8:53through these deals, I actually don't 8:54see that modular coming back. Uh and so 8:57it feels like it went and found a bunch 8:59of deals, but it wasn't necessarily able 9:02to categorize them as well or 9:03efficiently as Chad GPT was. Um it 9:07thinks that the five seat sleeper is the 9:08best deal now. Uh it's in the sort of 9:11it's in the market, but it's not as sort 9:12of runaway steel deal as that $270 9:15couch. Overall, what I am learning from 9:18this is that the shorter prompt, as 9:20ironic as it may be, I think the shorter 9:23prompt is probably more effective when 9:25you are working with an agentic browser. 9:27And the longer prompt, the more detailed 9:29prompt I showed you, is going to be more 9:31effective when you're working with, say, 9:33a chat GPT 5.1 and you want to get 9:36absolutely the best deal. And so, I'm 9:38always in the market for delivering 9:40value. My goal here is going to be to 9:42craft a series of really good Black 9:43Friday prompts that you can use to 9:46optimize for a deal, optimize for a 9:48specific item you're looking for. Um, 9:50and I'll pop those into the into the 9:51Substack. But I felt like this was 9:53really useful because one of the things 9:54OpenAI has done is they talk about GDP, 9:57right? This idea that AI is supposed to 9:59do economic activity that's useful. 10:01Well, everybody knows that Americans go 10:03shopping on Black Friday and Cyber 10:04Monday and everything else and when they 10:06do, they look for deals. That is an 10:08economic activity. This is really the 10:09first year when I could plausibly put 10:11five AIs up, including two browsers and 10:13and the three LLMs, and I could see what 10:15they were good at or not. And I would 10:16say really there is a difference. Ched 10:18GPT 5.1 was the most useful. And the 10:21others were okay. I would say the the 10:24probably the second best overall was 10:26Comet in terms of pleasantness of use. 10:28It just wasn't as accurate or complete 10:30as Chad GPT. I think Claude did okay, 10:33but not probably better than the others. 10:35And Gemini didn't do super well. It 10:37highlights to me how different these 10:38models are because you'll recall that 10:40I've done videos saying Opus 4.5 is 10:43really good at longrunning agentic 10:44tasks. Gemini 3 is really good at 10:47synthesizing complex documents. Well, 10:49these are different activities. This is 10:50an economic activity. This is finding 10:52deals. It turns out properly prompted 10:55that GPT 5.1 is the king at finding 10:57deals. All right, you can get the 10:58prompts. I'll run subsack.