Learning Library

← Back to Library

Keyboard Control vs Screen Collaboration

Key Points

  • Two competing approaches are emerging: Anthropic’s Claude directly controls your keyboard and mouse, while OpenAI’s ChatGPT reads your screen and collaborates without taking control.
  • Claude’s “cursor” mode lets the LLM drive the UI, whereas ChatGPT’s new desktop app for Plus/Enterprise users merely observes specific apps (initially coding environments) and offers feedback.
  • The speaker finds ChatGPT’s read‑only assistance feels more stable and less risky, suggesting OpenAI will quickly broaden it to more applications.
  • This read‑only model is positioned as a step toward AI‑augmented development environments that provide insight and debugging help without automatically writing code.
  • Developers are split: some prefer the hands‑off guidance ChatGPT offers, while others favor tools like Cursor that can generate code directly.

Full Transcript

# Keyboard Control vs Screen Collaboration **Source:** [https://www.youtube.com/watch?v=Cj1m2O-Tmow](https://www.youtube.com/watch?v=Cj1m2O-Tmow) **Duration:** 00:03:21 ## Summary - Two competing approaches are emerging: Anthropic’s Claude directly controls your keyboard and mouse, while OpenAI’s ChatGPT reads your screen and collaborates without taking control. - Claude’s “cursor” mode lets the LLM drive the UI, whereas ChatGPT’s new desktop app for Plus/Enterprise users merely observes specific apps (initially coding environments) and offers feedback. - The speaker finds ChatGPT’s read‑only assistance feels more stable and less risky, suggesting OpenAI will quickly broaden it to more applications. - This read‑only model is positioned as a step toward AI‑augmented development environments that provide insight and debugging help without automatically writing code. - Developers are split: some prefer the hands‑off guidance ChatGPT offers, while others favor tools like Cursor that can generate code directly. ## Sections - [00:00:00](https://www.youtube.com/watch?v=Cj1m2O-Tmow&t=0s) **AI Keyboard Control vs Screen Collaboration** - The speaker compares Claude’s mouse‑driven automation with ChatGPT’s code‑reading assistance, highlighting both experimental approaches to AI‑augmented computer work. ## Full Transcript
0:00all right which would you rather have 0:02would you rather have the artificial 0:04intelligence control your keyboard and 0:06your mouse or would you rather have the 0:07artificial intelligence see your screen 0:09and collaborate with you those are the 0:12two bets that are being made right now 0:14Claude made the first bet and thropic 0:16released Claud for computer use and you 0:18can actually drive it around on the 0:19screen and use the mouse and all of that 0:21it runs at about a million tokens every 0:2315 0:24minutes on the other hand Chad GPT today 0:28released a 0:30update to their desktop app that allows 0:32you if you are paying for their service 0:35right if you're a plus or an Enterprise 0:37customer to use chat GPT to read 0:41specific apps on your computer so it's 0:44designed to read coding apps initially 0:46and they will expand it to other things 0:48eventually basically the bet is would 0:51you rather collaborate with Chad GPT 0:54while you are working on these apps and 0:56CH GPT can just look directly at what 0:58you're doing maybe at the code that 0:59you're writing and give you ideas or do 1:03you think it's more helpful for Claude 1:04to directly Drive the 1:07screen my sense is both of these are 1:11experimental and we are going to 1:12converge I see these as modalities of 1:15operation I think Claude has made the 1:18Bolder bet and the bet that is more 1:20likely to feel like a beta I played with 1:23with chat GPT it doesn't really feel 1:25like a beta even though they're labeling 1:27it a beta it reads your code just just 1:30fine and you can talk with it and it 1:32gives you perspective it feels really 1:35stable it also feels like a smaller bet 1:38for that company to 1:40make I suspect that because it feels 1:43stable they're going to be looking at 1:45expanding it in the next few weeks to 1:47cover cover other 1:49apps we will see but my sense 1:52is giving the llm the ability to read 1:55but not write is much much less risky 1:59than trying to get it to fully use the 2:01system autonomously and so that's a way 2:04for chat GPT to expand the usage to 2:08expand the surface of your work that it 2:12covers and to do so relatively 2:14efficiently right this is not 2:16necessarily a difference in intelligence 2:18for them it's just giving chat GPT a 2:21pair of eyes that it didn't have before 2:23to read very specific 2:25programs so we will see if you look at 2:28it this way it looks like the kind of 2:31positioning that is designed long-term 2:34to displace AI assisted development 2:36environments because you could be an any 2:38development environment and be AI 2:39assisted right there now it doesn't 2:41actually write the 2:43code directly in the development 2:45environment 2:46yet and so one of the key differences 2:49for example with cursor is that cursor 2:51literally will write the code using the 2:53large language model 2:56and I have heard varying responses from 2:59developers on this that if you're a 3:00developer I know you have opinions some 3:02developers are going to prefer the model 3:05that chat GPT just released today where 3:07it doesn't write the code but it gives 3:09you a perspective it gives you an 3:10opinion it helps you 3:12debug so I think there's a market for 3:14this if you have downloaded the new chat 3:16GPT give it a try let me know what you 3:19think