OpenAI Atlas: AI Browser Review
Key Points
- OpenAI introduced Atlas, an AI‑enabled web browser that adds a persistent chat assistant sidebar, mirroring the “smart‑browser” model popularized by tools like Perplexity’s comment browser.
- In a live demo, the assistant successfully generated and styled a PowerPoint slide deck—handling layout, color schemes, and content expansion—though it struggled with finer formatting details such as precise text‑color placement.
- The reviewer found Atlas to be the most practically useful AI‑browser utility they’ve seen in a while, noting its ability to execute commands (e.g., adjusting font size) while the user multitasks across other tabs.
- While Atlas isn’t yet powerful enough to dethrone Chrome, the reviewer is bullish on its development trajectory and the broader possibilities of an AI‑driven browser modality.
- The tool shows promise for automating repetitive web tasks (e.g., folder creation, content editing), potentially freeing up significant time for users.
Sections
- OpenAI's Atlas Browser Review - The speaker explains OpenAI’s AI‑enabled Atlas browser, showcases its sidebar chat assistant creating a PowerPoint presentation, and highlights its strong styling assistance alongside its formatting limitations.
- Leveraging LLMs for Linear Browser Tasks - The speaker advises assigning low‑ambiguity, linear tasks—like folder creation, writing assistance, and simple spreadsheet calculations—to browser‑based language models to ensure they perform effectively and reduce user frustration.
- AI Browsers and Prompt Injection Risks - The speaker warns that AI‑powered browsers, while useful for summarizing web content and interpreting media, are vulnerable to prompt injection attacks that can exploit any text on a page, raising unresolved safety concerns about their rapid, unsupervised use.
- Invitation to Share Opinions - The speaker expresses curiosity and encourages the other person to provide their perspective or information.
Full Transcript
# OpenAI Atlas: AI Browser Review **Source:** [https://www.youtube.com/watch?v=9ydIVzh7TBo](https://www.youtube.com/watch?v=9ydIVzh7TBo) **Duration:** 00:10:03 ## Summary - OpenAI introduced Atlas, an AI‑enabled web browser that adds a persistent chat assistant sidebar, mirroring the “smart‑browser” model popularized by tools like Perplexity’s comment browser. - In a live demo, the assistant successfully generated and styled a PowerPoint slide deck—handling layout, color schemes, and content expansion—though it struggled with finer formatting details such as precise text‑color placement. - The reviewer found Atlas to be the most practically useful AI‑browser utility they’ve seen in a while, noting its ability to execute commands (e.g., adjusting font size) while the user multitasks across other tabs. - While Atlas isn’t yet powerful enough to dethrone Chrome, the reviewer is bullish on its development trajectory and the broader possibilities of an AI‑driven browser modality. - The tool shows promise for automating repetitive web tasks (e.g., folder creation, content editing), potentially freeing up significant time for users. ## Sections - [00:00:00](https://www.youtube.com/watch?v=9ydIVzh7TBo&t=0s) **OpenAI's Atlas Browser Review** - The speaker explains OpenAI’s AI‑enabled Atlas browser, showcases its sidebar chat assistant creating a PowerPoint presentation, and highlights its strong styling assistance alongside its formatting limitations. - [00:03:11](https://www.youtube.com/watch?v=9ydIVzh7TBo&t=191s) **Leveraging LLMs for Linear Browser Tasks** - The speaker advises assigning low‑ambiguity, linear tasks—like folder creation, writing assistance, and simple spreadsheet calculations—to browser‑based language models to ensure they perform effectively and reduce user frustration. - [00:06:21](https://www.youtube.com/watch?v=9ydIVzh7TBo&t=381s) **AI Browsers and Prompt Injection Risks** - The speaker warns that AI‑powered browsers, while useful for summarizing web content and interpreting media, are vulnerable to prompt injection attacks that can exploit any text on a page, raising unresolved safety concerns about their rapid, unsupervised use. - [00:10:03](https://www.youtube.com/watch?v=9ydIVzh7TBo&t=603s) **Invitation to Share Opinions** - The speaker expresses curiosity and encourages the other person to provide their perspective or information. ## Full Transcript
OpenAI heated up the browser wars by
launching their browser that is AI
enabled called Atlas. I'm going to get
into what they launched, what the
promise was, and where I think it works
and where I think it doesn't. So, first
off, what did they launch? It is a
browser that looks a lot like Chrome or
any other browser you might use, except
it has a chat assistant in the side. In
this sense, it's very similar to other
smart browsers that have already
launched. The comment browser from
Perplexity comes to mind. Exactly the
same idea. You just launch it in the
sidebar and you do your task with the
chat and then you have the main browser
pane just the way it always is. I can
actually show you what that looks like.
So, here we are in the browser. I'm
actually working on a presentation about
the browser. I realize that's super
meta, but you can see I have a little
chat here, right? And you can kind of
interact with the chat. In this case,
what I had the agent do is to help me
create the PowerPoint presentation about
the agent. And just to be transparent,
it was able to lay out the styling. So,
it got this green and this blue
highlight uh really effectively. It was
able to lay out the title. It was able
to lay out the dark background look very
professional. It also was able to take
and expand the copy really effectively.
Where it didn't work as well was it
struggled with some of the details of
formatting. I know this doesn't look
like fancy formatting, but just to be
transparent, getting it to do white text
on background was not something it was
super good at. But that's a minor
nitpick, right? I don't want to go into
that. I want to talk about the larger
theme here. I wanted to try something
that was a real life task and just to
play with it and see how it worked. And
I got to say this is closer to useful
browser utility with AI than I've had in
a while. And that's what makes me
bullish on the trajectory here. I don't
know that this browser by itself today
is knock it out of the water.
Incredible. We're going to dethrone
Chrome. But I see the trajectory that
this team is shipping on and I'm super
interested in where they go next. And I
do think there's some really interesting
opportunities in the way this browser
modality works. So the way it interacts.
So for example, if I say
uh please adjust the font size on this
slide, it's going to take that challenge
as we chat and actually start to do
something. You can see it start to
organize and turn the sparkles on and
actually do something with my screen
while I go and do other work. And so if
I go and I look at other things, I can
come back later. If I go and look at my
LinkedIn, for example, I can then come
back to my slide deck and see that it
continues to work. Ironically, what the
slide deck is showing is one of the
things that I think is most useful here.
Think about the boring work that you do
on the web. Things like automating
folder creation. That takes a lot of
time. You can do that work much more
easily if you just open a tab and you
ask this browser to do it. Similarly,
you can think about the work you would
do as a writing coach. You can see it
has these issues, right? Like it
actually made it worse. And this is why
I don't want to overpromise it. This
thing does struggle with some of the
aesthetics and some of the direct tasks.
I also asked it to book a yoga class for
me. It eventually got through, but it
was about 10 times more painful than
booking it myself. And so I look at
these situations and I say, where are
their linear tasks that I can get into
that enable the browser to be at its
best, not at its worst. And so instead
of throwing it into something that has
fairly high ambiguity, like a
PowerPoint, how can I give it a task
where it's set up to succeed? In this
case, creating folders is super linear.
You can't screw it up. You create the
folder, you name the folder, and you're
done. Another one that I think is really
effective is being a second pair of eyes
on the screen. So, if you're writing
something, can it be a Grammarly or
writing coach for you on the side? I
realize that there are probably people
at Grammarly that will hear this and
sweat bullets, but like I think it's
fair. Like, it can literally look at
anything you're writing on the web and
give you a fairly thoughtful writing
critique. I think that's a great use
case for it. I think another great use
case is just letting the LLM do the
planning and thinking for you where you
have complexity. So, a great example is
look at this spreadsheet, perform some
simple calculations off the spreadsheet.
I don't really have time to and then
come back. As long as it's in the
browser and it can see it, it can do
that math for you. That can simplify
budgeting. That can simplify financial
tasks. it can simplify a lot of the
basic math and thinking that we do
around the web because a lot of it
happens in the browser anyway. A
creative one that I've come up with, I
don't know, you you tell me if you think
this is effective, but in theory, this
should work really well for time
tracking because you should be able to
investigate time spent per site or task
by interacting with the browser around
what you were actually doing. And that
brings up one of the most powerful
features about the Atlas browser, which
is it remembers more about you the more
you use Atlas. And so it will have an
Atlas specific memory set that is
private to you. They say where if you
are interacting with the browser more,
you get more value. The browser knows
you better. The browser knows your
previous chats. It knows the previous
places you visited and it understands
what you are trying to do because it has
seen your work already. So those are
some of the positives. If we turn around
and we look at some of the difficult
things, I think you saw some of them in
the little demo I gave you. Like there's
some challenges around ambiguous tasks
like PowerPoint deck creation. I think
there's also an unclear use case around
the value of the utility they're trying
to go for. So to unpack that, if it
books your yoga session for you in 20
minutes instead of you taking 2 minutes
to do it, is that really adding value
even if it is doing all the work? I have
questions. Another example, if it is
going to be able to shop for you, do you
lose the pleasure of shopping? Do you
lose the pleasure of planning the trip?
Planning the trip is one of the things
that is right on that browser's
suggested use case. So, we have to think
about what we want to do and whether we
want to delegate that task to an AI
browser. And that's going to become a
very real thing because this is not the
last update we'll see on this browser.
This is not the last browser that's
going to launch. AI browsers are a big
thing. And that brings me to my final
question, which is what about security?
Because right now, these browsers are
taking in the text from a website.
That's actually one of the great use
cases for them is they can summarize
text on a website. that you can use them
to look at YouTube videos and tell you
what's in the YouTube video. You could
probably do that for this video. But if
you use them to do that and you are on
the wrong kind of page, a prompt
injection attack is possible. If someone
has put text on that page that instructs
an LLM to do something malicious, the
LLM in the browser, it's not clear that
it can distinguish that. In fact, there
are known vulnerabilities in other AI
browsers that I expect would persist
here where the browser will treat every
piece of text it gets on the page, even
malicious text, as part of the prompt.
And then where are you? Because at that
point, it just follows the prompt and
the prompt injection attack succeeds.
And so, it seems like the LLM browser
builders expect us to just watch these
things browse around the web very
slowly. And that's how we protect
ourselves from prompt injection attacks.
But to actually have value, we need to
not have to watch them. They need to get
faster and we need to not have to watch
them. And it's not clear to me yet how
we can show and demonstrate safety. And
I know the teams at Perplexity building
the Comet browser, teams at OpenAI
building this browser, they care deeply
about security. So, I'm not suggesting
they don't, but I'd like to see the kind
of browser safety card development that
we've had with model safety cards where
we start to say, you know what, we know
this browser is safe in these ways
because we've tested it for these
vulnerabilities. This is the known risk
for this browser and this is what you
should use it for. because otherwise I
think there's an assumption that either
the browser is default safe which is
what Chrome has taught us and that's
dangerous here or there's an assumption
that it should never be used which I
think is also an overreaction. I do
think there's real value here for what
I'm going to call boring web work that
is low ambiguity. If I want to just set
it to do something very linear like
click around and triage my email for a
long time in the background and I'm
going to go off and do something else
and I don't care and it's fairly low
risk cuz maybe it's just making folders
for email. Fine. It can go do that.
Anything that is like that where it's
like you can't screw it up. You just
have to logically follow the task on the
web for a long time. It's going to be
great at that. And that's fine with me
because we humans don't love doing that
work. So if it wants to pick that up,
that's great. So overall, my grade for
this browser C plus B minus maybe it's
not really at a point where I think it's
going to overtake Chrome, but it is much
much better than the web browsing value
I've seen from OpenAI previously. So
agent mode, I didn't get a lot of value
from it. This is definitely better than
that. And so I see the trajectory of
this team and I could see in 6 months,
this is a really interesting browser. If
you want my quick take versus Comet, I
still prefer Comet a little bit because
I think it has some data inputs and
outputs on key sites that make it
useful. I use it for LinkedIn a lot
because I can see pending invitations
through the Comet browser and it's super
helpful. It also has great plugins to
calendar. I expect that will get fixed
with this browser, too. But it's not
there yet. And I find that the speed I
get from that direct data input output
is useful. I think we're going to get to
a two-speed web where we're going to
start to see those data inputs and
outputs becoming very useful for agentic
browsers where they're available and
we're going to see slower service sort
of off-road service where the browser
has to use the UI. That's my first
impression. What did you think of Atlas?
I'd be curious to hear.