Home AI Companies News Hugging Face There’s a brand new AI agent that can browse the web, fill...

There’s a brand new AI agent that can browse the web, fill out forms and more without you having to touch your mouse.

0
There’s a brand new AI agent that can browse the web, fill out forms and more without you having to touch your mouse.
(Image credit: Hugging Face)
The Open Computer Agent, a demo that is open-source, can see what’s displayed on screen, click on buttons, fill in forms, and perform tasks step-by-step like a person.
  • Hugging Face debuted a new AI tool to navigate the web for you.
  • Using a real browser, the Open Computer Agent can complete tasks such as getting directions or booking a ticket.
  • Both the agent and the open-source demonstration can Open Computer Agent, a free (but limited) tool that acts as a personal assistant in your web browser.

    As part of the ongoing “smolagents”the Open Computer Agent can interact with websites and apps just like you, using an invisible keyboard and mouse to complete requests. The AI can open up a browser, enter information into forms, click on buttons, and much more. It will go to Google Maps and enter your origin and destination. Then it will show you the route, like a faithful digital chauffeur. You can test it out yourself by using the live demo. Due to its popularity, there are some delays and errors as a result of a backlog.

    We’re launching Computer Use in Smolagents. As vision models improve, they can power more complex agentic workflows. Qwen-VL model support built-in groundeding, i.e. ability to locate any element in an image by its coordinates, thus to… pic.twitter.com/mI8MuWZkIS May 6, 2025 (19659011)

    Agent you have

    Open Computer Agent is a philosophy that has inspired similar tools such as OpenAI’s Operator and Opera’s browser operator. Hugging Face’s AI Agent is a similar tool that encourages users to be active participants rather than passive sources of information.

    Open Computer Agent, like Browser Use, is open-source. Anyone can see how it functions and build on it, or even tweak it to fit niche use cases. The agent is a starting point for something more flexible. It’s not a finished, legalized product. The demo is a demonstration and not a polished product. It can make mistakes and force you to log in for CAPTCHA tests and logins.

    Many people would love to be able book tickets, check store hours, do searches, look up directions, or click through menus with just a natural language prompt. It’s one to ask ChatGPT to find cheap flights. It’s one thing to ask ChatGPT to find cheap flights. It’s quite another to watch the tool try to book a flight by going to a travel site, scrolling through listings and clicking “book now.”

    Sign up to receive breaking news, reviews and opinions, as well as top tech deals.

    Opera’s web browser and AI assistant is now available on iOS

    Eric Hal Schwartz has been a freelance writer at TechRadar for more than 15 years. He has covered the intersection of technology and the world. He was the head writer of Voicebot.ai for five years and was at the forefront of reporting on large language models and generative AI. Since then, he has become an expert in the products of generative AI, including OpenAI’s ChatGPT and Anthropic’s Claude. He also knows Google Gemini and all other synthetic media tools. His experience spans print, digital and broadcast media as well as live events. He’s now continuing to tell stories that people want to hear and need to know about the rapidly changing AI space and the impact it has on their lives. Eric is based out of New York City.

www.aiobserver.co

NO COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Exit mobile version