OpenAI’s Agent Has a Problem: Before It Does Anything Important, You Have to Double-Check It Hasn’t Screwed Up

Operator, OpenAI's brand new AI agent, doesn't quite deliver the hands-off experience some might hope it would.

Behold Operator, OpenAI's long-awaited agentic AI model that can use your computer and browse the web for you. 

It's supposed to work on your behalf, following the instructions it's given like your very own little employee. Or "your own secretary" might be more apt: OpenAI's marketing materials have focused on Operator performing tasks like booking tickets, restaurant reservations, and creating shopping lists (though the company admits it still struggles with managing calendars, a major productivity task.) 

But if you think you can just walk away from the computer and let the AI do everything, think again: Operator will need to ask for confirmation before pulling the trigger on important tasks, which throws a wrench into the premise of the AI agent acting on your behalf, since the clear implication is you need to make sure it's not screwing up before allowing it any real power.

"Before finalizing any significant action, such as submitting an order or sending an email, Operator should ask for approval," reads the safety section in OpenAI's announcement.

This measure highlights the tension between keeping stringent guardrails on AI models while allowing them to freely exercise their purportedly powerful capabilities. How do you put out an AI that can do anything — without it doing anything stupid?

Right now, a limited preview of Operator is only available to subscribers of the ChatGPT Pro plan, which costs an eye-watering $200 per month. 

The agentic tool uses its own AI model called Computer-Using Agent to interact with its virtual environment — as in use mouse and keyboard actions — by constantly taking screenshots of your desktop. 

The screenshots are interpreted by GPT-4o's image-processing capabilities, theoretically allowing Operator to use any software it's looking at, and not just ones designed to integrate with AI.

But in practice, it doesn't sound like the seamless experience you'd hope it to be (though to be fair, it's still in its early stages). When the AI gets stuck, as it still often does, it hands control back to the user to remedy the issue. It will also stop working to ask you for your usernames and passwords, entering a "takeover mode."

It's "simply too slow," wrote one user on the ChatGPTPro subreddit in a lengthy writeup, who said they were "shocked" by its sluggish pace. "It also bugged me when Operator didn't ask for help when it clearly needed to," the user added. In reality, you may have to sit there and watch the AI painstakingly try to navigate your computer, like supervising a grandparent trying their hand at Facebook and email.

Obviously, safety measures are good. But it's worth asking just how useful this tech is going to be if it can't be trusted to work reliably without neutering it.

And if safety and privacy are important to you, then you should already be uneasy with the idea of letting an AI model run rampant on your machine, especially one that relies on constantly screenshotting your desktop.

While you can opt out of having your data being used to train the AI model, OpenAI says that it will store your chats and screenshots up to 90 days on its servers, TechCrunch reported, even if you delete them.

Because Operator can browse the web, that means it will potentially be exposed to all kinds of danger, including attacks called prompt injections that could trick the model into defying its original instructions.

More on AI: Rumors Swirl That OpenAI Is About to Reveal a "PhD-Level" Human-Tier Intelligence

The post OpenAI's Agent Has a Problem: Before It Does Anything Important, You Have to Double-Check It Hasn't Screwed Up appeared first on Futurism.

Visit link:
OpenAI's Agent Has a Problem: Before It Does Anything Important, You Have to Double-Check It Hasn't Screwed Up

Huge Study Finds Constellation of Health Benefits for Ozempic Beyond Weight Loss

In a ginormous new study, researchers have begun mapping the manifold health benefits of drugs like Ozempic and Wegovy beyond weight loss. 

In a ginormous new study, researchers have begun mapping the manifold health benefits of drugs like Ozempic and Wegovy beyond weight loss.

Published in the journal Nature Medicine, this new study led by Ziyad Al-Aly of the Veteran's Affairs health system in St. Louis tracked millions of diabetes patient outcomes over a period of 3.5 years.

Of those, over 215,000 had been prescribed a glucagon-like peptide-1 (GLP-1) agonist receptor — the class of drugs that includes Ozempic, Wegovy, Mounjaro, Zepbound, and others — and 1.7 million were on another form of blood sugar-lowering medicine.

Looking at other disorders in the data ranging from Parkinson's disease and Alzheimer's to kidney disease and opiate addiction, Al-Aly and his team found that those who were on GLP-1 medications saw significant improvement across a staggering range of health concerns — and far beyond anything clearly linked to weight or blood sugar.

Though many studies have found that these blockbuster drugs seem to be beneficial for specific disorders, "no one had comprehensively investigated the effectiveness and risks of GLP-1 receptor agonists across all possible health outcomes," the physician-scientist told Nature.

In particular, Al-Aly said that the drugs' impact on addiction disorders "stood out" to him, with 13 percent of the GLP-1 cohort who had issues with addiction seeing improvement — a finding that dovetails with other studies about these drugs and their effect on addiction.

Other apparent benefits were even harder to make sense of. Al-Aly and his team also discovered that psychotic disorder risk was lowered by 18 percent for the GLP-1 cohort, and the Alzheimer's risk was cut by 12 percent.

"Interestingly, GLP-1RA drugs act on receptors that are expressed in brain areas involved in impulse control, reward and addiction — potentially explaining their effectiveness in curbing appetite and addiction disorders," Al-Aly said in a statement published by the University of Washington, which was also involved in the study. "These drugs also reduce inflammation in the brain and result in weight loss; both these factors may improve brain health and explain the reduced risk of conditions like Alzheimer’s disease and dementia."

While those findings are indeed incredible, the researchers also found that other issues seemed to be exacerbated by taking GLP-1s. Along with an 11 percent increase in arthritis risk, the team found a whopping 146 percent increase in cases of pancreatitis — another discovery that complements prior research into the drugs' dark side.

Though that figure is pretty jarring, Al-Aly seemed to take it in stride.

"Given the drugs’ newness and skyrocketing popularity, it is important to systematically examine their effects on all body systems — leaving no stone unturned — to understand what they do and what they don’t do," he said in the UWash press release.

By looking so deeply into the drugs, these scientists are, as Al-Aly puts it, drawing a "comprehensive atlas mapping the associations" of GLP-1 drugs that looks into all of their effects on the body — an important quest as they continue to rise in popularity and usage.

More on GLP-1s: Woman Annoyed When She Gets on Wegovy and It Does Nothing

The post Huge Study Finds Constellation of Health Benefits for Ozempic Beyond Weight Loss appeared first on Futurism.

See more here:
Huge Study Finds Constellation of Health Benefits for Ozempic Beyond Weight Loss

Mark Zuckerberg Shows Off Bizarre Video of Himself Leg Pressing Chicken Nuggets

AI might be worsening carbon emissions, but at least we have this fake video of Mark Zuckerberg leg-pressing chicken nuggets, we guess.

Combo Meal

Meta-formerly-Facebook announced a new suite of AI-powered video-creating and editing tools today, collectively called "Meta Movie Gen."

Longtime CEO Mark Zuckerberg showed off the new AI offering in his favorite way to promote anything: by showing off his love for fitness — albeit with some very strange, very AI twists.

In a bizarre Instagram video, Zuck can be seen doing leg presses in a series of increasingly strange AI-generated settings. In the first scene, he's pictured using the machine in a neon-lit gym; in the next, he's dressed like Caeser and pictured against a distinctly ancient Roman backdrop. At one point he's pressing dripping racks of gold.

Then, in perhaps the strange scene of all, Zuck is suddenly pictured leg-pressing a large bucket of chicken nuggets whilst surrounded by a sea of french fries.

"Every day is leg day with Meta's new MovieGen AI model that can create and edit videos," Zuck captioned the video. "Coming to Instagram next year."

Sure! Why not. Generative AI might be guzzling energy and drastically worsening carbon emissions in the process, but we get... a fake billionaire nugget press. Will somebody please make it make sense?

Mixed Reactions

The top comments on the video were overwhelmingly positive.

"Whoa!" wrote one impressed Instagram user. "That's exciting!!"

But other Instagram users were more skeptical.

"Second richest man in the world spending his [research & development] money on this," commented one user, seemingly incredulous of Meta's resource allocation.

"How many artists did you steal from to train your AI?" asked another netizen. A fair question, given that Zuck recently drew criticism for declaring that "individual creators or publishers tend to overestimate the value of their specific content."

Looking Ahead

In a press release, Meta characterized Movie Gen as an "advanced and immersive storytelling suite of models" with "four capabilities: video generation, personalized video generation, precise video editing, and audio generation."

But the chicken nugget promo aside, there's no set release date for the tool.

"We aren't ready to release this as a product anytime soon," Meta's chief product officer Chris Cox wrote in a Threads post, "but we wanted to share where we are since the results are getting quite impressive."

Or, alternatively, Meta wants its shareholders to know that a competitor to OpenAI's Sora model is in the works — and that Zuck can leg press copious amounts of chicken nuggets.

More on Mark Zuckerberg: Zuckerberg Says It's Fine to Train AI on Your Data Because It Probably Has No Value Anyway

The post Mark Zuckerberg Shows Off Bizarre Video of Himself Leg Pressing Chicken Nuggets appeared first on Futurism.

See more here:
Mark Zuckerberg Shows Off Bizarre Video of Himself Leg Pressing Chicken Nuggets

NASA Engineers Were Disturbed by What Happened When They Tested Starliner’s Thrusters

Later this week, Boeing's plagued Starliner is set to attempt its return journey from the International Space Station.

But instead of ferrying NASA astronauts Butch Wilmore and Suni Williams back to the ground, it'll be undocking and reentering without any crew on board — after a software update, that is, because it was originally unable to fly without astronauts inside it.

Even before the ill-fated capsule launched in early June, engineers noticed several helium leaks. During Starliner's docking procedures, the leaks quickly turned into a real problem. The spacecraft missed its first attempt to dock with the space station.

Ever since, Boeing and NASA engineers have been struggling to identify the root cause of the problem.

At first, NASA remained adamant that it was simply a matter of routine procedure to investigate the mishap before imminently returning Wilmore and Williams on board Starliner. The agency repeatedly fought off reports that the two astronauts were "stranded" in space, arguing that engineers just needed a little more time to figure out the issue.

But it didn't take long for NASA to change its tune. While attempting to duplicate the issue at NASA’s White Sands Test Facility in New Mexico, engineers eventually found what appeared to be the smoking gun, as SpaceNews' Jeff Foust details in a detailed new breakdown of the timeline.

A Teflon seal in a valve known as a "poppet" expanded as it was being heated by the nearby thrusters, significantly constraining the flow of the oxidizer — a disturbing finding, because it greatly degraded the thrusters' performance.

Worse, without being able to perfectly replicate and analyze the issue in the near vacuum of space, engineers weren't entirely sure how the issue was actually playing out in orbit.

During a late August press conference announcing its decision to send Starliner back empty, NASA commercial crew program manager Steve Stich admitted that "there was just too much uncertainty in the prediction of the thrusters."

"People really want to understand the physics of what's going on relative to the physics of the Teflon, what's causing it to heat up and what's causing it to contract," he admitted. "That's really what the team is off trying to understand. I think the NASA community in general would like to understand a little bit more of the root cause."

While engineers found that the thrusters had returned to a more regular shape after being fired in space, they were worried that similar deformations might take place during prolonged de-orbit firings.

A lot was on the line. Without perfect control over the thrusters, NASA became worried that the spacecraft could careen out of control.

"For me, one of the really important factors is that we just don’t know how much we can use the thrusters on the way back home before we encounter a problem," NASA associate administrator for space operations Ken Bowersox said, as quoted by SpaceNews.

"If we had a way to accurately predict what the thrusters would do all the way through the deorbit burn and through the separation sequence, I think we would have taken a different course of action," Stich said during last month's teleconference. "But when we looked at the data and looked at the potential for thruster failures with a crew on board... it was just too much risk."

That's a polite way of saying that NASA had very serious concerns. According to Faust's reporting, the saga evolved into "NASA’s biggest human spaceflight safety crisis since the shuttle Columbia accident more than two decades ago."

Earlier this week, NASA announced that Starliner's uncrewed undocking will take place as soon as Friday evening.

Wilmore and Williams will stay behind, presumably watching as their ride to space departs without them.

The two astronauts will have to be patient as their ersatz shuttle, SpaceX's Crew-9 mission, won't arrive until no sooner than September 24. Even then, the pair will have to wait until the Crew Dragon spacecraft returns to Earth in February, extending what was supposed to be an eight-day mission into an eight-month affair.

More on Starliner: Astronauts Hear Strange Sounds Coming From Boeing's Cursed Starliner

The post NASA Engineers Were Disturbed by What Happened When They Tested Starliner's Thrusters appeared first on Futurism.

Read more from the original source:
NASA Engineers Were Disturbed by What Happened When They Tested Starliner's Thrusters