Bret Taylor and Clay Bavor talk AI startups, AGI, and job disruptions – Semafor

Posted: February 22, 2024 at 7:59 pm

Q: When you decided you wanted to start a company, did you know what you were going to do?

Bret: We knew we wanted to empower businesses with this technology. One of our principles was, its hard, even in San Francisco, to follow the pace of progress. Every day, theres a new research paper, a new this, a new that. Imagine being a big consumer brand. How do you actually take advantage of this technology?

Its not like you can read research papers every week as the CEO of Weight Watchers. So we knew we wanted to enable businesses to consume this technology in a push-button way, a solution to bring this to every company in the world. We called on a lot of CIOs, CTOs, and CEOs of companies we worked with in the past. We talked to them about the problems they were facing. Through that, we got excited about one concept, which is the future of digital customer experiences, thinking of this asset, which is the AI agent, as being a really important new concept.

It wasnt just about customer service; it was something bigger than that. We always talked about peoples websites and apps, thats their digital identity. In the future, every company will need an agent. Can you update the agent with the new policies? That sentence will come from a CEOs mouth at some point. What we loved about it, when youre starting a company, you want to imagine yourself doing it for 20-plus years. So its a big commitment. We love that there was a short-term demand in customer service where we could improve something very expensive that no one likes much. So its a great application of AI.

Clay: One of the things that weve been very focused on from the beginning is being intensely customer-led. Back to your question on how we started, it was through a series of conversations rather than taking this new technology and being the hammer trying to find the proverbial nails.

Q: Would you want to develop your own models?

Bret: We converged on a technical approach that we should not be pre-training models. Our area of AI research is around autonomous agents, its really thriving in the open-source community. Theres a lot of people making an AI thats answering all my emails, which is an energetic, open-source community that is fun to watch.

There are a couple of reasons why we really believe in it. One is customer benefits. Imagine its Black Friday, Cyber Monday, and you want to extend your return period to past New Years, which is a common thing to do. If youre building your own pre-trained model, youre like, we can update the policy in like three weeks. And thats going to cost $100,000. The agility that you get from using a constellation of models, retrieval-augmented generation, all the common techniques in agentic style AI.

Similarly, it means we can serve a much broader range of customers because its not like you have to build an expensive model for every customer. And like Reid Hoffmans characterization, there are frontier models, like GPT-4 and Gemini Ultra. And then there are foundation models, which is just this broad range of open source and others. The foundation models are sort of a commodity now. It makes a lot of sense to focus on fine-tuning and post-training and say, we can start with these great open-source models or other peoples foundation models, and just add value, which is unique to our business. For example, we have a model that detects are you giving medical advice? We have a model that detects are you hallucinating? The pre-training part of that isnt particularly differentiated.

So our view is that theres a Moores Law level of investment in these foundation models. Wed rather benefit from that rising tide lifting our boat, rather than burning our own capital, doing what is a relatively undifferentiated part of the AI supply chain, and really focus on what makes our platform unique. If you squint, its like the cloud market. How many startups build their own data centers now? For some companies, it might make sense, but you should have a very specialized use case. Otherwise, licensing the server from Amazon or Azure probably makes a ton more sense. I think the same is true of these foundation models.

Q: Has the process of building an autonomous agent been more challenging than you thought it would be?

Clay: Its been incredibly fun exploring this territory because you can anthropomorphize how humans think, reason, and recall things, and those really apply to so many things in developing agents. For instance, what does it take to respond effectively to customers and solve problems? You have to plan. How should the agent go about planning?

So we have specialized models that are experts in planning and thinking through the next steps. How do you answer a factual question about the company? You recall a memory of something you read previously. So weve figured out how to give our agent access to, in essence, a reference library that it can read through in an instant, pull out the right bits, and use those to summarize and synthesize an answer. How do we make sure that answers are factual or the action that the agent is taking is correct?

We have another module within our agent architecture that we affectionately call the supervisor. And the supervisor, before a message is sent to a user or an action is taken, will basically review the agents work, and say, actually, I think you need to make a little change here, try again and get back to me, and only after the initial process has revised that, will the action be taken or the response sent.

On whats been hard, there are a number of really important challenges that if youre going to put AI directly in front of your customers, you need to mitigate and overcome. For hallucinations, large language models can synthesize answers and facts that arent, in fact, factual. So we built a layered approach to ensuring that we can mitigate hallucinations, and theres no guarantee because AI is non-deterministic. Were using supervisory layers, giving it access to knowledge provided by the company. Were providing audit and inspection tools, and quality assurance tools, so that our customers can review conversations and, in essence, coach the AI in the right direction through this feedback mechanism.

No matter how smart one of these frontier models is, its not going to know, Reed, where your order is, or when I bought my shoes and whether or not theyre eligible for returns. So you have to be able to integrate safely, securely, and reliably with the systems that you use to run your business. And weve built some really important protections there where all actions taken when youre interacting with customer or company data are completely deterministic. They use good old-fashioned if-then-else statements, and dont rely on LLMs, and their unpredictability to manage things like access controls, security, and so on.

The last interesting challenge has been, of course you want an AI agent representing your company to be able to do stuff, to answer questions, to be able to solve problems. But you also want it to be a good ambassador of your brand and of your company. So one of the most interesting challenges has been, how do we imbue a companys AI agent with its values and its voice, its way of being?

One of our design partners, OluKai, is a Hawaiian-inspired retailer. They wanted to make sure that their AI agent interacts with what they call the Aloha experience. So weve imbued it with tone, language, some knowledge of the Hawaiian language. Weve even had it throw the shaka emoji at a customer who was particularly friendly towards the end of an interaction.

One of our other customers has what they refer to as the language of luxury, a kind of a refined way of interacting with customers with really excellent manners. These are some of the challenges that weve had to overcome. Theyve been hard but really interesting.

Q: When people think of automated customer service, the thought is, how do I get to a real person in the quickest way? Are you seeing evidence that people might enjoy talking to a robot more than a person?

Bret: Thats definitely our ambition. So Weight Watchers, the AI in their app is handling over 70% of conversations completely autonomously. And its a 4.6 out of 5-star customer satisfaction score, which is remarkable. OluKai, over Black Friday, Cyber Monday, we handled over half their cases with a 4.5 out of 5 customer satisfaction score. The joke we all say is if you surveyed anybody, Do you like talking to a customer support chatbot?, you could not find a person who says yes.

I think if you survey people about ChatGPT, you get the inverse. Everyone loves it, even with its flaws, and hallucinations. Its delightful. Its fun. Thats why its so popular. One of our big challenges will be to shift the perception of chatting with an AI. At our company, we dont use the word bot, because weve found that consumers associate it with the old technology.

So our customers get to name their agent, but we usually refer to it as an AI or an agent or a virtual agent, to try to make sure that the brand association is hey, its this new thing, its this fun, delightful, empathetic thing, not that old, robotic thing. But itll be an interesting challenge.

Our AI agents are always on, faster, more delightful than having to wait on hold, not because the agent on the other side is bad. But you dont have to wait on hold. Its instantaneous. Its faster. I hope that we end up where people are like dont you have an AI I can talk to? Are you kidding me? I have to talk to a real person? I dont think were there, and I think therell be a bit of a cultural shift. Weve even talked about how do you actually know youre talking to one of the good ones versus the old bad ones? Because they kind of look the same. But you know it when you see it.

Q: There are some really heavy hitters in this space trying to do something similar. How do you differentiate yourselves?

Bret: Were really focused on driving real success with real scaled consumer brands like Sonos, Sirius XM, Weight Watchers, and OluKai. We really recognize that its very easy to make a demo in this space, but to get something to work at scale, thats where the hard stuff is. When companies decide who they want to partner with, theyll look at who are the customers? Do I respect them?

We want to be focused on the enterprise. We believe that the needs of enterprise consumer brands are pretty distinct or higher scale. They have really strict regulatory requirements that smaller companies dont have. That produces a platform where we have a lot of enterprise features around protecting personal identifiable information, compliance, things that are an important category of enterprise software that I think will set us apart.

We also have a really great business model. We call it outcome-based pricing. Our customers only pay us when we fully resolve the issue. It means that they are only paying us when were saving them money. It will be competitive and execution really matters. The company hasnt even existed for 11 months and weve got live paying customers.

Very few people remember AltaVista, but those of us at Google at the time do. Very few people remember Buy.com; they remember Amazon. Were aware that in these periods of technology innovation, execution matters a lot.

Q: Just to make sure I understand, if Im a customer and I go to a human, then that company doesnt have to pay you because the agent did not resolve the issue..

Bret: Thats right.

Q: Youre a startup. You have no time to be distracted. But then you became chair of the OpenAI board. What was that like for you two then?

Bret: The reason why I agreed to join the board was a sense of the gravity and importance of OpenAI. I had this genuine fear that the OpenAI that had produced so much of the innovation that inspired Clay and I to quit our jobs might cease to exist in its current form. I was in a unique position to help facilitate an outcome where OpenAI could be preserved, and I felt a sense of obligation to do it.

When I talked to Clay, the conversation was like, is this going to take too much time? Is it going to be a distraction? Both of us were like, OpenAI is really important. Youre not going to sit around 10 years from now and say, was it a bad use of time to help preserve the mission of ensuring Artificial General Intelligence benefits all of humanity. Ive served on public boards before, including some high profile ones. Ive been pretty good at time management and work a lot. Weve been able to manage it pretty well. At the end of day, were technologists.

Its funny. Now people ask, is it competitive? Its like asking, is the internet a market? I dont think it is. If I have to articulate the AI market, theres infrastructure, theres foundation model providers, theres tools, and then theres solutions. Were a solution. Were in a different part of the supply chain of AI.

Clay: As we do with everything, we talked it through. And I really felt, and I think Bret felt, that there was an element of civic duty. Its fair to say that Bret was in a literally unique position to make a difference, given his experience, given his great mind, and perhaps most importantly, given his values and judgment. For the impact on Sierra, Bret has done a remarkable job balancing everything and Im really proud to, from a step removed, be a part of preserving this really important organization.

Q: I think every company in crisis now is going to call you to be on their board.

Bret: Im trying to figure out the reputation I have now. Am I like Harvey Keitel from Pulp Fiction or something? I dont know.

Q: Is the drama over, by the way? I know theres an ongoing investigation.

Bret: Nothing to share at this point. But over the coming months, well be super transparent about all of that.

Q: Speaking of AGI, I know youre not developing it. But has this experience of trying to meld all these different models, and fine-tune them to build something more intelligent, made you think about the path to AGI any differently?

Bret: Im not an expert in AGI so take this as a slightly outside, slightly inside perspective. I do think that composing different models to produce something greater is a really interesting technique. If you have a model thats wrong 10% of the time and right 90% of the time, and another model that can detect when its wrong with the same level of accuracy, you can compose them and make something thats right 99% of the time. Its also slower and more expensive, though, you end up with a pipeline of intelligence. Theres both time and cost limits to it. But its really interesting architecturally.

The biggest trend change that Clay and I have talked about is, I think three years ago ancient history AI was sort of the domain of machine learning. You meet a data scientist, their workflows are very different than engineers. Its like notebooks and lots of data. Source control is optional. Its very different culturally than traditional software engineering. Now, particularly with agent-oriented models, you can use models off the shelf, you can wire them together, and AI has moved to the domain of engineering.

You use it almost like you think of spinning up a database or something like, oh, yeah, well use this model for that and use this model for this. Im not sure of its impact on AGI, which has a lot of connotation, but certainly as it relates to building an intelligence into all the products we use on a daily basis, I think its been democratized.

LLMs just enable transfer learning. Essentially, when you train on all of human knowledge, its very easy to get it to do something smart at the tail end of that, kind of reductively. As a consequence, thats so interesting, because now just every day full stack developers can incorporate next generation intelligence into their product. You used to have to be Google.

Now its like, everyday programmers have these at their disposal. And I still think we havent seen the end of that. The first generation of iPhone apps were like a flashlight. I think the early AI applications were sort of thin wrappers on top of ChatGPT. We havent gotten yet to the WhatsApps and the Ubers.

Q: I also wonder if theres also an element of the early internet here, where theres an infrastructure bottleneck. You cant use a frontier model for every part of this. Its too slow, too expensive. So, do you try to make your software efficient for todays models, or make it a little inefficient in anticipation of the infrastructure layer improving?

Bret: Our approach internally with research is to use overpowered models to prove out a concept and then specialize afterwards. And I really think that style of development is great. Its like vertical integration, you can get it working, prove it out, and then say, Okay, can we build specialized models? Theres been a lot of research Microsoft had, I cant remember the name of the research paper but theres been a ton of research of using very large parameter models to make lower parameter calls that are really effective.

Q: Textbooks Are All You Need.

Bret: That was the paper. This area is fascinating. One of the things weve talked about is Sierra was the name of a game software company in the 90s that both of us played. I remember hearing stories of the game developers in the 90s, where theyd make a game for a computer that didnt exist yet. Moores Law was at such a blistering pace at that point that making a game for the current generation just didnt make sense, youd make it for the next one.

When we think about Sierra, we think about two forms of this, which is one you can build with lower parameter, cheaper models that make it faster and cheaper. Similarly, even the current generation of models will be cheaper and faster a year from now, even if you did nothing. So theres this interesting thing as youre building a business and youre thinking about your gross margins, which is talking about the present will be the past so quickly, its almost incorrect. You really actually should be thinking about Moores Law the way a 90s game developer thought about the PC.

It makes it very hard to form a business plan, by the way, because you almost have to bet on the outcome, but you dont have all the information. We know a multimodal model that supports x is going to exist by the end of this year, with like a 90% likelihood. What decision do you make as a technologist at this point to optimize for that? Its fun, but its chaotic.

Q: It sounds like getting that exactly right might be the thing that makes you win.

Clay: Being able to read the trend lines and how quickly these new capabilities will come from being just over the horizon, to on the horizon, to available and usable for building new products with, thats part of the art here. We both often fall asleep reading research papers at night. So were up to speed on the latest. Our hope is that we can read those research papers and hire the PhDs so that our customers dont have to, and we can enable every one of them to build this AI agent version of themselves.

Q: Youve said that this will put some call center workers out of business, but it will also create new jobs. I agree but do you have any ideas of what those new jobs will be?

Bret: One of our design partners, the customer experience team, theyre in the operations part of the customer service team. They were doing quality assurance on the agent, including both before and after launch, reporting issues with live conversations. They refer to themselves now as the AI architects and their main job is actually shaping and changing the behavior of the AI. Weve embraced that.

With our new customers, we talk about how you need to have some people adopt this AI architect role. The exciting part for me is what is the webmaster of AI? Not the computer science person whos making the hardcore HTTP server, but the person whose actual job it is to help a company get their stuff up and running, and maintain it.

We love this idea of an AI architect, but I think it requires technology companies to create tools that are accessible to people who are not technologists so that they can be a part of this. I actually was really inspired by the role of Salesforce administrator. It would surprise me if it werent one of the top 10 jobs on Indeed still to this day. And the role of a Salesforce administrator is a low code, no code job to set up Salesforce for people.

If you talk to Salesforce administrators, 99% of them made a mid-career transition to that role. Everything from manicurists to accidental admin, like your boss says, hey, we have the Salesforce thing, you mind maintaining it? Ten years later, they have a higher salary and theyre part of this ecosystem.

Its important as technology companies, were creating those opportunities to have on-ramps for people from operational roles around service to benefit from the rising tide of all the investment in this space. It will be disruptive, though. I dont know the history of the automated teller machine very well. I imagine there was a point where it was disruptive. And its very easy to say now that bank employees didnt go down. What about the week you put it in? Was that moment disruptive? It probably was.

We shouldnt be insensitive to the fact that when you start answering 70% of conversations with an AI, theres probably a person on the other side thats getting less traffic. Thats something we need to be accountable to and sensitive to. But the average tenure of a contact center agent is way less than two years. Its not a career people seek out. Its not necessarily the most pleasant work. If you see in a call center, people have eight chat windows open at the same time, with a requirement of how many conversations they can have per hour. Its a challenging job.

So Im hopeful that the jobs that come out of this will be better and more fulfilling. But the transition could be awkward, and thats something we need to be sensitive to and its something Clay and I talk a ton about.

See original here:

Bret Taylor and Clay Bavor talk AI startups, AGI, and job disruptions - Semafor

Related Posts