On September 12, 2023, Cohere presented a webinar with Stanford Professor Erik Brynjolfsson and MIT Principal Research Scientist Andrew McAfee about the impact of Generative AI on workforce productivity.
The full recording is available here:
Erik and Andrew are two of the most renowned academics in this field, and have conducted research over decades into the economic impact of new technologies. On September 12, they spent an hour with us discussing their research findings, what they’re learning from multiple conversations with C-level executives at the Fortune 500 companies they consult to, and where they believe Generative AI is poised to have impact soonest.
Cohere helps companies build state-of-the-art enterprise knowledge assistants. If you’re interested in accelerating productivity at your company, and ready to resource such a project, please get in touch with us at https://cohere.com/contact-sales.
Read the transcript hereNeil Shepherd (00:02:09): Alright, let's make a start. So it gives me great pleasure to introduce everyone today. I'm delighted to have so many people with us and thank you for attending. So we're talking here today about generative AI and its impact on workforce productivity. Generative AI is literally one of the most exciting and transformative technologies that we've seen emerge in the last 15 years, probably since the advent of the smartphone. A couple of challenges for us really here are one is the technology's ubiquity. This is what economists call a transformative technology, like a general purpose technology like electricity or many other things that we take for granted over time because they become so ubiquitous. And so with such an influential G P T, and apologies for the acronym obviously overlapping with general transformers, the question isn't where we should deploy this in our enterprise.
But how quickly and to what extent and what sequence. So everything comes down as usual to the potential value at stake. How much value is available for the different use cases and tasks that we want to automate? How much risk is there associated? And we're very lucky to have with us, two of the most foremost experts from academia, Erik Brynjolfsson and Andrew McAfee, and I'll introduce them now and I'll let them do the speaking because they're the ones who have conducted a lot of research on this and understand this very, very well. So first up, I'd like to introduce Erik Brynjolfsson. He's a co-founder of WorkHelix. He's a professor at Stanford and the director of the Stanford Digital Economy Lab at the Institute of Human Centered AI. He's one of the most widely cited researchers studying the digital economy. The author of over a hundred academic articles, five patents and five books.
His research focuses on the effect of digital technologies on the economy. We also have with us Andrew McAfee, who is the co-director of the Institute for the Digital Economy and a principal research scientist at M I T Sloan School of Management. He is the author, a co-author of more than a hundred articles, case studies and other materials for students and teachers of technology, including his latest book The Geek Way, which will be out in November. He's also a co-founder of Workhelix. Both of our guests have actually collaborated for many years and co-authored multiple articles together, including the second machine age machine platform crowd and others. And most recently, they've co-founded a company called Workhelix, which uses their research and a vast data set that they've created to assess a company's generative AI opportunity and create a tailored roadmap that actually gets it. This data is also the basis for some of what we're going to share today. So without further ado, I welcome Eric and Andrew, I'll pass the floor over to you and I'm looking forward to a great conversation.
Andy McAfee (00:04:57):
Neil, thanks very much. Hi everybody. I'm Andy McAfee and I have the great pleasure today of getting to talk with my friend, colleague, co-author, co-founder, all around alpha geek, smart guy Eric. Eric and I have been, as Neil says, we've been working and talking together for solidly more than a decade. And here's the amazing part. I'm not tired of it yet and one of the main reasons I'm not tired of it is I always learn from Eric, but I always get a chance to trip him up or troll him when we talk and I'm not going to pass that opportunity up here. A little while back, Eric left the east coast and went west in search of wildfires and earthquakes in California. And so we're now doing this across the country, which is a particular thrill for me. So Eric, do you have anything you want to say before the grilling begins? I've got a bunch of questions for you.
Erik Brynjolfsson (00:05:56):
Okay, I'll be ready for them. I'm sure you'll have some new ones for me, but just remember I can give as well as I can get. So it'll be fun in both directions.
Andy McAfee (00:06:02):
I have ample experience with that. Alright, Eric, I want to start by referring to this terminology that Neil used. You economists are generally kind of like a sober level headed lot. Your job is to understand the economy then your branding. The branding for your own discipline is the dismal science. But there's one thing that I've found that you all are not dismal about that you kind of get giddy about. You talk about in these almost mystical terms sometimes, and Neil identified it and abbreviated it. You economists seem to go crazy for these things called general purpose technologies or GPTs. So can you start off by doing two things for us? Number one, define what these things are, general purpose technology and then can you explain why you and your colleagues are so over the moon about these things?
Erik Brynjolfsson (00:06:58):
Yeah, well first off, we're kind of pissed that we lost the acronym for, we were using it for a couple decades and now nobody thinks of it as a general purpose technology, but it is the one thing that economists do get excited about because as you and I wrote about in the second machine age for most of human history, not much change for the average person, people were basically living a little above subsistence level. There were kings and queens and empires and pandemics and whatever, but living standards basically were the same until the first really important G P T came along. The first really important general purpose technology that was of course the steam engine that ignited the industrial revolution. And since then we're about 30 to 40 times wealthier than our ancestors were back then. But that wasn't the only one. Neil mentioned electricity was another very important general purpose technology. And today I think that generative AI or AI more broadly is a general purpose technology, possibly even more important than the earlier ones. And given the importance to living standards and how people live, that's a very big deal.
Andy McAfee (00:08:08):
I want to make sure I understand this. Are you saying that there's been this small handful of technologies that have lifted humanity out of the muck that they were dwelling in for these? I mean, is that about right?
Erik Brynjolfsson (00:08:21):
Deal? That is about right. And the history books have filled with all sorts of other stuff, but it really boils down to this handful of technologies and they have three properties that set 'em apart from everything else.
Andy McAfee (00:08:32):
How do I recognize one when I see one?
Erik Brynjolfsson (00:08:34):
Yeah, so the criteria are that Tim Bresnahan and Mineral Berg came up with where first off, they are ubiquitous, they affect almost all sectors of the economy, different tasks, they're not just in one little area. Secondly, they rapidly improve and honestly nothing improves as rapidly as ai. So this makes the other ones look like PIRs. And thirdly, and this is the most important one I think, is that they spawn complimentary innovations that you can build things on top of them. So electricity is not just light bulbs, it's electric motors, refrigeration, air conditioning, lots of other things. And likewise, AI is catalyzing a whole set of other changes. By the way, they're not just physical technologies. Often the new technologies and the way that economies use it can be new technologies for marketing or business process redesign, new kinds of skills. So we use the term very broadly.
Andy McAfee (00:09:27):
Alright, and again, I want to make sure I've got this right. There is a, historically for all of human history, there is a small handful of these things. You economists, you don't run around promiscuously labeling everything at G P T. It's a super high bar, right? So Eric, you said at least once that AI and more specifically generative AI deserves to be on that extraordinarily short list, man. Are you sure?
Erik Brynjolfsson (00:09:53):
Well, in fact, I wouldn't just put it on the list. I would put it, I think we're going to look back and say it was at the top of the list because it ticks all those boxes. It does affect almost every sector of the economy. We're seeing the work that Daniel Rock did lay this out very clearly. All the tasks that are being affected, maybe we'll talk about that later. No one's going to argue that it's not rapidly improving. And then these complimentary innovations, and that's what cohere Helix are all seeking to document those more. But the reason I think it's arguably the most G of all GPTs, the most general is that it's going after intelligence. And what's more important than that, when I was visiting DeepMind in King's Cross in London a while back, they had this modest slogan on the wall that said, our goal is to solve intelligence and then use that to solve all the other problems in the world. And I was like, yeah, that's about right.
Andy McAfee (00:10:44):
Alright, so you believe and your career has been decades long. You've seen a few technologies come and go. This is not just recency bias. You think this one is probably at the top of the list.
Erik Brynjolfsson (00:10:59):
I think about it, what's more important than addressing intelligence in our generation, what we're doing right now, we're in the midst of doing that. The thing that worries me though is that most executives, most businesses aren't taking it seriously enough. My technology friends, I'm here at Stanford University in Silicon Valley, but really globally, people are pushing the frontier on the technology like crazy. But most companies are way behind the curve and they aren't making the changes. And there's a growing gap between what the technologies can do and what businesses are doing, and that's creating a lot of disruption. I think there's going to be companies going out of business as a result. A lot of occupations are in turmoil and it's going to get a lot more disruptive in the next three to five years.
Andy McAfee (00:11:43):
You mentioned a guy named Daniel Rock who you and I both know because you were his dissertation advisor and he is also a co-founder of Work Helix with us. You mentioned this fantastic piece of work that he did based off work that you and he did earlier, a few years back that tried to get at how g, the G P T of generative AI is. Can you rattle off some of the findings from that work? We might get trans Daniel later, but hit the high notes.
Erik Brynjolfsson (00:12:14):
Yeah, first off the title was cool. GPTs are gpt so you guys can Google it because general purpose tech, or I should say general pre-trained transformers are general purpose technologies. But what they went through and used a methodology that he and I had initially done back when we were both at M I T, he's at Wharton now and I'm at Stanford where we take any occupation, you can break it down into basic tasks like take a radiologist, everybody talks about radiologists reading medical images. It's true they do that. They actually do 26 other tasks caring for patients, sedation coordinating care. You can go through all the other tasks, bus drivers, economists, they all do multiple different tasks. You can look at each individual task and evaluate whether or not generative AI could help with that particular task. Summarizing a memo, absolutely lifting a box onto a truck, not so much. And as you go through each of those different tasks, you end up getting a picture of not just what that occupation does, but once you've broken down the occupation individual tasks, the cool thing is you can roll them back up to the level of a whole firm. And in that paper they go through all of the different tasks. And a really important finding is Eric,
Andy McAfee (00:13:26):
How do you go through every task in an economy that's like a big number.
Erik Brynjolfsson (00:13:31):
There's about 18,000 of them according to O net. So it is a big, big, and Daniel
Andy McAfee (00:13:36):
Looked at every one of them with his own two eyeballs. What did he do?
Erik Brynjolfsson (00:13:40):
That's why we have smart people like Daniel and his team to go through it. Honestly, it is not something that I would recommend most people try to do on their own, but you can piggyback on that effort. They involved a small army of crowd workers to evaluate them based on a set of rubrics. They said, here's the criteria that you evaluate them and then you have these thousands of people evaluate them. It also turns out you can use ironically G P T itself to evaluate it. And the spooky thing was that when you had the large language model do the evaluations, a lot of the answers came out very similar to what the humans were. So it's sort of a little recursive effort there. But the end result was you get this picture of which occupations, which tasks are most affected. And one of the scary things for a lot of us was that it's coming after a lot of high paid professional jobs that previously were immune. Andy, when you and I were writing our books, it was disproportionately lower paid jobs that were being affected by automation. But what Daniel's work shows is that there's a lot of middle skill and professional jobs, lawyers, doctors, marketing managers, teachers who have big parts of their work affected by large language models and other kinds of generative ai.
Andy McAfee (00:14:57):
Do you remember about what percentage of the US workforce has, I don't know, at least 10% of all of his tasks amenable for today's generative ai?
Erik Brynjolfsson (00:15:08):
I'm scared now because Daniel's on the thing. I think I remember it was around 80%, but he can pop in make sure and correct me as the student becomes the professor. But it was definitely a large majority of occupations have at least some tasks that are being affected and are likely to be disrupted and a pretty significant minority that had a majority of the tasks being affected. And to be clear, that's just with the current technology G P T four, we all know that there are bigger and better models coming out later this year and early next year. So the disruption fund has just begun.
Andy McAfee (00:15:48):
I want to talk about a piece of work that you did with a couple colleagues recently that I learned a ton from because it was a very different piece of research where you guys, you all went deep in one company looking at one job and just a couple tasks that were exposed to an L L M. And the thing I love about it is that everybody's been studying the impact of GPTs on coding and it's pretty obvious that coders all over the world are very, very quickly adopting these technologies to help them with their work. Great. You went to a really different part of the company. You went to customer service, right.
Erik Brynjolfsson (00:16:26):
And I think the lessons we learned from that are widely applicable. So let me review them a bit. We looked at how call center and customer service reps were affected and the important thing was that the companies were not trying to replace the workers but to augment them. And I think that's something we see more and more is using these tools to allow people to do their job more productively and expand their capabilities. And in fact, in this case, there's about a 35% productivity improvement for the least skilled workers. So it was a big bump just in a matter of a few months. So this is not one of these hypothetical things. Someday some big things going to happen. We saw that in real time within a few months, this improvement, interestingly, the more skilled workers did not get as big a bump. It was the less skilled workers who got the biggest bump. Also the customers seemed to be happier. We looked at customer sentiment analysis. You can look at the words and these millions of transcripts, there are a lot more happy words and a lot fewer angry words, less typing in all caps in the
Andy McAfee (00:17:24):
Messages. Literally that was maybe my favorite part of the research is that you had the idea to look at how often customers typed in all caps and after the technology and that went down. That's how I know I'm angry when I'm typing
Erik Brynjolfsson (00:17:36):
In all caps. No, absolutely. And we've all had those experiences. We get angry at the rep and that happens too often, but it happens less often when the rep is getting help from an L L M and it's coaching them on the right answers to give them. And the reps themselves seem to be happier. Who would've imagined that reps like it better when their customers are happy? And so there was less turnover and they were sticking with the jobs. It was really one of these nice things where you get a win for the stockholders, win for the clients, win for the workers that weren't being squeezed. So the technology was having a very big effect very rapidly. And I think there's a lesson there for basically every company that if they roll these technologies out, they can augment people, not replace them and have it lead to better satisfaction across the board.
Andy McAfee (00:18:22):
You've used the words augment and coach a couple of times in describing this piece of research. So was this technology in this case, was it not getting people out of the loop? Was it not an automation technology,
Erik Brynjolfsson (00:18:37):
I should say Andy, the CEO O of the company said that he had read second machine agent was inspired by it. So that was flattering to try to find ways to augment people. And to be clear, it's not always the case. There are certainly places where you want to replace the workers, but I think it's under appreciated that you can augment the workers and allow them to do new things they couldn't do before, handle more types of processes. And that's exactly
Andy McAfee (00:19:02):
What happened here. In this case, the technology just teed up possible things for the agent to say and they could accept it or reject it or ignore it.
Erik Brynjolfsson (00:19:09):
Exactly. It was basically listening in on their conversations and it would say, Hey, this might be a good time to mention this feature or don't forget to upsell them or maybe don't use the F word quite so much, but basically coaching them along to do the right answers. And the human operators did not always agree. They didn't always go along with it, although in our data we found that the operators that went along with the L l M tended to do better on average. So maybe they should have been listening to it more often. But one of the reasons you want to keep a human in the loop is that we all know that these LLMs, they hallucinate at times they make mistakes or confabulate and they don't always have the answers. If there's not sufficient training data, they don't know what to say. So having a human in there can deal with that long tail of unusual cases a lot better.
Andy McAfee (00:19:57):
You and I have been around the world of technology intersecting with business and some of the holy grails of that intersection for a long time. And I think you're probably as tired as I am of the buzz phrase knowledge management. It's been this kind of shining star off in the distance that technology was actually going to let us harness and profit from all the knowledge of the organization. And to be super honest, it really has not happened. The story you're telling is kind of a story about successfully harnessing the knowledge of an organization and letting people, letting newer people, less experienced people have access to that knowledge on demand and make use of it. Am I painting too happy a picture?
Erik Brynjolfsson (00:20:42):
No, this is a real fundamental change. Some people call it software 2.0. Every organization has an enormous amount of tacit knowledge, things that have never been written down that we just kind of learn from our colleagues, from osmosis, from on the job experience. And that kind of knowledge has historically defied codification. I mean, how can you write code of something that you don't even know what it is? Machine learning has completely changed that. It is literally learning. The machines are literally learning. They observe these millions of transcripts and they figure out what are the right things to say at the right time. So this tacit knowledge now becomes codified through the machine learning system. It's a game changer because so much value in any corporation is in that tacit knowledge. And now finally we have a way to tap into it. But Andy, in fairness, you've been asking me, I have a bunch of questions for you.
Andy McAfee (00:21:37):
We'll get to them. I have a, you
Erik Brynjolfsson (00:21:39):
Keep grilling me. Okay. I'm
Andy McAfee (00:21:40):
Going to keep grilling you because it's just too much fun. I have a little question,
Erik Brynjolfsson (00:21:46):
I'll give you one or two more and then I really want to go after
Andy McAfee (00:21:48):
You. Okay, fine. I got a little one and then a bigger one. The little one is you're describing and in this paper, which again I love this paper that I'm a fan of your worker also, I would not be collaborating with you for a decade. I love this paper because you're describing this very lightweight intervention. It seemed like it wasn't crazy expensive. It didn't take a huge long time to set up, it happened fast and the good things started happening kind of immediately and they were substantial good things. I mean, is that about right?
Erik Brynjolfsson (00:22:18):
Yeah, that was something that was very unusual because in a lot of other big enterprise systems, it can take years before you start getting to pay off. I've documented in other cases I won't name some of the big companies that roll out these 50 million enterprise projects that don't pay off for a long time or sometimes at all. This was one where we saw the benefits within a few months. And by the way, I should thank my co-authors, Lindsay and Danielle at M I t who did a lot of the hard work to document this and people got up the learning with call center operators, it's important to get up the learning curve quickly because there's a lot of turnover. A lot of 'em don't stay for a long time. So it's important you get this return quickly. And I don't want to overgeneralize this is happening everywhere, but we do have a tool that can build on the existing infrastructure and roll out and get benefits very quickly. And that's why I'm actually kind of optimistic about productivity in the coming years. We've seen some pretty dismal productivity the past decade to my disappointment. But looking forward, I think we're going to have about twice as high productivity growth rate closer to 3% per year rather than the 1.4% that we've seen in the past. And by the way that the congressional budget office is predicting
Andy McAfee (00:23:36):
And everybody, for an economist to predict a doubling of productivity, which is probably the thing they care about the most in their lives to predict a doubling of this is you are not making a cautious prediction here. Alright, last question and then you can turn tables if you want. We've talked about augmentation, we've talked about coaching, but there is going to be automation happening here. Do we need to worry about the job apocalypse and is technological unemployment finally going to hit the economy in a big way?
Erik Brynjolfsson (00:24:08):
There is such an overemphasis on that that it is frustrating and you and I keep reminding people that it hasn't happened for 200 years. I don't think this time is fundamentally different. There's certainly will be places where there are job losses, but technology has always been destroying jobs, it's always been creating jobs. And so the real issue is the turnover and the dynamism and companies that are prepared are going to gain jobs. I think that the best thing you could do to prevent job loss in your organization is to invest in this technology, augment your workers and make sure one of your competitors isn't putting you out of business. But I don't foresee any mass unemployment. I mean as you know, we're close to record low unemployment right now, and with an aging workforce we may have a job, a worker shortage more than a job shortage. So of the various list of concerns, that's not high on my list. Okay, so let me ask you a couple questions there because you, you're at a B school and you have this new book, we heard the Geek Way coming out about how companies can win. So now there's this new amazing technology and some technologies level, the playing field and some lead are differentiators that separate the winners from the losers. Which category would you put this in or can you decide yet?
Andy McAfee (00:25:26):
I have a strong opinion and a prediction and it's a little bit counterintuitive. Hush you. It's a little bit counterintuitive because you've given examples of light lifts that lead to big improvements. And I believe those examples, so coders all over the world are already using these technologies. I think customer service departments in lots and lots of companies are going to go through a similar process to augment their people and share the knowledge of the organization. So you see all that. You think, oh man, this is a rising tide that floats all boats to some extent, but I believe it's bigger effect. The bigger effect of generative AI is going to be sharpen the differences between the companies that are good at technology and the companies that are not good at technology. And maybe we'll have time to talk about what that phrase means good at technology. But Eric, at the same time that technology has been pervading the economy, getting cheaper per unit, think about what a unit of compute or a unit of software, a unit of bandwidth costs compared to where it was 10 or 20 years ago. While this has been happening, while the costs have been plummeting and companies all over the economy have been investing like crazy in this stuff, Eric, this have the competitive differences among firms and industries, have they been getting bigger or smaller?
Erik Brynjolfsson (00:26:48):
No, this was actually our first conversation when I was visiting Harvard Business School where you came into my office and we worked out in the blackboard how there was this growing gap between the leaders and the laggers even back then. And it's only increased now. So what distinguishes these superstars from everyone else? Do they have something in common?
Andy McAfee (00:27:08):
You and I have a team of really great colleagues who have been doing research on what they call superstar firms. And they find that not just in high tech industries and not just in the United States, but very broadly throughout the richest countries in the world, in industry after industry, there is a small group of superstar firms that get it and that are pulling away from the pack. And the question is, what are they doing that's so different? And the reason I think that's an incredibly important question is that I have a book about it coming out later this year. Yeah, amazing. Where I tried to dive in on that question and it's at least a book's worth of answer, probably a library's worth. But I want to concentrate on one thing that I observed over and over and over, and it's a striking difference between, I'm going to use a different phrase, I'm going to talk about geek firms that just run themselves differently than the companies of the industrial era. I'm not sure all superstars are geeks and all geek firms, not everybody that's full of geeks is a superstar firm. But man on that Venn diagram, there's a lot of overlap between those two circles.
Erik Brynjolfsson (00:28:15):
And we should be clear, coming from m i t, we both consider geek a high compliment, super high school where people didn't want to be geeks. We really love geeks.
Andy McAfee (00:28:24):
That's why you and I are much, much happier in our lives than we were in high school because that bit has flipped on that, right? So Eric Geek is a term of praise and admiration in this case. And I wanted to understand what the geeks do that's so different and we've all heard the phrase M V P, the minimum viable product and that's great. And at Helix we follow that kind of lean startup philosophy. We're trying to understand what the customer wants and build what they want. Yes, yes, yes. There's another M V P going on it's minimum viable planning. And what I mean by that is just willy-nilly starting to do stuff without a plan is a really bad idea in generative AI and just about everything else, you need to get the team together and think about the opportunities and scope things out and not just blindly do what was on the cover of information week that week. That was back when information week was a physical magazine. So there's a minimum viable plan that needs to happen. And Eric, as you well know, the mission of Work Helix is to help enterprises, to help companies generate that plan. How do we think about prioritizing the opportunities that are out there? Great, that minimum viable planning is critical, otherwise you're just randomly chasing things that sound attractive.
Erik Brynjolfsson (00:29:41):
So I just want to be clear here, I mean it's not like you're born a geek firm and you can never change or not. There's a way that you can transform an ordinary run of the mill firm into one of these geek firms and hopefully superstar firms.
Andy McAfee (00:29:55):
I think this is an absolutely huge opportunity because the more I looked around, the more I became convinced that you don't need VC financing. You don't need 50% computer science PhDs, you don't need a Menlo Park headquarters. These are relatively simple practices that get you a long, long way. Planning less is a conceptually, it's a very, very simple thing to do. It's just hard for a lot of companies because we are subject to what Danny Kahneman called the planning fallacy. We love to plan. We sit around and analyze and think we're getting it right. And I'll say this one more time because our company does this for a living, you have to do some planning, but some is the operative word there and it can be surprising how much benefit you get quickly from the planning and then it's time to start doing. And the geeks
Erik Brynjolfsson (00:30:42):
Love and speed is of the essence. This is the frustrating thing. I see. As I said earlier, there's this amazing technological opportunity, but so many companies are like deers in the headlights. They're not adjusting to this and they need a plan to take action. Every board that I've talked to is going to their CEOs, going to their senior executives to say, what is our plan? How are you going to address this opportunity? And that's the right thing for them to be asking because as Daniel's research shows and others, almost every occupation is being transformed. And if they're not planning, they're going to be left behind. So yeah, go ahead.
Andy McAfee (00:31:20):
And once you've got that planning nailed and the team kind of understands what they're going to do, then it's time to start doing. And particularly in a technology that is this new, that is changing this quickly, and frankly it's weird. Generative AI is weird, man. The only way you're going to get experience with it and figure out how it actually works and how to make it work for your circumstances is by trying stuff and not giving up. If your first attempt at prompt engineering doesn't work very well, I mean we don't understand how this thing works. This is a crazy piece of technology. So you have to jump in and start doing stuff. Go back and revise, look at the plan, look at the state of technology course correct and orient, but the goal is to launch projects, start doing, start learning, and get your organization generative AI in shape and that will pay massive dividends. We've talked about this knowledge management revolution. If you want to go seize that, sitting around and just planning for a long time and scoping out systems to the nth degree, I don't think it's going to get you there.
Erik Brynjolfsson (00:32:31):
Well, you talked about how some of these things are unexpected and one of the approaches that I've seen a lot of companies use with success, I really love it, is these hackathons where they'll ask the whole company to take a day off or a few days off and work with the technology. The amazing thing is that when people are playing with the LLMs or with the generative ai, the image generation is and asking 'em to use it in their applications, they come up with things they can do with it that the inventors of the technology didn't have in mind. In fact, one of the things you mentioned was all the success with coding. That was not something people intended the LLMs to be able to do. And right now it has been a
Andy McAfee (00:33:07):
While to even believe that. Is that true?
Erik Brynjolfsson (00:33:10):
It's true. It's true. It ingested a whole bunch of stuff off the web. It happened to see a bunch of Python code out there and it's like, oh, I get this. Now I'm a Python coder. But beyond that, these emergent properties, you're seeing that a company, somebody may be using it to do something in their law firm or their medical practice that no one at one of the leading vendors had anticipated. And that's what's so exciting about this, that there's so many downstream complimentary innovations that are just waiting to be harvested.
Andy McAfee (00:33:43):
Amen. So I'll say it one more time. You need an M V P, you need a minimum viable plan. That's the starting point. Then go do stuff. I found out when I was researching the Geek way that the Agile software movement is one of these very rare movements whose origin can be precisely traced. There was a group of 17 Alpha geek coders who got together in the winter of 2001, 2002 in Utah, I believe it was at Snowbird, because they were so frustrated with how software was being written and it was this analysis, heavy planning, heavy document, all the requirements and then give the binder to it and they come back and 24 months and disappoint you. And it was just broken. This waterfall model was broken and these geeks got together with no other agenda than to try to come up with something better.
And you can go look at the Agile manifesto right now and the principles behind the Agile manifesto and the top of the website says our highest priority is to satisfy the customer through early and continuous delivery of valuable software. And I mean, find the appropriate part of your body and have that phrase tattooed on it. Because once you get your M V P done, that's the way to go about it. This world, Eric, you and I know this world is changing so quickly. The technology is proving so improving so rapidly. You bring up that this G P T is pretty clearly improving faster than anything I think we human beings have seen before. That's how fast it is. You are not going to master this without early and continuous delivery, without trying to do that early and continuous delivery approach.
Erik Brynjolfsson (00:35:27):
So Andy, we have a bunch of questions here from the participants. Should we take some of those questions? We
Andy McAfee (00:35:34):
Didn't put everybody to sleep. Are you and I
Erik Brynjolfsson (00:35:36):
Judging by going toe to toe? Neil, you're the host here. How would you like to go next?
Neil Shepherd (00:35:43):
Yeah, totally. I'd love to get a few questions going for the two of you and do a little bit of a spiel. However, I'll do a little sort of recap before we are. I'd love to spend a little bit of time also just telling the audience what COHEs found because we've been talking to an awful lot of customers and there's a few trends, a lot of people to be aware of. And then I will shift over to q and a and then it's going to be over to you two again to answer some really hard questions. Lisa, I hope they're hard. That's the whole point. I hope so too. So there we go. So anyway, thank you so much. Really, really appreciate you being here, but I'm going to give you a few other things going on. I'm going to share a few observations that we've seen here at Cohere as well during our time.
And this market is shifting just incredibly quickly. So as you're mentioning general purpose technology, we've seen the evolution of the inbound inquiries and what we're finding of the conversations we're having with customers shifting very heavily from point solutions, be it summarization or doing better search or something like that. The overwhelming direction just now is towards what we would call knowledge assistance, where you have basically multiple models working together to answer questions in human readable text, extracting information from corporate data stores that is real data that you can then use. So no more of this sort of, Hey, ask me to make a poem in Shakespeare or something that's wonderful and all. It's just not useful. But we're seeing really people want to get access to their data and put them into these models and get the information in human readable form. That's huge. It's really been a big shift in the last few months I would say.
Second observation is everybody's looking for a secure environment. This we're talking about the crown jewels here of an enterprise data. It's unstructured data that somebody else could read if it ever got into the wild. Nobody, almost nobody, I would say is willing to take that risk. The models have to go to where the corporate data is not the other way around. And then the last is we're seeing, I would call it a few approaches, and this is really a sort of emerging advice I would say, of how people are getting tripped up with their proofs of concept. This is an environment where the POCs can definitely get stuck in what's known as P O C purgatory. It's just never quite good enough, et cetera. And we've got a few observations of how we see things working better. The first is that people are working on evaluating generative models and trying them with a bunch of prompts, et cetera.
Firstly, the prompts work very different across different models. The second is we also think that approach is a little wrong. Really start with getting the corporate data. Your ability to retrieve corporate data that you're then going to feed into the model is by far the biggest determinant of success. This is all about this stuff called RAG or retrieval augmented generation that doubtless we're hearing a lot about. But we believe that that is the first step and the most interesting for making it likely that there'll be fewer hallucinations, more reality, and you're going to get the value that you want out of this. And then the last is academic benchmarks that we've seen. They're not always relevant for what you want to do for the accuracy you care about. You have to just make it work for your users, not refer too much to those academic benchmarks. That's it. That's my little spiel of advice. No more for me. I ain't got any more, but you all do. So I'll ask a few questions here. And my first one is around productivity gains. So the N B E R study had productivity gains averaging 14%, but really ranging from not much to 35% for the less skilled novice workers. Right. Can you give some more examples of what you're seeing out there of the typical productivity gains that you might see in different tasks?
Erik Brynjolfsson (00:39:22):
Yeah, it ranges a lot. I mean, I think one of the ones that's been most documented we've heard a lot about is in coding where you could have 50 to a hundred percent. One that it's harder to measure the benefits precisely, but there's clearly a big gain is in summarizing documents and then creating work. So for instance, in a medical application, a doctor goes up, sees a patient, there's a stack of notes from all the previous doctors and nurses that saw that person. They need to know what exactly is relevant to them as say a kidney doctor, maybe a small subset of that LLMs are terrific for pulling out the part that's relevant to them. And then in turn, when they need to dictate the note, there's a few bullet points that they want to say. And the L l M can give a candidate, okay, here's what you usually say in this situation.
And the doctor can review that and sign off on it in both directions. It can be a doubling or more of productivity. Similarly in legal applications. Another one that I was surprised to see was I was talking to his C E O who had to prepare for his board meeting and had to come up with a set of KPIs for the coming quarter and was kind of having a brain block, asked his team to help, and then he asked the system, he put in a bunch of information about his company and asked the L l M to help him with that. It came up with a terrific set. They can actually be remarkably creative and look at these big questions, the things that Andy and I thought were not going to be suitable for AI anytime soon, but now we're seeing them happening. And that was a huge productivity gain as well.
Andy McAfee (00:40:56):
And Eric is bringing up something fundamentally here. The gains that we're seeing are a large, and B, they're in very, very different categories. He mentions document summary, he mentions the coding benefits. There was a story in the New York Times a while back about how powerful generative AI is for doctors. And immediately you think, oh, but it shouldn't be giving diagnoses for patients. That's true risky. Maybe it is right now. This is not the benefit that the Times was talking about. It was talking about just transcribing and summarizing patient notes because apparently for a lot of doctors, that is two hours of work every day, and I'm very sure that it's among their least favorite two hours of work. The times quoted one doctor who said, look, I retired because I couldn't type fast enough. This is an astonishing waste of resources. This is astonishing. And the times, again, their job is not to hype up generative ai.
They said for some of the doctors studied that two hours went down to 15 minutes because it turns out that the technology was really good at turning this turning speech into a transcript and then summarizing it, following the form of a patient note. So the doctor reviewed that and you think, well, wait a minute. Maybe the technology didn't do a very good job, man. It turns out we humans do a terrible job of that. The research about how much physicians miss when they go down and try to write their patient notes, at the end of the day, it's terrifying. We should not be asking people to try to do this. So I see benefits like that, and even though those are the most prosaic kinds of benefits, you're just transcribing and summarizing speech, man, that's a 600% productivity improvement for a physician on that particular task. Take that to the bank. That is a big deal. Now over on the weird side, Eric, I think you saw this paper too. Some team of people had the idea to see if an L L M would be a good HVAC controller. Did you see this?
Erik Brynjolfsson (00:42:55):
Andy McAfee (00:42:56):
They hooked this system, hopefully not up to anything that we care about, but to some building and had the L l M control the H V A C in the building, and it apparently did a pretty good job of that. So again man, we are just starting to understand this toolkit and put it to work. We're going to get those prosaic benefits. We're going to get some science fiction, holy cow, weird benefits. And to underscore how important this moment is, I went back and re-watched X Machina just a couple of weeks ago, and that's a pretty cool movie. A lot of us nerds have seen it. It came out in I believe, 2015 or 2016. And the whole premise of the movie is that this astonishing technology had appeared and they identified this one poor geek to fly off into the middle of nowhere and talk to this newly sentient ai. Here we are. This is what this is. On the order of seven years later, everyone with an internet connection can have a more deep nuanced conversation than that guy in the fictional film was having with the ai he was interacting with. Science fiction is coming at us very, very quickly.
Neil Shepherd (00:44:05):
Wow, thank you. By the way, I have to wonder if that HVAC machine was actually managing the cooling for all the GPUs it was using to do that.
Andy McAfee (00:44:13):
We don't know if there was a net benefit or not, but it was a cool demo.
Erik Brynjolfsson (00:44:18):
Excellent. So Neil, I mean we can spend the next hour giving you lots more examples of particular cases, but I want to stress there's a systematic way of evaluating these cases. Earlier Andy asked me, there are about 18,000 distinct tasks that we've evaluated. I think Daniel Rock is on the line here. I think we should grill him and have him explain a little bit about the methodology for systematically going through not just a few different cases that we've each encountered, but systematically evaluating 18,000 tasks in such a way that you can prioritize which ones are more likely to be benefiting from these tools, though there he is. Daniel, you want to explain how you do that.
Andy McAfee (00:44:56):
And for all of you listening in, professors never get tired of cold calling people. It's just one of the deep joys
Erik Brynjolfsson (00:45:02):
That we have. Go ahead, Daniel.
Daniel Rock (00:45:07):
Oh boy. Here we go. I get cold called by Eric yet again on never end
Andy McAfee (00:45:14):
Flashbacks. Is this triggering to you?
Daniel Rock (00:45:18):
Yeah, but now I do it to my students as well. So passing it on anyway. Yeah, so we evaluate things in a very simple way. We ask a super straightforward question and we evaluate that question with both human judgment and G PT four. And it's kind of funny how much they agree. So that question is, could you double someone's productivity in a given task with no measurable drop in quality? And we look at all 20,000 no tasks that the government says people do at work. And the answer can be one of three things. Either the answer is no, you can't. Yes you can with just large language models. Yeah, but it kind of depends. And that depends is do we need to build other systems, particularly software systems around the generative AI technology? Now what that lets you do is test like, okay, we know things are improving quickly.
And as Eric said, it's certainly pervasive. And then that last bit, the difference between the yes and the yes but answers that tells us how much complimentary innovation it's going to take to really unlock the gains for all alarms. And it turns out that's quite a lot. And Neil was alluding to some of the critical challenges there. You have to get your data in order. I mean imagine, I think there's one stat out there. 80% of the world's corporate data is actually sort of this unstructured text or image or audio formats where unlocking all of that with generative ai and it's really a new type of software. You can think of the lossy compressions that you're doing to use generative AI as being a new way to relate different data points that we've not been able to use for decision-making or in studies or in other kinds of software. So to the question, and I think someone asked it in the q and a, why does this take so long? Well, it's a totally new thing and we have to come up with good ways to do it. Best thing you can work on right now is getting all of your ducks in row, making that data available, getting the talent you need and so on, setting up those compliments so that you can innovate quickly.
Neil Shepherd (00:47:22):
Thank you. So I'm going to ask something provocative. Here's one of the questions from the audience and it's along the same lines. And for Eric and Daniel, do you have any thoughts on the NVIDIA study that speculates that gen AI will have more of an economic impact than electricity? It's quite provocative. I dunno if you agree.
Erik Brynjolfsson (00:47:44):
I think it depends how broadly you define electricity. Arguably Nvidia is a product of electricity, and that's one of the cool things about these GPTs that they have all these spin-on effects. One thing I can be confident of is happening a lot faster. In our book, we described how it took about 30 years for the payoff for electricity to come to America's factories between the 1880s and the 1910s, 1920s. That was how long it was before we saw a big gain. I already just described to you, we're already seeing massive gains right now. So it's a very compressed timescale. Ultimately, it's hard for me to imagine anything less impactful than intelligence. So I would have
Andy McAfee (00:48:28):
To More impactful. More impactful.
Erik Brynjolfsson (00:48:31):
Yeah, anything more impactful. Thank you, Andy. So yeah, I think broadly that's the right direction, but the bigger thing is how rapidly it's happening and how behind the curve. A lot of companies are still today.
Andy McAfee (00:48:46):
Let me emphasize that a little bit because Eric's a fairly persuasive guy. And I think that when you look at we have this other form of intelligence out there that's going to augment ours, okay, that's not electricity, right? That's artificial intelligence. But one thing we know from that time period that Eric was talking about from the end of the 19th century into about the teens or the twenties, this period when the American economy was electrifying, the data are quite clear that the companies on top at the beginning of that shift, were not the ones on top at the end of that shift, and I taught for a long time at Harvard Business School with Clay Christensen, who is this wonderful guy and just a mentor of mine, and he brought up, he popularized this idea of disruption. There's a whole lot of disruption coming. The companies that are underwhelmed by this or too hesitant or can't, don't plan correctly and can't iterate fast enough, man, they're in trouble in this world that we're heading into.
Neil Shepherd (00:49:49):
That's good. Got another question about this. It's about reliability. So we see that generative AI can affect multiple industries, multiple tasks, et cetera, and it depends on how well it does it, how quickly it might spread. And the question I have here is that we spent maybe centuries, arguably maybe less, but fine tuning things all the way down to Six Sigma. Will general AI catch up to that level of reliability that it's going to basically cover all industries and functions?
Andy McAfee (00:50:16):
Well, hold on. The number of things that are anywhere near a Six Sigma level of reliability is very, very, very low semiconductor fabs, six Sigma going on with their yields. Civil aviation is a six sigma process. Our chances of winding up dead if we hop on an airplane are so vanishingly small that we don't need to worry about it. In my opinion, most things are not anywhere Six Sigma. And the closer you look at how any company is run, the more amazed you are that anything gets out the door. Right? Man, we have a lot of processes that need to be improved. The opportunities are immense. They're not small, they're immense. And to think that we're anywhere near the threshold of how efficient or how productive we can be, I think it's a bad joke. And I'll give you one quick example of that.
SpaceX as a company is I believe a product of this century. It's 20 or 21 years old in that time, in one generation, it has become the first company in the world to figure out how to make commercially viable reusable rockets. Nobody else in the history of the space race ever did that. SpaceX did it, and now they fly rockets and reuse them like crazy. They also are the only company that was able to deploy large numbers of rugged, high bandwidth, reliable internet terminals into a war zone after Russia invaded Ukraine. My question is, what have all the incumbents in the space industry been doing when this upstart shows up and just mops the floor with them? And I will tell you SpaceX are fanatic about this very geeky approach to tackling very, very big challenges. And space is a nice analogy to generative AI because you have to do some planning for space.
You can't just start building rockets, however you want to have to do your M V P, your minimum viable planning after that price stuff, the rockets are going to crash if they're not crude, that is not that big a deal. And SpaceX's kind of go forward approach this very, very geeky approach is making the rest of the space industry look quite bad. So I don't think we're anywhere near the ceilings of what's possible. I think the geeks are going to show up in industry after industry and deliver crazy amounts of value to us, to all of US citizens and consumers using gen AI and the rest of the toolkit that's available.
Erik Brynjolfsson (00:52:45):
Lemme say a little bit about how we might get to that path that Andy just described. For generative AI broadly and LLMs in particular. There's two paths, one of which I'm a little skeptical of and one of which I'm quite confident of. I mean, one is just making these LMS better and better. There are scaling laws that as you get more data, more compute, more parameters, you can predictably improve the error rates quite a bit. And we have a few more orders of magnitude ahead of us in that, and that's good news and a lot of people are confident that will make some progress, but inherently these technologies are subject to some confabulation and some error. They're not designed the way other technologies where you can prove their output. That makes me more confident of a second approach that both you brought up Neil and Daniel brought up, which is combining these with other kinds of tools.
You can have them call on databases, you can have 'em call on symbolic processors, you can have 'em call on calculators. And so even though an lmm may be very bad at arithmetic and you can maybe make it better as you get bigger and bigger, and now they could do three digit numbers, maybe four digit numbers. We all know that a much better way is to call on a calculator. And that I think is a kind of a path that will apply for a lot of different applications. And we're just in the process. I know your company is doing a lot of this, of connecting them to these other systems, and that's the way that I think we can get provably accurate answers. The LLMs are great for some kinds of applications. They can be very creative, they can be ingenious, they can figure out ways around problems that we hadn't seen before. But when it comes to something that's provably correct, we have another set of technologies that we can tap into.
Andy McAfee (00:54:24):
I love that. And to think that the alpha geeks at Cohere and around the industry are not working on exactly the problems that Eric identified. And that's to mistake where you are in a point in time for a broad trend. There's a broad trend going on here.
Neil Shepherd (00:54:43):
That's great. Fantastic. I have a question about employee retention and satisfaction. So there was a question came in that talked about the call center study, which industry that suffers term we'd love to hear. Was there a net positive impact on retention and satisfaction from that? And is that likely to apply to other jobs as well?
Erik Brynjolfsson (00:55:05):
This was one of the happiest things I saw. Sometimes you can squeeze productivity out of people by monitoring them and watching them very carefully and just making their lives miserable. That's not what happened here. In this case, what we saw was that the people, they basically broke it into two groups. Some people got access to the LLMs and some people didn't. The people who were working with the LLMs, they seemed to be significantly happier. And it wasn't just with the reporting we saw, there was less turnover and call centers are rife with rapid turnover, but in this case, the turnover went down quite a bit. People stuck around longer. And it relates to the earlier point we saw that the customer sentiment was better. And I've never been a call center operator, but I imagine that it's more fun working with happy customers than with angry customers.
So maybe those two were related. It also seemed to get them up the learning curve a lot faster. You could compare long, it took 'em to figure things out with and without the tool and they just got better at the job. And again, I think people probably enjoy being competent at their job more than not being that good at it. So for all those reasons, we saw that the employees were reported and acted like they were better off than before. How generalizable is this? I think it is actually quite generalizable because those underlying fundamentals that we saw in that case, they apply in most applications of these tools. So this is one of these things where it's not a zero sum. You take some from one group and give it to the other group. This is making the pie bigger.
Andy McAfee (00:56:31):
Lemme say one thing about that. I think this is super important. Most people want to do their jobs well, right? This image of zombified workers just going through the motions, the data do not show that over and over. If we give people much more powerful tools for 'em to do their jobs, they will like their jobs better. This is a big benefit.
Neil Shepherd (00:56:52):
That's terrific. I got one more question and then we're going to wrap it up. It's quite a big question, so just ask you do your best to summarize as best you can. We can use an l l m if we want to cheat. That's okay.
Erik Brynjolfsson (00:57:03):
Oh, I've been doing it the whole time.
Neil Shepherd (00:57:05):
Oh great. So what's your opinion about what's holding back companies from getting on this journey? Is it concerns about data quality, skills, ethics, privacy, compliance? Is there a pattern here or is it No, get on with it.
Erik Brynjolfsson (00:57:21):
There's a lot of all those things. It's not the technology. The technology is quite capable. But I think a big part of it, honestly, is just knowing where to prioritize that people are overwhelmed with all these opportunities and they need a plan. That's why we started work Helix was to give them a plan and say, here's where you need to prioritize. There's some juicy low hanging fruit just waiting to be gone after and maybe they have some intuitions, but if you could do that in a quantitative analytical way, I think it gives 'em the confidence to proceed.
Andy McAfee (00:57:49):
Yeah, right on. And then to proceed is important. Proceed means go do stuff. Eric and I were at the same party a few years ago where we were sitting around by this weird series of events having at cocktail hour and Jeff Bezos was there, and Eric, you remember this. I'm like, I am not going to miss this opportunity, right? So we're making chitchat with Bezos and I said, Jeff, what is the most common mistake you see other people trying to run great big companies make? And he didn't hesitate. He said they become too risk averse. They just stopped trying to do stuff. Their career incentives are wrong, something's not lined up. And they just become these kind of ossified status quo based organizations where they're not willing to go take a risk, fail at something, experiment, iterate, do those kinds of things. I thought it was an absolutely brilliant answer.
Erik Brynjolfsson (00:58:40):
James, you're talking to a lot of these executives, James is the c e o here at Work Helix. What are you hearing that's holding them back and what are you hearing that unlocks that
James Milin (00:58:50):
Unequivocally? The question we hear day in and day out is, look, we believe Gen AI is an incredibly capable technology, but where should we get started? How do we prioritize the opportunities? And I think a lot of executives have seen in the past that a pioneering person across the organization will just start deploying it without a top-down view as to where this can benefit. So really what we like to do is help folks and say, here's exactly a quantitative analysis at the smallest unit of work, which is a task and here's where the benefit is and here's where to start. And that's exactly why we love working with Cohere, where we can work together on some of these early use cases, get wins on the board for companies, and start the landslide into growth and productivity
Neil Shepherd (00:59:33):
Ravo. Fantastic. Well you know what, we're at time. That was a lot of fun. I can't believe it went by so quickly. And I'm just going to finish up with a big thank you for everyone for attending. And particular thank you to our panelists. So Eric, Andy, Daniel, James, really appreciate you all being here. So our final spiel here is obviously if you're looking for some information about how to understand what the journey might look like and where the opportunities are for your company and what a roadmap might look, work looks is obviously doing that. And please reach out to James on that. And for anything to do with LLMs, you should know by now who cohere is we work on secure high performance LLMs that are highly customized role for enterprise use cases. Please come to us and you'll find us as well. Really appreciate everyone's time. You guys have all been fantastic. Hopefully we can do this again sometime. And in the meantime, have everyone a wonderful week. Thank you.
Andy McAfee (01:00:27):
Neil Cohere. Thank you very much. We appreciate it.
Neil Shepherd (01:00:30):
Thank you so much.