FIR #466: Still Hallucinating After All These Years

-
FIR #466: Still Hallucinating After All These Years
Not only are AI chatbots still hallucinating; by some accounts, it’s getting worse. Moreover, despite abundant coverage of the tendency of LLMs to make stuff up, people are still not fact-checking, leading to some embarrassing consequences. Even the legal team from Anthropic (the company behind the Claude frontier LLM) got caught.
Also in this episode:
- Google has a new tool just for making AI videos with sound: what could possibly go wrong?
- Lack of strategic leadership and failure to communicate about AI’s ethical use are two findings from a new Global Alliance report
- People still matter. Some overly exuberant CEOs are walking back their AI-first proclamations
- Google AI Overviews lead to a dramatic reduction in click-throughs
- Google is teaching American adults how to be adults. Should they be finding your content?
In his tech report, Dan York looks at some services shutting down and others starting up.
Links from this episode:
- Veo 3 News Anchor Clips
- Google has a new tool just for making AI videos
- Chicago Sun-Times publishes made-up books and fake experts in AI debacle
- How an AI-generated summer reading list got published in major newspapers
- Chicago Sun-Times publishes made-up books and fake experts in AI debacle
- Anthropic’s lawyer was forced to apologize after Claude hallucinated a legal citation
- Chicago Sun-Times Faces Backlash After Promoting Fake Books In AI-Generated Summer Reading List
- Groundbreaking Report on AI in PR and Communication Management
- Comms failing to provide leadership for AI
- Perplexity Response to Query about Failure to Implement AI Strategically
- Google is Teaching American Adults How to Be Adults
- Google AI Overviews leads to dramatic reduction in clickthroughs for Mail Online
- Shocking 56% CTR drop: AI Overviews gut MailOnline’s search traffic
- Google AI Overviews decrease CTRs by 34.5%, per new study
- Company Regrets Replacing All Those Pesky Human Workers With AI, Just Wants Its Humans Back
- How Investors Feel About Corporate Actions and Causes
Links from Dan York’s Tech Report
- Skype shuts down for good on Monday: NPR
- Glitch is basically shutting down
- Investing in what moves the internet forward
- Bluesky: “We’re testing a new feature! Starting this week, select accounts can add a livestream link to sites like YouTube or Twitch, and their Bluesky profile will show they’re live now.”
- Bridgy Fed
- Fedi Forum
- Take It Down Act 2025 (USA)
- Mike Macgirvin
The next monthly, long-form episode of FIR will drop on Monday, June 23.
We host a Communicators Zoom Chat most Thursdays at 1 p.m. ET. To obtain the credentials needed to participate, contact Shel or Neville directly, request them in our Facebook group, or email .(JavaScript must be enabled to view this email address).
Special thanks to Jay Moonah for the opening and closing music.
You can find the stories from which Shel’s FIR content is selected at Shel’s Link Blog. Shel has started a metaverse-focused Flipboard magazine. You can catch up with both co-hosts on Neville’s blog and Shel’s blog.
Disclaimer: The opinions expressed in this podcast are Shel’s and Neville’s and do not reflect the views of their employers and/or clients.
Raw Transcript
Shel Holtz (00:01)
Hi everybody and welcome to episode number 466 of Four Immediate Release. I’m Shel Holtz in Concord, California.
@nevillehobson (00:10)
and I’m Neville Hobson in the UK.
Shel Holtz (00:13)
And this is our monthly long form episode for May 2025. We have six reports to share with you. Five of them are directly related to the topic du jour of generative artificial intelligence. And we will get to those shortly. But first, Neville, why don’t you tell us what we talked about in our ⁓ short form midweek episodes since
You know, my memory’s failing and I don’t remember.
@nevillehobson (00:44)
⁓ Yeah, some
interesting topics we’ve had a handful of short form episodes, 20 minutes more or less, since the last monthly, which we published on 28th of April. And I’ll start with that one because that takes us forward. That was an interesting one with a number of topics. The headline topic was cheaters never prosper, said, unless you pay for what you create.
And that was related to a university student who was expelled for developing an AI driven tool to help applicants to software coding jobs cheat on the tests employers require them to take. And it had mixed views all around with people thinking, hey, this is cool. And it’s not a big deal if people cheat others who are abhorred by it is an abhorrent idea. I’m in that camp. I think it’s a dreadful idea that ⁓ most people think it’s not a bad thing. is. Cheating is not good. That’s my view.
There were a lot of other topics too in that as well. A handful of others that were really, really good. How communicators can use seven categories of AI agents and a few others worth a listen. That was 90 minutes, that one. That’s kind of hitting the target goal we had for the long form content. If it’s too long, hit the pause button and come back to it. Might apply to this episode too.
So that was 462 at end of April. That was followed on May the 7th by 463 ⁓ that ⁓ talked about delivering value with generative AIs endless right answers. This was really quite intriguing one. ⁓ Quoting Google’s first chief decision scientist who said that one of the biggest challenges of the gen AI age.
is leaders defining value for their organization. And one of the considerations she says is mindset shift in which there are endless right answers. So you create something that’s right, you repeat the prompt and get a different for images, for example, and get a different one, it’s also right. And so she posed a question, which one is right? It’s an interesting conundrum type thing. But that was a good one. We had 16 minutes that one. And
Shel Holtz (03:01)
We had a comment on that
one, too, from ⁓ Dominique B., who said, sounds like it’s time for a truthiness meter.
@nevillehobson (03:02)
We have a comment? Yeah, we do.
Okay, what’s what are those?
Shel Holtz (03:13)
Stephen Colbert fans here in the US would understand truthiness. It’s a cultural reference.
@nevillehobson (03:18)
Okay.
Got it. Good. Noted. ⁓ Then 464. This was truly interesting to me because it’s basically saying that as we’ve talked about and others constantly talk about, you should disclose when you’re using AI ⁓ in some way that illustrates your honesty and transparency. Unfortunately, research shows that the opposite is true.
that if you disclose ⁓ that you’ve used AI to create an output, ⁓ you’re likely to find that your audiences will lose trust in you as soon as they see that you’ve disclosed this. That’s counterintuitive. You think disclosing and being transparent on this is good. It doesn’t play out according to the research. ⁓
It’s an interesting one. I think I’d err on the side of disclosure more than anything else. Maybe it depends on how you disclose. But it turns out that people trust AI more than they trust the humans using AI. And that we spent 17 and half minutes on that one show. That was a good one. You got a comment too, I think, have we not?
Shel Holtz (04:31)
from Gail Gardner who says, that isn’t surprising given how inaccurate what AI generates is. If a brand discloses that they’re using AI to write content, they need to elaborate on what steps they take to ensure the editor fact checks and improves it, which I think is a good point.
@nevillehobson (04:48)
wouldn’t disagree with that. Then 465 on May the 21st, the Trust News video podcast PR trifecta. That’s one of your headlines, Cheryl. I didn’t write that one. So ⁓ it talks about unrelated trends or seemingly unrelated trends, painting a clear picture for PR pros accustomed to achieving their goals through press release distribution and media pitching. The trends are that people trust each other less than ever.
people define what news is based on its impact on them becoming their own gatekeepers. And video podcasts have become so popular that media outlets are including them in their up-fronts. So we looked at finding a common thread in our discussion among these trends and setting out how the communicators can adjust their efforts to make sure the news is received and believed. That was a lengthier one than usual. 26 minutes that came in at has always this great stuff to…
to consume. So that brings us in fact to now this episode 466 monthly. So we’re kicking off the wrap up of May and heading into a new month in about a week or so.
Shel Holtz (05:59)
We also had an FIR interview dropped this month.
@nevillehobson (06:03)
we did. Thank you for the gentle nudge on mentioning that. That was our good friend Eric Schwartzman, who wrote an intriguing post or article, I should say, in Fast Company about bot farms and how they’re invading social media to hijack popular sentiment. Lengthy piece, got a lot of reaction on LinkedIn, ⁓ likes and so forth in the thousands, some hundreds of comments. So we were lucky to get him for a chat.
It’s a precursor to a book he’s writing based on that article that looks at bot farms. They now outnumber real users on social networks, according to Eric’s research and how profits drive PR ethics. Why meta TikTok X and even LinkedIn are complicit in enabling synthetic engagement at scale, says Eric. So ⁓ lots to unpack in that. That was a 42 minute conversation with Eric. His new book,
called Invasion of the Bot Farms. He’s currently preparing for that. He’ll explore the escalating threat, he says, through insider stories and case studies. That was a good conversation with Eric Schell. It’s an intriguing topic, and he really has done a lot of research on this.
Shel Holtz (07:16)
And we do have a comment on that interview from Alex Brownstein, who’s an executive vice president at a bioethics and emerging sciences organization who says, chat GPT and certain other mainstream AIs are purportedly designed to seek out and prioritize credible, authoritative information to inform their answers, which may provide some degree of counterbalance.
And also since the last monthly episode, there has been an episode of Circle of Fellows. This is the monthly panel discussion featuring usually four IABC fellows. That’s the International Association of Business Communicator. I moderate most of these. I moderated this one. And it was about making the transition from
being a communication professional to being a college or university professor teaching communication. And we had four panelists who have all made this move. Most of them have made it full-time and permanent. are
teachers and not working in communications anymore. One is still doing both. And they were John Clemens, Cindy Schmig, Mark Schuman and Jennifer Waugh. It was a great episode. It’s up on the FIR podcast network now. The next circle of fellows is gonna be an interesting one. It is going to be done live. This is the very first time this will happen, episode 117.
So we’ve done 116 of these as live streams and this one will be live streamed, but it’ll be live streamed from Vancouver, site of the 2025 IEBC World Conference and Circle of Fellows is going to be one of the sessions. So we’re gonna have a table up on the platform with.
the five members of the 2025 class of IAVC fellows and me moderating. And in the audience, all the other fellows who are at conference will be out there among those who are attending the session and we’ll have the conversation. Brad Whitworth will have a microphone. He’ll be wandering through the audience to take questions. It’ll be fun. It’ll be interesting. It will be live streamed as our Circle of Fellows episode for June. So.
watch the FIR Podcast Network or LinkedIn for announcements about ⁓ when to watch that episode. Should be fun.
@nevillehobson (09:54)
Okay, that does sound interesting. Shell, what date is it taking place? You know.
Shel Holtz (10:00)
It’s going to be Tuesday, June 10th at 1030 a.m. Pacific time. It’s the last session before lunch. So even though IABC has only given us 45 minutes for what’s usually an hour long discussion, we’re going to take our hour. People can, you know, if they’re really hungry, their blood sugar is dropping, they can leave. But we’ll be there for the full hour for this circle of fellows.
@nevillehobson (10:27)
I was just thinking, the last time I was in Vancouver was in 2006, and that was for the IBC conference in 2006. That’s nearly 20 years ago. I where’s time gone, for goodness sake?
Shel Holtz (10:37)
I don’t know. I’ve been looking for it. So as I mentioned, we have six great reports for you and we will be back with those right after this.
@nevillehobson (10:40)
No, that was good.
At the Google I.O. last week, that’s Google’s developer conference, amongst many other things, the company unveiled a product called V.O.3, that’s V.E.O., V.O.3, its most advanced AI video generation model yet. It’s already sparking equal parts wonder and concern. V.O.3 isn’t just about photorealistic visuals. It marks the end of what TechRadar calls the silent era of AI video.
by combining realistic visuals with synchronized audio. Dialogue, soundtracks and ambient noise all generated from a simple text prompt. In short, it makes videos that feel real with few, if any, the telltale glitches we’ve come to associate with synthetic media. ZDNet and others included in a collection of links on Techmeme describe VO3 as a breakthrough in marrying video with audio, simulating physics, lip syncing with uncanny accuracy,
and opening creative doors for filmmakers and content creators alike. But that’s only one side of the story. The realism VO3 achieves also raises alarms. Exios reports that many viewers can’t tell VO3 clips from those made by human actors. In fact, synthetic content is becoming so indistinguishable that the line between real and fake is beginning to dissolve. Alarm is a point I made in a post on Blue Sky.
earlier last week when I shared a series of amazing videos created by Alejandra Caravejo at Harvard Law Cyber Law Clinic, portraying TV news readers reading out a breaking news story she created just from a simple text prompt. What comes immediately to mind, I said, ⁓ is the disinformation uses of such a tool. What on earth will you be able to trust now? One of Alejandra’s comments in the long thread was,
This is going to be used to manipulate people on a massive scale. Others in that thread noted how easily such clips can be repeated and recontextualized with no visual watermark to distinguish them from real broadcast footage. I mean, one thing is for sure, Sal, if you’ve watched any of these, they’re now peppered all over LinkedIn and Blue Sky and most social networks. You truly are going to have your jaw dropping when you see some of these things. It’s not hard to visualize.
just hearing an audio description, but they truly are quite extraordinary. This is a whole new level. There’s also the question of cost and access. ⁓ VO3 is priced at a premium around $1,800 per hour for professional grade use, suggesting a divide between those who can afford powerful generative tools and those who can’t. So we’re not just talking about a creative leap. We’re staring at an ethical and societal challenge too.
Is VO3 one of the most consequential technologies Google has released in years, not just for creators, but for good and bad actors and society at large? How do you see it, Joe?
Shel Holtz (14:00)
First of all, it’s phenomenal technology. I’ve seen several of the videos that have been shared. saw one where the prompt asked it to create a TV commercial for a ridiculous ⁓ breakfast cereal product. was ⁓ Otter Crunch or something like that. And it had a kid eating Otter Crunch at the table and the mom holding the box and saying Otter Crunch is great or whatever it was that she said.
⁓ and you couldn’t tell that this wasn’t shot in a, in a studio. ⁓ it was, it was that good. Alarm? I’m surprised that there is alarm because we have known for years that this was coming. ⁓ and I, I don’t think it should be a surprise that it has arrived at this point, given the quality of the video services that we have seen from other providers and
This is a game of leapfrog so that you know that one of the other video providers is going to take what Google has done and take it to the next level, maybe allowing you to make longer videos or there will be some bells and whistles that they’ll be able to add and the prices will drop. This is a preliminary price. It’s a brand new thing. We see this with open AI all the time where the first
time they release something, have to be in that $200 a month tier of customer in order to use it. But then within a couple of months, it’s available at the $20 a month level or at the free level. So this is going to become widely available from multiple services. I think we need to look at the benefits this provides as well as the risk.
that it provides. This is going to make it easy for people who don’t have big budgets to do the kind of video that gets the kind of attention that leads to sales or whatever it is your communication objective was for enhancing videos that you are producing with actual footage in order to create openers or bridges or
just to extend the scene, it’s going to be terrific. Even at $1,800 an hour, there are a lot of people who can’t get high quality video for $1,800 an hour. So this is going to be a boon to a lot of creators. In terms of the risk, again, I think it’s education, it’s knowing what to look for.
getting the word out to people about the kinds of scams that people are running with this so that they’re on their guard. It’s going to be the same scams that we’ve seen with less superior technology. It’s going to be, you know, the grandmother con, right? Where you get the call and it sounds like it’s your grandson’s voice. I’ve been kidnapped. They’re demanding this much money. Please send it. Sure sounds like him. So grandma sends the money. So
This is the kind of education that has to get out there ⁓ because it’s just gonna get more realistic and easier to con people with the cons that frankly have been working well enough to keep them going up until now.
@nevillehobson (17:38)
Yeah, I think there is real cause for major alarm at a tool like this. You just set out many of the reasons why, but I think the risk mostly comes more from or rather less from examples like the grandmother call saying, you know, someone calling the grandmother, I’ve been kidnapped. I don’t know anyone that’s ever happened to him, not saying it doesn’t, but that doesn’t seem to me to be like a major daily thing. might more pro-Zec, more fundamental than that. But
Some of the examples you can see and the good one to mention is the one from Alejandra Carabagio, the video she created, which were a collection of clips ⁓ with the same prompt. they were all TV anchors, presenters on television, ⁓ talking about breaking news that J.K. Rowling had drowned because a yacht sank after it was attacked by orcas in the Mediterranean off the coast of Turkey.
⁓ What jumped at me when I saw the first one was, my God, this was so real. It looked like it was a TV studio, all created from that simple prompt. But then came three more versions, all with different accented English, American English, US English, English as a second language for one of the presenters that illustrates from that one prompt, what you could do. And she said that the first video took literally a couple of seconds.
And within less than 10 minutes after tweaking a couple of things after a number of attempts, she had a collection of five videos. So imagine that there are benefits, unquestionably. And indeed, some of the links we’ve got really go through some significant detail of the benefits of this to creators. But right on the back of that comes this big alarm bell ring. This is what the downside looks like. And I think
your point about ⁓ it’s going to come down, competitors will emerge. Undoubtedly, I totally agree with you. But that isn’t yet. In the meantime, this thing’s got serious first mover advantage and the talk up that I’m seeing across the tech landscape mostly, it hasn’t yet hit mainstream talk. I’m not sure how you kind of explain it in a way that excites people unless you see the videos. But
This is big alarm bell territory, in my opinion, and I think it’ll accelerate a number of things, one of which is more calls to regulate and control this if you can. you know, who knows what Trump’s going to do about this? Probably embrace it, I would imagine. I mean, you’ve seen what he’s doing already with the video and stuff that promotes him in his his emperor’s clothes and all this stuff. So this is, a major ⁓ milestone, I think, in the development of these technologies.
it will be interesting to see who else comes out in a way that challenges Google. But if you read Google’s very technically focused description, this is not a casual development by six guys with a couple of computers. This is required, I would imagine, serious money and significant quantum computing power to get it to this stage in a way that enables anyone with a reasonably powered computer to use it and create something. ⁓
got that that aspect to consider should we be doing something like this that generates huge or rather uses huge amounts of electricity and energy and all the carbon emissions we got that side of the debate that’s beginning to come out a little bit. So it’s experimental time without doubt. And there are some terrific learnings we can get from this. mean, I’d love to give it a go myself, but not at 1800 bucks. So if I had someone to do it for that was I could charge them for that I’d be happy.
⁓ But I’m observing what others are doing and hearing what people are saying. And it’s picking up pace. Every time I look online, there’s something new about this. Someone else has done something and they’re sharing it. So great examples to see. So yes, let’s take a look at what the benefits are and let’s see what enterprises will make of this and what we can learn from it. But I’m keeping a close eye on what others are saying about the risks because ⁓ we haven’t, you talk about the education, all that stuff.
No one seems to have paid any attention to any of that over the years. So why are going to pay attention to this now if we try and educate them?
Shel Holtz (22:06)
Well,
that really depends on how you go about this. Who’s delivering the message? I mean, where I work, we communicate cybersecurity risk all the time. And we make the point this isn’t only a risk to our company. This is a risk to you and your family. You need to take these messages home and share them with your with your kids. And every time something new comes out, where there’s a new scam, where we are aware
@nevillehobson (22:10)
It does. ⁓
show.
Shel Holtz (22:34)
And we usually hear about this through our IT security folks, but where we are aware that in our industry, somebody was scammed effectively with something that was new. We get that out to everybody. We use multiple channels and we get feedback from people who are grateful for us telling them this. So it’s not that people won’t listen. You just have to get them in a way that resonates with them.
And you have to use multiple channels and you have to be repetitive with this stuff. You have to kind of drill it into their heads. see organizations spending money on PSAs on TV alerting people to these scams. They’re all imposter scams is what it comes down to. It’s pretending to be something that they aren’t. know, what troubles me about this
I think is that we are talking a lot about erosion of trust. We talked about it on the last midweek episode, the fact that people trust each other less than they ever have. Only 34 % of people say they trust other people, that other people are trustworthy. And we’re trying to rebuild trust at the same time we’re telling people, you can’t trust what you see. You can’t trust your own eyes anymore. So this is a challenging time.
@nevillehobson (23:54)
Right.
Shel Holtz (24:00)
without any question when you have to deal with both of these things at the same time. We need to build trust at the same time. We’re telling people you can’t trust anything.
@nevillehobson (24:02)
It is.
Well, that is the challenge. absolutely right, because ⁓ people don’t actually need organizations to tell them that. They can see with their own eyes, but it’s then reinforced by what they’re hearing from governments. We’ve got an issue that I think is very germane to bring this into conversation, something in this country that is truly extraordinary. One of the biggest retailers here, Marks & Spencer.
was the subject of a huge cyber attack a month ago, and it’s still not solved. Their websites, you still can’t do any buying online. You can’t do click and collect none of those things. Today, they announced you can now again, log on to the website and browse. You can’t buy anything. You can’t pay electronically. You can only do it in the stores. And that no one seems to know precisely what exactly it is. There’s so much speculation, so much ⁓ talk that
of which most is uninformed, which is fueling the worry and alarm about this. And the consequences from Marks & Spencer are potentially severe from a reputational point of view and brand trust, all those things. haven’t solved this yet. That, people are saying, that was likely caused by an insecure password login by someone who is a supplier of Marks Spencer. But this is not like
little store down the road. This is a massive enterprise that has global operations. And the estimates at the moment is that the cost to them is likely to be around 300 million pounds. It’s serious money. They’re losing a million pounds a day. It’s serious. Oh, they won’t disclose it. It’s illegal to do that here in the UK to pay the ransom, if you disclose it. Government advice from the cyber security folks is don’t pay the ransom. Difficult thing to me is that you follow that advice and they’re still not solving the problem.
Shel Holtz (25:45)
And what was the ransom?
@nevillehobson (26:03)
The point I’m making, is that this is just another example of ⁓ forged trust, if I could say it that way, that it was likely until information arrives telling exactly what it was, that someone persuaded someone to do something who they thought was someone else that they weren’t that enabled that person to get access. Right. So this is going to be like that for some of the examples we’ve seen. But I think it’s likely as well to be ⁓
Shel Holtz (26:23)
Yeah, sure. It was fishing.
@nevillehobson (26:33)
kind of normal that you would almost find impossible to even imagine that it was a fake. So what’s going to happen when the JK Rowling example, like someone in a prominent position in society or whatever, it’s suddenly ⁓ on a website somewhere that gets picked up and repeated everywhere before it’s well, wait a minute, is just to what’s the source of this, but it’s too late by then. And that’s likely what we’re going to see.
Shel Holtz (26:58)
We
reported on a story like this many years ago. It was, if I remember correctly, a bank robbery in Texas. It was a story that got picked up by multiple news outlets. It was completely fake. The first outlet that picked it up just assumed that it was accurate because of their source and all the other newspapers.
picked it up because they assumed that the first newspaper that picked it up had checked their facts, but it was a false story. This is nothing new. It’s just with this level of realistic video, it’s going to be that much easier to convince people that this is real and either share it or act on it.
@nevillehobson (27:40)
as it will.
And it won’t be waiting on the media to pick up and report on it. That’s too slow. It’ll be TikTokers, it’ll be YouTube. It’s anyone with a website that has some kind of audience that’s connected and it’ll be amplified big time like that. So it’ll be out of control within probably within seconds of the first video appearing. That’s not to say that, dear, know, this is so what do we do? We’ve got to be that that’s that is the landscape now. And honestly and truly can’t imagine how
example of like a JK Rowling death at sea and all that stuff is on on multiple TV screens, supposedly TV studios that you don’t think when you’re watching hang on, is this legit this TV show you might occur to you, but the other nine people out there watching along with you aren’t gonna ask themselves that they’re gonna share it. And it’s suddenly it’s out there. And before you know it. I don’t know.
If it’s ⁓ say the CEO of big company that’s happened at a time of some kind of merger or takeover going on and then that person suddenly dropped dead, that’s the kind of thing I’m thinking about. So ⁓ I can see the real need to have some kind of, I can’t even call it shell regulation, I’m not sure, I don’t know, by government or someone.
alongside, you can’t just leave this to individual companies like yours who are doing a good job. Well, there are 50 others out there who aren’t doing this at all. So you can’t you can’t let it sit like that. Because this, the scale of this is breathtaking, frankly, what’s going to happen. And I think Alejandro Caravaggio and others I’ve seen saying the same thing, that ⁓ that, ⁓ you know, this is going to be a tool used to manipulate people on a massive scale. We’re not talking about business.
employees necessary, the public at large, this is going to manipulate people. And we’re already seeing that at small scale, based on the tech we have now. This tech’s up notches, in my view. you know, 1800 bucks, people are going to do this, ⁓ that to them, it’s like, you know, petty cash almost, or someone’s going to come out with something, again, that isn’t going to be that and it’s on a dark web somewhere and you know.
So I mean, I’m now getting into areas that I have no idea what I’m going to be talking about. So I will stop that now. I don’t know how that’s going to work. this requires attention, in my opinion, that to protect people and organizations from the bad actors, that euphemistic phrase, who are intent on causing disruption and chaos. And this is potentially what this will achieve alongside all that good stuff.
Shel Holtz (30:19)
It’ll be interesting to hear what Google plans to do to prevent people from using it for those purposes. I have access to…
@nevillehobson (30:26)
They have a bit an FAQ,
which talks a talks a little bit about that. hey, this is like draft still, I would say.
Shel Holtz (30:33)
I have access to VO2 on my $20 a month Gemini account, I’ll just wait the six weeks until VO3 is available there.
@nevillehobson (30:44)
Well, things may have moved on to who knows what in six weeks, I would say. But nevertheless, this is an intriguing development technologically and what it lets people do in a good sense is the exciting part. The worrying part is what the bad guys are going to be doing.
Shel Holtz (31:03)
to say. So I need to make a time code note.
@nevillehobson (31:04)
Yeah.
Shel Holtz (31:18)
The fact that generative AI chatbots hallucinate isn’t a revelation, at least it shouldn’t be at this point, and yet AI hallucinations are causing real, consequential damage to organizations and individuals alike, including a lot of people who should know better. And contrary to logic and common sense, it’s actually getting worse.
Just this past week, we’ve seen two high-profile cases that illustrate the problem. First, the Chicago Sun-Times published what they called a summer reading list for 2025 that recommended 15 books. Ten of them didn’t exist. They were entirely fabricated by AI, complete with compelling descriptions of Isabelle Indy’s non-existent climate fiction novel Tidewater Dreams and Andy Weir’s imaginary thriller The Last Algorithm.
The newspaper’s response? Well, they blamed a freelancer from King Features, which is a company that syndicates content to newspapers across the country. It’s owned by Hearst. That freelancer used AI to generate the list without fact checking it. And the Sun-Times published it believing King Features content was accurate. And other publications shared it because the Chicago Sun-Times had done it.
Then there’s even more embarrassing case of Anthropic. That’s the company behind the Claude AI chatbot, one of the really big international large language models, frontier models. Their own lawyers had to apologize to a federal judge after Claude hallucinated a legal citation and a court filing. The AI generated a fake title and fake authors for what should have been a real academic paper. Their manual citation checks
missed it entirely. Think about that for a moment. A company that makes AI couldn’t catch its own tools’ mistakes, even with human review. Now, here’s what’s particularly concerning for those of us in communications. This isn’t getting better with newer AI models. According to research from Vektara, even the most accurate AI models still hallucinate at least 0.7 % of the time.
with some models producing false information in nearly one of every three responses. MIT research from January found that when AI models hallucinate, they actually use more confident language than when they’re producing accurate information. They’re 34 % more likely to use phrases like definitely, certainly, and without doubt when they’re completely wrong. So what does this mean for PR and communications professionals? Three critical things. First.
We need to fundamentally rethink our relationship with AI tools. The Chicago Sun-Times incident happened just two months after the paper laid off 20 % of its staff. Organizations under financial pressure are increasingly turning to AI to fill gaps, but without proper oversight, they’re creating massive reputation risks. When your summer reading list becomes a national embarrassment because you trusted AI without verification, you got a crisis communication problem on your hands.
@nevillehobson (34:04)
.
Shel Holtz (34:28)
Second, the trust issue goes deeper than individual mistakes. As we mentioned in a recent midweek episode, research shows that audiences lose trust as soon as they see AI disclosure labels, but finding out you used AI without disclosing it is even worse for trust. This creates what researchers call the transparency dilemma. Damned if you disclose, damned if you don’t. For communicators who rely on credibility and trust, this is a fundamental challenge we haven’t come to terms with.
Third, we’re seeing AI hallucinations spread into high-states environments where the consequences are severe. Beyond the legal filing errors, we’ve seen multiple times now, from Anthropic to the Israeli prosecutors who cited non-existent laws, we’re seeing healthcare AI that hallucinates medical information 2.3 % of the time, and legal AI tools that produce incorrect information in at least some percentage of cases that could affect real legal outcomes.
The bottom line for communication professionals is that AI can be a powerful tool, but it is not a replacement for human judgment and verification. I know we say this over and over and over again, and yet look at the number of companies that use it that way. The industry has invested $12.8 billion specifically to solve hallucination problems in the last three years, yet we’re still seeing high profile failures from major organizations who should know better.
My recommendation, if you’re using AI in your communications work, and let’s be honest, most of us are, insist on rigorous verification processes. Don’t just spot check. Verify every factual claim, every citation, every piece of information that could damage your organization’s credibility if it’s wrong. And remember, the more confident AI sounds, the more suspicious you should be.
The Chicago Sun-Times called their incident a learning moment for all of journalism. I’d argue it’s a learning moment for all of us in communications. We can’t afford to let AI hallucinations become someone else’s crisis communications case study.
@nevillehobson (36:37)
until the next one. Right. mean, listen to what you say. You’re absolutely right. Yet, the humans are the problem. Arguably, and I’ve heard this, they’re not, it’s the technology is not up to scratch. Fine. Right. In that case, you know that. So therefore, you’ve got to pay very close attention and do all the things that you outlined before that people are not doing. So this one is extraordinary.
Shel Holtz (36:39)
And it becomes a case study. ⁓
The humans are the solution.
@nevillehobson (37:05)
⁓ Snopes has a good analysis of it talking about this. ⁓ King Features, mean, their communication about it, they said, the company has a strict policy with our staff, cartoonists, columnists, and freelance writers against the use of AI to create content. And they said it will be ending its relationship with the guy who did this. Okay, throw him under the bus, basically. So you don’t have guidance in place properly, even though
you say you have a strict policy, that’s not the same thing, is it? So I think this was inevitable and we’re going to see it again, sure, we will and the consequences will be dire. I was reading a story this morning here in the UK of a lawyer who was an intern. That’s not her title, but she was a junior person that she ⁓ entered into evidence, some research she’d done without checking and it was all fake, done by the AI. And the case
turns out, and again, this is precisely the concern, not the tech. It’s not her fault. She didn’t have proper supervision. She was pressured by people who didn’t help because she didn’t know enough. And so she didn’t know how to do something. And she was under tight parameters to complete this thing. So she did it. No one checked her work at all. So she apologized and all that stuff. And yes, the judge, from what I read, isn’t isn’t penalizing her. It’s her boss. He should be penalizing.
You’re going to see that repeated, I’m sure already exists in case up and down businesses, organizations everywhere, where that is not an unusual setup structure, lack of support, lack of training, ⁓ lack of encouragement, indeed, the whole bring it out, let’s get the policy set up guidance and not just publish it on the internet. We bring it to people’s attention. We embrace them. We encourage them.
We bring them on board to conversations constantly, brown bag lunches, all the informal ways of doing this too. And I’m certain that happens a lot. But this example and others we could bring up and mention show that it’s not in those particular organizations. So the time will come, I don’t believe it’s happened yet, ⁓ where the most monumentally catastrophic clanger will be dropped sooner or later in an organization, whether it’s a government.
whether it’s a private company, whether it’s a medical system or whatever, that could have life or death consequences for people. Don’t believe that’s happened yet that we know of anyway, but the time is coming where it’s going to, I’d say.
Shel Holtz (39:36)
it will,
it undoubtedly will. And you’ll see medical decisions get made based on a hallucination that somebody didn’t check. What strikes me though is that we talk about AI as an adjunct, right? It is an enhancement to what humans do. It allows you to offload a lot of the drudgery so that you can focus your time on more.
human-centric and more strategic endeavors, which is great, but you still have to make sure that the drudge work is done right. I mean, that work is being done for a reason. It may be drudgery to produce it, but it must have some value or the organization wouldn’t want it anymore. So it’s important to check those. And in organizations that are cutting head count,
@nevillehobson (40:06)
Ahem.
Shel Holtz (40:29)
You know, what a lot of employees are doing is using AI in order to be able to get all their work done. That drudge work, having the AI do that and spend 15 minutes on it instead of three hours. It’s not like those three hours are available to them to fact check. They’ve got other things that they need to do. Organizations that are cutting staff need to be cognizant of the fact that they may be cutting the ability to fact check the output of the AI.
which could do something egregious enough to cost them a whole lot more than they saved by cutting that staff. And by the way, I saw research very recently, I almost added it as a report in today’s episode that found that investors are not thrilled with all the layoffs that they’re seeing in favor of AI. They think it’s a bad idea. So if you’re looking for a way to…
get your leaders to temper their inclinations to trim their staff. You may want to point to the fact that they may lose investors over decisions like that, but we need the people to fact check these things. And by the way, I have found an interesting way to fact check and it is not an exclusive approach to this.
But let me give you just this quick example. On our intranet every week, I share a construction term of the week that not every employee may know. And I have the description of that term written by one of the large language models. I don’t know what these things mean. I’m not a construction engineer.
So I get it written, and then the first thing I do is I copy it, and then I go to another one of the large language models and paste it in, and I say, review this for accuracy and give me a list of what you would change to make it more accurate. And most of the time it says, this is a really accurate write-up that you’ve got of this term. I would recommend to enhance the accuracy that you add these things.
So I’ll say, ahead and do that, write it up and make those things. Then I’ll go to a third large language model and ask the same question. I’ll still go do a Google search and find something that describes all of this to make sure I’ve got it right. But I find playing the large language models against each other as accuracy checks works pretty well.
@nevillehobson (42:56)
Yeah, I do a similar thing to not for everything. mean, like everyone who’s got the time to do all that all the time, but depends, I think, on what you’re doing. But ⁓ it is something that we need to we need to pay attention to. And in fact, this is quite a good segue to our next piece, our next story, where artificial intelligence plays a big role. this one ⁓ talks about ⁓
outlander really a new report from the Global Alliance of Public Relation and Communication Management that is, it offers a timely and global perspective on how our profession is adapting and in many cases struggling to keep pace as artificial intelligence continues its rapid integration into our daily work. As AI tools become embedded in the workflows of communication professionals around the world, a new survey from the Global Alliance offers a revealing snapshot
of where our profession currently stands and where it may be falling short. The report titled Reimagining Tomorrow, AI and PR and Communication Management draws on insights from nearly 500 PR and communication professionals. The findings paint a picture of a profession that’s enthusiastically embracing AI tools, particularly for content creation, but falling short when it comes to strategic leadership, ethical governance, and stakeholder communication. While adoption is high,
91 % of respondents say they’re using AI. The report highlights a striking absence of strategic leadership. Only 8.2 % of PR and communication teams are leading in AI governance or strategy, according to the report. Yet professionals rank governance and ethics as their top AI priorities at 33 % and 27 % respectively. Despite this, PR teams are mostly engaged in tactical tasks.
such as content creation and tool support. This gap between strategic intent and practical involvement is critical. If PR professionals don’t position themselves as stewards of responsible AI use, other functions like IT or legal will define the narrative. This has implications not only for reputation management, but for organizational relevance in the comms function. Now, in a post on his blog last week, our friend Stuart Bruce
describes the findings as alarming, arguing that communicators are failing to lead on the very issues that matter most, ethics, transparency, stakeholder trust, and reputation. His critique is clear. If PR doesn’t step up to define the response of the use of AI, we risk becoming sidelined in decisions that affect not just our teams, but the wider organization and society. The Global Alliances report also shows that while AI is mostly being used for content creation,
Very few are leveraging its potential for audience insights, crisis response, or strategic decision making. Many PR pros still don’t fully understand what AI can actually do, Stuart, either tactically or strategically. Worse, some are operating under common myths, such as avoiding any use of AI with private data, regardless of whether they’re using secure enterprise tools or not. So where does this leave us? Well, it looks to me like somewhere between a promise and a missed opportunity.
How would you say it, Joe?
Shel Holtz (46:21)
it is a missed opportunity so far as far as I am concerned. And I have seen research that basically breaks through the communications boundary into the larger world of business that says, yes, there’s great stuff going on in organizations in terms of the adoption of AI, but there is not really strategic leadership happening in most organizations. Employees are using it.
There are a growing number of policies, although most organizations still don’t have policies. Most organizations still don’t have ethics guidelines, although a growing number do. There are companies like mine that have AI committees, but the leadership needs to come from the very top down. And that’s what this research found isn’t happening. I was just scrolling through my bookmarks trying to find it. I’ll definitely turn that up before the…
show notes get published, if it’s not happening at the leadership levels of organizations, it’s not happening at the leadership levels of communication, I certainly can see that in the real world as I talk to people. It’s being used at a very tactical level, but nobody is really looking at the whole overall operation of communication in the organization, the role that it plays and how it goes about doing that.
through that lens of AI and how we need to adapt and change and how we need to prepare ourselves to continue to adapt and change as things like VO3 are released on the market and suddenly you’re facing a potential new reputational threat.
@nevillehobson (48:07)
Lots to unpack there. It’s worth reading the report. It’s well worth the time.
Shel Holtz (48:12)
Hey, Dan, thank you for that great report. Yeah, I had to wipe a tear away as well over the passing of Skype. You’re right. It was amazing as the only tool that allowed you to do what it could do. And as we have mentioned here more than once in the past, it is the only reason that we were able to start this podcast in the first place without Skype. You were in Amsterdam at the time.
And for you and I to be able to talk together and record both sides of our conversation, Skype was the reason that we could do that. The only other option would have been what at the time was an expensive long distance phone call with really terrible audio. Who knew the double ender back in those days? We could have done it. You realize we could have both recorded our own ends. It would have taken forever to send those files.
@nevillehobson (49:02)
Yeah.
Shel Holtz (49:09)
back then because the speeds were.
@nevillehobson (49:11)
It would have been quicker
burning them to a CD and sending it by courier, I would say.
Shel Holtz (49:15)
Yeah,
no kidding. So bless Skype for enabling not just us, but pretty much any podcasters who were doing interviews or co-host arrangements. Skype made it possible, but Skype also enabled a lot of global business. There were a lot of meetings that didn’t have to happen in person. I mean, you look at Zoom today, Zoom is standing on the shoulders of Skype.
@nevillehobson (49:39)
Yeah, it actually did enable a lot. You’re absolutely right. I can remember to you remember this, of course, back in those days when both of us I think we were both of us were independent consultants. So, you know, pitching for business securing contacts and following up and all that was key. We had what what Skype called Skype out numbers that were regular phone numbers that people could use like a landline and that we get forwarded through to Skype by
wife’s family in Costa Rica, she used Skype to make calls all the time that replaced sending faxes, which is how they used to communicate because that was cheaper than international phone calls at that time. ⁓ lots happened in that time. But in reality, it’s only 20 years ago. It sounds a lot. But all this has happened in a 20 year period. And Skype ⁓ was the catalyst for much of this. They laid the foundation for
teams that we see now, Zoom, Google Meet, all those services that we can use. So what happened to WebEx and the like? It seems to have largely vanished, what I can see. So we’re used to all this stuff now. But it was great starter for us. And Dan mentions.
Shel Holtz (50:55)
Yeah, I had a Skype out. My Skype
out number I got, it was my business number and I got a 415 area code because that’s San Francisco and nobody knew the 510 area code in the East Bay outside of the Bay Area. So it provided just that little extra bit of cache. Oh, a San Francisco number. I mean, there was just so much good that came out of Skype. They kept coming up with great features and great tools even after Microsoft bought it.
@nevillehobson (51:17)
Yeah.
They did.
Yeah. And the price, the pricing structure was good. At that time I had, I had business in on the East coast in the U S and I had a New York number. So, uh, yeah, it was, was super, but, so good to, to have a reminisce there with Dan. That was great. Um, I was intrigued by your element about Bridgie Fed, which, uh, I’ve been trying to use that since it emerged.
Shel Holtz (51:25)
So.
That’s great.
@nevillehobson (51:53)
with Blue Sky, but also with Ghost, which has enabled a lot of this connectivity with other servers in the Fediverse. And so I’ve kind of got it all set up. But no matter what I do, it just does not connect. And I haven’t figured out why not yet. So you’ve prompted me to get this sorted out, because it’s important. I’ve got my social web address, and it was enabled by Ghost, that works on Mastodon.
and it enables Blue Sky to connect with Mastodon 2. It’s really quite cool, but Bridgifed’s key to much of that functionality. maybe it’s just me. I haven’t figured it out yet. There could be. So this is definitely not yet in the mainstream readiness arena quite yet, but this is the direction of travel without any doubt. And I think it’s great that we eliminate these, you know, activity pub versus AT protocol.
It just works. No one gives a damn about whether you’re on a different protocol or not. That’s where we’re aiming for. And that’s what is actually we’re moving towards quite quickly. Not for me, though, until I get this work.
Shel Holtz (53:04)
One protocol will win over another at one point or another. It always does.
@nevillehobson (53:07)
It’s like, yeah,
Betamax and VHS, you know, look at that.
Shel Holtz (53:12)
Yep.
And that’s the power of marketing because Betamax was the higher quality format. Well, let’s explore a fascinating and entirely predictable phenomenon that’s emerging in the corporate world. Companies that enthusiastically laid off workers to replace them with AI are now quietly hiring humans back.
@nevillehobson (53:16)
Yes, right, right.
Shel Holtz (53:35)
This item ticks a lot of boxes, man. Organizational communication, brand trust, crisis management. Let’s start with the poster child for this phenomenon. Klarna, the buy now pay later company. CEO Sebastian Simitowski became something of an AI evangelist, loudly declaring that his company had essentially stopped hiring a year ago, shrinking from 4,500 to 3,500 employees through what he called natural attrition.
He bragged that AI could already do all the jobs that humans do and even created an AI deep fake of himself to report quarterly earnings, supposedly proving that even CEOs can be replaced. How’d that work out for him? Just last week, Semitkowski announced that Klarna is now hiring human customer service agents again. Why? Because as he put it, from a brand perspective, a company perspective, I just think it’s so critical.
that you are clear to your customer that there will always be a human if you want. The very CEO who said AI could replace everyone is now admitting that human connection is essential for brand trust. It isn’t an isolated case. We’re seeing this pattern repeat across industries, and it should serve as a wake-up call for communications professionals about the risk of overly aggressive AI adoption without considering the human element. Take Duolingo, which had been facing an absolute
firestorm of social media after CEO Louis Vuitton announced that the company was going AI first. The backlash was so severe that Duolingo deleted all of its TikTok and Instagram posts, wiping out years of carefully crafted content from accounts with millions of followers. The company’s own social media team then posted a cryptic video. They were all wearing those anonymous style masks saying Duolingo was never funny.
We were. And what a stunning example of how your employees can become your biggest communication crisis when AI policies directly threaten their livelihoods. All this is particularly troubling from a communication perspective. These companies didn’t just lose employees, they lost institutional knowledge, creativity, and human insight that made their brands distinctive in the first place. A former Duolingo contractor told one journalist that the AI-generated content is very boring.
while Duolingo was always known for being fun and quirky. When you replace the humans who created your brand voice with AI, you risk losing the very thing that made your brand memorable. But here’s the broader pattern we need to understand. According to new research, just one in four AI investments actually deliver the ROI they promise. Meanwhile, companies are spending an average of $14,200 per employee per year just to catch and correct AI mistakes.
Knowledge workers are spending over four hours a week verifying AI output. These aren’t the efficiency gains that were promised. Now, I firmly believe those are still coming, those gains, and in a lot of cases, they’re actually here now. Some organizations are realizing them as we speak, but we’re not out of the woods yet. From a crisis communication standpoint, the AI layoff rehire cycle creates multiple reputation risks.
There’s the immediate backlash when you announce AI replacements. We saw this with Klarna and Duolingo and others. Employees and customers both react negatively to the idea that human workers are disposable. Then there’s the credibility hit when you quietly reverse course and start hiring people again. It signals that your AI strategy wasn’t as well thought out as you claimed. And that sort of trickles over into how much people trust your judgment and other things that you’re making decisions about.
For those of us working in communication, this trend highlights some critical lessons. Stakeholder communication about AI needs to be honest about limitations, not just potential and benefits. Companies that over promise on AI capability set themselves up for embarrassing reversals. Klarna CEO went from saying AI could do all human jobs to admitting that customer advice, customer service quality suffered without human oversight.
Second, employee communications around AI adoption require extreme care. When you announce AI first policies, you’re essentially telling your workforce they’re expendable. The Duolingo social media team’s rebellion shows what happens when you lose internal buy-in. Your employees become your critics, not your champions. And brand voice and customer experience are fundamentally human elements that can’t be easily automated.
Companies struggling most are those that tried to replace creative and customer facing roles with AI. Meanwhile, companies succeeding with AI are using it to augment human capabilities, not replace them entirely. The irony here is pretty rich. At a time when trust in institutions is at historic lows, companies are discovering that human connection and authenticity matter more than ever. You can’t automate your way to trust. So.
What should communication professionals take away from this ⁓ AI layoff rehire cycle? Be deeply skeptical of any AI strategy that eliminates human oversight in customer facing roles. Push back on claims that AI can fully replace creative or strategic communications work. And remember that when AI initiatives go wrong, it becomes a communications problem that requires very human skills to solve.
The companies getting all this right are the ones that view it as a tool to enhance human capabilities, not replace them. The ones getting it wrong are learning an expensive lesson about the irreplaceable value of human judgment, creativity, and connection.
@nevillehobson (59:32)
Yeah, it got me thinking about ⁓ the ⁓ human bit that doesn’t get this, which typically a leader is an organization, but actually not necessarily at the highest level. I’m thinking in particular of companies, I’ve had a need to go through this process recently, who replace people at the end of a phone line in customer support.
with a chat bot typically as the first line of defense. And I use that phrase deliberately. It defends them from having to talk to a customer where they have a chat bot where it guides you through carefully controlled scripted scenarios that it does have a little bit of leeway in its intelligence to respond on the fly to a question that’s not in the script, as it were, but only marginally. And so you still have to go through a system
that is poor at best and downright dangerous at worst in terms of trust with customers. your point, I agree totally, kind of fosters a climate of mistrust entirely when you can’t get to human and all you get is a chat bot and sometimes a chat bot that can actually engage in conversation. There are some good ones around.
But my experience recently with an insurance company to an accident, car accident I had in December, a guy drove into my car, repaired, and I’m chasing the other party to reclaim my excess. And boy, that’s an education in how not to implement something that engages with people. So, but I don’t see any sign of that changing anytime soon.
So one thing I take from this show, everything you said, indeed what we discussed in this whole episode so far in this context, it’s a people issue, not a tech issue completely in terms of how these tools are deployed in organizations. The CEO at Klana, I was reading about the CEO of Zoom who deployed an avatar to open his speech at an event recently.
⁓ I just wonder what were they thinking to do all these thing
05/24/25 | 0 Comments | FIR #466: Still Hallucinating After All These Years