Generative AI sans horse sense – unreliable narrators by design?

Gear up!

I like the way that I can do a Google search using a photo – drag & drop it into a search box – to identify its content. I like the way I can extract text from a photo (providing the photo has sufficient resolution on that content). And Amazon uses AI to summarize the gist of product reviews (to some degree).

But in the rush to embed AI for directly answering user questions – in a manner like “Ask Mr. Wizard,” this article notes that something is amiss. With large language models. Perhaps with no way out.

Not all movies can be saved in post. Not all software can be saved by updates. GIGO.

Rolling the dice is not a good basis for trustworthiness. “Mostly correct” incurs hapless harms. A form of quantum uncertainty: horse sense or horse pucky, eh.

Who vets fact-checking mechanisms? And if it comes to using low-wage human labor to fact-check, …

• Washington Post > Tech Brief > email news > “Google’s AI search problem may never be fully solved” by Will Oremus (May 29, 2024) – Last week, Google’s new “AI Overviews” stretched factuality.

“All large language models, by the very nature of their architecture, are inherently and irredeemably unreliable narrators,” said Grady Booch, a renowned computer scientist. At a basic level, they’re designed to generate answers that sound coherent — not answers that are true. “As such, they simply cannot be ‘fixed,'” he said, because making things up is “an inescapable property of how they work.”

But that [citing and summarizing specific sources] can still go wrong in multiple ways, said Melanie Mitchell, a professor at the Santa Fe Institute who researches complex systems. One is that the system can’t always tell whether a given source provides a reliable answer to the question, perhaps because it fails to understand the context. Another is that even when it finds a good source, it may misinterpret what that source is saying.

Other AI tools … may not get the same answers wrong that Google does. But they will get others wrong that Google gets right. “The AI to do this in a much more trustworthy way just doesn’t exist yet,” Mitchell said.

2 comments

  1. Talking dog 1.0

    As a counterpoint to the bumpy ride of LLMs, Steven Levy reminds us of previous tech revolutions – in the context of recent AI announcements:

    • OpenAI’s GPT-4o homage to the movie ‘Her” – “like the screenplay was a blueprint.”

    • Google’s rollout of a new version of its most powerful AI model, Gemini Pro … and a Project Astra tease.

    In his “Time Travel” commentary, Levy notes that “actually, reality [of 1995’s Internet] exceeded my hyperbole [at a time when most people in the United States had yet to log on, let alone net-surf].”

    So, while naysayers highlight LLM’s quirky performances (noted in my post), he marvels at the “talking dog.” Something which heralds a future arc like the once mundane smartphone – for better or worse (indeed).

    … like the story where someone takes a friend to a comedy club to see a talking dog. The canine comic does a short set with perfect diction. But the friend isn’t impressed – “The jokes bombed!”

    Like a sitcom about a talking horse, eh.

    • Wired > email Newsletter > Steven Levy > Plaintext > The Plain View > “It’s time to believe the AI hype” (May 17, 2024) – There’s universal agreement in the tech world that AI is the biggest thing since the internet, and maybe bigger.

    Folks, when dogs talk, we’re talking Biblical disruption. Do you think that future models will do worse on the law exams?

  2. AI from magic hat

    The hype over AI applications reminds me of the hype over quantum computing. $billions continue to be invested in both technologies, despite limitations and uncertain business models. And demo’s that disappoint.

    This article clarifies context: using AI for summaries and wordsmithing vs. fetching factual information. Can ChatGPT really ace the legal bar exam? Or write code like an expert programmer?

    In my short story “First phone day” (from Tales of Tau’s World), here’s a vision of AI at a social occasion:

    Later in the day, “Time to party!” popped in Tau’s feed, tagged with Tak’s grinning avatar. Nearby all his pals waved. “Okay, ready!” he replied. He joined the others moving toward the Ed Center.

    The Center’s gym had been transformed into a festive dance floor. Cool lighting effects, holograms in the air, etc. Music by a droid stream jockey.

    “Wow!” Comms started popping with reactions, digs, … where to start, who’d make the first move …

    Their AIs sensed the tension and suggested enabling chaperon mode. “Whoa, you can do that?” Tau looked around, “Go with it?” The answer was quick, as their feeds guided them into groups, polled them on music and dance options, and instructed the stream jockey, which cued the lighting as well.

    Tau got with the vibe and joined in the dancing. Mostly group dancing – line and quad dancing. He liked the way everyone seemed to sync while together. Epic! And his new personal AI offered tips on the moves.

    All too soon, things started winding down. The music faded. The lights brightened. A hospitality avatar popped in Tau’s feed, “You guys have been great! Congrats to you all and thanks for coming!”

    • Washington Post > “AI isn’t dumb, but it might be dumber than you think” by Shira Ovide (June 25, 2024) – It’s time to get real about what AI can and can’t do.

    … there is often a mismatch between the reality of AI and how companies encourage you to think of their AI as magical brains that know and do everything.

    The lesson is to get comfortable with what AI can and can’t do, so you’re not disappointed.

    And it helps to see the pattern of companies backtracking when AI doesn’t work nearly as well as they had promised. They know AI is not magic and you should, too. Here are some examples: [Amazon, Microsoft, Google, …].

Comments are closed.