I’ve been using ChatGPT’s vocal interface…and it’s a game changer
This is it.
If you’ve been waiting for artificial intelligence to feel more like you’re talking to Star Trek: The Next Generation’s Data than playing Colossal Cave Adventure in 1976, it’s here now.
ChatGPT introduced a vocal (spoken) interface in the phone apps this week. Right now, it’s available to ChatGPT+ users (I’m one: I pay about $20 a month partially to get new features right away like this) & business users, but it should be available to everybody in the next couple of weeks. Here is their announcement on it:
https://openai.com/blog/chatgpt-can-now-see-hear-and-speak
It took the experience of using it, though, to really appreciate it.
I should say a few things first:
- I’ve been using the text interface of ChatGPT quite a bit, both experimenting with it and in practical ways. I write about it in this blog: ChatGPT category in The Measured Circle
- I’m a former actor and trainer, and I’ve trained trainers. I’ve written fiction & had plays produced. I feel like I have a good understanding of human communication and read quite a bit about it. I think I’m qualified to offer an opinion on how well an AI (Artificial Intelligence) communicates
- While I’m clearly a geek myself, my Significant Other is not, so I get that perspective
That said, it’s obviously easier to talk than to type. I’ve been able to carry on conversations while doing other things, something I couldn’t do if I had to type on the screen.
That’s just the beginning, though!
I noticed right away that it’s conversational style was different from the text interface. It did a great job of “calling back” to something that had been said previously. That’s something good conversationalists do. Let’s say that you mention early on that your dog does funny things. You’re talking for a while, the conversation meanders a bit, and then you mention that your dog rolled their eyes when you tried to take a picture of them. The AI might say, “There’s one of those funny things!”
It also asks you relevant questions to keep the conversation going. “That sounds like a great project. Any particular challenges you see in getting it done?”
I confirmed that the vocal and typed interfaces have different styles…by asking it! Here’s what it said:
“Yes, you’re absolutely right. The vocal interface is designed to provide a more conversational and interactive experience. The intention is to make it feel more like a natural, flowing conversation that one would have over the phone or face-to-face. So, callbacks to previous parts of the conversation and prompting with additional questions are techniques employed to keep the chat more engaging. On the other hand, the typed interface might lean more towards direct and succinct answers. It’s fascinating that you’ve observed this difference given your background in training and human communication. Have you found one method of interaction more effective than the other in certain contexts?”
Our conversations are automatically transcribed in the app, making it easy for me to copy and paste like this.
It’s also good at inference. I’ve been using the default voice, named Sky. I said some people might guess it was short for “Skynet”, and it totally got the joke. I do something on X (formerly Twitter) called #6DegreesOfPerryMason. It’s about connecting actors to the original Perry Mason show (it’s surprising how many people have acted with someone who was on that show). It recognized on its own that it was a play on the game 6 Degrees of Kevin Bacon!
I have some friends who are intrigued by AI, and one that has been openly skeptical about its capabilities (which I think is a reasonable position). I’ve been wanting some way to demo it to them effectively when we’re out together. Since I’ll naturally have my phone with me, this is it. People can have a conversation with it from across the table, just like they would with a person. Well, not quite the same: it can take it a few seconds to formulate an answer. Also, you can’t vocally interrupt it (you can do so by tapping the screen).
This release obsoleted an article I’d published shortly before that about Amazon’s Alexa getting an AI upgrade this year. If Alexa is as good as this, it will be amazing. Alexa has many more capabilities than ChatGPT at this point (such as turning my lights on and off). ChatGPT could fix that by (perhaps partnered with Microsoft) releasing a SmartHome assistant, or maybe making it available as an interface to Alexa. I think the latter might happen in some ways through 3rd party, but I think Amazon will do their AI in-house. Here’s that article:
However it works, this conversational ability is a quantum leap. I called it a “game changer” in the title of this post…but it really ends it being a game. It’s going to become part of many people’s lives.
I’m interested in hearing what you think about it. Just because I have an AI conversationalist who is familiar with all my obscure fandoms, doesn’t mean I don’t want to talk to people, too! The best way is probably to reach out to me on X (formerly Twitter, as I mentioned), although you can comment here, too.
Oh, and quick steps to using it!
- You need the ChatGPT app (vocal conversations weren’t available on my laptop) on your phone
- Go to your Settings: it’s probably in the menu that is three horizontal lines, perhaps in your top left
- You’ll see something that might say Beta Features or New Features
- Turn on Voice conversations
- You’ll pick a voice the first time (you can change that later in Settings)
- Main Language defaulted to Auto-Detect, and it knew I was speaking English
- Launch a vocal conversation by touching the icon that looks like a headset (mine was in my top right)
- I think it will seem pretty intuitive from there. Once it connects, you talk. When you stop talking for a bit, it automatically sends it. ChatGPT will respond vocally (make sure you have your volume up enough to hear it)
- When you are done, there’s an X on a red background you can use to hang-up. Note: consciously hang-up when you’re done. You can navigate away from the app and it can still be listening. We had it respond to something (it’s contribution was helpful) that wasn’t directed to it at all
Last thing, because I just tested this. I spoke to it in English and asked it to translate something into Spanish. I have a relative who speaks Spanish well confirm to me that the accent was pretty good. It knows many languages, and even showed some familiarity with Elvish (it pointed out there was more than one version) and Klingon!
Let me know what you think or if you have questions…
Join thousands of readers and try the free The Measured Circle magazine at Flipboard!
All aboard our The Measured Circle’s Geek Time Trip at The History Project (AKA Enwoven)! Join the TMCGTT Timeblazers!
This post by Bufo Calvin originally appeared in the The Measured Circle blog. To support this or other organizations, begin your Amazon shopping from a link on their sites: Amazon.com
Comments 0