Voice as the Vanguard: Embracing AI’s Audio Ascendancy

Brian Chen, the tech reporter at The New York Times, today presented a compelling case for the unparalleled influence of voice in AI-driven interfaces. His article points out the pivotal shift towards voice technology and emphasizes how AI is thriving in this medium and revolutionizing our interactions with the digital world.

I tend to agree, not simply because I feel Chen is correct, but because of many personal observations and involvements as both an early adopter and promoter of technology that has shaped our lives today.

To help set the stage, in 2005, I worked with SightSpeed to help propel video calling and conferencing and also with GrandCentral to extoll the virtues of always being reachable with one number for life. Even earlier, it was Webley, the first voice assistant that really worked, and before all that, being at the forefront of sports marketing from age 14 until my mid-30s. I coined the term “the Workcation” in 2015. In 2005, the Nokia Blogger Relations program signaled a dramatic shift in brand communications and was the archetype and seminal version of what is called influencer relations today, but we did it without paying graft, bribes, or payola. And, of course, my blog, VoIPWatch, has been around since 2004. This list of experiences and early adoptive behavior just goes on and on.

So, given that what I see is that we’re now again in an era where technology will continually redefine the boundaries of communication, voice is starting to stand out as a potent and profoundly human medium.

Since last Friday, I’ve been using the Rabbit 1.0, a $199 first-wave “AI Companion.”  Is it perfect? Hell no.

Does it harken back memories of my first computer, an Osborne 1? Yes. But more so, much like when my first Apple MacIntosh arrived, and I started to use it, I saw a change in how I would work in the future. But it was in 1992 when my first of two Apple Newtons gave me the visceral reaction that the world had just changed. That’s how I feel about the Rabbit 1.0

Is the Rabbit 1.0 perfect, or so advanced that it’s game-changing? Yes and No. Back when I used the Newton, I found myself relying on it constantly when I was away from my home office to “stay connected,” and remember, technology wasn’t as advanced as it is today. The Rabbit is giving me that same vibe.

With my Newton, I could send and receive email and fax in both directions when connected by an RJ-11 cable to a modem. I did this anywhere I could find an open RJ-11 jack, even on airplanes, with it plugged into the Airphone in the seatback in front of me. I ran apps that connected me to Compuserve, AOL, and MCI: Mail. I even had a pager card plugged into the Newton from Bell South, providing me with two-way paging.

This was all long before the first Blackberry Messenger, which meant I could stay in touch and keep up on things in an era where staying connected was becoming as important as it is today.  That visceral feeling returned when RIM added a keyboard and two-way texting, and email really went mobile. That’s how I feel about the Rabbit—not for what it does today but for what it means for the future.

Let’s face it: voice technology before AI’s Netscape moment–when ChatGPT hit us all, was really confined to rudimentary commands and responses. Today, it is on the brink of a revolution. The limitations of early voice assistants we’ve been using, like Siri and Alexa, were stark, with their vocal cords tied to their programming, and really, only able to understand a specific set of instructions.

Today, as Chen discusses, the evolution towards AI-driven, large language model-based systems is heralding a new era of interaction. These advanced models learn and adapt from vast datasets, enabling a conversational depth that earlier technologies could scarcely imagine but the technologists behind them could clearly envision. And those LLMs are getting to the point where they are now learning from us.

The transformative potential of AI in voice technology is not just about its ability to understand and generate natural language. It’s about making technology accessible on a deeply personal level.

From where I sit, I argue that the day is coming when our AI voice assistants respond to our queries and anticipate our needs, learn our preferences, and adapt to our unique life contexts—all through the most natural interface known to us: spoken language. This shift is significant and underscores a broader movement in technology towards systems that understand not just the command but the context, not just the question but the nuance of its phrasing. This “in-context” element is the crucial factor in AI Voice.

The implications for interacting with devices, from smartphones to home automation systems to our cars, are profound. AI’s role in this transformation is not just supportive but central, driving the creation of more intuitive, responsive interfaces and—crucially—more human.

As we stand on the brink of this new frontier, it’s clear that the voice is more than just a communication medium—it’s the future of human-computer interaction. And in this future, AI doesn’t just assist; it connects, understands, and engages with us in ways that are fundamentally reshaping our interaction with technology.

With its immediacy and ease, voice promises to lead this charge, making our interactions with technology more efficient and more inherently human. I agree with Chen. As someone who has been online since 1980, when text was the mode and medium, it’s evident to me that voice is not merely a tool but a transformative force in the AI epoch, an era where our voices bring technology to life.