How to Create a Voice Bot Without Google
Google is one of the most famous tech companies in the world. Their solutions are present in every area of our life, and software development is no exception. But what if something prevents us from using its tools? Take a look at our article about developing software without Google tools.
Or even better, how to create ANYTHING without Google? It sounds impossible, yet intriguing. What’s it like to develop software without using the popular development tools from one of the most powerful tech companies in the world?
Well, we have already figured that out.
Why not Google?
Google seems to be everywhere. The goods and services provided by this company seem truly limitless, from Google Search to development tools like Angular and Lighthouse. Google’s name has even become a verb for searching the Internet. What other company can brag about this level of achievement?
A second important factor is that Google doesn’t work in China due to the Great Firewall. China strictly regulates Internet access to all Western companies, including Google, which makes using its development tools there nearly impossible. If you want to build an app for China, Google tools will definitely cause problems.
So really, no Google?
Yes. Absolutely no Google.
If you are concerned about the privacy of your company’s data, here you’ll find alternative ways to implement your projects.
Let’s look at a customer service voice bot as an example of software that can be developed without Google from the beginning to the very end.
Disassembling a Voice Bot
An efficient customer service voice bot should perform five main functions:
- Receiving and redirecting calls
- Speech recognition
- Intent routing
- Functional bots
- SMS mailing
Each function needs its own development tool for implementation, so we’re going to disassemble a voice bot and see what non-Google tools can be used to put it together.
1. Receiving and redirecting calls
This is how customer flow begins: They dial the contact center and make a call. The task of the voice bot here is to receive the call and start a conversation. Here the bot can greet the user, determine whether the user’s phone number already exists in the company database and recognize the user’s language.
When it comes to calls, the only available Google product is Google Voice. This service provides call forwarding, voicemail and messaging services.
There’s only one problem here. Google Voice is a standalone product and cannot be integrated into a voice bot.
However, Google has already developed a workaround. They have a beta version of the Dialogflow phone gateway feature that enables Dialogflow to build conversational IVR (interactive voice response) solutions for chatbots. But the beta version has limited functionality. It only works when connected to a Google Voice number. So it currently can't be used for voice bot development.
To implement this part of our voice bot, we can use Amazon Connect or Twilio Programmable Voice. Amazon Connect is an AI-based cloud contact center that helps improve customer service. This service is easy to deploy and to use, and allows users to control the company’s text chats and voice bot in one UI. This makes Amazon Connect truly the number one tool in this industry.
The Twilio Programmable Voice API can make, manage and route calls to a browser, an app, a phone, or anywhere a call can be taken. Moreover, Twilio helps reduce call latency, making the conversation sound more natural.
Google released the Dialogflow phone gateway, but it’s only a beta version with limited functionality. Non-Google services like Amazon Connect or Twilio Programmable Voice API have been on the market for a long time, their solutions are trusted and they offer a much wider set of functions.
2. Speech Recognition
A Speech to Text (STT) feature is an important link in processing a customer’s call. Converting speech into text helps the bot identify keywords needed for further redirection or to determine the language of the speaker.
Google has achieved a lot in speech recognition. Their services have proved themselves as being among the most up-to-date and easy-to-to use, and Cloud Speech-to-Text is a great example of their success. With this Google product, developers can convert audio to text with the help of a simple API and neural network models.
The number of features is huge. The service can automatically recognize 120 different languages, process prerecorded audio as well as real-time streams, separate speakers in transcripts, cancel noise, suggest phrases, add correct punctuation, define multiple channels, and do all this at very high speeds.
However, despite all these advantages, this tool sometimes lacks accuracy in converting speech to written text and we still have those privacy issues.
Google’s closest competitors, Amazon and Microsoft, have also released STT services: Amazon Transcribe and Microsoft Speech to Text (part of the Speech service). Amazon’s tool is fast, transcribes accurately, can recognize several speakers and suggests phrases, just like Google. Microsoft’s service doesn’t lag far behind. With its own neural network models and the ability to translate real-time streams, their service is also worth mentioning.
Moreover, a lot of other companies have created their own high quality, multifunctional STT solutions. For example, Speechmatics, IBM Watson Speech to Text or Twilio Speech Recognition API could also be good alternatives to Google. While all these products can do real-time transcription, each also has its own special features. Speechmatics specializes in English dialects, Watson can adapt to a specific language (terminology, acronyms, names, jargons, acoustical environments) and the Twilio service understands natural language.
All the non-Google services we’ve mentioned are good, but they also have their weaknesses. Google still can process the most languages, but Twilio runs a close second with 119. Amazon Transcribe, the next closest, can process only two. Google can also handle punctuation and multiple channels.
It’s hard to compete with Google’s Cloud Speech-to-Text in this domain, but if Google can’t be used, there is a huge number of high-quality alternatives, including Amazon Transcribe, Microsoft Speech to Text, Speechmatics, IBM Watson Speech to Text, Twilio Speech Recognition API and many more. With a list like that, the choice depends only on your project’s goals.
3. Intent routing
Once the bot has recognized all the keywords and determined the user’s intent, it should redirect the call to the appropriate functional bot to deal with the customer’s question.
The Google service that helps with bot development is Dialogflow – a huge development suite for creating chatbots powered by machine learning.
The overall functionality of this tool is very broad, and with regard to intents, its structure is detailed and simple. It helps the bot define what the end-user really wants and returns the right response based on values extracted from the user’s phrases (name, place, time, etc.).
If we could imagine that Dialogflow was never developed, here are some alternatives that can be used to set up intent routing.
Amazon Lex helps build high-quality solutions for chat and voice bots. Bots built with this service function mostly like the Google service. The bot is activated, it extracts all necessary values, it recognizes the user’s intent and then responds in the appropriate way. In addition, Amazon Lex has an Intent Chaining feature that simplifies conversations by dividing them into smaller parts. That can be useful if, for example, you want your bot to ask the user additional questions once the intent is determined.
Another powerful service for intent routing is Houndify, or, to be precise, Houndify Custom Commands. With their help, it’s possible to set different parameters for intents and phrase templates so they can be identified more accurately. For example, you can use this tool to define different words that can be said or not said in the intent, or set the probability of the user saying particular words during the call. If we compare functionality, Houndify is more flexible and works more smoothly than Amazon Lex or Dialogflow.
Intent routing can be implemented within Google’s Dialogflow, but since its privacy leaves much to be desired, Amazon Lex with its Intent Chaining or the more flexible Houndify Custom Commands would be a better choice.
4. Functional bots
The main component of a voice bot includes a bunch of smaller bots that perform different functions depending on the user’s intent. For example, for an airport contact center, one bot would handle ticket booking, another would take care of people who want flight information, while another bot would be busy processing сustomer ratings and so on.
And we are back to Dialogflow. The functionality of this tool is really vast, so using it to build your own chatbot from scratch isn’t too difficult. The agents – these very bots with multiple settings – control the conversations with end-users and collect the necessary information for further processing (which can be a problem if you need to keep your data safe). While the bot can be built for only a limited number of platforms, Google has provided open source integrations that allow anyone to add a needed platform to the main list (like Twilio, Twitter, and Viber).
Here Amazon Lex and Houndify can be heroes too! These services were created to build conversational interfaces for various products like websites and contact centers. Along with Dialogflow, both have an enormous range of functions for creating and supporting chatbots, both voice and text. Lex and Houndify can transcribe speech to text and understand natural language with a very high degree of accuracy.
But the most powerful solution is the combination of these two tools. Houndify accurately recognizes the intent, then triggers the correct Amazon Lex bot, and the dialogue follows the flow we set.
Google’s main tool for bot development is Dialogflow with its multiple functions, but if Google is not an option, Amazon Lex and Houndify can save the day. They are easily scalable, convenient to use, and as capable as Dialogflow.
Lifehack: A good voice bot can do a lot of things, but since some questions still need the expertise of a real person, one bot should have the ability to redirect the call to a live agent.
5. SMS services
The role of SMS in the modern world is decreasing since these days nearly everyone has access to the Internet and uses chat. However, SMS still can play its part in communication between a customer and a company. In a voice bot, SMS can work both ways — a text message can be sent to the user to allow them to approve something (such as order details) or to the bot (such as a user rating of the service quality).
Here Google has nothing to brag about. As we already mentioned, Google Voice provides the opportunity to send text messages, but only as a standalone product that cannot be integrated into a voice bot.
If you need SMS messaging, Twilio Programmable SMS is the best choice. This API can build a stable, fast and smooth mail system that will do all the hard work and save the company hours of dealing with manual SMS messaging.
Google has nothing for SMS except for Google Voice, which cannot be implemented into a voice bot. So Twilio Programmable SMS is a must-have tool for voice bot development.
To sum up
There’s no question that Google is everywhere around us – in our phones, computers, watches, houses and so on. And it’s still hard to believe that you can create a high-quality software product without their services.
Get weekly updates on the newest design stories, case studies and tips right in your mailbox.
Here at Yellow, our team is truly happy to receive such positive feedback!
It’s no secret that everyone likes hearing their name, whether it’s a conversation with a close friend, in small talk with a colleague, or a weekly newsletter from a favorite grocery store. Personalized content on your website can do an excellent job for your marketing and sales. Let’s dive into the topic of how to personalize with Contentful.
Creating a chat sounds quite like a challenge because it’s a complex structure with constant real-time updates. It can be hard to believe that such a complex task can have a ready-made solution. Well, here we have Twilio.