How Uniphore Leverages DataStax Astra DB for Real-Time, Multi-Modal Sentiment Analysis
Uniphore helps companies improve sales and customer service conversations with artificial intelligence (AI). The Palo Alto, Calif.-based company’s multi-modal emotion AI platform enables organizations to improve their representatives’ performance and customers’ engagement, reduce response times and costs with process automations, and boost revenue.
We spoke with Saurabh Saxena, Head of Technology and VP of Engineering at Uniphore, to learn more about the company and why it turned to DataStax for a managed database-as-a-service built on Apache Cassandra®.
1. Tell us about Uniphore and what makes it unique.
Uniphore started with the idea of introducing AI into relevant workflows in the contact center as it relates to agent interaction with customers. The main areas where AI can be applied involve understanding the intent of the caller, making sure all the action items are automatically captured without human intervention, and assisting the agent for making quicker and more accurate decisions while on the call with a customer.
If a workflow is to be kicked off, for example if somebody wants a copy of their bill, then there are processes that you can automate through robotic process automation (RPA). Or you can implement a self-service bot to chat and answer common customer questions. These customer support areas, where AI is used behind the scenes, is where the company's mojo is right now.
We started exploring the possibilities for applying emotion AI in any professional human conversations. We believe that conversations are the greatest source of information to any company, more than any dataset. If you look at scientific research, however, 75 percent of communication is nonverbal. The tone, facial expressions, and body pose, when combined with the words used, complete a conversation.
Our technology for analyzing tone is based on the six basic emotions in the Ekman model: fear, anger, disgust, happiness, sadness, and surprise. For example, if a person gets excited about something, even a baritone voice will go up higher in pitch.
We can also do a similar behavioral model of emotion from facial expressions and the words a person uses. We combine all of these to help employees who are talking to customers, to understand if they are projecting correctly and using empathy, politeness, and confidence to connect with them at a human level.
Uniphore sets itself apart by analyzing multi-modal inputs. Most companies today that say they have emotion AI just look at words. We focus on three modes: tone, facial expressions, and words.
The first enterprise function we targeted is sales. We have plans for building more applications for different enterprise functions. Wherever there is a conversation happening between humans, and wherever the emotion involved is important, this technology can play a role in the apps we develop on our platform.
2. What are some of the key business challenges and IT-related data challenges your company is facing?
One challenge we have is that there are incumbents in this conversational intelligence market . They don’t use the emotion angle. It’s more rote functionality like recordings, transcripts, and notes. They are what I would say is the 1.0 version. They tried to automate certain things about these conversations, but they didn’t use the emotion aspect in the conversations. Now the 2.0 wave is coming.
Another challenge that we have to deal with is this trust factor that our customers and our customers' customers will have, to trust that there is nothing unusual about emotion AI. For example, we never do facial recognition. We don't build profiles. We only tell you about the engagement and sentiment about individual meetings. That's good information for salespeople to have, because then they can start thinking about how to get the customer engaged in the conversation.
Companies are coming to us because they want to invest in understanding emotion in digital conversations. When you meet face-to-face, there's a lot of camaraderie that you can gather. Digital forums can make that more challenging. The more forward-thinking CROs and CMOs are starting to see the value of emotion.
As far as data challenges, to do this emotion analysis, we have to handle about 200 data points on a face every 1/24th of a second for every participant in a meeting. The app uses computer vision to read facial landmarks like frown lines on the forehead and between eyebrows to note if someone is frowning. It monitors eyes to know if participants are engaged, along with cheek, chin, and head postures.
High frame rates are necessary to capture very quick facial muscle movements, plus the audio on top of that, which all exponentially increases our data volumes. That's a huge technical challenge for this product. We have nine or ten different AI models that need to run in real time and we have to handle this large volume of data. That’s where DataStax comes in.
3. What’s your past experience personally with open-source Apache Cassandra and how did Uniphore come to be a DataStax customer?
I have been using Cassandra for about 10 to 12 years. Now, of course, I don't get to code as much as I would want to. In the 2010s, I worked at an IoT company where we were also dealing with massive volumes of data. There were five million electric utility subscribers, with every meter emitting data every 15 minutes for history. 675 billion in one month, running on Cassandra.
I have used all of the technologies in the space. I started using Hadoop in the early 2010s, I used MongoDB, ScyllaDB, and so on. These are alternate technologies to Cassandra. I'm not saying those technologies were bad, just that my comfort factor and confidence level in Cassandra has always been very high.
I first got in touch with DataStax in about 2014-2015, when my company at that time had a cloud-scale deployment on open-source Cassandra. At that time, the reason to look for help was because I needed more than one person to review data models for the most optimal implementation in Cassandra.
With Cassandra, how you build your data model can make a sea of difference between whether it performs well or gives you all sorts of problems. Because you're storing such large volumes of data, changes are not easy. You can't just at the drop of a hat change your data model, because then you have to worry about migration and redoing the dataset that you've been shoring up.
Initially, I looked up DataStax to get help making sure that the data was modeled right so that it could be queried very fast, even if the writes were slower, and that the configurations for the read and write replicas were set up right. I found the management of Cassandra can be very difficult if you do it yourself, so I was very interested in some of the admin functions that DataStax was providing.
One of the reasons Uniphore started with open-source Cassandra was because when you bootstrap development of a new product, if you're two or three engineers trying to do stuff, you try to minimize your outpoints. I didn't have the bandwidth to deal with a vendor and a partner and so on, so we started with open source.
Uniphore never touched production without DataStax. We wanted a more stable way of handling Cassandra instead of my people running Docker containers and worrying about backups and outages, the cost of running it, the constant worry about a worst-case scenario.
The large data volumes that we deal with and the elasticity that the computation needs would have required significant engineering effort on our end if we had to do it with open-source Cassandra. We also like the features that DataStax has added for its admin console. Those are very useful for us.
Right now, we can create these spaces in seconds instead of looking for an engineer to do it, so it accelerates our time to market, improves our costs, and delivers great stability value. We had to do a massive migration from our own organic Cassandra running on open source when we moved to DataStax Astra DB, but despite that complication, I thought it was worth it, and we've been proven right since.
4. How does Uniphore’s platform work and what kind of data volumes are involved that make Astra DB a particularly apt solution?
We analyze computer vision for facial expressions in addition to the audio for analysis of tone and spoken words used with natural language processing (NLP). The massive volume of data that is generated every second has to be stored and processed, both in real-time and post-call, in a very small amount of time. Essentially, we have Apache Kafka in the middle. We run nine to ten deep learning models or neural networks on the NVIDIA Triton and NEMO platforms, on GPUs.
These are all either TensorFlow or ONNX [Open Neural Network Exchange] deep learning models. They run and manufacture this data, what we call raw matrix, and it's placed into Kafka, and then we have a process that reads it from Kafka and unloads it into Astra DB. Then we have another process that's reading Astra DB in real time and doing the emotion fusion to reconcile the information from the three different signals in another deep learning model. We read the data from Astra DB, process a lot of it, then produce the sentiment engagement values that we deliver to our clients. They are pretty accurate, with 70 to 80 percent accuracy.
We have these massive reads and massive writes that happen. In the last month, we probably averaged about 1.5 to 2 million reads per day, on still a very nascent product, with only a handful of smaller customers. We're about to go big, and we're already hitting 29 to 30 million writes per day, two to three million reads a day, and our day is very short, because all our early customers are in the Pacific Time Zone, so it's really a six-hour period where most of the key meetings are happening.
We have a real-time application that can run on the right-hand side of a Zoom meeting to show the sentiment data of every participant. Every three to five seconds it will refresh, so meeting hosts can see if they’re losing somebody. We call it Read the Room.
After a meeting, the hosts get an email or go into the application to see the full post-call analysis. They can see the emotion sentiment and engagement for each participant. The app can show how it dipped at a certain point in time, so they can drill down on that and see why. We use a deep learning model based on optical character recognition (OCR) to show which slide was being presented when the emotions dipped.
We also have this concept of what we call critical moments. We look at emotion dips and also when more than one person mirrors a particular emotion. We identify those critical moments.
Right now we are using Kafka and then Astra DB. Our streaming is happening on Kafka. I’ve been thinking about using DataStax Astra Streaming at some point. We use AWS, but did not see Amazon Keyspaces as the right solution for us. It’s not Cassandra. It's a CQL layer over a DynamoDB storage engine. With the kind of volume that we produce, we needed the storage optimization and partitioning that Cassandra offers in real time.
We currently beat the speed of light, processing 24 frames a second in less than a second for our real-time app. Our tonal data processing takes about 200 milliseconds, so five times a second, and our computer vision processing for facial expression analysis is 24 times a second. Any kind of latency that we add is going to have a negative effect on the real-time side. Keyspaces would add too much latency.
5. What are the benefits of working with DataStax that you’re noticing?
I appreciate the way DataStax customer success and account management functions work. I've never had an issue of delayed response, never, and we are going on six months now.
The platform has obviously performed phenomenally well for our volumes and at a fraction of the cost. We pay AWS a lot, so we appreciate that we can do all this data processing through DataStax at a much more competitive cost. The cost-benefit is beyond anything that we could do ourselves. I would say that there's about 2x to 3x benefit in cost going with Astra DB compared to running an equivalent cluster ourselves and paying for compute and storage.
As far as developer and IT team productivity, we don't have to worry about the elasticity, the stability, the backups, the recoveries, the security, any of that. Now our engineers can focus on our core business: the product. That's engineering strength that goes back into how to make our fusion models and our computer vision processing on NVIDIA better.
I do not worry about whether the engine is running or humming. Usually, about 50 percent of engineering effort goes into that alone. Not having to worry about that is a huge benefit, from not just the engineering cost perspective, but to be honest, the headache perspective. My stress levels are much lower knowing that this data is being managed by DataStax. That's eliminated. I can't put a number on it. To me, that’s priceless.
We ran into some technical challenges because we were doing massive amounts of data, so we worked with the DataStax professional services team. On the consulting team’s advice, we decided to modify how we structure our partition keys and how much of the data we retrieve in one. That required transitioning the data model to a certain extent. They helped us understand the easiest and best way to do it. DataStax consulting services helped us transition from one data model to another in a seamless manner, very quickly, without false starts that might otherwise have cost us a week or two.
Read the full case study to learn more about Uniphore and why the company chose DataStax Astra DB to analyze sentiments with AI in real time.