Voice will be the next big development platform, according to Jeremy Liew of Lightspeed Venture Partners, speaking at CB Insights’ Innovation Summit.
While bots, AR/VR, and voice control all have the potential to be the next important UI platform, Liew put his money on voice, which “allows us a new mode of interaction.”
“I think what excites me most about voice is that you don’t need to be able to spell or type to be able to interact with the computer,” Liew said.
It’s not just young kids who can suddenly operate the speaker or adults who can control their thermostats verbally. Liew pointed out that billions of people in developing countries who still aren’t online could have a new way to adopt and interact with connected devices through voice, which creates a whole new audience segment.
GET the enterprise AI TRENDS report
Download the free report to learn about the biggest emerging trends in AI and strategies to watch for 2021.
Liew was joined on the panel by Arthur Johnson, vice president of strategy, corporate development & global partnerships at Twilio, in a panel discussion led by Ari Levy of CNBC on the new user interface.
As voice takes off, the two big players right now are Amazon and Google. And both Johnson and Liew were bullish on the leg up Amazon already had in voice UI.
“I’m always amazed at how naturally I can interact with Alexa. I can talk fast, I can talk slow, I put a fake accent on sometimes to try to trick it,” Johnson told the audience.
Liew also pointed out that despite the Echo being a relatively new product, it’s already in about 6% of US households.
But on the question of whether Amazon would be able to hold onto its early lead as the primary voice development platform, Johnson noted that there is still very much a race under way. The advantage that Apple or Google has is that they have tied their voice assistants to the OS in your pocket.
What it may really come down to is which platform sparks developers to create new apps and use cases for this new user interface. Johnson noted that companies developing for these platforms shouldn’t go too wide with their products. “Go as narrow as you can, get the tech right, train the model, [and then] branch out,” he said.
“If you try to build a generalized solution, you need to have more data than incumbents. And that’s hard to imagine if your incumbents are Facebook, Apple, Google, [and other big tech companies].”
Johnson gave the example of the healthcare space, which could use virtual assistants to gauge the confidence levels of patients when they respond to questions about their health, and redirect them to humans in case a low level of confidence is detected.
But with new user platforms that operate via AI-only, without human intervention, Liew said the focus first would have to be on low-risk sectors like shopping or entertainment, sectors where “nobody dies” if the virtual assistant gets it wrong.
Transcript:
Ari, senior tech reporter, CNBC: Thank you. All right, well, we get you on a slow news today. So thanks for being here. Thanks for showing up early. So we’re talking about bots and voice communication and AI, user interfaces. Arthur, let’s start with you, being at Twilio, a company that’s right in the middle of working kind of behind the scenes on all of this, can you talk a little bit about the state of play of bots? We hear about them all the time. They’re on all the platforms. But I remember communicating with my cable company and my phone company 10 years ago and I would go online and the thing that was responding to my inquiry was an automated…it was a computer. So what’s happened, where are we now, and why has this become such a big topic over the last couple of years?
Arthur Johnson, Corporate Development and Partnerships, Twilio: Thanks, Ari. So I think that the thing that’s different now is I think that before, bots is more like an automated decision tree, so very linear, very static. But now that the technology has gotten a bit more intelligent now, it’s a bit more intelligent to respond to different inquiries from customers. And so I think the things that’s changed is two things, one is the technology has gotten a lot better. There’s more data now. The algorithm is better for the bot technology. And the interface is different too now.
So a lot more people are comfortable kind of talking to companies via text or via chat as opposed to calling on the phone calls. It’s easier to do these bots view, that interface, versus the interface from before. I do think there’s still a lot of discussion about bots and there’s a lot of companies that are experimenting publicly and privately. I think that the promise is very big, but right now it’s still very early in this space. The one thing that you’ll start to see is that there’s some interactions that are probably better suited for bot technology today than others and we have like limited domain, very controlled environments, kind of customer service interactions, that’s better suited today for bots. The more complicated interaction when you try and troubleshoot, etc., may not be as well suited for bot technologies today, but I think there’s still a lot of promise in this area.
Ari: What are some of the applications that you can envision as the next iteration if the current iteration is text and kind of automated replies?
Arthur: I think you’re gonna see two things. One is gonna be, I think customer service is gonna be reimagined in the next two years, three years with this automated bot technology. So think about if you are a shopper shopping at a large retailer and they know your preferences, they know exactly what you want to buy, what size you are, you can get a text from them about, “Hey, we have these shoes and stuff right now in the inventory, do you want these shoes?” You can look at the shoes. You can visualize the shoes, and then transact the shoes all via the interface, via an SMS, for example.
You can also imagine mobile commerce being a lot better with bot technology. So my wife had a…I had to get my wife’s car fixed last week and as I get into the shop and then they’ve put my name in the computer and every interaction after that was via text, and some automated, some human assisted, so I think automated interaction but also kind of…
Ari: And so you’re actually at…you’re right next to the customer service rep that you’re communicating via text?
Arthur: Exactly. So some of that communication was automated because they knew exactly what to say and some was via rep itself. And so it’s interesting. And I was able to pay for that invoice as well via text, and so it was really a great experience to have.
Ari: Like hybrid bot.
Arthur: Exactly. And I think the bots that are more human-assisted are the ones I think that will catch on more traction in the interim as opposed if you’re trying to go for 100% automation.
Ari: Jeremy, if you were to think of what the state-of-the-art example is today of bots in use, bots in action, what’s the best way to think about it?
Jeremy Liew, Partner, Lightspeed Venture Partners: I think there’s two separate ideas behind bots. There’s the concept behind using some sort of messaging platform as an interface, and then there’s the concept of the responses being completely automated. And I think that that…thinking about those two components separately is important.
Ari: Yeah. Can you unpack that a little?
Jeremy: Yeah. Facebook, you know, when they opened up the apps, the opportunity for bots inside Messenger, it came as kind of a big deal and everyone was looking towards Tencent because so much, it’s got so much greater functionality happening through chat, in WeChat, than you do anywhere else in any of the other messenger services. And I think that’s sort of creating envy amongst everyone who has a messaging platform or chat platform in the West, so like, “Oh, I’d like to have something like that too.”
One of the things to bear in mind is that a lot of that is not actually happening in chat. Chat is a platform for things that look a lot like mini apps or HTML cards that happen to be delivered through a chat interface. And so is it a bot or is it an interaction that’s happening through chat? They’re just slightly different from each other. And that’s what’s happened. In the one case, we’ve seen a lot of success.
The other thing that’s also important to bear in mind is that in China we saw WeChat being so successful because the web and apps were kind of crappy as alternatives. Here in the West you’ve got a much higher bot at clear if you’re gonna try and create a chat-based platform because the web and apps are actually pretty good for most use cases and so it’s…the incumbent way that a user is interacting with the customer service rep or whatever is actually better than what the alternatives were there before. And so if you’re gonna drive chat, you have to get better.
Ari: As far as chat, text chat as a developer tool or a platform for developers, is that…are you bullish on that? Do you think there’s much opportunity for third-party innovation, Jeremy?
Jeremy: If I was gonna pick a candidate for most likely to develop as a platform that can generate a wave of new startups and if the candidates were, you know, chat platforms, AR, VR, or some variation of voice control, whether it’d be Alexa or Google Home or something like that, I would probably put my bet on the last first.
Ari: Being Alexa?
Jeremy: Yeah. I think that that’s the one that in the West right now offers the greatest potential to be the new platform that can generate a new wave of innovation.
Sari: So why is there’s so much more promise on voice than text chat?
Jeremy: The difference is that it allows you a new modality of interaction. So by definition, a chatbot is operating on a phone, right? If you have your phone present and it’s in your hand, if you have to, if you’re gonna be texting and tweet or messaging and tweet, then you also have the opportunity to use the web and you also have the opportunity to use an app. And in many instances the web or the app may be a better way to solve the problem that you have than messaging.
On the other hand, what Alexa and its office is a whole new modality to interact when you don’t have a phone in your hand, whether you’re in a hands-free environment, let’s say you’re in your kitchen and you’re making dinner, or whether in your car and you’re driving, or whether you’re a six years old and you don’t know how to spell or type, right? These are new modalities that are now open to you and it’s those new modalities that are likely to generate new use cases rather than it just being a port of an existing use case.
So for instance, being able to call on Uber through my Alexa app isn’t a new use case, it’s now a new channel for an existing behavior for an existing company. But I would think about that the same way that I think about the first television shows being a guy in a suit reading the news into a microphone behind a desk which looked like radio. And over time you actually get all this content that gets created that is customized for the video medium and so you get sitcoms and you get late night talk and you get sport shows and so forth. But those things were not possible on radio and so they became developed specifically for TV, and I think we’ll see the same things developing for a voice-only environment.
Ari: So if you’re thinking of those platforms, is it Alexa and Google, is there a third…I mean are you as an investor, are you thinking about what are…you know, when I look at the market opportunities for potential third party developers, where is the distribution? Is it gonna be those two channels or are we still waiting to see how these would all…?
Jeremy: So it looks like…I mean, I think the data says that like 40% of search requests on Microsoft OS phones are voice, I think it’s like 30% for Android device searches, voice. Those are meaningful numbers so I don’t think it’s just Amazon, but I will say that that is the lead candidate right now. I think something like $9 million Alexa devices were sold through the end of 2016. People were saying that before the holidays, there was a 5% penetration rate of Alexa’s into households. And this is a household level device, it’s on an individual level device. So 5% in a year in change, that’s the start of the curve that looks pretty interesting.
You know, Alexa was the top app, was a top music app, and I think like top 10 for all apps around the holidays which suggest that it was a very popular Christmas gift. And I think that that new set of adoptions, you know, that’s what drives the potential for new platforms. And because it’s an auto ways, it’s a lower fidelity device than historically, and that has zero pixels, right? There’s no screen, there’s zero pixels. I think a lot of people will not see the opportunities because they’re trying to fit an existing model into a new thing.
Ari: Arthur, how do you…as we’ve talked a little bit about text and voice, are those the two modalities that you would think about, or is there a third? Start with that first.
Arthur: I think that to pick up on what Jeremy’s said, and I’ll answer the question, well, I think that the Alexa definitely has the advantage right now. The thing that they have to address though is having an Alexa in your pocket, right? I think that Alexa is becoming sort of the OS of interface at home, but you still want to use that same syntax everywhere you go. And so when you don’t have the ability to have Alexa in your pocket, that’s where I think they have to address that. So they’re doing that by enabling developers to built apps that have the Alexa interface, which is great. But the advantage that Apple and Google has is that they are tied into the OS in your pocket. And so I think that is gonna be a fight to figure out how do you kind of get the Alexa syntax and get that in your pocket, away from Android and from iOS.
Ari: And then also, while you’re on that subject the…are we going to be in for the same universe that we’ve been in? And, you know, with computing and with mobile where you just have these different silos and developers have to develop for each one of them separately? Or do you think there will be more of a set of standards were developed for Alexa and it works across?
Arthur: I think unfortunately in the early stages of any kind of market paradigm shift, there’s gonna be silos, and it’s gonna make the developer job that much harder. I think the company that can help in this interim period to make it easier for that developer to just develop like one app that can be applied to, you know, an iOS app, Android app, an Alexa interface, a Google interface, a Siri interface when that opens up, I think that’s gonna make life easier. But in this early stage, you’re gonna have those silos, unfortunately. It’s gonna be harder on the developer and also hard on the consumer too because as a consumer you have to understand the “Ok Google” syntax, you have to understand the Alexa syntax, and it’s gonna be harder to try things out, so it’s gonna be hard in the interim.
Ari: Yeah.
Arthur: I think eventually there will be some consolidation on one or two platforms for the innovations…
Ari: Do you use both of them?
Arthur: Yes. I have “Ok Google” in my pocket and I have Alexa at home, and my family they’re all Apples, they have Siri as well. So I have all those different syntaxes.
Ari: Does one of them understand you better than the other?
Arthur: I’m always amazed at how natural I can interact with Alexa at home. I mean it’s eerily scary how, you know, we also tied it to our home automation system, so we change the temperature there, and so it’s eerily scary how natural. I can talk fast, I can talk slow, I try to put in like a fake accent sometimes to trick it, but it’s eerily accurate. And that’s the secret. So I think Alexa is the accuracy of the interface, the natural language understanding but also the compute behind it as well. But again, it’s gonna have…I wanted to remember my preferences and that’s why having it in my pocket too…is because I’ll say something to Alexa at home but Google is not gonna remember that preference. And so being able to tie a preference across all interfaces, all devices is gonna be what’s gonna be a really special consumer experience.
Ari: Jeremy, have you done the A/B testing?
Jeremy: Historically what I think you find is that whoever takes the lead is not gonna open up to a sort of a federalized platform. And whoever is losing is gonna be much more open to doing that, the number two, the number three, number four players. So just to go back to your earlier question, I imagine that that federalization will occur when the winner emerges and the people who aren’t the winner will realize they’ve got to do something different to play. Because there are, you know, because…
Ari: The competitor advantages that were open?
Jeremy: Well, I mean the issue is that as long as you do need to do custom development, a new developer has to pick one to do first.
Arthur: Right.
Jeremy: And the one that they do first will be the one that has the most uses. And so there is this sort of two-sided network effect that will cause the number two through end plans to say, “We need to figure out a way to federalize so that we can have developed the mindshare,” because at the end of the day platforms aren’t useful without developer mindshare.
Ari: So yeah, so back to the question, other than voice and text, anything else we’re seeing?
Arthur: Yeah, I’ve seen some early, you know, biometrics, eye detection, etc., but I think that’s still so early. We’ve seen companies that do like emotional detection, gestures as well. I think that somebody mentioned to me the other day that in some of the new BMWs, they have this physical gestures to control the devices. I’ve seen the early prototypes of physical gestures, eye gestures, but I think voice and text are gonna be the ones that’s gonna probably be the predominant ones in the interim, I think.
Jeremy: The thing that excites me the most about voice is just that you don’t need to be able to spell or type to be able to interact with a computer. So I have a five-year-old at home. He certainly can’t spell. He can’t type. He can’t write. He can’t do any of those things. But now he has the ability to interact with a computing device, and that’s pretty amazing. But generalizing beyond my five-year-old, if you look to the developing world where you have billions of people who are yet to have…who are yet to own a smartphone, or even who have decided to own smartphones but may be illiterate and not have the ability to interact with the web or with the apps in the way that you would just assume for someone who can read and write. So the ability to bring that next couple of billion of people online through voice, I mean those are the things that really get, you know, those are the market sizes that get people pretty excited.
Ari: Yeah. What about the industries? Where do you think we’ll start to see the adoption happen more rapidly? And let’s leave tech aside, because that’s the obvious one. So outside of tech, you know, where are you starting to see some of these emerge?
Arthur: I think it depends on the interactions. If the interactions that I think that are probably the most impactful is gonna be probably healthcare is one area, education. So on the healthcare side, telemedicine is a big use case that we see at Twilio. And being able to have patients talk to the doctors through different interfaces, having patients that are maybe incapacitated talk to the doctors via voice and communicate with computers and voice that way…
Ari: Can you give an example of how that’s worked?
Arthur: So for example, if you are a patient and you want to quickly get a response from your doctor on, “Hey, listen, I want to fill this prescription, when can I get it filled out?” They can send you kind of based on your location, based on your GPS, where the best, closest pharmacy to get the prescription filled out. “Hey, doctor, I have this issue, I have this issue on my leg, here’s a picture of that, can you analyze that for me?” You text back and forth, etc. “Where can we go to find this kind of back brace for my back? Where is the right location?” So things like that to help them do that.
Ari: And that’s human-assisted?
Arthur: Human-assisted, exactly. So some of that questions that are asked from the patients, the computer knows…I know what the response should be with a 95% confidence level. And they can see if it’s above, you can say, listen, if the confidence level is above 95%, you can automate the response. If it’s not, if it’s below that, then let me kind of send it to me to human, to human assist. And so I think that depends on how the confidence level on the apps is.
Ari: And so the way that the doctor receives that is here’s the 5% of the question that you need to answer?
Arthur: Right. It could be the doctor, it could route to a nurse practitioner, it can route to the doctor, it can be routed to the x-ray technician, so depending on the…how you classify the content on that interaction if you’ve arrived to thee right person based on the confidence level as well.
Jeremy: If you’re talking about AI, if you’re talking about bots, then you’re really talking about no human in the loop type interactions.
Ari: No human or what…
Jeremy: No human in the loop type interactions, this is a human in the loop type interaction. If we’re talking about genuine bot-only, AI-driven, no human in the loop interactions, then I’m not sure that healthcare will be an early one. I think you’ve got to rely on things where nobody dies if you don’t get it right. And so I think it’s probably gonna be more like things like, you know, shopping, where like if you get sent the wrong…if you get sent a suggestion for the wrong red coat, like you’re not enraged by that, or, you know, content too where like you ask someone what’s the news and you get it from NPI instead of CBS, again, nobody dies.
Ari: Yeah, it’s a good first hurdle. But I mean as you were saying that, I was thinking, you know, connected cars are an area where there’s obvious application for automation, for AI, and the companies that are spearheading all that technology are in that space. Given the regulatory hurdles of putting that technology into action and just the general safety concerns of the population, how do you test that stuff enough so that you can actually get it deployed in ways that are real? And California isn’t allowing Uber self-driving cars so they said they’re going to Arizona. Much safer in Arizona, I guess, to have autonomous vehicles than in California.
But are you able to get enough testing in these sort of lower friction environment to get real deployment in areas where you would be saving lives but the risk would be…yeah?
Jeremy: I think it’s always crawl, walk, run there, right? Like let’s take in a car environment, you know, if you were talking about having a voice-type interface you’ll probably start off with, you know, play me a song by Aretha Franklin and not hit the brakes. Those are two very different sort of interactions where you don’t necessarily want to rely on correct passing of your voice to take action in a timely way. Again, it’s like you start off with things that are relatively low risk, and the least risk things of all tend to be information, entertainment type use cases.
Arthur: As one of the examples that’s not in tech is politics. I know it’s a kind of touchy subject especially around this time, but there’s a company called HelloVote that helps to register voters and was really through automation and bot technology to identify where you are, if you’re already registered, you’ll be kicked back to a database to check that out, check out your geography and get you registered to vote. And so it was a cool technology to do that at scale. And then they’re able to experiment and get it right initially and then they kind of launched it during the said campaign season and that was a good way to have the supply to an important area like politics.
Ari: Yeah.
Jeremy: I was…there’s this, I forget the name of the company, but it writes these automated stories based on box scores?
Ari: Narrative Science?
Jeremy: Narrative Science, exactly.
Ari: Yeah, they’re gonna eliminate my job soon I think.
Arthur: You can be retrained.
Jeremy: This is gonna make your job easier, I think. Folks on the judgment path, eh? No, but they’re writing these automated stories based on box scores for sporting games and I think they’re also doing some earning report stuff as well. But I think this is an interesting area where you can imagine that you could take the box score from a Little League game, which no one is gonna write a story for, right? And if you tie that to this sort of AI ability to write a story, and then as I’m driving home I can ask my car, you know, “Hey, what happened in John’s baseball game today?” And I can actually get a story that’s been automatically generated from a box score then read to me while driving, right? That’s a pretty amazing experience that is not possible today.
Ari: Yeah. Arthur, so your CEO, Jeff Lawson, and I thank CB Insights folks for sending this over, said in the poll that bots are overrated and not particularly compelling or productive in their current guise. Is there a sense that we’re overhyped at the moment?
Arthur: Well, I think that there’s just been a lot of discussion about bots and I think that there are a lot of bots that showed a lot of promise, and maybe there was a missed set of expectations. But again, I think that there are bots that are not gonna perform well because the domain is just too wide, there’s too many potential interactions, too many potential responses in place. But if you narrow this domain to an area that you can contemplate some of the answers and responses and help to do more human assist model, I think that these bots or chatbots or automated response can do well.
I mean, for example, there’s a company that narrows so narrowly, so they focused on appointment reminders for hair salons. That’s how narrow they were and they said, “Let me prove this out first and test this out.” And they said they’re gonna move from there to other interactions. And so I think that that approach, although it seems very extreme, but the reports was let’s focus as narrow as we can, get the technology right, train the model and then kind of branch out from there. I think if you start too wide, you get these bots that just don’t work and that’s why I think where you have that missed expectations. So human-assisted, narrow domain, I think that those are bots that tend to work better, broader domain, it’s hard.
Ari: Yeah, and more, from CB Insights they’ve counted over 40 companies who’ve built some type of chatbot, and so I wonder is this…it seems like an area where we should be able to rely on the big five tech companies to do that for us. Do companies need to be developing their own or…you know, it seems like the amount of innovation that’s been happening in Facebook and Amazon and Google should be enough, or am I not thinking about that right?
Arthur: What’s horrible is I think that the advantage that they have is they have the data and the intelligence and they have the PhDs, right? And so as a small company or midsized company, you don’t have the advantage of Facebook, Google, etc., have to develop these technologies, so. And they’re making more and more of these tools available, to be open tool, which is fantastic. But I think there’s gonna have to be a lay of customization that you’re gonna have to have as a company to enable this within your strategy or within your innovation. So if you are a big retailer and you want to build out a customer service channel that now is more automated, you can leverage some of the open source tools from these big companies but you have to have the right interface that merge with the voice and the other kind of channels of communication and so you can leverage the tools from these big, big players. But I think you’re gonna have to have, still have your own customization layer to make it fit in any of your strategies, I think.
Ari: Yeah. Jeremy, what do you think? Did the platform players provide enough technology that everyone should be able to use that?
Jeremy: I think it’s in their interest to do so. And so I think that they’re gonna make it as easy as possible for as many developers as possible to build to their platforms and that means abstracting out as much as possible of what’s difficult, whether that be, you know, passing the voice, the NLP breaking down to, you know, identification of particular meanings of particular words. You know, Alexa has already started to do this by kind of creating certain constants like what “stop” means and what “resume” means and making those universal add-ins that anyone that’s building the skill can then take as an input and use as a command that becomes standardized across.
Ari: Yeah, and, so I want to go back a little bit to what I was talking about earlier with third parties and startups. I think that a few days ago in San Francisco, you talked about Satya Nadella’s desire to commoditize all of these sort of AI layers whether it’s voice or text or vision, and so that it’s just part of platform. If you’re using Cortana or if you’re using Azure. It’s just another sort of building block. Given that and given that that’s really proper with the other platforms that are gonna take some sort of similar approach, where are the opportunities for startups and what kind of deals are we seeing coming in the door that looks maybe interesting but just not really, you know, it doesn’t have much business opportunity?
Jeremy: I don’t think we really know yet. I think that we were at such an early stages, you know, like, I think there was a million Alexa sold at the beginning of 2016 and it went to 10 million, so a million is like not enough to know anything from. Like it’s not a big enough market for developers to target. Ten million is just starting to get big enough where I think we’re gonna start to see more people thinking about this is a platform and not just as a distribution channel for something they already do. And so I think we’re just at the beginning of seeing what that innovation can be, but it totally makes sense for him to want to do that. That’s what leads to the greatest level of innovation because you’re now reducing the technical barrier of entry to building apps, right?
Ari: I guess it comes to just another piece of infrastructure.
Jeremy: Well, just the same way that from like 2000 through 2010, we saw the technology required to build a website dramatically plummet, or from 2006 through 2012 you saw the cost and time to build an app, a smartphone app, you know, dramatically come down. As that came down you shifted the bottleneck from the technology skills to the customer insight, right? And so it’s in the interest of any platform that drive that barrier of entry down because there are more and more people that have insight than have the skills and the insights. And oftentimes there’s two people who are different from each other and so, you know, getting that rare combination is really difficult when the technology is hard. As a technology get’s abstracted away, you start to see more innovation happening outside of Silicon Valley and that probably better reflects the opportunities for an American consumer or for an international consumer.
Ari: Is there any analog to what you’re seeing now as far as pictures and deal flow to mobile seven years ago or, you know, is there another analog as far as the ideas are raw, the innovation is there and you’re just…
Jeremy: Yeah, I think we’re seeing a lot of toys right now. A lot of toys, you know, Poncho The Weather Cat, you know, bot, I think is a great example of a toy but like you have to start with something like that frankly because it reduces the expectations of perfection. And unless you start building that knowledge, you don’t know where you can go from there.
Ari: Similar with people calling Snapchat a toy a few years ago?
Jeremy: That’s exactly right. And then I would say that the other angle is people are taking something that is known and trying to port it. And so Paul English who is the CTO, founder of Kayak now has a travel bot. And, you know, there’s an interesting model there which is historically people used to talk to travel agents to get their travel done, and so, you know, that back and forth, that Q&A back and forth is actually quite a natural idea for travel, and to your point, it’s a specified domain and so you actually have a lot more granularity around being able to understand what it is you’re trying to do.
Ari: I think as of the tool you’re finding, are you seeing any trends? Maybe talk a little bit about what Twilio Fund has within that. Any trends you’re seeing?
Arthur: Twilio Fund is a $50 million early-stage fund that we have at Twilio. The investment company is building on our platform, so we’re building a meaningful business onto this platform. We love to know about you’re doing and maybe invest in your business to see it flourish. And we are seeing, you know, we think that the developer’s job is a great job to have. It’s also very hard. Every new technology that comes up, they have to understand how to incorporate that into what they’re building. And so I think that we see a lot of companies who are trying to democratize the use of AI because AI can be a very complicated thing. Oftentimes you have to have a PhD to understand. It takes a lot to train the model. You have to buy a lot of GPUs and CPUs to train that model. And so our company is trying to democratize that to make it easier to leverage AI within your application. So we saw a few company is doing that. And also the company is trying to extract meaning and intelligence from customer interactions. And so you have this transport layer and then you try to figure out what kind of meaning can I extract from that conversation to try and do an automated response. So extracting meaning from that conversation, you know, in an automated way is something we’ve seen a lot our company is trying to do with this new technology.
Ari: How much deal flow are you seeing?
Arthur: We’ve seen quite a bit. I think that a lot of companies are feeling pretty empowered to start the new kind of AI tech projects and communication projects and so we’re seeing a ton of deals coming into mind.
Ari: And as far as the investment you make, how do you view them and the potential, you know, economic return versus strategic versus just, you know, kind of a cool technology that looks like it’ll be useful at some point?
Arthur: Well, the point of the fund is to build the ecosystem for Twilio. So we’re not in the business of financial returns with the Twilio Fund. We’re gonna try and have another tool in our toolkit to build an ecosystem and this helps us to do that and so it helps us to attract companies that are building something interesting. And if they win on our platform, we win as well. And so I think that we’re less focused on the financial returns, we’re more focused on strategic value and building the ecosystem with these companies in the fund.
Ari: So I had written a few questions and CB Insights sent a few as well and there is one that sort of overlaps, but I wanted to talk a little bit about, I called it the creepiness factor. CB Insights called it, more eloquently, unintended consequences and ethical challenges. But thinking about this technology in action not just in our homes but in the office and where we stay and just in sort of everywhere we go, what are some of the things we could just be keeping in mind and maybe, you know, be potentially concerned about or at least cognizant of?
Arthur: Well, one thing that my wife noted was creepy when we’ve got the Alexa was that the Alexa is always listening, right? And she’s wondering like where does the information go? I guess there was a case in the news recently about the Alexa helped to uncover a crime based on what it heard. And so that puts me, I don’t know if I call it creepy, I think it’s something that you have to just understand better where the information goes. That’s the one thing that comes to mind when I hear about maybe unintended consequences, but I think as long as you educate the users on what’s happening with the data, what’s happening with the device and what they do, I think it’s gonna ease that perceived creepiness. But I think that it’s a necessary component to make this technology work.
Jeremy: Yeah. If there’s one thing we’ve learned over the last 20 years, it’s that consumers will always sacrifice privacy for convenience, and they will always get upset about that and then do it again. And so I think that this path here is well-worn. Like we are going to become comfortable with the level of surveillance that we would never previously have been comfortable with because it’s gonna make it just a little bit easier to get it. I mean that’s how it started.
Ari: Yeah, and I have a sister who is about 12 years younger than I am. She grew up on Facebook and posting everything on Facebook and I remember asking her about the Alexa and she said, “No, that’s way too creepy for me.” Those generational things.
Jeremy: And I want to bet you in two years time, she will have one in her bedroom monitoring her like all the time. It’s like people always complain about it and they never take action.
Ari: No problem. So as these UIs become more simplistic and minimalistic, how was discovery going to work? Are these UIs only going to be to execute plans that you already know you want to do or they’re gonna become more proactive?
Jeremy: I think that’s an excellent question, because this is an area where I think we disagree. I think that if you are going to have either a chat-based or voice-based interface, it is gonna drive you towards a pull versus push model of interactions, which is different than what we’ve been used to with notifications on our phone and from social media for the last sort of eight years or so. It goes back to the original way that we used the web, which is that we used to type in URLs, or we used to type stuff into Google and everything was pull-based. It was initiated by the user. We sort of drifted away towards a more of a push-based model, whether it’d be driven by advertising or the feed or notifications or emails or whatever it was. But because these channels or the channels that we use for a lot of interactions, if…we won’t tolerate a lot of noise in amongst the signal, and it is just too hard to tell the difference between signals and noise for a particular individual and get it right most of the time, so I think people can be very, very judicious about Alpine Messaging, about that message from the store that we think we found something you might like. Once you open the door to that, in the next month when you’re 6% short of goal, you send two of those messages and then the next month it’s four and then it’s eight, and all of a sudden you’re gonna get shut off.
And so I think that’s bad for the platforms and bad for the companies, and so I think you’re gonna see that return to pull-based, which means that discovery becomes a completely unsolved problem in that environment. And therein lies the opportunity.
Arthur: I think the, like on the Alexa for example, the discovery of like new skills right now, that’s the part that’s a challenge. I mean right now you have to go to the website. There’s no way to discover new skills via that interface at all. And then when you program the skills, there’s no way to remember what are all the skills that are available on my Alexa. And so I think that they have to figure out that is coming from but also the fact that once you have enabled a skill on the interface, how do you remember all the things in the interface.
Jeremy: And that comes down to habituation.
Arthur: Yeah. The more you use it, the more you remember.
Ari: The more you use it, the more you remember. And it’s the same reason why the average person uses 27 apps per month. They have many, many more than that on their phone but they use 27, and there are 28 spots on the home screen of your iPhone, four across, seven down. It’s probably not an accident those numbers are almost identical, almost exactly the same because if you use something often enough, you bring it to the home screen, and if it’s on the home screen, you’re most likely to use it. So I think that it’s gonna come down to habituation.
And then the question around discovery, given that you have to go to a display device to survey new skills, and the whole point of a voice interface is that there’s no display, I think that is actually gonna come down to genuine old fashion word of mouth where a person uses their mouth to tell you about this amazing thing they’ve been able to use their device for. And in fact that’s actually now become the primary viral growth mechanism for apps, right? It’s not invites, it’s not, you know…well, there’s certainly a lot of paid acquisition but the invite flow of, you know, a few years ago is completely ineffectual anymore. It is people telling each other about a great experience.
Cameron McCurdy, CB Insights, Host: One of the issues that many startups have is building inside of someone else’s sandbox. In this case, many companies are building services behind these new interfaces and giving up some of the control of the customer relationship. Is building behind someone else’s UI something that startups should be worried about?
Arthur: That’s a great question. So I think as a developer I think that you have to sacrifice some things, right? So in order to get the kind of distribution you need, in order to get the right apps to the customer you have to rely on something, a platform, a company, etc., because I think that, as Jeremy has pointed earlier, picking the right platform, picking the right kind of technology is gonna help you get over some of those obstacles. But as a developer, again, you know, that job is so hard right now and picking that right platform…you pick the wrong platform, you could set yourself back, you know, years if you do that. So I think you do have to sacrifice some of that to pick the right platform and get access to the right customers and the right interactions, I think.
Jeremy: I think working with a platform that’s growing and likely to be big helps you create your first $100 million of enterprise value much quicker and makes it much more likely, but it makes it more difficult to get from, you know, billion 1 to billion 10 unless you have some direct relationship with your customers. So it’s a tradeoff and you might decide to make that tradeoff early and then later on develop a direct interaction with your customer later on.
Cameron: Great. I think this question is more directed towards Jeremy. What are your expectations for Snapchat as it continues to explore new UIs like spectacles and AR? Do you think that it’ll be more difficult as a public company with quarterly earnings?
Jeremy: I think the folks that are running Snapchat are doing it not towards quarterly earnings but, you know, with a view towards what’s gonna drive the most long-term value. And I don’t think that’s gonna change, whether it’s a private company or a public company.
Cameron: And last question, one of the UIs talked about in the presentation and you guys mentioned a little bit today is gesture control. We haven’t really seen a significant development in that area. Do you think it’s coming? Do you think we’ll ever see gesture become a widely used UI? Or do you think that it will almost entirely be overtaken by things like voice and chat?
Jeremy: I think gesture control…so gesture control only make sense if you don’t have a phone with you because you got a lot more fidelity doing this and doing this or something else. And I think that you might find that voice completely removes the need for gesture control.
Arthur: Yeah, I think it’s a very hard problem to solve and I struggle with the sides of that opportunity for gesture control versus the other interfaces that we have at play. And so that’s one that maybe good to keep an eye on but it’s hard for me to imagine the sides that opportunity being substantial.
Jeremy: It’ll be an accessory.
Arthur: Yeah.
Jeremy: It won’t be a platform.
Arthur: Yeah.
Cameron: Sorry, we have one more question from the back. You talked a little bit more about going niche and narrow and many startups are trying to compete with these larger platforms by tackling more specific use cases. Do you think that these are VC-sized businesses and is there a way for them to expand other than just becoming generalized bot platforms?
Arthur: What’s funny is we’ve seen a lot of these M&A activity pickup because you…and that turn gets an idea, it builds a platform and they oftentimes try to build that to be acquired by, you know, Facebook, Google, etc. So M&A has picked up in this area and, you know, it’s harder to become a big generalized…you know, you have NAI company or chatbot company nowadays because they have the platform that has this natural ceiling of what you can do and the likely exit for you is gonna have to be to one of the big platforms. If you can do something that creates something that’s platform-like, that could be the way to get to a large business, but I think that these platforms right now are kind of creating a natural ceiling on some of these opportunities on NAI and chatbots, I think.
Jeremy: I think the challenge with AI is that you’ve got the algorithms which is trying to become better understood. Yes, they’re still difficult and there aren’t that many people, but like there’s better understanding of that. Then you’ve got the data on which those algorithms work and more data always beats better algorithms. Like more data always wins. So if you trying to build a generalized solution, you need to have more data than the incumbents, and that’s hard to imagine if those incumbents are Facebook and Google and Amazon and Apple. They have more data for generalized interactions because they’re collecting them in other ways from their phones, you know, from Alexa, from whatever else it is.
So if you think that you can beat them, you have to do one of two things. First, you’ve got to go get a dataset that they don’t have, and that’s possible and that’s why you start to get very, you know, niched, right? Because they don’t necessarily have a dataset on diagnostics for various health conditions. And that’s, you know, offline data or data that’s not being collected by their everyday work. So yeah, there’s an opportunity to beat them there.
The second way that you can beat them is if you’re just doing something that’s not a top priority for them. You know, even Apple, even Microsoft, even Google, even Facebook, they can’t do more than five things well at a time. You put your A team on your top priority, you put your B team on your next priority, and by the time you get to your 10th priority you’ve got your J team on that. And if a startup can’t beat even Facebook’s J team, then they got a problem. So that’s the other way that you can compete. It’s just picking something that’s not a big priority for them.
If you aren’t already a client, sign up for a free trial to learn more about our platform.