Foursquare has had its share of challenges but now it believes it can position itself as a measurement of consumer trends in the offline world.
“We want to be the Nielsen of the real world,” said Foursquare CEO Jeff Glueck, speaking at the CB Insights Future of Fintech Conference in New York.
Just as Nielsen is known for its TV ratings and its tracking of consumer purchases for retail chains, major brands, and data partners, Foursquare would like its foot-traffic analyses and consumer preference data to drive strategic and tactical decisions for its data clients.
Foursquare was able to accurately predict Chipotle’s sales miss for the first quarter of this year based on foot traffic data aggregated from its consumer-facing mobile apps.
The company, which is backed by VCs including Union Square Ventures, Andreessen Horowitz, and O’Reilly AlphaTech Ventures raised a $45M Series E in January 2015 but did so in a downround, at a valuation of $250M when the company had been valued at $400M previously. The pivot to data is part of Foursquare’s long evolution after pioneering the location-based check-in app (Foursquare raised seed financing in 2009).
The rise of alternative data providers has the potential to revolutionize the world of institutional investing, including hedge funds, investment banks, and mutual funds.
“We want to sell to a small, really good group of investors and charge a lot more instead of commoditizing,” added Glueck, pictured in the photo at the top of this post.
The panelists also included Matthew Grande of Point72 Asset Management, Bertrand Schmitt CEO of App Annie, and Bina Kalola managing director at Bank of America, Merill Lynch. The speakers also touched on compliance challenges surrounding data, doing due diligence on data vendors, the rise of IoT-generated data, and issues of data quality and commoditization.
Granade, of Point72, noted that due diligence is important not only in sifting the good data from the bad data, but also to avoid regulatory and legal risk.
“If you don’t know exactly where the source of the data is coming from, you open yourself to being exposed to insider trading laws,” he said.
On the point of data commoditization — whether the spread of alternative data would mean that it would become less valuable as every market participant gets access to the same dataset — Kalola of Bank of America, Merrill Lynch argued that point had not been reached yet.
“We’re still a while away from commoditization, we’re still at adoption,” she said.
The panel also discussed whether consumer privacy concerns are overblown.
Granade noted that investors are not interested in individual data, only aggregate data.
“I find it interesting what a small minority of privacy advocates get worried about,” said Glueck of Foursquare. “People are signed up for grocery store loyalty cards and NGOs, which use similar tracking mechanisms but people only worry about it when it’s web cookies.”
Matt Wong, Senior Research Analyst, CB Insights: So today, we have a very exciting panel lined up this morning on the rise of alternative data in institutional investing. It’s a growing trend of a growing market amongst sophisticated investors for data that goes beyond traditional measurement like stock price history. So again, feel free to add your questions to the mobile app @cbinsights and the hashtag #futurefintech.
Let’s start with a quote from one of our panelists today, Matthew Granade at Point72. “Investing isn’t just about earnings estimates and 10ks. It’s about satellite imagery, sensors, mobile devices. The more you can process those things, the more edge you’ll have.”
And obviously, one of the biggest trends enabling this is just the rise of data overall due to low-cost sensors. Mobile phones, relevant data to institutional investors coming from IoT devices, social media posts, digital pictures, news media, and purchase transaction records among others.
So all of this is creating sort of a growing sense of opportunity around generating Alpha from these alternative data sets. Here, you see a landscape of startups who are both generating some of these new data sets, as well as helping them make sense and analyze them. Companies across different spaces that we’ll get into a bit. Including credit card transactions, transactional data, and being able to understand consumer behavior through this data set and also merchant performance. You see one application of this applied to grocery store performance, data on repeat purchase behavior here from a startup in New York called Ernest Research. Recently laid out in a blog post by one of their investors, Greycroft Partners. You see satellite imagery enter the discussion in the investing space to help investors really understand what’s happening on the ground.
You see one use case here applied by Orbital Insight…a Sequoia-backed startup in Silicon Valley which is selling analysis of satellite images of cornfields. [inaudible 01:44:50] crops will shape up. And also using aerial images to identify, for example, the number of cars in in-store parking lots for insights into customer foot traffic and retail sales.
Social media mining is another area where investors are looking to gain real-time sentiment and event-driven insights about what’s happening in the world. You see a quote here from Paul Haan [SP], who’s a founder at a Cayman Islands-based hedge fund. “Our systems can analyze a tweet in less than a second from the moment a person tweets. Analyzing untapped and unstructured data sets such as Twitter gives the distinct advantage over other investment managers.”
You see here, one of the breaking news alerts from a startup here in New York called Data Miner. Specifically around Home Depot’s credit card breach. And you see how that reflects within the stock price after that report…that breach report went public and also when the alert happened.
And then, to the point of our panel today on foot traffic analysis, grateful to have Foursquare here to chat a little bit more about the work they’re doing. But one of the most public displays of this happening, we’re seeing Foursquare which recently blogged about prediction around Chipotle earnings using their foot traffic data, and then sort of the pretty close comparison to the actual earnings. And then you see what happened afterwards. Foursquare is doing this in other ways as well, including iPhone sales, and expect Jeff to talk a little bit more about what they plan to look at next.
So some questions that the panel will be looking at that are open. “When the rules change, how do you maintain the supply when the rules and access can change very quickly?” You see some of the recent media mentions there on Data Miner and Twitter and others. “Will there be sort of big venture-backed business whose sole goal around this is alternative data, and providing this new analysis for institutional investors?” And “What are the next data sets of the future that are waiting to be tapped, potentially generating returns for institutional investors?”
With that, we have a great panel moderated by Robin Wigglesworth, who’s the U.S. Markets editor at the Financial Times. Robin has been doing some great work around this area, including a story this week he published on a recent investment by Eric Schmidt at Google and do a new hedge fund that is investing using sort of cargo shipping data. We also have Matthew Granade here from Point72 Asset Management, where he’s the Chief Market Intelligence Officer. Prior to that, co-founder of a startup called Domino Data Lab, and also a co-head of research for Bridgewater Associates. Jeff Glick, from CEO of Foursquare. Previous to Foursquare, spent seven years as Chief Marketing Officer at Travelocity. Bertrand Schmitt, CEO and co-founder of App Annie. Fifteen years of experience in mobile and analytics across U.S., Europe, and Asia. And finally, Bina Kolola, a Managing Director at Bank of America-Merrill Lynch in global banking and markets financial technology.
So with that, I want to bring Robin and the rest of the panel up for a great discussion. Thanks.
Robin, Reporter, Financial Times: Well, thank you very much for being here. It’s a fantastic, exciting panel. I’m just sorry we can’t go on for three hours. But I thought we might as well jump right into it. Since the focus here is on institutional investors and how they can get a bit of extra performance out of alternative data and new data sets, I’d like to open up with Matthew. Maybe you start and say is this real? Can there be Alpha derived from some of these new data sets that are coming on the market?
Matthew Granade, Chief Market Intelligence Officers, Point72 Asset Management: It’s absolutely real. And it’s a real change, I think, from how investing used to work. So if you go back five or ten years, most of the conversation was between the investor, the company, and the brokers. And so, you had this conversation among those three groups. And with the invention of what I like to call “third party data,” as an investor, you can know as much or more about that company sometimes even than the people you’re talking to within the company know.
So to make that concrete for a second, if you’re willing to understand what’s going on with McDonald’s, you’re going to look at credit card transactions. You’re going to want to look at geo-location data. You’re going to want to look at app downloads and a handful of other things. And suddenly, you’re going to have a very robust picture of how McDonald’s is doing, and you’re not going to have to talk to McDonald’s about that. And obviously, you’re going to want to as an additional part of that.
But I think it’s a very additive part of the investment process now. Very real.
Robin: Yeah. And Bina, what do you see on the sales side then? Do you see the same trends?
Bina Kalola, Global Banking & Markets Financial Technology Investments, Bank of America: I do. We have a lot of requests. We have very different clients on the “buy” side. Some have a shorter horizon, some a longer horizon. There are many different factors they take into consideration. And so, when I think about the new Alpha data sets and the alternative data sets, I break it into…I think it’s very focused on equities markets, number one. Number two, I think about it in terms of kind of an X-Y axis on the value of the information. Thinking about “Are you trading short-term in a few hours? Are you positioning for a year-plus?”
And then I think about the data in terms of “Is it immediately actionable? Is the action obvious? Or is it ‘That’s interesting, but what does this mean?'” And so, you have certain clients like Point72 who have the architecture to consume, and you have a lot of clients who do not. And so, that value extraction is really different right now, and I don’t see complete adoption yet, because that architecture isn’t there.
Robin: Bertrand, you’re from the provider side of things. How much interest are you seeing in the data you provide? And how much of that is actually coming from hedge funds, institutional investors? Are you seeing interest from traditional, boring mutual funds as well?
Bertrand Schmitt, CEO, Chairman & Co-founder, App Annie: Yeah. Us, we have probably 20% of our business with investors. Investors, we have VCs, that’s a small chunk. We have hedge funds; definitely, that’s probably the biggest chunk of our investor base. But we have more and more loan funds, even pension funds. And they’ll start to buy all that up. So we have seen a trend of our time.
Robin: How do they actually use it? How does a pension fund use this in their investment process?
Bertrand: How they will use this, we have multiple type of companies you can address, we saw that…focus on revenues, on downloads. Revenues, that’s great. If you want to track some giving companies, for instance, where they derive most of their revenues from mobile. And you see that in the U.S., in Asia.
If you look at downloads, some companies like Fitbit…If you can track how many app installs you got, you can get actually a very good prediction in term of units sold of equipment.
And now, the more we move with our usage product, where we take MAU, DAU, and actually, retention rate. Now, you see us being used a lot to track Facebook, Netflix, these type of companies. That will be the type of companies that we’d be trying to look for, to invest in.
And I will say…similar companies, from hedge funds to actually loan funds and pension funds.
Robin: Jeff, hedge funds beating down your door, actually for your global full map?
Jeff Glueck, CEO, Foursquare: Yeah. After we called Apple better than Wall Street, Chipotle, more accurately, and McDonald’s Q4 better than the Wall Street consensus, we got a lot of phone calls, I can tell you. I think…what Bertrand was talking about, of course, is understanding the digital world through their tracking software installed in so many apps. And for us, we have a picture of real world foot traffic in 120 countries. That’s the biggest opt-in panel in the world. Foursquare runs a consumer service, so we are a city guide in over 120 countries. About 50 million users a month use it to find great places all round them. And also, we run Swarm, which is the check-in game that Foursquare is famous for. Where people check in and try to earn coins and badges, and become the mayor of any place.
It turns out that if you want to build the freshest map of the world and detect where phones, even passively as well as actively, are moving in and out of 105 million businesses, you create a game. And so, it turns out to be…the crowdsourcing, the location means that we know when businesses in over 100 countries move, open, close within 24 hours much faster than Google or any other source can know. Because we have millions of people editing our data in a crowd-sourced fashion.
We have about…this core opt-in panel of our 50 million users, 12 million people that have opted in to let us see every stop that they make on their phones. And so, if they walk into a Starbucks or a Ford dealership, or a McDonald’s, or a commercial real estate location, if the phone’s in their pocket or in their briefcase or in their backpack, we will be able to snap it to place using WiFi triangulation and Bluetooth beacons, and GPS, and time of day, and all that.
So yes, the data set is pretty interesting to reveal the over 90% of consumer spending that happens in the real world, less than 10% is still eCommerce. So yeah, I think it’s valuable.
Robin: Matthew, let’s put you on the spot. Is this data valuable?
Matthew: “Is the data valuable”? Well, we don’t talk about specific vendors, but geo-location, generally, yes, we think that’s valuable. You have to have a lot of infrastructure, and you have to have a lot of know-how to make it work. This is not a plug and play sort of thing. It’s very tricky, and I would say there’s still a lot of it that we’re trying to figure out. But generally, having multiple different third party data sources on consumer stocks and things like that…[inaudible 01:55:16] lot of stocks stocks, is very valuable.
Jeff: And Matt can’t say who he’s using.
Robin: Exactly, no. There were a few strands I wanted to pick up on. First of all, how will you actually use the data, and how much data is usable? I’ve heard plenty of stories of people coming to sell snake oil…everybody thinks that every data set should be monetizable [SP]. That’s not actually true. And there’s also how you plug it in and actually use it. Maybe, Bina, are you seeing…do people come to you and just sell all sorts of odd things?
Bina: They do. What I would say is we’ve seen more predictive power information that’s relevant in the consumer space. But getting into the other sectors is much more difficult and sparse. The value of the information that comes through could be an insight once a week, it could be once a month. It doesn’t have that continuous daily, hourly, “Let me check it all of the time.” And so, sometimes, it’s really hard to change behavior in getting that adoption, because you’re not consistently getting value out of it.
We are seeing everything from satellite drones…there were a lot of issues, obviously, around drone information and privacy. The mobile geo-location data is, again, quite interesting, and making sure that people actually have rights to provide that data out there to the buy side and the sell side. The social listing platform is a bit noisy with a lot of primary research.
I think data is being created everywhere and everyone is trying to monetize it, and understanding which ones are going to be the most relevant in the totality of the investment decisioning process. Because remember, there’s a big expense, and everyone thinks that they’re the end-all, be-all on that trade decision, and that’s usually not the case.
So a lot in consumer. We’re seeing a lot of different types of data, but I think it’s still being sorted through on value.
Robin: Matt, what’s the most random data set somebody has tried to flog to you?
Matthew: People have to tried to sell us all sorts of stuff. But I just want to emphasize one point she was making there, which is that the name of the game is being able to sort through these datasets, know which one’s good, which one’s bad. Part of why we don’t talk specifically about who we use and who we don’t use and why is that there’s a lot of vendors out there and to your point, a lot of them aren’t good. And a lot of the insight that we’ve had on this is that sorting out process, and being able to do that rapidly. If you’re a fund, that’s one of the key things you have to figure out how to do.
Robin: Yes. Jeff?
Jeff: I was going to say that the provenance, so to speak, legally and also the quality of the data that you’re buying, is a key thing for any buyer to really dig into. And I think the smart hedge funds and others really dig into this, but I think the general market isn’t yet educated.
People try to sell crazy things. I was saying that I had an entrepreneur ask me, “Hey, I hear lots of hedge funds are buying your data. So I’m a wash service in Manhattan, and we pick up and wash,dry, fold laundry. Do you think hedge funds want to know what brands of clothing we’re washing?” And I’m like, “I think the credit card data might be a little ahead of you on that one.”
But the provenance is really important. So in location, in geo-location data, there are vendors out there that don’t have true rights to the data. They don’t have opt-in from 12 million people very transparently with that. And then it’s not just the legal concern, you also get quality problems. So you have to ask your vendors in geo-location, “Where do you get your shapes?” For us, 9 billion times, people have said, “I’m at the Starbucks,” or “I’m at the Ford dealership.”
And it’s also about the quality. A lot of third party geo-location data is people listening to ad exchanges, for instance. So every time an ad is served in a game, it goes through a mediation, let’s say, through App Nexus or MoPub, and the location might be shared. But it turns out, because we know where millions of people really are, when we look at publishers on those exchanges, 80% are lying. Eighty percent of publishers are passing incorrect location data that isn’t where the phone really is. And we know because if an app says that this device is in Central Park and we know it’s in the Gramercy Park area, systematically, through machine learning, we blacklist those services.
But people who don’t have their own user base can’t tell who’s lying and who’s telling the truth. So you really have to get into the provenance of these different kinds of data sources.
Bertrand: I would like to pile on that. I have seen quite a few reseller of data for instance. We’ve tried to sell similar sorts of that type of data that we sell. But when you dig in, they refuse to give you the source. You don’t know where it’s coming from.
Obviously, there is a problem because you cannot validate. You cannot know if it’s coming from proper ways, if the data will properly cleaned up, all of this stuff. Us, and to Foursquare, we have decided to own everything. We own the full process from A to Z. From making sure our terms of service are clean for the users, the clients, all the way up. Making sure [inaudible 02:00:30] treatment are properly applied.
We are very careful, and I think that that will be the big difference between people who do everything or by themselves [inaudible 02:00:39], and the guys will try to resell some stuff. You better don’t want to know how they get there.
Matthew: You not only just have the quality problem, you also…if the chain of ownership of that data is not proper and properly documented, then as a fund, you expose yourself to insider trading law in the sense that somewhere along the way, someone stole that information and then passed it to you. We have three full-time lawyers who essentially just vet data vendors and establish at every step of the chain that the contract was properly done and that the ownership was properly passed. And also to deal with the issues around personal identifiable information.
There’s a massive compliance operation that has to go side by side with the investment operation if you’re going to do this and not get yourself into a tremendous amount of trouble.
Robin: Yeah. It’d be interesting, picking up on this strand. Even if you have…you are completely okay with the provenance of the data and it’s owned by the people selling to you, at what point should we start worrying about the privacy issues? At what point do you think people will start? Because between one or two or several data sets, you can put a very granular map of the individual people who eventually are going to hear how much people’s personal details are anonymized in this data as well. And whether it’s scrubbed properly and whether that’s a potential pitfall, that there’s only one or two scandals out there, and then suddenly, the regulators will start turning their beady eyes to the issue.
Bertrand: In our case, we are very careful not to touch that and to make sure our clients won’t touch that. We don’t use our user identifiable data. We set aggregate data by the day by app, and nothing deeper, and we believe it’s a full force for clients. And it’s also a trade with our users. They access all that data for free. In exchange, we’re asking some rights. But we are careful not to ask too much rights.
Matthew: I think it’s a much bigger deal in the advertising and marketing space than in the finance space. And a lot of vendors are selling to both, right? So they’ll come in and they’ll be able to say, “I can tell you who shopped here,” and if you’re the McDonald’s across the street, you can push them coupons. For us, we like to see everything aggregated. It doesn’t really mean anything that this morning, you went to Starbucks…I like you Robin, but it’s just not that interesting.
Matthew: Whereas what everyone in the whole country did is what we’re looking to know. But a lot of these vendors are also doing these marketing cases, and I think that’s where the challenge is. I don’t know if you have more of a perspective.
Jeff: Obviously, Foursquare thinks about privacy every day, and we have 50 million users, and their privacy is paramount for us. Everything we do is entirely aggregated and anonymous. And so…even when we help clients understand demographic aggregate patterns, we know that women 20 to 29, because they’re part of our community, are shopping at DSW instead of Macy’s lately and things like that. And so, we understand these broad aggregate demographics, but we are super protective of privacy; we don’t let any of the cohorts we do get small in any way. It’s a trust that we have with our users, and they let us see their location because there’s a value to them.
People tell me all the time, their users, “I leave the background location on for Foursquare. It’s the only app I do. Because if I go to a new restaurant or if I turn the corner in Barcelona, it’ll just Ping me.” It just knows I’m new to this neighborhood, and here’s a great tapas place that you’re right around the corner from. So there’s a real exchange with consumers, but privacy and confidentiality is obviously core to that.
Robin: Yeah, it’d be interesting…speaking, like you say, the value is actually in the aggregate data rather than individual data. But still, the kerfuffle that can happen. There was an example in the UK recently where shock and horror hedge funds were paying polling companies to do polls ahead of the Brexit vote. And I thought that was entirely unremarkable and uninteresting and kind of obvious.
But you still had a very senior Labour Party politician come out and say, “This is horrible. Hedge funds are stealing data. This is horrific.” And as much as people don’t like necessarily advertising industry, I think they like the finance industry even less. The idea that you guys are hawking their information…however much the personal details are scrubbed, you worry about that, frankly…it’s like Bank of America is taking these people’s details and selling them to [inaudible 02:05:20] hedge funds.
Bina: You know what I would say, what’s interesting because you can buy a feed from Twitter and Facebook, and the individual doesn’t seem to have an issue. I’ve asked this question all the time. “How do you feel about that? You know that this is being sold to anyone and you’re still willing to put the information on?” And they’re like, “Yeah.”
And I think that this new generation is absolutely okay with that. I think there’s another generation probably that thinks about it differently. But the transparency of information is there and no one is going to be able to put that wall up. I think that what’s really critical…on our side, we look at making sure that we’re not getting individual data. That would…anyway, that information would be a needle in a haystack, not really that relevant. Because it doesn’t have the aggregate trend.
Some of the things that I would say are more important to us are thinking about history and comparison, right? So I don’t need to know that Matt went into a store. I need to know, generally, the totality of all the Matts that went in.
But on the privacy side, I don’t think that the individuals yet understand or have really thought about all of the information that’s out there and the rights that have been given up, and I’m not sure they care.
Robin: Is there a paradigm in terms of just a different generation that we don’t care as much as promises as maybe we thought we did?
Bertrand: It’s a generation of Facebook, of Instagram, of Twitter. You shared so much crazy stuff by yourself, pictures, information about yourself publicly in the first place. Yeah, I think there is a big difference. And those are ones…being French and European, there is definitely a much bigger aversion in Europe to sharing data for sometimes, for us [inaudible 02:07:05] good, reasonable, historical reasons, that reason. I would say more and more for crazy reasons initially; politicians in Europe don’t seem to focus on making the economy work. But that’s another topic, I guess.
Jeff: I would just add, as an observer of this over the last 17 years in digital, I find it kind of bizarre what people…a small minority of privacy advocates get worried about. Because most people, especially millennials who are already pretty comfortable with the opt-in, transparent, fair rules. But how many people are a member of a grocery loyalty club? How many people here have a grocery card or a Walgreens card, or a CVS card, or something like that? How many people have ever donated to a nonprofit by the way here?
If you’re a member of any of these things, they are selling your name, address, and every individual SKU that you buy with your name and address. And if you donate to a nonprofit, do you ever notice that 20 other nonprofits have a letter in your mailbox within six days?
The degree to which information that’s tied to a name and address doesn’t alarm people, or they’re unaware of it. But in a totally anonymous cookie that doesn’t have any PII attached to it freaks them out. It’s kind of a funny thing how human nature works in this situation.
Robin: Yeah, certainly, yeah. I think that’s…as a fellow European, I’ve noticed that coming from the U.S. If you sign up to anything, you know you’re opening yourself up to a deluge of letters and emails afterwards. It doesn’t happen quite as much in Europe yet. Interesting.
How do you actually present the data? Jeff and Bertrand, do people want it in a more understandable old world fashion? Do they want the API? How do you actually present it and sell it?
Bertrand: What we have seen is that we have different clients, different needs. So what we like to make sure…and even one client will have different needs, different users, with different capacity skills and training. So we [inaudible 02:09:08], with a web interface, that’s our main way to deliver our data. But then of course, someone wants to access it programmatically, API file reports [inaudible from 02:09:18 to 02:09:22]. So different ways to keep everyone happy.
Robin: Yeah, always a good thing.
Jeff: Yeah, like Bertrand, it depends on the client, right? So we have clients who can have the power to take huge data sets every night and get an overnight feed, and they want to create a mosaic of data interrelated to their other data sets as big hedge funds or something like that, big quant funds.
But most players aren’t like that. Even most long/short players don’t have a huge team of data scientists. So we also provide ticker-level analytics for certain custom reports. And then we have APIs for things like understanding cohorts. You want to understand Macy’s customers from last summer who went to certain stores and where they’re shopping now…nine months later, over a time series and doing this kind of cohort analysis and stuff, so we have APIs for that.
But we’re used to APIs. Foursquare, you may not know this, but we power location for Twitter globally, for Pinterest. We do a lot of Samsung studio technology. We power a lot of Bing’s geo-technology. Uber just announced they’re moving off Google to Foursquare’s global platform for drivers and writers.
And so, we have 100,000 companies using our API and data sets to understand the world, and we learn from all of that usage. So APIs is a big part of our history, but most financial clients don’t want to use the APIs.
Robin: Bina, how do you do it? How do you take in data and how do you then typically structure it for clients? Where’s the demand? Are you seeing any trends? Are people getting a bit more tech-savvy, or they still want old paper reports as it were saying X-Y-Zed?
Bina: We have both. The fundamental research report is still our core research product and that is still our clients’ biggest demand. And then there’s additive information that people want to see on the primary research side. And so, we’re adding to that.
Really, a lot of the…when I think about the U/Is and what a research analyst or somebody on the sales desk will use, it really varies with the company. We didn’t build a master U/I and taking all the feeds into the queries. So I would say that the U/I development for being able to query is probably the next generation for a lot of the new data providers, and making it much more than a hose.
And so, that is also part of the differentiation to make it more widely usable. And I think that’s really critical when you think about scaling your business as a data provider. So you guys have a fundamental business, and then in addition, you’re selling data that you get. Some people, their business is selling the data.
And to making that something that could be…that is accessible and actionable, broadly, is pretty important and critical. But we will obviously build our own models and incorporate the information and try and find the value.
Robin: How do you deal with…in any piece of information, if you’re the only person that has it in the world, then it’s fantastically valuable. Where a million people have it, it becomes less valuable. As a banker, I imagine you’d get into trouble if you started selling a valuable piece of information to just a few select clients. Perish the thought that a bank would ever do that, of course. But how do you deal with the fact that as this information becomes more commoditized and more people use it, it just becomes less valuable?
Bina: You know, I think that we have to remember that information is interpreted. And it isn’t binary always to say that it leads to X in stock price movement. There is a proposition and a hop that has to be made, and that is fundamentally in the analysis that the human is going to do. Now, I won’t get into what the machine does with an algorithm and those models. But people take positions all of the time on taking a long position or a short position. That’s why our markets actually work. We have buys and sells every day in the equities markets on a single ticker.
So the commoditization, what I would say is I think there’s probably a running to be able on certain things; on breaking news, on trends that are very short term to have it first, and that’s really critical. And then, one still has to action it, so our clients still have to action that recommendation.
I think it’s still a while away from commoditization. I think we’re still in the adoption phase. So I’m not worried about the commoditization…
Robin: Wild West, the real words, I think. Matthew?
Matthew: But I think that one of the most…so I agree that insight is an important part of this, and we certainly see across our platform that some investors are able to understand and interpret and get insight from the data better than others. So there is a skill there.
But I think one of the most important decisions that companies that are selling data to asset managers have to decide about, and one of their most important strategic decisions is what’s their sales model going to be? And if they’re passing it out essentially like the Wall Street Journal, then the price is going to be…whatever a Wall Street Journal costs. It’s a dollar a copy or something. Whereas what they’re wanting to do is come in and say, “This is worth millions and millions of dollars.” Well, when it’s widely, widely distributed, it’s no longer then worth millions and millions of dollars.
And so, that’s…if you’re in that business, you have to figure out where you’re going to be on that spectrum and what model you’re going to use make sense.
Jeff: Yeah. And I’ve had debates with some hedge fund clients about this. I actually, mostly agree with what Matt just said. If you have a really unique data set, and we think we do on the geo-location side, then we’re approaching this as we want to work with all the top smartest hedge funds and quant funds, and a small set of them, and charge a lot for the data. And not…we’ve been approached by the Bloombergs of the world, “Hey, can you put your data in every terminal?” And we said, “No, our strategy is not to commoditize, but to have some deep partnerships.”
And that’s partly economics, but it’s also learning. This panel is called the New Alpha, and for us, we’re fundamentally a consumer company. We have this trust with 50 million users. And so, we’re learning about where our data is predictive, like it predicted on Chipotle and Apple and the like. And then there are obviously…Amazon is not about foot traffic. We’re learning…obviously between McDonald’s and Amazon, there’s a lot in the middle. And so, we’re learning where foot traffic is predictive of sales and stock markets, and where it isn’t.
And so, working deeply with really smart long/short analysts helps us understand how to best leverage our data for the long term, so it’s a two-way street by being more selective, I think, in the approach. And obviously, we want to charge a premium for that.
Robin: The FT does the same thing we call “smaller than the journal.” We have readers, but we charge more.
Matthew: But I think the hedge funds that are going to be really successful in using this data are going to figure out a model where they can work with the vendors to create economics that make this worthwhile for both parties. And I think…exactly what that model is is very unclear still, at least to me. But I think that’s where this has to go. That’s sort of the only way that both sides can get what they want and then not destroy the value of the product.
Bina: I want to add one thing that Jeff said and I think this is really critical. That right now, I think that iterative process was between the vendor and the client in trying to extract that value is really key and critical, whether it’s the buy side or the sell side. And what we’re finding is that’s actually what a lot of these companies are asking for. “Can you help us figure out what that predictive power would be, where that value is? How you would want to actually use the information.” So it’s a really interesting time because you can create that partnership still pretty early right now with a lot of companies. And so, there is a bit of a first mover advantage, because you get to iterate the product.
Robin: Yeah. I wanted to ask one quick question to all of you before we ask this beautiful, intelligent audience…open up for some questions. If there’s a data set that you’d really like that doesn’t exist today, what would it be? It’s hard…but general terms. What would you really like to get your hands on?
Bina: You know, I think there’s sort of two that I see in development, and I think it’s going to be really interesting. Being able to truly extract sentiment and tone, number one. Not an encyclopedia or dictionary of words, but really be able to take that content and context, and a lot of people are doing it. But there’s more accuracy needed.
And then transcription on voice. Again, there’s probably a standard that’s more 80% accurate or so, maybe a little bit less. But those two technologies and what that could mean for the creation of data is hugely interesting to me.
Bertrand: That’s a wonder. We keep thinking about how we improve our data set. The way we think about this, actually, is we go back to focus to our core markets. Actually, our core markets is to help app publishers be the better [inaudible 02:18:59] business with that done in apps. And by extension, we’re happy to serve financial app stores and networks, and all these guys. But our core is app publishers. So we try to think about what data they need to run their business. And they need to understand how their apps get discovered, how they get downloaded, how they can engage with their users, and finally, how they monetize users.
So we try to align all our products from that on all of these steps. How is it that we can provide more visibility there . For instance, we are soon launching some marketing and teaching service where we plan more visibility of where ads are run, what channels…who is displaying what, that sort of stuff. That’s some new stuff that is coming.
Matthew: I think the area where I certainly have seen a lot less data sources than I would like to see is understanding industrial companies, more of the B2B world. There’s a lot of good data sources on the consumer side and on the TMT side. Also, a lot of good data sources on the healthcare side if you’re investing in those sectors.
But if you’re willing to understand how trucks move across the country, hard to do trains, hard to do boats I think are probably the exception. There is some pretty good data around boats. But in general, I keep thinking that some of the growing use of sensors and Internet of Things will light that up more. But I haven’t seen it yet.
Jeff: Well, obviously, Matt can speak to it…equities and analysts want and investors want. For us, a lot of what we’re thinking about is we want to be the Nielsen of the real world. But we’re thinking now about the intersection between the advertising world and the real world a lot. So we already measure and tell advertisers which ads drive people into the Olive Garden or the Jaguar dealership or whatever with our panel. But we’re thinking about how we mash up with set top box data, so that we could start to eventually tell TV advertisers like a car company or a film company. Seeing this ad versus this different ad campaign drove effective lift with those demographic to walk into your storefronts across the country.
And so, that’s where I think we don’t yet have full access to the right set of set top box data with the right sample mash to do that. But that’s going to happen. It’s going to happen. And sort of our Nielsen of the real world ambition, I think, is in that direction.
Robin: Interesting. So I’m sure we have some questions from the audience here now. I think some might have even tweeted them. Does anybody want to stand up?
Man: Yeah, we have some questions from the audience. The first is for the investors. “What is the most unusual data set someone has tried to pitch you on?”
Robin: I think that’s for you if you can maybe…I know you don’t want to mention specific stuff that you’re not using. But maybe, Matthew, if you want to say something about…you must have come across some odd ones.
Matthew: Well, certainly one that I was very excited about that then crashed and burned and gets a little bit into the compliance thing we were talking about was someone told me that they had put a sensor on every oil well in America and could give me sort of oil well, exactly what was going on. The only problem with that was when our lawyers looked at it, he hadn’t really asked anyone in charge whether or not he could do that. So he was right that there was the data and it was good. As best as I could tell, he maybe passed out $100 bills at oil wells across the country. And so, that is not being used at all by us.
Robin: Well, we have Jeff’s dry cleaning example, I love that one. Bina, have you come across some fun ones?
Bina: One that maybe scared me a little bit more than with odd is just geolocation just based on phones and working with a telco. And that was the first time I had really heard about that, and that was a little frightening.
Jeff: By the way…the telcos, when they talk about having location off cell tower triangulation, the average radius that they know where a phone is is 1,000 meters. One thousand meters is not…when I think about location, it’s two-meter accuracy or less. But when they’re talking about a 1,000-meter radius, that’s one of the reasons why smart people aren’t really buying the telco.
You can feel a little better, I guess, on that ground.
Robin: That’s good to know, at least. Maybe another question?
Man: This question is for Jeff. “It seems Foursquare and Swarm data self-selects to younger game players. How representative of the general population is that?”
Jeff: Yeah, it’s great. We have for institutional investors who want to dig into diligence on it, we have a U.S. census and global census on our data. And we tilt to…we over-index in the 20s and 30s and millennials. But we have enough people on the 40s and 50s and up that we can re-wait to census. Like every sample does, we re-wait…adjust to census, so that’s pretty easy to do. But we have enough people, and it’s very spread both globally and when I joined the company two years ago, I thought, “Oh, maybe it’d be very coastal or very urban,” but it really isn’t. It’s very spread across red state/blue state so to speak and very widely said.
But yeah, we do do some normalization to census. The funny thing is, when I really get into it, I’m like, “I have to admit, we’re underrepresented in people in their late 60s and 70s.” And then people remind me like “Advertisers and brands usually only buy 18 to 49 anyway.” For anyone over 49…and I’m getting closer to that, you really don’t matter to Madison Avenue.
Robin: Well, we have time for a couple more questions, I think.
Man: Another question is “How open are banks to sharing data with startups to test their models and algorithms?”
Robin: Bina, do you want to…
Bina: So I can’t speak about all banks, but I think banks are pretty concerned with privacy of our clients, both on the institutional and the consumer and the retail side. I would say that you’re not going to find too many that are receptive to doing that.
We will work with startups obviously in using our own data on our side of the firewall for us, but not necessarily selling it out to the public.
Robin: I remember talking to one bank, who even the internal team, they took an extremely dim view of sharing that internally, which was interesting.
Bina: That’s right. Between divisions, that’s right. Absolutely. Consumer data is absolutely protected.
Robin: Yeah, exactly. Maybe one more question before we try and keep the train on track on time?
Man: Yeah. So another question that I guess can be asked to the whole panel, but it’ll definitely focus a little bit more on App Annie is “What are the risks of building on a third party data set, or a third party exhaust?”
Bertrand: Yeah, we don’t use third party data, actually. We just use first party data. We control it. We have one million apps that are relying on us to track and understand what’s going on in terms of the [inaudible 02:26:27], revenues, time spent, MAUs, that sort of stuff. These one million apps probably represent 50% of all activities on the app stores.
On the other side, we have user panels, millions of users on them. And we fully control and publish these apps. Our strategy is to be completely…to be independent of third parties. To make sure that we have control, we know what we do, and we know how it’s done.
Robin: Jeff, do you want to [inaudible 02:26:58] as well?
Jeff: Yeah, I commented earlier that we have our own…like Bertrand was saying, vertically integrated, we build our own shapes and our…over two million people contribute…50 million users, but two million people last year actually helped edit the database, and then it goes through 40,000 volunteer data quality control volunteers around the world. So we really build our own stack and have our own opt-in.
When we extend and work with thousands of third party publishers, as I said, across the exchange, we have to discard 80% of it. Because often, they don’t have real-time location when they claim they do. You have to be very careful, and I think that’s why for both App Annie and Foursquare thinking about vertically integrating. A very clean, global data set, too. We want our data set to be global.
Robin: For Foursquare, though, the biggest danger must be that you just lose popularity. That people don’t use it as much. So you go to the Myspace direction as it were.
Jeff: Yeah. Obviously, we brought back a lot of the magic. We split our apps and there was a lot of user outcry. But over the last year, we tripled user engagement on Swarm. Which is kind of the core way we build the shapes. And so, we have more check-ins a day than in the history of the company. I feel good about that.
But we don’t need 100 million daily users to build our map of the world. If you think about our opt-in panel, it’s like thousands of times bigger than Nielsen already. But we don’t need to be Facebook to be a really representative sample of the real world economy.
Robin: Fantastic. Well, thank you so much. If you’ll give up your hands for the panel.
Bertrand: Thank you.
This report was created with data from CB Insights’ emerging technology insights platform, which offers clarity into emerging tech and new business strategies through tools like:
- Earnings Transcripts Search Engine & Analytics to get an information edge on competitors’ and incumbents’ strategies
- Patent Analytics to see where innovation is happening next
- Company Mosaic Scores to evaluate startup health, based on our National Science Foundation-backed algorithm
- Business Relationships to quickly see a company’s competitors, partners, and more
- Market Sizing Tools to visualize market growth and spot the next big opportunity