If there’s one technology that promises to change the world more than any other over the next several decades, it’s (arguably) machine learning.
By enabling computers to learn certain things more efficiently than humans, and discover certain things that humans cannot, machine learning promises to bring increasing intelligence to software everywhere and enable computers to develop new capabilities –- from driving cars to diagnosing disease –- that were previously thought to be impossible.
While most of the core algorithms that drive machine learning have been around for decades, what has magnified its promise so dramatically in recent years is the extraordinary growth of the two fuels that power these algorithms – data and computing power.
Both continue to grow at exponential rates, suggesting that machine learning is at the beginning of a very long and productive run.
As revolutionary as machine learning will be, its impact will be highly asymmetric. While most machine learning algorithms, libraries and tools are in the public domain and computing power is a widely available commodity, data ownership is highly concentrated.
This means that machine learning will likely have a barbell effect on the technology landscape. On one hand, it will democratize basic intelligence through the commoditization and diffusion of services such as image recognition and translation into software broadly. On the other, it will concentrate higher-order intelligence in the hands of a relatively small number of incumbents that control the lion’s share of their industry’s data.
For startups seeking to take advantage of the machine learning revolution, this barbell effect is a helpful lens to look for the biggest business opportunities. While there will be many new kinds of startups that machine learning will enable, the most promising will likely cluster around the incumbent end of the barbell.
Democratization of Basic Intelligence:
One of machine learning’s most lasting areas of impact will be to democratize basic intelligence through the commoditization of an increasingly sophisticated set of semantic and analytic services, most of which will be offered for free, enabling step-function changes in software capabilities. These services today include image recognition, translation and natural language processing and will ultimately include more advanced forms of interpretation and reasoning.
Software will become smarter, more anticipatory and more personalized, and we will increasingly be able to access it through whatever interface we prefer – chat, voice, mobile application, web, or others yet to be developed. Beneficiaries will include technology developers and users of all kinds.
This burst of new intelligent services will give rise to a boom in new startups that use them to create new products and services that weren’t previously cost effective or possible. Image recognition, for example, will enable new kinds of visual shopping applications. Facial recognition will enable new kinds of authentication and security applications. Analytic applications will grow ever more sophisticated in their ability to identify meaningful patterns and predict outcomes.
Startups that end up competing directly with this new set of intelligent services will be in a difficult spot. Competition in machine learning can be close to perfect, wiping out any potential margin, and it is unlikely many startups will be able to acquire data sets to match Google or other consumer platforms for the services they offer. Some of these startups may be bought for the asset values of their teams and technologies (which at the moment are quite high), but most will have to change tack in order to survive.
This end of the barbell effect is being accelerated by open source efforts such as OpenAI as well as by the decision of large consumer platforms, led by Google with TensorFlow, to open source their artificial intelligence software and offer machine learning-driven services for free, as a means of both selling additional products and acquiring additional data.
Concentration of Higher-Order Intelligence:
At the other end of the barbell, machine learning will have a deeply monopoly-inducing or monopoly-enhancing effect, enabling companies that have or have access to highly differentiated data sets to develop capabilities that are difficult or impossible for others to develop.
The primary beneficiaries at this end of the spectrum will be the same large consumer platforms offering free services such as Google, as well as other enterprises in concentrated industries that have highly differentiated data sets.
Large consumer platforms already use machine learning to take advantage of their immense proprietary data to power core competencies in ways that others cannot replicate – Google with search, Facebook with its newsfeed, Netflix with recommendations and Amazon with pricing.
Incumbents with large proprietary data sets in more traditional industries are beginning to follow suit. Financial services firms, for example, are beginning to use machine learning to take advantage of their data to deepen core competencies in areas such as fraud detection, and ultimately they will seek to do so in underwriting as well. Retail companies will seek to use machine learning in areas such as segmentation, pricing and recommendations and healthcare providers in diagnosis.
Most large enterprises, however, will not be able to develop these machine learning-driven competencies on their own. This opens an interesting third set of beneficiaries at the incumbent end of the barbell: startups that develop machine learning-driven services in partnership with large incumbents based on these incumbents’ data.
Where the Biggest Startup Opportunities Are:
The most successful machine learning startups will likely result from creative partnerships and customer relationships at this end of the barbell.
The magic ingredient for creating revolutionary new machine learning services is extraordinarily large and rich data sets. Proprietary algorithms can help, but they are secondary in importance to the data sets themselves.
What’s critical to making these services highly defensible is privileged access to these data sets. If possession is nine tenths of the law, privileged access to dominant industry data sets is at least half the ballgame in developing the most valuable machine learning services.
The dramatic rise of Google provides a glimpse into what this kind of privileged access can enable.
What allowed Google to rapidly take over the search market was not primarily its PageRank algorithm or clean interface, but these factors in combination with its early access to the data sets of AOL and Yahoo, which enabled it to train PageRank on the best available data on the planet and become substantially better at determining search relevance than any other product.
Google ultimately chose to use this capability to compete directly with its partners, a playbook that is unlikely to be possible today since most consumer platforms have learned from this example and put legal barriers in place to prevent it from happening to them.
There are, however, a number of successful playbooks to create more durable data partnerships with incumbents.
In consumer industries dominated by large platform players, the winning playbook in recent years has been to partner with one or ideally multiple platforms to provide solutions for enterprise customers that the platforms were not planning (or, due to the cross-platform nature of the solutions, were not able) to provide on their own, as companies such as Sprinklr, Hootsuite and Dataminr have done.
The benefits to platforms in these partnerships include new revenue streams, new learning about their data capabilities and broader enterprise dependency on their data sets.
In concentrated industries dominated not by platforms but by a cluster of more traditional enterprises, the most successful playbook has been to offer data-intensive software or advertising solutions that provide access to incumbents’ customer data, as Palantir, IBM Watson, Fair Isaac, AppNexus and Intent Media have done. If a company gets access to the data of a significant share of incumbents, it will be able to create products and services that will be difficult for others to replicate.
New playbooks are continuing to emerge, including creating strategic products for incumbents or using exclusive data leases in exchange for the right to use incumbents’ data to develop non-competitive offerings.
Of course the best playbook of all — where possible — is for startups to grow fast enough and generate sufficiently large data sets in new markets to become incumbents themselves and forego dependencies on others (as, for example, Tesla has done for the emerging field of autonomous driving).
This tends to be the exception rather than the rule, however, which means most machine learning startups need to look to partnerships or large customers to achieve defensibility and scale.
Machine learning startups should be particularly creative when it comes to exploring partnership structures as well as financial arrangements to govern them – including discounts, revenue shares, performance-based warrants and strategic investments. In a world where large data sets are becoming increasingly valuable to outside parties, it is likely that such structures and arrangements will continue to evolve rapidly.
Perhaps most importantly, startups seeking to take advantage of the machine learning revolution should move quickly, because many top technology entrepreneurs have woken up to the scale of the business opportunities this revolution creates, and there is a significant first-mover advantage to get access to the most attractive data sets.
Slowly but surely, cyber security is evolving from the days of castles and moats into the modern era of software driven business. In the 1990s, after several failed attempts to build secure operating systems, the predominant security model became the network-perimeter security model enforced by firewalls. The way it works is clear: Machines on the inside of the firewall were trusted, and anything on the outside was untrusted. This castle-and-moat approach failed almost as quickly as it began, because holes in the wall had to be created to allow emerging internet services like mNews, email and web traffic through.
Artificial intelligence will replace large teams of tier-1 SOC analysts who today stare at endless streams of threat alerts.
With a security wall that quickly became like Swiss cheese, machines on both sides were still vulnerable to infection and the antivirus industry emerged to protect them. The model for antivirus then and now is to capture an infection, create a signature, and then distribute it widely to “immunize” other machines from getting infected by the same malware. This worked for vaccines, so why not try for cyber security?
Fast-forward to 2016, and the security industry hasn’t changed much. The large security companies still pitch the castle-and-moat model of security — firewalls and signature-based detection — even though employees work outside the perimeter as much as inside. And in spite of the fact that most attacks today use one-and-done exploit kits, never reusing the same malware again. In other words, the modern work force coupled with modern threats has rendered traditional security techniques obsolete.
Software is eating security
While most enterprises today still employ these dated security techniques, a new model of security based on artificial intelligence (AI) is beginning to take root in organizations with advanced security programs. Necessity is the mother of invention, and the necessity for AI in security became obvious when three phenomena emerged: (1) The failure of signature-based techniques to stop current threats; (2) the voluminous amounts of security threat data; and (3) the scalability challenges in addressing security threat data with people.
“Software is eating the world,” the noted venture capitalist Marc Andreessen famously said in 2011 about such obvious examples as Amazon, Uber and Airbnb disrupting traditional retail and consumer businesses. The security industry is ripe for the same kind of disruption in the enterprise space, and ultimately in the consumer product space. Artificial intelligence will replace large teams of tier-1 SOC analysts who today stare at endless streams of threat alerts. Machines are far better than humans at processing vast amounts of data and finding the proverbial needle in the haystack.
Artificial Intelligence is experiencing a resurgence in commercial interest because of breakthroughs with deep learning neural networks solving practical problems. We’ve all heard about IBM’s Watson winning at “Jeopardy,” or making difficult medical diagnoses by leveraging artificial intelligence. What is less well known is that Watson has recently undergone a major deep learning upgrade, as well, allowing it to translate to and from many languages, as well as perform text to speech and speech to text operations flawlessly.
Many of us interact with deep learning algorithms unwittingly when we see TV show and movie recommendations on Netflix based on what we’ve viewed previously or when your Mac properly identifies everyone in a picture uploaded from your phone. Or when we ask Alexa a question and Amazon Echo gives an intelligent response — likewise for Cortana and Siri. And one of the most hotly debated topics in machine learning these days is self-driving cars, like Tesla’s amazing Model S.
Deep learning allows a machine to think more like a human. For instance, a child can easily distinguish a dog from a cat. But to a machine, a dog is just a set of pixels and so is a cat, which makes the process of distinguishing them very hard for a machine. Deep learning algorithms can train on millions of pictures of cats and dogs so that when your in-house security camera sees the dog in your house, it will know that it was Rover, not Garfield, who knocked over the vase.
With deep learning, today’s next-generation security products can identify and kill malware as fast as the bad guys can create it.
The power of deep learning becomes clear when you consider the vast speed and processing power of modern computers. For instance, it takes a child a few years to learn the difference between a house cat and a dog. And if that child grew up to be a cat “expert,” it would take Gladwell’s 10,000 hours to become a feline whisperer. The amount of time it takes to expose a human to all of the training data necessary to classify animals with near perfection is long. In contrast, a deep learning algorithm paired with elastic cloud computing resources can consume hundreds of millions of samples of training data in hours, to create a neural network classifier so accurate and so fast that it would outperform even the most highly trained human experts.
What’s more fascinating than this new technology allowing machines to think like a human, is allowing machines to act like a human. Since the 1950s, we’ve been fascinated with the notion that robots might one day be able to think, act and interact with us as our equals. With advances in deep learning, we’re one giant step closer to that reality. Take the Google Brain Team’s DeepDream research, for instance, which shows that machines trained in deep learning can create beautiful pieces of art, in a bizarre form of psychedelic machine “dreaming.” For the first time, we see incredible creativity from machines because of deep learning, as well as the ability to make decisions with incredible accuracy.
Because of this ability to make classification decisions with incredible accuracy, deep learning is leading a renaissance in security technologies by using the technology to identify unknown malware from benign programs. Like the examples above, this is being done by training the deep learning neural networks on tens of millions of variants of malware, as well as on a representative sample of known benign programs.
The results are industry-changing, because unlike legacy security products that provided protection either through prior knowledge of a threat (signature-based) or via segmentation and separation, today’s next-generation security products can identify and kill malware as fast as the bad guys can create it. Imagine a world where security technologies actually enable more sharing rather than less, and allow a more open approach to data access rather than restrictive. This is the direction deep learning is allowing us to go.
Are you ready?
Disruption is clearly coming to the security space. The market has been waiting for better technology that can keep pace with the fast-evolving adversarial threat. Breakthroughs in deep learning artificial neural networks are now stopping attacks previously unseen in real time before they even have a chance to run. It’s time to get on-board with a new generation of technology that is disrupting traditional castle-and-moat security models.