Building the Startup Predictor : AI-based Predictive Solution Predicts Viability of Startups with 75% accuracy

Sonal Mane
7 min readJun 1, 2018

The fun ride

It’s like building a magic glass ball that predicts the likelihood of success of startups. Except that it’s not magical at all and has a predictive accuracy of 50% with a recent successful prediction of 75% accuracy!

The past two years have been a fun entrepreneurial ride with a small group of engineers, data scientists and program managers that came together to build out their passion project. We’re delighted to announce that this project will soon be commercialized and see it’s first real customer. As one of the three co-founders and CEO of this project, I’ve had the pleasure of working with a super talented team. Pushpraj Shukla, our ML guru and CTO, Zoey Zhou, the mastermind and our data scientist and Jim Brisimitzis, our visionary, fearless leader and COO who led us along the way!

So, what does the Startup Predictor solve for?

Imagine yourself as a venture capitalist or a corporate innovation leader looking at thousands of startups being founded every single day. Wondering if the ones you’re talking to and helping are truly the ones that will shine through and see an exit in the near future.

If you did your math right, if your startup has the right team, product market fit and customers; your chances of success are still at 5–10%. If you’re a top tier venture capital firm with a suite of MBA grads churning out financial models that rip at the heart of the business model and come up with a valuation that gives you a 10x return, even better. Regardless of your perch, you’re plagued with the phenomenon of startup failures. And even if your startup doesn’t fail, chances are they may be get acqui-hired or merge and you may not get the returns you were hopeful for.

Let’s add to that the problem of human biases. As investors, having expertise in an industry vertical or market is often seen as a positive. It also means that you’re likely to have a human bias for startups in that industry. Add to that the sheer volume of deals that a venture capitalist will see in a year.

Photo by Markus Winkler on Unsplash

So here’s the pitch

The Startup Predictor solves for the qualitative biases around startups by creating an unbiased, data-driven, ML-based health score that helps investors and corporations make better decisions and save millions of dollars by making intelligent bets on startups. Unlike solution providers in this space, the Startup Predictor combines proprietary Bing data with insights that are publicly available to come up with a health index that has a 50% probability of success.

We built this AI to assist humans (and not to replace humans), be it tech scouts or VCs or executives to see startups beyond their personal lens. We don’t make any assumptions on what makes for a good entrepreneur or a good VC or accelerator or a good or bad area for that matter. We try to model the startup ecosystem as broadly as possible, across hundreds of attributes, as objectively as possible. We let our AI models learn the complex interplay of Team, Industry, Consumers and Financing that goes into making some startups successful sooner than others.” says Pushpraj sharing a bit about the nature of the model behind the predictor.

In Dec 2017, we predicted 44 startups across industries such as Retail, Pharma, Fintech, Education and Government. These became our ‘Companies to Watch’ for 2018 and for every successful round raised or exit, we would consider that milestone a validation of the prediction. It’s May 2018 and 75% of this list of startups has gone on to raise funding thereby validating the success rate of the Startup Predictor. A success rate of 75% is pretty high compared to the industry average of 10%. So if we were a venture fund investing in these 44 startups, we’d be off to the races already!

Where the magic happens

On the startup team, we started with the idea of leveraging machine learning to build a predictive AI solution. Turns out the Cloud & AI team at the time was building predictive models around the Academy Awards and sports events and the data-scientists and program managers on the team were pretty excited to help us bring this product to life. They had hacked on a similar idea already.

This was the perfect marriage of concept meets business need. What followed was a journey of two years where we consistently met as a team and brainstormed how we’d measure success. We talked about the test data that we would need and challenges around it. We also got very crisp about the use case we were trying to solve for. Data was the biggest hurdle and also the most fun challenge to solve for. From procuring to testing to training the data.

Predicting the next Unicorn or Dragon is a very ambiguous problem and we whittled it down to predicting the next funding milestone. This is important because a funding milestone is typically a sign of things going in the right direction, assuming the post-valuation was higher after the round.

Behind the scenes

There are three key components to the Startup Predictor —

  1. Company data: this is information about the startup, team, social presence, news, web traffic etc.
  2. Industry data: this is information about the industry vertical of the startup and the market forces behind it. Depending on what’s hot, we evaluate AI vs. Saas vs. VR startups differently.
  3. Financing data: this is information about the investors and the rounds of funding. Were the investors successful as a fund and what were their returns? How many investments succeeded over the past 12 months?

Bringing together these signals is fundamental to the model. We then overlay features such as the social media reach, founder’s education and several other nuances that go into the model. Training data comprises of past successes and failures in the startup ecosystem and we train the model with decades of startup records.

Lastly, we backtest with portfolios of venture funds and measure how the predictor performs relative to human decisions.

The thing to note is that we don’t model this as a single monolithic AI problem. There are several layers of AI and Machine learning models which solve different tasks, like understanding semantically if two companies are similar based on what they say vs. how people perceive them, modeling the lifecycle of a startups, modeling consumer and industry trends with Search, Social and Web data, among others.” says Pushpraj.

Moving forward

The applications of a predictor that consumes company data and predicts the likelihood of success are varied. From lead-generation to match-making between traditional enterprises and startups, there are many ways to use this health index and the underlying industry trends that surface.

One application in particular for our team has been to scout startups in the wild that never join an accelerator and thereby have very little validation around them. We then pick the top startups and work closely with them to provide resources from across the organization.

Another application that we’re piloting is to build trends based on the startups that we’re seeing and build an analysis of emerging trends that are leading indicators for corporate innovation teams rather than lagging indicators of venture capital investments.

So, what did we learn?

  1. Exits and successes don’t always have to mean extreme failure or IPOs

There are a large number of companies that are doing exceedingly well over time that aren’t raising billions in venture capital. These are steady growing startups that are likely revenue positive and not hitting the news radar but, silently growing and eventually seeing successful liquidation events.

On the other hand, exits may mean mergers or acquisitions that don’t necessarily lead to a higher valuation. These companies are interesting because they indicate that either a sudden spike in valuation was inflated and post funding rounds didn’t match up or that the company merged and created mutual value.

2. Success in the press and media doesn’t equate to success in real business

3. You can build out an idea from concept to commercialization with a frugal approach and a dedicated team of intrapreneurs

4. Trying to remove any personal biases we had about success/failure as exits, IPOs etc. Are B2B startups better than B2C? Different industries have different startup ecosystems and success could be different. E.g. Food industry is more nascent vs Tech

5. Being able test our product internally in Microsoft. AI can be just as biased as humans if the data we feed it with is biased. Being able to learn, test and remove biases and test internally in Microsoft is huge. Its an opportunity not many companies offer. I think this was a perfect training ground for us to test our tech before we release it to other companies.

6. How startups lead and are at the forefront of driving trends various industries. Trend forecasting is a related R&D area for us, so plenty to learn from this project for that as well.

When we started out as a team, this was simply a business application for internal use. We hacked our way through the early days, fundraising internally to build out the product and even built an early prototype in Excel. Being an intra-preneur meant we all had our day job and initially had to rely on our belief that this project would work and the unwavering support of our leadership team. With a dedicated team and lots of sweat equity, we were able to build out this solution and are now taking it to market. In my 12 year journey, this is the most fulfilling piece of work that I was able to create and it couldn’t have happened without the most stellar team.

So to wrap it up, the biggest learning is that you can truly tackle the hardest problems and even predict the next Dragon or Unicorn if you have a rockstar team walking with you every step of the way! No problem is too big or too technically complex to solve and with AI as a tool at your fingertips, it’s the perfect time to pick your passion problem and build your startup!

Microsoft Team:

Sonal Mane, Director of National Programs, Microsoft for Startups; Pushpraj Shukla, Director of Data Science, Cloud & AI; Jim Brisimitzis, Head of Microsoft for Startups North America; Zoey Zhou, Senior Data Scientist, Cloug & AI.

--

--

Sonal Mane

@Databricks @Qualtrics @PIPELINEorg Past: @MSFTStartups @GirlsinTech #VC @math_v_p @Chicagolandec #Product @windows @bing @Office @usc