Deep dive into state-of-the-art data-driven approach towards sourcing, evaluating and supporting investments

Signalfire is one of the world’s most ambitious takes on redesigning a venture capital firm.

A self-described most quantitative fund in the world & the first VC with a demo”, Signalfire is the undisputed champion of the data-driven venture capital movement, which I covered extensively in my previous article.

At the heart of its success lies an end-to-end real-time data platform called Beacon, a sort of a Bloomberg terminal for startup industry or as Chris Farmer, Signalfire CEO describes it „proprietary mini Google”.

In this article, I will try to showcase how Signalfire is using Beacon to create competitive advantage across the entire value chain of a venture.

Some of the things you will learn:

  • What are some of the 10 million data sources the firm uses to uncover future winners?
  • Which metrics does the firm use to spot break-out companies among the broader startup population?
  • What role does AI play in helping Signalfire’s portfolio companies recruit top talent?
  • How does Beacon help Signalfire secure allocation in the hottest deals?
  • What allows the firm to invest $10M / year in the platform despite charging industry-standard management fees?
  • Is Signalfire trying to replace human investors with a system capable of making autonomous investment decisions?
  • How does Beacon empower founders of Signalfire’s portfolio companies to make better strategic decisions?
  • What piece of software allows Signalfire to tap into the collective intelligence of its advisors and LPs for reviewing deals?
  • What is the role of data in increasing diversity and reducing bias?
  • What is the most technically difficult part of building & maintaining Beacon?

To make for easier analysis, I’ve divided the article into 6 sections:
I. Market intelligence
II. Network
III. Talent
IV. Technical
V. Cost
VI. Closing thoughts


Beacon tracks performance of more than 6 million companies in real-time.

To accomplish this astonishing feat, it is drawing upon more than 10 million data sources, such as academic publications, patent registries, open-source contributions, regulatory filings, company webpages, sales data, AppStore rankings, social networks, product crowdfunding sites (Indiegogo), tech communities (Producthunt), angel group platforms (Gust, Proseeder), expert networks (GLG, Coleman Research Group, Guidepoint), highly gated professional networks (Voray), buyer review sites (Capterra, GetApp, G2Crowd, SoftwareAdvice, TrustRadius), technographics vendors (Datanyze, HG Data), family office coinvestor networks (Sharenett); crowdfunding sites (AngelList, FundersClub, OurCrowd, Republic), and even raw credit card data.

Finding the needle in a haystack

The exact qualities Signalifre looks for in companies are heavily dependent on the sector they operate in. As noted by Chris Farmer, Signalfire’s CEO:

"The sort of factors that influence success in a storage company versus a social network are radically different. One has an app and is ad-supported and the other doesn’t have an app and obviously has a totally different [model]. And then obviously there’s myriad companies in between. So you have to solve for each of those individually. And I would say there are some commonalities across all of those types of companies. And then there’s many dimensions, many more dimensions that are different"

How does the firm know what to track?

The firm starts from first principles on what they think are the most important things in a given market. Using a combination of primary research, market mapping exercises, and data provided by the platform, the firm tries to build their unique perspective on individual markets, like what are the secular headwinds or tailwinds in a given sector, what is the availability and level of advancement of core technologies, where the profit pool is likely to accumulate, and what are the likely characteristics of the winners in that equation. This approach, dubbed “The Prepared Mind” is championed by a number of top tier firms, including Accel Partners, Bessemer Venture Partners, and Lux Capital.

Once this baseline is established, the firm starts to proactively track market participants’ performance against the factors they think will influence success.

"What we’re looking at is the same sort of KPIs that you would look at if you were the management team of that company and try to dashboard your business."

That could be anything from construction and quality of the team (founders’ previous successes, whether key employees have left, movements of talent between companies); through consumer behavior patterns around a given company’s platform (engagement, duration of sessions, how often do they come back); customer spending habits; lifetime value of the customers; financial flows to and from the company and the quality of those flows; to news sentiment around given company and a myriad of other things.

Of course, the firm’s perspective on the markets and the anatomy of the possible winners is constantly evolving. While this statement is true for most venture firms, thanks to the massive amounts of data it analyses, Beacon takes this capability to another level.

We always evolve our perspective and map. These are sort of live documents that are constantly learning and improving over time. […], we’re pretty heavily beyond what you could as a human, moving heavily into deep machine learning types of approaches on this and massive statistical studies that we can do as a result of the scale at which we’re operating. And the feedback loops are, you know, in the millions of, of companies and the trillions of data points. It’s impossible to even approach this scale even within orders of magnitude on a manual basis”.

Companies that are outperforming or doing something notable are flagged up on a dashboard, effectively allowing Signalfire to see deals earlier than traditional venture firms.

But it doesn’t end here.

Seeing the full picture

The whole point of pulling all this data is to gain the most comprehensive view on both the company itself, as well as the competitive dynamics of the sector and broader market context in which the company operates. Beacon basically allows you to fluidly zoom out your perspective all the way up to the broad market level and zoom in down to an individual company level, seeing all the accompanying data on every step of your way to see if an opportunity is worth taking a deeper dive in.

To exemplify — using Beacon, one can easily get insights about how well does a given company’s team stack up against its main competitor’s team, what was the total venture investment volume in the subsector in the last 12 months and what are the recurring patterns in customer spend in a given market segment. It can help with questions like how much does a certain product cost in different geographies, how much a special or discount deal would impact revenue growth and profit margins, how a change to an offering would compare to competitors, and tasks like cohort analysis or competitive benchmarking to truly understand where a given company truly sits relative to its peer group.

The man versus the machine

Unlike some of the other firms dabbling with data-driven venture investing, like Deep Knowledge Ventures, Signalfire isn’t trying to create a system capable of making autonomous investment decisions, thus leaving humans out of the equation.

For Signalfire it’s all about creating a symbiotic relationship between the man and the machine.

Below quotes from Chris Farmer are a good proxy for the firm’s thinking:

"We’re a hybrid system that still includes venture capitalists to make the final decisions and balance quantitative inputs (e.g. performance metrics) with qualitative ones (the vision or grit and determination of the founder/team). Great systems are not enough; you need top-quality investors and experts as well, and the combination of the two is optimal — augmented intelligence vs AI".


"Ultimately, spotting a unique investment opportunity is still a human talent — but the data analysis helps the best candidates stand out more. We’re using our technology to weaponize people with superhuman anomaly-detection powers"

Beacon— impact on deal origination and picking investments

So how exactly does Beacon empower humans? There are a few aspects to that.


As described earlier, Beacon allows Signalfire to see deals much earlier than competing funds, creating a major competitive advantage in the cutthroat venture landscape of today

“A few days of advantage can mean the difference between winning and losing a deal,” — says Farmer.


Beacon unearths companies that would have never appeared on fund’s radar using traditional, network-driven deal sourcing techniques, effectively increasing the size of the very top layer of the firm’s investment funnel

“We backed a company from Romania that we never would have seen otherwise.” — says Farmer.


At any given moment, there are tens of thousands of companies that could, at first sight, be interesting to a VC.

Armed with the ability to constantly monitor and analyze their performance, Signalfire can filter through deals with greater pace and accuracy than traditional venture firms.

While computers are far better than us at narrowing the scope of what we should look at, we’ve found that humans are much better at actually evaluating the 10–12 promising companies identified by the computer.” — says Andrew Ng, partner

Whereas most firms meet with hundreds of companies (most of which end up going nowhere), SignalFire is able to take far fewer total meetings with an equal if not greater number of fruitful ones. This allows the firm to take a more intimate look at the most promising companies and build stronger relationships with the founders behind them, leading to both better investment decisions and winning more deals.


Most of the old-school VCs ridicule the idea of data-driven investing at the early stage based on the perceived non-existence of conclusive data on both the companies themselves as well as on the nascent sectors they operate in.

While this might be true in the case of extremely deep, revolutionary technologies, according to Farmer it doesn’t apply to the majority of the startup activity.

“You know most things are not completely novel where there’s never been any variation of what they do. Though they are usually an interesting twist or an evolution or even a revolution of what’s been done previously. But, you know, sort of history tends to rhyme. And so the data can give a lot of context on the ecosystem in which a company is pursuing.”

Having the aforementioned ability to easily benchmark a company (or sometimes even an idea) against its peer group and incumbents, to gain deep insights on the customer behavior patterns in the sector it operates in, to run multiple what-if scenarios can be detrimental In building a bullish or bearish case and eventually building the conviction to invest, especially when moving into yet uncharted territories

This sentiment is echoed by Ali Partovi, investor/advisor of Signalfire & cofounder, who calls Beacon a “game-changer for assessing tech investments”. He says he’d “compare it to GPS navigation. While there’s no substitute for using your own sense of direction, it’s indispensable, especially when you venture into new areas.”


As noted by Francesco Corea in his phenomenal article, venture capital investment process is very prone to bias:

“Many venture capitalists suffer indeed from common psychological biased such as overconfidence (Zacharakis and Shepherd, 2001); availability biases (over-weighting information that comes easily to mind because memorable while underweighting information that is less exciting); information overload (Zacharakis and Meyer, 2000), meaning that more information often leads only to greater confidence and not to greater accuracy; halo effect (how similar this company is to previous exits I had?); survivorship biases; representativeness, which means ignoring statistical information in favour of a narrative; confirmation bias (accepting information that support pre-existing beliefs); and similarity biases (meaning not simply that entrepreneurs with similar educational and professional path are preferred, but also that VCs with a history of working with startups tend to overlook the potential of entrepreneurs with a background in established firms, and vice versa — Franke et al., 2006) ” — wrote Corea.

Decision-making based on pattern recognition, established networks, and gut instinct is the likely cause of both the diversity issues faced today by the startup world as well as the subpar returns venture as a sector is delivering.

Using data to originate, screen, and evaluate investments based on quantifiable factors can play a massive role in reducing the prevalence of bias in the decision-making, leading to both better financial returns, but also higher diversity across venture capital firms’ portfolios.

“It doesn’t entirely eliminate bias but it does make it more of a meritocracy. It makes you take a second look,” says Farmer


Groupthink is quite possibly one of the worst diseases plaguing the VC population. A lovechild of FOMO, biases, short-term thinking, easy access to capital, and misaligned incentives, it accounts for many of the most spectacular flops in venture history.

The intelligence acquired by Beacon helps the fund to gain their own perspective on how bullish they really are on flavor-of-the-month sectors or companies, and often leads them instead towards less en-vogue/high-consensus areas of investments.


Network is arguably the single most important thing a venture firm can offer an entrepreneur.

Given the above, it’s surprising to see that apart from a few notable examples (cc: FirstRound, A16z, Kima Ventures) most of the venture firms still operate in an extremely inefficient way on the network-enablement front.

Where one would expect a thriving many-to-many network where all of the its nodes (VCs, founders, LPs, advisors, 3rd party experts, etc) can freely and instantly communicate with one another, share expertise, provide intros, etc. thus creating an extremely powerful network effect, what you usually get is a siloed, slow and extremely untransparent structure which requires tens of back-and-forth emails to obtain a single introduction.

Sounds like something ripe for disruption by software, doesn’t it?

Come Signalfire.

Apart from being a market intelligence platform, Beacon serves also as the communication-enabling layer between the various stakeholders in the Signalfire ecosystem.

All of the Signalifire’s 75 advisors (who are also LPs in the fund), founders of the portfolio companies, and the fund’s employees are wired into the system, creating the opportunity to fully tap into the resources (know-how, capital, contacts, etc) held by the network’s participants.

To ensure maximum engagement on the platform, the firm went as far as creating differing, tailor-made versions of the software for different interest groups- there’s a Beacon version for founders, for advisors, a central system, etc.

This approach brings several indisputable benefits:


Founders Signalfire invests in gain an enormous advantage in the form of an ability to pick brains of the Signalifre’s 75 hand-picked advisor-LPs (including eg. Stuart Butterfield, CEO of Slack) who were all granted open access to the platform in exchange for serving as on-demand advisors to Signalfire’s portfolio companies.

As noticed by one of the Signalfire’s partners, Wayne Hu:

“No matter how long we’ve been in the game, as venture capitalists we can only provide you with B+ advice. The ideal solution is to speak with a canonical expert, and SignalFire has hand-selected more than 75 of the top domain experts across all major industry verticals and functions. […] Our distributed network is in stark opposition to the traditional closed network venture firm, and means we can provide you access to A+ advice no matter the industry or nature of your question.”


The probability of getting a much-needed introduction is an order of magnitude higher when you can tap into the collective networks of 100+ people.


100+ experts all looking at the same, unified data stream regarding a given company, its competitors, and the market in which it operates, connected by a real time communication layer making it easy to share concerns, insights, and recommendations — if you ask me, it’s hard to imagine a better way to tap into collective intelligence to analyze potential deals.


All of the advisor-LPs and many of the portfolio founders are active angel investors. Having a centralized platform where they can easily share dealflow, keep track of new investment opportunities, and share insights on individual deals 1) increases total dealflow of the firm 2) makes them more likely to invest alongside the fund as angels in individual deals

“We often bring them in as co-investors alongside us to really sort of focus their attention on particular companies of relevance” –says Farmer


The last, but equally important piece of the puzzle is Beacon Talent, an AI-based system for identifying and sourcing talent. This product directly addresses one of the biggest friction points in recruiting — research, which typically consumes HALF of the time it takes to conduct a search. The firm’s goal is to reduce this time-intense effort while simultaneously expanding the scope, quality, and diversity of candidates.

Beacon talent tracks and provides deep intelligence on nearly the entire talent ecosystem of the tech industry, including engineers, data scientists, product managers, designers, and business leaders, ranking each person with dozens of quality dimensions, providing real-time predictions on how likely they are to switch jobs, and even proactively pushing new ones as they become available to help Signalfire’s portfolio companies with the recruitment of rising stars.

In this effort, the firm monitors and analyzes the ocean of publicly available data about potential employees, such as their career moves and accomplishments, and distills it down to simplified insights, which can be automatically transferred to the portfolio companies’ native applicant tracking systems.

Signalfire claimed that in just 6 months from ltaunch Beacon Talent it was able to place 55 candidates, a quarter of whom on executive level, in their portfolio companies.

One of the notable examples of the system in work is when Zume Pizza was looking for a seasoned executive to manage its operations and turned to Signalfire for help. Using its proprietary software, Signalfire was able to spot and eventually help close a perfect candidate — Susan Alban, who had spent the two previous years as UberEats first general manager for San Francisco.


„Ten years ago, Farmer says, the project would have been impossible. “These kinds of data storage and processing capabilities weren’t available. We needed the kind of computer power and storage that was only available in the bigger consumer internet companies.”

Now, however, even smaller companies can crunch data at a large scale, thanks to database tools such as Hadoop and Apache Spark, and the possibility of renting cheap server capacity from Amazon Web Services.

Theoretically, the technical process is straightforward. A combination of public and private data sources is first selected, then crawled, consolidated, and finally filtered by investment criteria.

In general terms, these AI processes normally entail data crawling modules (i.e. to map, monitor, and extract unstructured sources of data), identification modules (i.e. to homogenize and consolidate company references and understand relations within the start-up network), and clustering modules (i.e. to group and categorize similar players, industries or news).

Because these are unstructured data sources, the challenge is to de-duplicate things, normalize them, separate out names that are exactly the same, connect handles to people, etc.

“That’s where 90–95% of the work is, is taking those unstructured data sources, individually structure them to high degrees of accuracy, cleansing the data, data munging. And it’s just all this data janitorial work, and building pipelines that self heal as formats on web pages change. All of that infrastructure that frankly has been built by folks like Google but is a much deeper problem than the data issues at Facebook or some of the social networks, because they’re dealing in relatively structured data. It may be user-generated and so it may have its own flaws but, you know, they’re dealing with much more, much more structured data, whereas the Google’s of the world are out crawling the entirety of the public web and then bringing that back in a structured format. So we do something that’s much more similar to that, except for we really constrain the domain and the sources that we pursue and spend a lot more time and energy cleaning and processing those sources individually before we then combine them into sort of what you see on the front end. And that’s really what makes this so hard. Frankly much harder than I would have ever guessed. — says Farmer


It’s not a poor man’s game

Given the magnitude of work required to get to this level of precision, where the firm can actually source unstructured, messy data from 10+M sources and structure it in a way that’s streamlined, uniform, single record, auditable, and can really be used to drive insights, it shouldn’t come as a surprise Beacon comes with a hefty price tag.

As reported by Farmer, the firm spends more than ten million dollars a year on the platform.

Even given Signalfire’s considerable fund size, such expense couldn’t be financed from the management fee. So how does the firm actually manage to pull this off financially?

Actually, Signalfire employs a hybrid model, where apart from venture investing they also do advisory work for later-stage investors like Fidelity and corporates from Fortune100, who are trying to gain a better understanding of the startup landscape.

“So we take the massive amounts of data that we collect with the systems and use that proprietarily for our own venture activity and then also use it to help paint the landscape for, for other constituents that are non-competitive with us that help to advertise the heavy expense of building up all this infrastructure. It’s a totally new approach to venture that’s much more of an operating company type approach than the traditional sort of money management structure of most VC funds” — says Farmer.


Software is eating venture capital

I opened my previous article with a somewhat controversial thesis:

For an industry which is supposed to be at the forefront of innovation, it’s shocking how outdated, or rather broken, venture capital really is.
It’s almost 2020, and it’s still (mostly) the same biased guys investing in their nearby living friends solving imaginary problems, basing their investment decisions on crystal ball called intuition, bringing on average net negative value to the entrepreneurs & losing investors’ money in the process. Add a hefty dose of misaligned incentives, massive operational inefficiency, resistance to adapt tech, and you’ve got a fairly decent picture of VC AD2019.
This status quo, however, is finally starting to get questioned by a new breed of investors. Equipped with data, algorithms, and custom-made software, those rebel venture capitalists arrogant enough to question “the way things have always been done” are quickly gaining a foothold in the venture world, creating a new paradigm of meritocracy.

14 months and hundreds of conversations later, I am more convinced than ever that software is in fact eating venture capital, and those who do not adapt to the new reality, will soon become obsolete.

As elegantly summarised by Kelvin Yu:

When Ilya and Chris (Farmer) were still exploring the idea, many VCs told Ilya that SignalFire’s strategy wouldn’t work. You can imagine that a successful GP might think, “My way has worked well in the past, so there’s no reason to change.”
However, that is precisely the same mentality that allowed so many venture-backed startups to destroy comfortable incumbents over the years. The whole business of VC is to be optimistic about innovation, to upend the status quo. VCs often question how scalable a startup’s business model is or how software can revolutionize an industry. It is a little ironic then, that by and large they have not looked at how software can improve their own business

If you’re a VC, now is the time to ask yourself:

Am I the disruptor or am getting disrupted?