This post was co-written with Richard Harmon Ph.D., Director, EMEA Financial Services at Cloudera.
A growing number of asset management firms have been integrating alternative data sources with conventional market data, enriching trading strategies with additional insights as a means to gain an informational advantage and an edge on the rest of the market. But the informational advantage that can be gained through the use of alternative data will diminish as those sources become generally understood and adopted. To avert this alpha decay, it is critical to understand how alternative data will contribute to your strategy and make certain that it is applied for the right reasons.
Active management is fundamentally about uncovering opportunities before they are priced in by the broader market. Over the past 5 years a growing number of asset management firms have been integrating alternative data sources with conventional market data, enriching trading strategies with additional insights as a means to gain an informational advantage and make more informed investment decisions in order to get an edge on the rest of the market.
The informational advantage that can be gained through the use of alternative data will diminish as those sources become generally understood and adopted. In other words, the alpha decays at an accelerated rate as more firms process and analyze similar alternative data sources. But if your competitor’s implementation is poor, no ground is lost.
To avert alpha decay, it is critical to understand how alternative data will contribute to your strategy and make certain that it is applied for the right reasons (fit for purpose). It is also critical to ensure the integrity of the underlying data and to optimize the models that use it, thus accelerating an informed decision-making process.
Understand How Alternative Data Contributes to Your Bottom Line
An alternative data strategy could include non-financial data such as location/foot traffic, energy supply/demand, weather, IoT, patent filings, satellite imagery, or telecommunications-related information. It could also employ “traditional” financials from private company data if it is applied in an alternative way. Below are two examples of how data sets like these can contribute to a firm’s competitive edge.
One way is to help fund managers demonstrate their investment strategy to their clients, ultimately increasing the total assets they manage.
In a recent Greenwich Associates research report, 88 percent of the buy-side firms surveyed said that they demonstrate trading strategies to investors. Of those, 95 percent use alternative data to explain those strategies. (see chart, below).
A second example of how alternative data could contribute is to spot signals that provide unique and timely insights into market trends as well as investment signals. For example, in April of 2016, Foursquare, a location intelligence provider, correctly predicted that the 2016 first-quarter sales for Chipotle would be down nearly 30 percent based on foot traffic data derived from their mobile apps. Two weeks later, the restaurant chain reported the same decline, leading shares to fall 6 percent the following day.
The Chipotle story exemplifies two concepts vital to drawing value out of alternative data: “fit for purpose” and “timeliness”:
- Fit for purpose: Alternative data are data sources that are both new and useful for a specific purpose. In the above scenario, foot traffic is used in a new and alternative way, but most important, it is useful data because customers walk in to order food. Compare this to automobile traffic, while also a novel approach, it could be less exact.
- Timeliness: The purpose of alternative data is to provide additional and actionable insight to a strategy before it becomes widely understood and generally used. The Foursquare example took place more than two years ago, when the use of foot traffic data was ahead of its time. It is still novel, but it is getting more widely used.
Ensure Fit for Purpose with a Team of Subject Matter Experts
According to Eagle Alpha, “Data is only as valuable as the questions asked of it. Investors who have unique angles in their fundamental analysis will seek answers to questions that other investors have not thought of asking.”
Asking the right questions is an iterative and creative process that is driven by subject matter experts (i.e., portfolio managers). They and their teams are the ones best equipped to derive meaning (and thus value) from new types of data because they understand the historical and business relevance of a topic and can quickly discern nuanced disruptions that could flag an opportunity. The same Greenwich Associates study noted earlier supports this observation. It found that two-thirds of the buy-side firms interviewed take a team-based approach to analyzing alternative data. The team collaborates to ask the questions, find the data, run the models, and revise the questions.
For instance, consider sentiment analysis, another type of alternative data source. Cloudera and Arcadia Data have joint customers that leverage advanced text analytics to extract insights from sentiment-based signals derived from semi-structured data sources, such as a company’s earnings announcements, independent analyst publications, and annual reports. The success of this approach is not purely algorithm-driven. The teams executing this strategy have the deep market, product, and analytic knowledge to assess not only that a signal adds relevant information to an investment decision but how best to execute on this information.
Ensure Data Integrity by Using Data in Its Native Format
After considering the subject matter expertise of the team, the caliber of the algorithmic models built and used by the team are dependent on the integrity of the underlying data. Alternative data comes in many varieties and formats, so if you start to transform that data to fit a predefined and generalized data structure, the meaning of the data can be lost in translation. A data lake architecture is designed to process, store, and analyze both structured and unstructured data in native formats. Data integrity is ensured because it has not been changed.
The value of running proprietary models (i.e., your firm’s unique perspective) against data in its native format is that the answers to your questions are not derived from transformed interpretations. The data lake is the ideal environment to capture unique signals and make informed decisions from a wide range of real-time and historical alternative and conventional data sources. To facilitate collaboration that maintains the integrity described, native visual analytics is needed.
Native visual analytics is a term that refers to building data visualization apps/dashboards from within the data lake, leveraging the native formats of both alternative and conventional data sets. In conjunction with an intuitive and self-service web interface, the subject matter experts and their teams are able to collaborate on the same big picture. Because data has not been translated (i.e., summarized/aggregated and then moved to a separate visualization server) the team will have confidence in the integrity of the analysis. They will also have direct access to the output of the operationalized models described below in granular detail, not abstract summarizations. Any other approach reduces the agility and speed required to seek insights quickly and effectively in a highly competitive environment.
Ensure Timeliness by Operationalizing Analytics
The team’s proprietary models referred to earlier not only need to use data in its native form, they also need to be operationalized. Operationalizing models are key to lowering the total cost of executing a trade, thus improving the economics of the strategy and improving the odds of generating alpha. Costs include the allocated expense of the data, getting the right personnel, and the time and effort required to build the models.
Data science workbenches, for example, are tools that help ensure efficient collaboration across the team and streamline the processes required to rapidly make a model operational within a controlled environment, ensuring data governance and regulatory compliance. Data science workbenches allow the team to continuously monitor and measure the predictive power and behavior of their models. Tools such as these facilitate a repeatable, industrial-scale process for developing the models and a reliable architecture for deploying them into production systems.
A purposeful and team-based approach, assessing alternative and conventional data sets in their native form, and executing models that have been operationalized will facilitate well-thought-out investment objectives, while ensuring that the time, staffing, data selection process, and technology commitment is worthwhile. This will maximize the shelf life of your alternative data strategy, even as others are traveling down a similar path.