QuantRocket logo

© Copyright Quantopian Inc.
© Modifications Copyright QuantRocket LLC
Licensed under the Creative Commons Attribution 4.0.

Disclaimer

Estimation of Covariance Matrices

By Christopher van Hoecke and Max Margenot

Volatility has long been a thorn in the side of investors in the market. Successfully measuring volatility would allow for more accurate modeling of the returns and more stable investments leading to greater returns, but forecasting volatility accurately is a difficult problem.

Measuring Volatility

Volatility needs to be forward-looking and predictive in order to make smart decisions. Unfortunately, simply taking the historical standard deviation of an individual asset's returns falls short when we take into account need for robustness to the future. When we scale the problem up to the point where we need to forecast the volatility for many assets, it gets even harder.

To model how a portfolio overall changes, it is important to look not only at the volatility of each asset in the portfolio, but also at the pairwise covariances of every asset involved. The relationship between two or more assets provides valuable insights and a path towards reduction of overall portfolio volatility. A large number of assets with low covariance would assure they decrease or increase independently of each other. Indepedent assets have less of an impact on our portfolio's volatility as they give us true diversity and help us avoid position concentration risk.

Covariance

In statistics and probability, the covariance is a measure of the joint variability of two random variables. When random variables exhibit similar behavior, there tends to be a high covariance between them. Mathematically, we express the covariance of X with respect to Y as:

$$ COV(X, Y) = E[(X - E[X])(Y - E[Y])]$$

Notice that if we take the covariance of $X$ with itself, we get:

$$ COV(X, X) = E[(X - E[X])(X - E[X])] = E[(X - E[X])^2] = VAR(X) $$

We can use covariance to quantify the similarities between different assets in much the same way. If two assets have a high covariance, they will generally behave the same way. Assets with particularly high covariance can essentially replace each other.

Covariance matrices form the backbone of Modern Portfolio theory (MPT). MPT focuses on maximizing return for a given level of risk, making essential the methods with which we estimate that risk. We use covariances to quantify the joint risk of assets, forming how we view the risk of an entire portfolio. What is key is that investing in assets that have high pairwise covariances provides little diversification because of how closely their fluctuations are related.

Let's take the covariance of two closely related variables, $X$ and $Y$. Say that $X$ is some randomly drawn set and that $Y = 5X + \epsilon$, where $\epsilon$ is some extra noise. We can compute the covariance using the formula above to get a clearer picture of how $X$ evolves with respect to asset $Y$.

We can also compute the covariance between $X$ and $Y$ with a single function.

This gives us the covariance matrix between $X$ and $Y$. The diagonals are their respective variances and the indices $(i, j)$ refer to the covariance between assets indexed $i$ and $j$.

In this case, we only have two assets so we only have indices $(0, 1)$ and $(1, 0)$. Covariance matrices are symmetric, since $COV(X, Y) = COV(Y, X)$, which is why the off-diagonals mirror each other.

We can intuitively think of this as how much $Y$ changes when $X$ changes and vice-versa. As such, our covariance value of about 5 could have been anticipated from the definition of the relationship between $X$ and $Y$.

Here is a scatterplot between $X$ and $Y$ with a line of best fit down the middle.

Between the covariance, the linear regression, and our knowledge of how $X$ and $Y$ are related, we can easily assess the relationship between our toy variables. With real data, there are two main complicating factors. The first is that we are examining significantly more relationships. The second is that we do not know any of their underlying relationships. These hindrances speak to the benefit of having accurate estimates of covariance matrices.

The Covariance Matrix

As the number of assets we are curious about increases, so too do the dimensions of the covariance matrix that describes their relationships. If we take the covariance between $N$ assets, we will get out a $N \times N$ covariance matrix. This allows us to efficiently express the relationships between many arrays at once. As with the simple $2\times 2$ case, the $i$-th diagonal is the variance of the $i$-th asset and the values at $(i, j)$ and $(j, i)$ refer to the covariance between asset $i$ and asset $j$. We display this with the following notation:

$$ \Sigma = \left[\begin{matrix} VAR(X_1) & COV(X_1, X_2) & \cdots & COV(X_1, X_N) \\ COV(X_2, X_0) & VAR(X_2) & \cdots & COV(X_2, X_N) \\ \vdots & \vdots & \ddots & \vdots \\ COV(X_N, X_1) & COV(X_N, X_2) & \cdots & VAR(X_N) \end{matrix}\right] $$

When trying to find the covariance of many assets, it quickly becomes apparent why the matrix notation is more favorable.

Why does all this matter?

We measure the covariance of the assets in our portfolio to make sure we have an accurate picture of the risks involved in holding those assets together. We want to apportion our capital amongst these assets in such a way as to minimize our exposure to the risks associated with each individual asset and to neutralize exposure to systematic risk. This is done through the process of portfolio optimization. Portfolio optimization routines go through exactly this process, finding the appropriate weights for each asset given its risks. Mean-variance optimization, a staple of MPT, does exactly this.

Estimating the covariance matrix becomes critical when using methods that rely on it, as we cannot know the true statistical relationships underlying our chosen assets. The stability and accuracy of these estimates are essential to getting stable weights that encapsulate our risks and intentions.

Unfortunately, the most obvious way to calculate a covariance matrix estimate, the sample covariance, is notoriously unstable. If we have fewer time observations of our assets than the number of assets ($T < N$), the estimate becomes especially unreliable. The extreme values react more strongly to changes, and as the extreme values of the covariance jump around, our optimizers are perturbed, giving us inconsistent weights. This is a problem when we are trying to make many independent bets on many assets to improve our risk exposures through diversification. Even if we have more time elements than assets that we are trading, we can run into issues, as the time component may span multiple regimes, giving us covariance matrices that are still inaccurate.

The solution in many cases is to use a robust formulation of the covariance matrix. If we can estimate a covariance matrix that still captures the relationships between assets and is simultaneously more stable, then we can have more faith in the output of our optimizers. A main way that we handle this is by using some form of a shrinkage estimator.

Shrinkage Estimators

The concept of shrinkage stems from the need for stable covariance matrices. The basic way we "shrink" a matrix is to reduce the extreme values of the sample covariance matrix by pulling them closer to the center. Practically, we take a linear combination of the sample covariance covariance matrix a constant array representing the center.

Given a sample covariance matrix, $\textbf{S}$, the mean variance, $\mu$, and the shrinkage constant $\delta$, the shrunk estimated covariance is mathematically defined as:

$$(1 - \delta)\textbf{S} + \delta\mu\textbf{1}$$

We restrict $\delta$ such that $0 \leq \delta \leq 1$ making this a weighted average between the sample covariance and the mean variance matrix. The optimal value of $\delta$ has been tackled several times. For our purposes, we will use the formulation by Ledoit and Wolf.

Ledoit-Wolf Estimator.

In their paper, Ledoit and Wolf proposed an optimal $\delta$:

$$\hat\delta^* \max\{0, \min\{\frac{\hat\kappa}{T},1\}\}$$

$\hat\kappa$ has a mathematical formulation that is beyond the scope of this lecture, but you can find its definition in the paper.

The Ledoit-Wolf Estimator is the robust covariance estimate that uses this optimal $\hat\delta^*$ to shrink the sample covariance matrix. We can draw an implementation of it directly from scikit-learn for easy use.

Here we calculate the in-sample Ledoit-Wolf estimator.

Calculating Errors

We can quantify the difference between the in and out-of-sample estimates by taking the absolute difference element-by-element for the two matrices. We represent this mathematically as:

$$ \frac{1}{n} \sum_{i=1}^{n} |a_i - b_i| $$

First, we calculate the out-of-sample estimate and then we compare.

Comparing to Sample Matrix

We can check how much of an improvement this is by comparing the errors with the erros of the sample covariance.

We can see that the improvement of Ledoit-Wolf over the sample covariance is pretty solid. This translates into decreased volatility and turnover rate in our portfolio, and thus increased returns when using the shrunk covariance matrix.

Adding More Assets

Now we bring this to more assets over a longer time period. Let's see how the errors change over a series of months.

Here we calculate our different covariance estimates.

Here we calculate the error for each time period.

And here we plot the errors over time!

We can see that the mean errors of Ledoit-Wolf are lower than those of the sample covariance matrix. This shows us that the sample sample covariance matrix is less robust. This example only used 50 assets, but as we add more, the Ledoit-Wolf estimator would likely perform even better as the number of assets outpaces the number of observations.


Next Lecture: Introduction to Volume, Slippage, and Liquidity

Back to Introduction


This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian") or QuantRocket LLC ("QuantRocket"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, neither Quantopian nor QuantRocket has taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information believed to be reliable at the time of publication. Neither Quantopian nor QuantRocket makes any guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.