QuantRocket logo

Disclaimer

Masking

Sometimes we want to ignore certain assets when computing pipeline expresssions. There are two common cases where ignoring assets is useful:

  1. We want to compute an expression that's computationally expensive, and we know we only care about results for certain assets. An example of such an expensive expression is a Factor computing the coefficients of a regression (RollingLinearRegressionOfReturns).
  2. We want to compute an expression that performs comparisons between assets, but we only want those comparisons to be performed against a subset of all assets. For example, we might want to use the Factor method top to compute the top 200 assets by earnings yield, ignoring assets that don't meet some liquidity constraint.

To support these two use-cases, all Factors and many Factor methods can accept a mask argument, which must be a Filter indicating which assets to consider when computing.

Masking Factors

Let's say we want our pipeline to output securities with a high or low percent difference but we also only want to consider securities with a dollar volume above \$10,000,000. To do this, let's rearrange our make_pipeline function so that we first create the high_dollar_volume filter. We can then use this filter as a mask for moving average factors by passing high_dollar_volume as the mask argument to SimpleMovingAverage.

Applying the mask to SimpleMovingAverage restricts the average close price factors to a computation over the ~2000 securities passing the high_dollar_volume filter, as opposed to ~8000 without a mask. When we combine mean_close_10 and mean_close_30 to form percent_difference, the computation is performed on the same ~2000 securities.

Masking Filters

Masks can be also be applied to methods that return filters like top, bottom, and percentile_between.

Masks are most useful when we want to apply a filter in the earlier steps of a combined computation. For example, suppose we want to get the 50 securities with the highest open price that are also in the top 10% of dollar volume. Suppose that we then want the 90th-100th percentile of these securities by close price. We can do this with the following:

Let's put this into make_pipeline and output an empty pipeline screened with our high_close_price filter.

Running this pipeline outputs 5 securities on May 5th, 2015.

Note that applying masks in layers as we did above can be thought of as an "asset funnel".


Next Lesson: Classifiers