QuantRocket logo

Disclaimer

Creating a Pipeline

In this lesson, we will take a look at creating an empty pipeline. First, let's import the Pipeline class:

In a new cell, let's define a function to create our pipeline. Wrapping our pipeline creation in a function sets up a structure for more complex pipelines that we will see later on. For now, this function simply returns an empty pipeline:

In a new cell, let's instantiate our pipeline by running make_pipeline():

Running a Pipeline

Now that we have a reference to an empty Pipeline, my_pipe, let's run it to see what it looks like. Before running our pipeline, we first need to import run_pipeline, a research-only function that allows us to run a pipeline over a specified time period.

Since we will be using the same data bundle repeatedly in this tutorial, we can set it as the default bundle to avoid always having to type the name of the bundle in each call to run_pipeline:

Let's run our pipeline for one day (2015-05-05) with run_pipeline and display it.

A call to run_pipeline returns a pandas DataFrame indexed by date and security. Let's see what the empty pipeline looks like:

The output of an empty pipeline is a DataFrame with no columns. In this example, our pipeline has an index made up of all 8000+ securities (truncated in the display) for May 5th, 2015, but doesn't have any columns.

In the following lessons, we'll take a look at how to add columns to our pipeline output, and how to filter down to a subset of securities.


Next Lesson: Factors