QuantRocket logo

© Copyright Quantopian Inc.
© Modifications Copyright QuantRocket LLC
Licensed under the Creative Commons Attribution 4.0.

Disclaimer

Introduction to Python

by Maxwell Margenot

Python is a good, jack-of-all-trades language to know. Here we will provide you with the basics so that you can feel confident going through our other lectures and understanding what is happening.

Code Comments

A comment is a note made by a programmer in the source code of a program. Its purpose is to clarify the source code and make it easier for people to follow along with what is happening. Anything in a comment is generally ignored when the code is actually run, making comments useful for including explanations and reasoning as well as removing specific lines of code that you may be unsure about. Comments in Python are created by using the pound symbol (# Insert Text Here). Including a # in a line of code will comment out anything that follows it.

You may hear text enclosed in triple quotes (""" Insert Text Here """) referred to as multi-line comments, but this is not entirely accurate. This is a special type of string (a data type we will cover), called a docstring, used to explain the purpose of a function.

Make sure you read the comments within each code cell (if they are there). They will provide more real-time explanations of what is going on as you look at each line of code.

Variables

Variables provide names for values in programming. If you want to save a value for later or repeated use, you give the value a name, storing the contents in a variable. Variables in programming work in a fundamentally similar way to variables in algebra, but in Python they can take on various different data types.

The basic variable types that we will cover in this section are integers, floating point numbers, booleans, and strings.

An integer in programming is the same as in mathematics, a round number with no values after the decimal point. We use the built-in print function here to display the values of our variables as well as their types!

Variables, regardless of type, are assigned by using a single equals sign (=). Variables are case-sensitive so any changes in variation in the capitals of a variable name will reference a different variable entirely.

A floating point number, or a float is a fancy name for a real number (again as in mathematics). To define a float, we need to either include a decimal point or specify that the value is a float.

A variable of type float will not round the number that you store in it, while a variable of type integer will. This makes floats more suitable for mathematical calculations where you want more than just integers.

Note that as we used the float() function to force an number to be considered a float, we can use the int() function to force a number to be considered an int.

The int() function will also truncate any digits that a number may have after the decimal point!

Strings allow you to include text as a variable to operate on. They are defined using either single quotes ('') or double quotes ("").

Both are allowed so that we can include apostrophes or quotation marks in a string if we so choose.

Booleans, or bools are binary variable types. A bool can only take on one of two values, these being True or False. There is much more to this idea of truth values when it comes to programming, which we cover later in the Logical Operators of this notebook.

There are many more data types that you can assign as variables in Python, but these are the basic ones! We will cover a few more later as we move through this tutorial.

Basic Math

Python has a number of built-in math functions. These can be extended even further by importing the math package or by including any number of other calculation-based packages.

All of the basic arithmetic operations are supported: +, -, /, and *. You can create exponents by using ** and modular arithmetic is introduced with the mod operator, %.

If you are not familiar with the the mod operator, it operates like a remainder function. If we type $15 \ \% \ 4$, it will return the remainder after dividing $15$ by $4$.

Mathematical functions also work on variables!

Python has a few built-in math functions. The most notable of these are:

These functions all act as you would expect, given their names. Calling abs() on a number will return its absolute value. The round() function will round a number to a specified number of decimal points (the default is $0$). Calling max() or min() on a collection of numbers will return, respectively, the maximum or minimum value in the collection. Calling sum() on a collection of numbers will add them all up. If you're not familiar with how collections of values in Python work, don't worry! We will cover collections in-depth in the next section.

Additional math functionality can be added in with the math package.

The math library adds a long list of new mathematical functions to Python. Feel free to check out the documentation for the full list and details. It concludes some mathematical constants

As well as some commonly used math functions

Collections

Lists

A list in Python is an ordered collection of objects that can contain any data type. We define a list using brackets ([]).

We can access and index the list by using brackets as well. In order to select an individual element, simply type the list name followed by the index of the item you are looking for in braces.

Indexing in Python starts from $0$. If you have a list of length $n$, the first element of the list is at index $0$, the second element is at index $1$, and so on and so forth. The final element of the list will be at index $n-1$. Be careful! Trying to access a non-existent index will cause an error.

We can see the number of elements in a list by calling the len() function.

We can update and change a list by accessing an index and assigning new value.

This is fundamentally different from how strings are handled. A list is mutable, meaning that you can change a list's elements without changing the list itself. Some data types, like strings, are immutable, meaning you cannot change them at all. Once a string or other immutable data type has been created, it cannot be directly modified without creating an entirely new object.

As we stated before, a list can contain any data type. Thus, lists can also contain strings.

Lists can also contain multiple different data types at once!

If you want to put two lists together, they can be combined with a + symbol.

In addition to accessing individual elements of a list, we can access groups of elements through slicing.

Slicing

We use the colon (:) to slice lists.

Using : we can select a group of elements in the list starting from the first element indicated and going up to (but not including) the last element indicated.

We can also select everything after a certain point

And everything before a certain point

Using negative numbers will count from the end of the indices instead of from the beginning. For example, an index of -1 indicates the last element of the list.

You can also add a third component to slicing. Instead of simply indicating the first and final parts of your slice, you can specify the step size that you want to take. So instead of taking every single element, you can take every other element.

Here we have selected the entire list (because 0:7 will yield elements 0 through 6) and we have selected a step size of 2. So this will spit out element 0 , element 2, element 4, and so on through the list element selected. We can skip indicating the beginning and end of our slice, only indicating the step, if we like.

Lists implictly select the beginning and end of the list when not otherwise specified.

With a negative step size we can even reverse the list!

Python does not have native matrices, but with lists we can produce a working fascimile. Other packages, such as numpy, add matrices as a separate data type, but in base Python the best way to create a matrix is to use a list of lists.

We can also use built-in functions to generate lists. In particular we will look at range() (because we will be using it later!). Range can take several different inputs and will return an iterator. We can see the items in the iterator by coercing it to a list.

Similar to our list-slicing methods from before, we can define both a start and an end for our range. This will return a list that includes the start and excludes the end, just like a slice.

We can also specify a step size. This again has the same behavior as a slice.

Tuples

A tuple is a data type similar to a list in that it can hold different kinds of data types. The key difference here is that a tuple is immutable. We define a tuple by separating the elements we want to include by commas. It is conventional to surround a tuple with parentheses.

As mentioned before, tuples are immutable. You can't change any part of them without defining a new tuple.

You can slice tuples the same way that you slice lists!

And concatenate them the way that you would with strings!

We can 'pack' values together, creating a tuple (as above), or we can 'unpack' values from a tuple, taking them out.

Unpacking assigns each value of the tuple in order to each variable on the left hand side of the equals sign. Some functions, including user-defined functions, may return tuples, so we can use this to directly unpack them and access the values that we want.

Sets

A set is a collection of unordered, unique elements. It works almost exactly as you would expect a normal set of things in mathematics to work and is defined using braces ({}).

Note how any extra instances of the same item are removed in the final set. We can also create a set from a list, using the set() function.

Calling len() on a set will tell you how many elements are in it.

Because a set is unordered, we can't access individual elements using an index. We can, however, easily check for membership (to see if something is contained in a set) and take the unions and intersections of sets by using the built-in set functions.

Here we checked to see whether the string 'cats' was contained within our animal_set and it returned True, telling us that it is indeed in our set.

We can connect sets by using typical mathematical set operators, namely |, for union, and &, for intersection. Using | or & will return exactly what you would expect if you are familiar with sets in mathematics.

Pairing two sets together with | combines the sets, removing any repetitions to make every set element unique.

Pairing two sets together with & will calculate the intersection of both sets, returning a set that only contains what they have in common.

If you are interested in learning more about the built-in functions for sets, feel free to check out the documentation.

Dictionaries

Another essential data structure in Python is the dictionary. Dictionaries are defined with a combination of curly braces ({}) and colons (:). The braces define the beginning and end of a dictionary and the colons indicate key-value pairs. A dictionary is essentially a set of key-value pairs. The key of any entry must be an immutable data type. This makes both strings and tuples candidates. Keys can be both added and deleted.

In the following example, we have a dictionary composed of key-value pairs where the key is a genre of fiction (string) and the value is a list of books (list) within that genre. Since a collection is still considered a single entity, we can use one to collect multiple variables or values into one key-value pair.

After defining a dictionary, we can access any individual value by indicating its key in brackets.

We can also change the value associated with a given key

Adding a new key-value pair is as simple as defining it.

String Shenanigans

We already know that strings are generally used for text. We can used built-in operations to combine, split, and format strings easily, depending on our needs.

The + symbol indicates concatenation in string language. It will combine two strings into a longer string.

Strings are also indexed much in the same way that lists are.

Built-in objects and classes often have special functions associated with them that are called methods. We access these methods by using a period ('.'). We will cover objects and their associated methods more in another lecture!

Using string methods we can count instances of a character or group of characters.

We can also find the first instance of a character or group of characters in a string.

As well as replace characters in a string.

There are also some methods that are unique to strings. The function upper() will convert all characters in a string to uppercase, while lower() will convert all characters in a string to lowercase!

String Formatting

Using the format() method we can add in variable values and generally format our strings.

We use braces ({}) to indicate parts of the string that will be filled in later and we use the arguments of the format() function to provide the values to substitute. The numbers within the braces indicate the index of the value in the format() arguments.

See the format() documentation for additional examples.

If you need some quick and dirty formatting, you can instead use the % symbol, called the string formatting operator.

The % symbol basically cues Python to create a placeholder. Whatever character follows the % (in the string) indicates what sort of type the value put into the placeholder will have. This character is called a conversion type. Once the string has been closed, we need another % that will be followed by the values to insert. In the case of one value, you can just put it there. If you are inserting more than one value, they must be enclosed in a tuple.

In these examples, the %s indicates that Python should convert the values into strings. There are multiple conversion types that you can use to get more specific with the the formatting. See the string formatting documentation for additional examples and more complete details on use.

Logical Operators

Basic Logic

Logical operators deal with boolean values, as we briefly covered before. If you recall, a bool takes on one of two values, True or False (or $1$ or $0$). The basic logical statements that we can make are defined using the built-in comparators. These are == (equal), != (not equal), < (less than), > (greater than), <= (less than or equal to), and >= (greater than or equal to).

These comparators also work in conjunction with variables.

We can string these comparators together to make more complex logical statements using the logical operators or, and, and not.

The or operator performs a logical or calculation. This is an inclusive or, so if either component paired together by or is True, the whole statement will be True. The and statement only outputs True if all components that are anded together are True. Otherwise it will output False. The not statement simply inverts the truth value of whichever statement follows it. So a True statement will be evaluated as False when a not is placed in front of it. Similarly, a False statement will become True when a not is in front of it.

Say that we have two logical statements, or assertions, $P$ and $Q$. The truth table for the basic logical operators is as follows:

PQnot PP and QP or Q
TrueTrueFalseTrueTrue
FalseTrueTrueFalseTrue
TrueFalseFalseFalseTrue
FalseFalseTrueFalseFalse

We can string multiple logical statements together using the logical operators.

Logical statements can be as simple or complex as we like, depending on what we need to express. Evaluating the above logical statement step by step we see that we are evaluating (True and True) or (False and not False). This becomes True or (False and True), subsequently becoming True or False, ultimately being evaluated as True.

Truthiness

Data types in Python have a fun characteristic called truthiness. What this means is that most built-in types will evaluate as either True or False when a boolean value is needed (such as with an if-statement). As a general rule, containers like strings, tuples, dictionaries, lists, and sets, will return True if they contain anything at all and False if they contain nothing.

And so on, for the other collections and containers. None also evaluates as False. The number 1 is equivalent to True and the number 0 is equivalent to False as well, in a boolean context.

If-statements

We can create segments of code that only execute if a set of conditions is met. We use if-statements in conjunction with logical statements in order to create branches in our code.

An if block gets entered when the condition is considered to be True. If condition is evaluated as False, the if block will simply be skipped unless there is an else block to accompany it. Conditions are made using either logical operators or by using the truthiness of values in Python. An if-statement is defined with a colon and a block of indented text.

Because in this example i = 4 and the if-statement is only looking for whether i is equal to 5, the print statement will never be executed. We can add in an else statement to create a contingency block of code in case the condition in the if-statement is not evaluated as True.

We can implement other branches off of the same if-statement by using elif, an abbreviation of "else if". We can include as many elifs as we like until we have exhausted all the logical branches of a condition.

You can also nest if-statements within if-statements to check for further conditions.

Remember that we can group multiple conditions together by using the logical operators!

You can use the logical comparators to compare strings!

As with other data types, == will check for whether the two things on either side of it have the same value. In this case, we compare whether the value of the strings are the same. Using > or < or any of the other comparators is not quite so intuitive, however, so we will stay from using comparators with strings in this lecture. Comparators will examine the lexicographical order of the strings, which might be a bit more in-depth than you might like.

Some built-in functions return a boolean value, so they can be used as conditions in an if-statement. User-defined functions can also be constructed so that they return a boolean value. This will be covered later with function definition!

The in keyword is generally used to check membership of a value within another value. We can check memebership in the context of an if-statement and use it to output a truth value.

Here we use in to check whether the variable my_string contains any particular letters. We will later use in to iterate through lists!

Loop Structures

Loop structures are one of the most important parts of programming. The for loop and the while loop provide a way to repeatedly run a block of code repeatedly. A while loop will iterate until a certain condition has been met. If at any point after an iteration that condition is no longer satisfied, the loop terminates. A for loop will iterate over a sequence of values and terminate when the sequence has ended. You can instead include conditions within the for loop to decide whether it should terminate early or you could simply let it run its course.

With while loops we need to make sure that something actually changes from iteration to iteration so that that the loop actually terminates. In this case, we use the shorthand i -= 1 (short for i = i - 1) so that the value of i gets smaller with each iteration. Eventually i will be reduced to 0, rendering the condition False and exiting the loop.

A for loop iterates a set number of times, determined when you state the entry into the loop. In this case we are iterating over the list returned from range(). The for loop selects a value from the list, in order, and temporarily assigns the value of i to it so that operations can be performed with the value.

Note that in this for loop we use the in keyword. Use of the in keyword is not limited to checking for membership as in the if-statement example. You can iterate over any collection with a for loop by using the in keyword.

In this next example, we will iterate over a set because we want to check for containment and add to a new set.

There are two statements that are very helpful in dealing with both for and while loops. These are break and continue. If break is encountered at any point while a loop is executing, the loop will immediately end.

The continue statement will tell the loop to immediately end this iteration and continue onto the next iteration of the loop.

This loop skips printing the number $3$ because of the continue statement that executes when we enter the if-statement. The code never sees the command to print the number $3$ because it has already moved to the next iteration. The break and continue statements are further tools to help you control the flow of your loops and, as a result, your code.

The variable that we use to iterate over a loop will retain its value when the loop exits. Similarly, any variables defined within the context of the loop will continue to exist outside of it.

We can also iterate over a dictionary!

If we just iterate over a dictionary without doing anything else, we will only get the keys. We can either use the keys to get the values, like so:

Or we can use the items() function to get both key and value at the same time.

The items() function creates a tuple of each key-value pair and the for loop unpacks that tuple into key, value on each separate execution of the loop!

Functions

A function is a reusable block of code that you can call repeatedly to make calculations, output data, or really do anything that you want. This is one of the key aspects of using a programming language. To add to the built-in functions in Python, you can define your own!

Functions are defined with def, a function name, a list of parameters, and a colon. Everything indented below the colon will be included in the definition of the function.

We can have our functions do anything that you can do with a normal block of code. For example, our hello_world() function prints a string every time it is called. If we want to keep a value that a function calculates, we can define the function so that it will return the value we want. This is a very important feature of functions, as any variable defined purely within a function will not exist outside of it.

The scope of a variable is the part of a block of code where that variable is tied to a particular value. Functions in Python have an enclosed scope, making it so that variables defined within them can only be accessed directly within them. If we pass those values to a return statement we can get them out of the function. This makes it so that the function call returns values so that you can store them in variables that have a greater scope.

In this case specifically, including a return statement allows us to keep the string value that we define in the function.

Just as we can get values out of a function, we can also put values into a function. We do this by defining our function with parameters.

In this example we only had one parameter for our function, x. We can easily add more parameters, separating everything with a comma.

If we want to, we can define a function so that it takes an arbitrary number of parameters. We tell Python that we want this by using an asterisk (*).

The time to use *args as a parameter for your function is when you do not know how many values may be passed to it, as in the case of our sum function. The asterisk in this case is the syntax that tells Python that you are going to pass an arbitrary number of parameters into your function. These parameters are stored in the form of a tuple.

We can put as many elements into the args tuple as we want to when we call the function. However, because args is a tuple, we cannot modify it after it has been created.

The args name of the variable is purely by convention. You could just as easily name your parameter *vars or *things. You can treat the args tuple like you would any other tuple, easily accessing arg's values and iterating over it, as in the above sum_values(*args) function.

Our functions can return any data type. This makes it easy for us to create functions that check for conditions that we might want to monitor.

Here we define a function that returns a boolean value. We can easily use this in conjunction with if-statements and other situations that require a boolean.

This above function returns an ordered pair of the input parameters, stored as a tuple.

And that one calculates the slope between two points!

With the proper syntax, you can define functions to do whatever calculations you want. This makes them an indispensible part of programming in any language.

Next Steps

This was a lot of material and there is still even more to cover! Make sure you play around with the cells in each notebook to accustom yourself to the syntax featured here and to figure out any limitations. If you want to delve even deeper into the material, the documentation for Python is all available online.


Next Lecture: Introduction to NumPy

Back to Introduction


This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian") or QuantRocket LLC ("QuantRocket"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, neither Quantopian nor QuantRocket has taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information believed to be reliable at the time of publication. Neither Quantopian nor QuantRocket makes any guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.