9. Modules, I: Expand functionality

We have already seen how lots of very basic mathematical operators are directly available in Python (that is, they are built-in). But we would also like to be able to do further things, like:

  • calculate sinusoids, exponentials and logarithms;

  • generate random numbers from many different distributions;

  • perform image processing;

  • plot results;

... and much more. While it is true we could write our own functions to do these things (and sometimes we might), in general we can avoid "reinventing the wheel" and spend our time addressing other questions, by utilizing modules. Modules are chunks of already-written code that contain sets of functions, constants and other objects, generally organized around a given topic, such as plotting or statistics or data science.

To use a module, we import it, a step which makes that module's tools available within a Python program or session. Once the module has been imported, it is just as if its code had been included at that point in the main code.

We can think of importing a module like going to a library and picking up a book on statistics for probability distributions, a book of algebra for matrix operations, etc. and then making use of the book's contents to enhance our own work. In fact, sometimes modules are referred to as libraries. We can check books out as often as we want, when we need them; similarly, we import a module into any program in which it would be useful. In the same way that always carrying around and searching through a huge number of books would be cumbersome, so do we just try to import the (typically few) modules we need at a given time.

As with using books, we like to cite where things come from for easy reference, for knowing where we might find more related content, and for having better understanding by both ourselves and others. We will similarly find it useful to explicitly keep track of what we use from different modules. Additionally, sometimes there are overlaps of functionality among modules, and we want to keep clear precisely what is being used. These factors inform the way in which we import modules.

9.1. Importing a module

There are different ways to import a module (or even just pieces of a module). In general, we strongly prefer the following method (with a couple minor variations) and will typically use it here:

import <module name> as <abbreviation>

The abbreviation will be chosen to be unique for each module in a program, and it is included every time we use a feature from the module. In this way, it acts as a reference citation both for ourselves and for others. It also helps avoid problems of conflicts if, for example, two modules contain a function of the same name but perhaps different syntax; using the distinct abbreviation disambiguates which function we actually use.

Note

For some very popular modules, there are commonly used abbreviations that are implemented in most texts, help documentation, and shared code. We will typically use these, because it makes understandability, integration and translation easier.

For example, one of the most common modules we will make use of is NumPy (or simply "numpy"; link). It contains many useful numerical computing tools, such as: arrays (analogous to vectors/tensors in math), linear algebra operations, trigonometric and hyperbolic functions, random number generators, and much more. The standard importation and abbreviation syntax is:

import numpy as np

Once this is done, to use any function or attribute within numpy, we type the name of the module, then ., then the name of the function or constant in order to use it. For example, the following shows ways to use various sinusoid, exponential and logarithmic functions, as well as utilize the famous constant \pi:

np.sin(10)
x = np.exp(-3.5)
y = np.log(x)
np.log(10**7)
print("The value of PI is:", np.pi)

Some large modules even contain submodules for specific topics. We can directly import the submodule in just the same manner as the full module, providing it its own abbreviation. For example, we will often use the "pyplot" submodule of Matplotlib for making plots and graphs:

import matplotlib.pyplot as plt

Again, functions within the (sub)module are just used in the same manner as before, referencing the abbreviation:

plt.plot([1, 2, 3], [6, 2, 6])
plt.show()

Some modules have very short names to start with, so we just use their full name as a reference (no extra abbreviation). Or sometimes the name isn't that short, but the full name is just used because of historical convention. For example, sys contains system functions (such as exiting a program); os contains operating system (OS) routines, such as getting file paths; and subprocess can run and/or check commands in the system shell. These are traditionally just imported as:

import sys
import os
import subprocess

which means that they or their submodules are referenced by their full name:

sys.version                # show current Python version number
os.path.abspath('.')       # show present working directory on computer (via 'path' submodule)
subprocess.run('ls')       # run the terminal command "ls", showing directory contents

9.2. A basic set of modules

In this course we will make use of the following modules. Again, this is a tiny set in an increasing set of modules available within the Python universe. (They should be installed on your systems already---if they aren't, and you receive an error message at import, please inquire.)

Module/link

Standard import

Examples of functionality (partial!)

Math

import math

basic mathematical functions (sinusoids, logs, exp, floor, ceiling, abs val, etc.)

NumPy

import numpy as np

basic math, linear algebra, arrays, matrix operations, random numbers, index operations, polynomials, Fourier transform, array-wise calculations

SciPy

import scipy.integrate as integrate
import scipy.special as special

interpolation, integration, linear algebra, dealing with sparse matrices, image processing, optimization

Matplotlib

import matplotlib as mpl
import matplotlib.pyplot as plt

plotting (2D and 3D), histograms, graphs, charts, image processing, interactive plots

Random

import random

random number generation of various distributions and styles

Pandas

import pandas as pd

data science, using dataframes

sys

import sys

system functions, reading in arguments, controlled exiting

os

import os

interfacing with the operating system (OS), getting file paths, seeing directory structure

copy

import copy

copying structures (in a way to duplicate the structure, not just the label)

time

import time

getting clock time, date-time, calendar information

turtle

import turtle

graphical interface for demonstrating how to implement loops, conditions, etc.

We will often include the import statement in examples in these notes. However, there will also be cases where we do not, and from this point onward, we will expect that you recognize the module abbreviations noted here, and know to import the necessary library. Thus, if you see:

math.sin(math.pi / 4)

... you should understand that your code or Python session must include import math prior to that line. If there is any doubt or question, you can check the import statements at the top of the file, and see where a given module (or submodule) has come from. For example, this check can be useful if you don't recognize an module's abbreviation.

The math and numpy modules have a large amount of overlapping functionality, with the former's basically a subset of the latter's. math is distributed with Python (so no need to install it separately, but it still needs to be imported), and we will find it more convenient to use at the start, due to its simpler structure. Later on, we will migrate to using numpy, because it has a broader and more powerful range of applications, in particular for calculations with arrays and matrices.

Note

The term package technically applies to a set of modules, while a module is contained in a single file somewhere on the computer. However, there really isn't any distinction between importing a module or package, so we might fall into the (technically bad) habit of referring to them interchangeably, even if it is slightly inaccurate.

9.3. Finding out more about modules

How can we tell what functionality is available in a module?
We can:
  • read the online documentation, such as provided in the above table listing modules,

  • use TAB-autocompletion: after importing a module, type its abbreviation and . (e.g., math.) and then hit TAB; a browseable list of the module's contents will appear, and you can start navigating or typing for specific functionality,

  • guess a name and try typing it after math.. Many module function names are the same as known mathematical functions (e.g., sin(), cos(), arctan(), tanh(), etc.) or variations (e.g., log(), log10(), log2(), etc.), and TAB-autocompletion might help show options when we have typed part of a known name,

  • type dir(math) to see a list of all objects in the module,

  • search online for the name of the function, such as with:
    python module plotting.
How can we find what module contains some particular function?
We can:
  • know from experience as we program (that is why practicing matters!),

  • note the abbreviation prefixing the function's name in a code, and check the import statement,

  • search online, such as:
    python module binomial distribution
    or more generally,
    python module <function description>
How do we know the syntax and usage of a function in a module?
We can:
  • read the function's internal help, as discussed in the next section. All built-in objects in Python and those in distributed modules come with help documentation. When we write our own modules, we can also include our own help documentation with it.

  • ... and we can also look online for examples and help documentatoin, some of which has been provided above in the main module table.

Q: How many functions and attributes within the math module start with the letter "p"? What are they?

+ show/hide response

9.4. Looking ahead: making our own modules

Additionally, we can write our own modules to help us organize our work. A module can be as simple as a text file with function definitions, parameter assignments and other objects of interest. We would import such a module in basically the same way as the other modules, shown above (and we would still use an abbreviation for each one).

Making our own modules allow us to write code, and separate it out from other code into a different file. This style of programming---having distinct pieces for specific jobs, which are then later combined-- is one example of modular programming. This is a very efficient model for being able to read, test, verify and add to smaller pieces of code, which are then combined into the larger product.

We can use the import statement to include the code of another file, either in the current directory or in Python's search path. For example, if we have some functions defined in a file called "lib_astro_functions.py" (and it is either in the present working directory or in Python's search path), then we can put the following in our main code:

import lib_astro_functions as laf

and then use functions from it by prefixing them with laf., such as laf.solar_radius, laf.calc_gravity_strength(), laf.convert_hubble_to_SI(), or any other hypothetical constants and functions. Note that the .py of the library text file's name is left out when importing it-- just part of the import statement's syntax. (In the case of self-written modules, we likely have to make up our own abbreviation, and the above was just chosen from the initials of the file name.)

Rather than have one really big book of all of mathematics and mathematical sciences, it is useful to keep different areas compartmentalized for ease of reference and use. The same is true of programming: we don't want a single, huge file that is hard to search through or to check individual pieces of. Modularization makes programming more manageable (and if you don't believe us now, you will later!).

9.5. Practice

NB: When using any function, one has to know things like the number of inputs needed, the expected units of inputs (e.g., radians or degrees, for trigonometric functions), and other usage notes. Knowing these important points requires reading the help description of a function, which we cover in the next section. In programming, we never want to have to guess that something is correct, but for these few questions, assume that each function takes a single input and that all units are "correct".

  1. How many functions in the math module start with "log"?

  2. Write the Python statement to evaluate and print y =
\tan(\pi^2). Note that you need to import the correct module.

  3. The mathematical operation of "ceiling" a value v is written as \left\lceil{v}\right\rceil, and that of "flooring" is \left\lfloor{v}\right\rfloor. Find the Python functions for each these operations, and see how each behaves on positive and negative float values.

  4. Import the math module, ask the user for a number a, and then print the evaluation of:

    \dfrac{\cos(a)\,\log_{10}(1+a^2)}{1+e^{-a}}

  5. Import the math module, ask the user for a float x, and then print the evaluation of:

    y = \frac{1}{\sigma\sqrt{2\pi}}\,e^{-\frac{(x-\mu)^2}{2\sigma^2}}

    for \sigma = 1 and \mu = 0.