5. Assignment, variables and in-place operators

The calculator-like functionality described in the previous section is certainly useful. But it is not fully programming: we cannot combine results of different calculations and build on previous results, since values aren't saved. Here, we take another large step toward real programming by introducing variables and the ability to store quantities using the assignment operator.

In "calculator mode", each line's operations were separate, and no line's evaluation had any impact on any other. Now that we will start to save values and use them later, the order of in which lines are evaluated matters. This is called the program flow. By default, the Python interpreter will start at the top of a script or program, and read downwards line-by-line, evaluating in turn down to the end. In later sections we will see how we can adapt the program flow in useful ways.

5.1. The assignment operator

Python's assignment operator is =, and it allows us to store values for later usage. While this looks like the "equal sign" in mathematics and it shares some similarities, we will see importantly how it really is distinct from that and deserves its own name.

In Python, = is a binary operator, always taking two operands. The righthand side (RHS) of the assignment operator provides the value to be stored, and the lefthand side (LHS) contains the name of the variable that will store it (so-called because its contents can change throughout a program, as we will see below).

After executing the following:

x = 4 + 8

the result of the RHS expression 12 is stored, which we can access any time later by referring to the variable x. We can view its value directly by just entering x at the prompt and hitting enter. In the following we store values in two variables x and y, and in the third line we use them in the process of creating a new variable z, and then displaying it:

x = 4 + 8
y = (9**2 - 1) / 8.
z = x*x - 1.5*y
z

OK, so this looks like how we might use "=" in an ordinary set of mathematical expressions. Indeed, in the previous section, we saw that programming directly borrowed the + and - symbols from mathematics, so why would we expect anything different for the recognizable "="?

However, and perhaps surprisingly, the assignment operator = does not perform the same job as a mathematical equals sign "=". (There is* something that does the latter in Python, which we will discuss in a later section; that operator is ==). We have to be clear in distinguishing between mathematical equality and computational assignment.

In mathematics:

The equals sign "=" is a passive statement about what appears on its LHS and RHS; it is either true or false. When the equality is true, then we are allowed to use algebra to rearrange the sides of the equation very freely. For example, if we have a mathematical statement that x = 4 + 8, then the following expressions are all valid and equivalent:

& x - 4 = 8 \\
& x - 4 - 8 = 0 \\
& 4 = x - 8 \\
& 2x = 8 + 16 \\
& {\rm etc.}

In programming:

The assignment operator = is an action, causing the interpreter to store a value on the computer in a way that we can later use it. The operator's LHS and RHS are separate and distinct, and cannot be mixed or swapped around with each other. If we have a computational statement that x = 4 + 8, then we cannot move parts from the RHS to the LHS, such as:

x - 4 = 8

Try running that in Python---it will produce an error, because we have broken the basic syntax of = (the LHS is only for naming what is being stored, not for performing operations like subtraction).

The following is how a Python interpreter sees an = (and therefore you should, too):

  1. Evaluate the expression on the RHS of =.

    • If any part of the RHS is undefined, say "Error" and stop.

  2. Store that RHS result at some location on the computer.

  3. Assign the variable name on the LHS of = as a label for specific location on the computer, so if that variable appears again in the program, the stored value can be fetched.

Note that this is a multi-step process, distinct from the mathematical form. In line with this concept, one can read the computational x = ... like: "x is defined to be ...", or "x is set to ...", or "Let x be ...", or "Assign ... to x", etc., as well as simply "x is ..." We can also casually say "x equals ...", but just keep in mind the computational aspect.

Note

Some other programming languages use a different symbol for the assignment operator, making it easier to mentally separate it from mathematical equality. For example, R uses the symbol <-, because it looks like an arrow taking the content from the RHS and putting it into the name on the LHS, which fits nicely with the behavior of assignment.

As we start creating variables, we note that each quantity being used in an expression must already be known and already have a well-defined value---that is, a variable must be be assigned before it is used anywhere. In the program flow through script or code, lines are evaluated in order from the top of the file downwards. Therefore, if c has not previously been assigned a value, then executing:

d = c*c - 10
c = 4 + 8

will produce an error when evaluating the RHS of the first line (try it on your system: does the error message make sense?). However, if we interchange these two lines:

c = 4 + 8
d = c*c - 10

... then each value quantity in the program flow is well-defined, and no error would be produced.

One more caveat about how we might be tempted to view assignment and variable definition. Consider the computational expression:

y = 3*x + 4.5

This might look like the formula for a straight line from mathematics, but it is not a general relation within the program. This is a single instance, and a specific value for x must have been defined previously to be used here; executing this line would lead to a new, specific value assigned to y. (The way to write a general relation like the equation of a line is to define a function, whose syntax is discussed later).

Q: Which of the following are valid or invalid expressions? Check by evaluating each line individually:

var + 3 = 10 + 15
var = 10 + 15 - 3
10 + 15 - 3 = var

5.2. Variables

The variable on the LHS of the assignment operator is, well, variable, and its value (and/or type, in Python) can be changed simply by assigning it to a new value at some point. So, the following sequence of lines is fine to run in Python:

x = 10
x = -11.4
x = False

y = 0
y = 20*15
y = -48

... and at the end of it, there is still only one variable x, which has type bool and value False, and one variable y, which has type int and value -48.

Every time a variable is defined in Python, it will have both a type and a value. It is not possible to say, Let x be an unknown int that we will define later. It must be given at least some initial value, which can be changed later---we say that the variable has been initialized with a value. A common choice for a default int value is zero; if you later forget to give the variable its real value, checking its value and seeing zero might help point out that mistake.

We note that in some programming languages (such as C), a variable's type cannot be changed once it is declared, though its value could be changed. In Python, you can change both properties for a variable. However, in general it could lead to confusion to see x as a boolean in one part of the code and then have it be a float in another part and a complex value in another... There are useful cases of changing what is stored in a variable, but to reduce possible confusion, changing just the value (like in y above) is much more common than changing its type (as in x above)---if you want something to store a different type, probably just make a new variable.

Note

It's good to keep in mind these type considerations and flexibilities when debugging code. Sometimes an error message might seem totally unrelated to the intended quantities at hand, but one discovers that there is really a different-than-expected type present. When debugging, it is always helpful to verify that both the type and values of the involved quantities are correct (or at least within expected ranges).

The Python command whos is a useful way to see what variables have been defined in a Python environment. It provides a list of the variable names, their values, their type and other properties. For example, to see the variables that we have defined during a Python session in IPython or with Jupyter-Notebook within the correct section, you can type:

whos

... where the output table of information might look something like:

Variable   Type     Data/Info
-----------------------------
c          int      12
d          int      134
x          bool     False
y          int      -48
z          float    129.0

(NB: exact output will depend on specific commands you have run in the given Python session.)

5.3. In-place math operators

One way that we can change a variable's value, and one that should also help emphasize that = is very different than mathematical equality, is the following valid code:

1my_var = 10
2my_var = my_var + 5

What is the value of my_var at the end of this? Well, let's follow along what happens using the rules of program flow. In Line 1, my_var gets assigned a value of 10 (and type int). If Line 2 were (incorrectly) read with mathematical equality "=", it would look like an impossible expression, or something that was just false. But we know better than that, and will read Line 2 computationally. The program flow within a line will lead to the RHS of = being evaluated first---since my_var has a value 10 from Line 1, we read the RHS as 10 + 5. The resulting 15 gets stored on the computer and the variable on the LHS becomes its name, which in this case just happens to be my_var again. We have switched where the label my_var points on the computer, reassigning it.

Let's think about what happened in terms of quantities. In Line 1 we gave a variable a value, and in Line 2 we effectively just added to it. This kind of accumulating procedure comes up so often in programming that there is a shorthand for it:

1my_var2  = 10
2my_var2 += 5

... where we have introduced a new operator +=, which performs in-place addition. It is called "in-place" because we don't introduce any intermediate quantities, but instead appear to just directly "add to" the variable on the LHS from the evaluation of the RHS. Check the final value of my_var2: it should also be 15. We can only use += on a variable that has been initialized.

There are similar operators for in-place subtraction -=, multiplication *=, division /=, integer division //= and remainders %=. In each case, the RHS of the operator is evaluated first, and then the in-place operation occurs. Check that each of the following final values makes sense (spacing around the operators doesn't matter; I just like vertically aligning the =s):

 1test1  = 20
 2test1 *= 3
 3
 4test2  = 20
 5test2 /= 4 + 2
 6
 7test3  = 20
 8test3//= 6
 9
10test3  = 20
11test3 %= 18 / 3
12
13test4  = -100
14test4 -= 21 + 4

Note that we can only use an in-place operator on a variable that has been initialized. The following:

not_yet_defined_var += 10

... produces an error:

<ipython-input-70-202f07c7ca71> in <module>
----> 1 not_yet_defined_var += 10

NameError: name 'not_yet_defined_var' is not defined

... since there is no starting "place" for the in-place operator.

Q: What is the final value of abc here:

abc = 10
abc = 15
abc+= 5
+ show/hide value

Q: What is the final value of efg here:

efg = 10
efg = 15 + efg
efg+= 5
+ show/hide value

5.4. Variable names

There are some rules for variable names to keep in mind:

  • They must be made up only of alphanumeric characters (a-z, A-Z, 0-9) and the underscore character _.

  • The variable cannot start with a number, but the number may appear anywhere else: var1 is OK, but 1var is not.

  • Unusable characters: Special characters such as \, $, %, !, etc. cannot be used, nor can commas ,, periods . or dashes -, as those all have special other meanings. Spaces  cannot be used, as those separate different words---one can use an underscore _ instead of a space.

  • Variables are case sensitive (indeed, all words in Python are). That is, Python distinguishes between upper and lower case, so that Age, age and aGe are three different variables.

  • A variable name cannot be a reserved word, or keyword, in Python such as for, if, def, etc. It also should not be word with special meaning (int, True, type, etc.) because the functionality of those original meanings should be maintained.

Note

You can always get the list of keywords in your current version of Python by typing:

import keyword
print(keyword.kwlist)

in a Python environment. Again, avoid these words for variable names. We will be using many of these later in our programming adventure.

Please do make meaningful variable names. For example, we emphasize here that in programming we are often translating mathematics, physics, biology, etc. from paper calculations to programs. In such cases, you should employ the same variable names and notations that you have written down on paper to the program you are writing. For example, if writing the formula "distance equals velocity times time", you should probably make use of the standard variables:

d = v * t

or perhaps more interpretable, but still succinct, names:

dist = vel * time

Half (or more!) of the art of programming is to make the code itself as readable and understandable as possible. This makes the program simpler to check by eye in the present and more directly editable at future times. It also greatly reduces the numbers of bugs or errors in the code.

5.5. Practice

  1. What is the name of this operator in Python: =?

  2. Which of the following are invalid Python expressions, and should produce errors?

    1xy = True
    22nd_score = 9.5/10
    3int_ = 45
    4var-1 = 205
    5a5 = 0.22222
    6num@int = int(5.5)
    
  3. Rewrite the following math equations as computational assignment operations, so that the variable(s) could be used later.

    1. (45 + x) / 10 = 39

    2. 16x - 4y = 3x + 2, for the case where y=3

  4. What are the results (type and value) of the following? (And does each make sense?)

    1. x = 15
      y = x // 4
      z = 3. * y
      
    2. cval = 5 + 4j
      val1 = 5 + 4j * 5 + 4j
      val2 = cval * cval
      
  5. Finding+fixing troubles is part of programming. Run the following and see the error message, and fix the problem assignment operation:

    5squared = 5**2
    
  6. What is the final value of value of the following (work through it in your head before checking with Python):

    1. value1 = 6
      value1+= 10
      value1/= 5 - 1
      value1+= 5
      
    2. value2 = 99
      value2+= 45
      value2/= 12
      value2 = 5
      
    3. value3   = 0.25
      value3 **= -2
      value3  /= value3 / 4
      value3  *= 4
      
    4. value4  = 45
      value4 %= 5
      value4 *= 100
      value4 -= 1
      
    5. value5 += 3
      value5 *= 5
      value5 /= 100
      value5 -= 1