20. Lists, I: storing more stuff¶

In this section we get to know the list type. It is useful for storing sequences of multiple objects, even if they are not of the same type. Lists are also convenient when collecting an unknown number of items, as there are methods for inserting (and removing) elements in various ways. There are built-in operations for finding and sorting elements, too.

For strictly mathematical operations, we will still typically prefer to use arrays. But for various other expressions and work, the flexbility and large number of methods in lists will be helpful.

20.1. What is a `list`?¶

A list is another example of an ordered collection of elements in Python. These share some similarities with arrays, which we have introduced previously, but lists are a bit more flexible and less purely mathematical.

Both arrays and lists store items in indexed sequences, and we can change elements in each after we have created them (they are both mutable). But while an array can only contain a single type of element (e.g., just floats or only ints), a given list can store elements with different types: int, float, string, and much more (including other lists). Additionally, the methods and operations associated with lists differ from those with arrays.

To define and initialize a list, all you need to do is put your items within square brackets [ ... ], separated by commas. For example:

my_list_1 = [5, 9, 3]                        # a list of ints
my_list_2 = [1, "Hi!", 3.4j, "Hello again"]  # a list of mixed element types
my_list_3 = []                               # an empty list (but will be useful!)

There is also a built-in list() function for converting other collections to a list, which is discussed in more detail below.

A lot of the basic behaviors with lists and elements are similar to what we've seen with arrays:

You can check the length of each with the len() function, and for a length N the allowed indices range from $[0, N)$ .
The elements of each list are selected using indices in the same way as with arrays, e.g., my_list_1[0] is 5 and my_list_1[-1] is 3.
Lists also have slicing capabilities similar to arrays. So, my_list_2[2:] evaluates to [3.4j, 'Hello again'].
Again, we should remember to distinguish between the type of the ordered collection object (type(my_list_2) is list) and the type of each of its elements (type(my_list_2[1]) is str). Though unlike with arrays, knowing one list element's type doesn't necessarily inform you about the others because they can differ.
The values in elements can be reassigned, just like in arrays. But unlike arrays, the type of a list element can be changed, as well; for example:
```
my_list_1[2] = 'banana'
print(my_list_1)
```
... produces: [5, 9, 'banana'].

20.1.1. Nested lists ¶

A list can even contain another list as an element. Consider:

my_list_4 = [ -3, "airplane", my_list_2 ]
print(my_list_4)

What is the length of this new list? One might guess either 3 or 6, but using len() tells us the answer is 3. Displaying the last element in this list my_list_4[2] verifies that it does contain the entire other list (as does type(my_list_4[2])).

... but since my_list_4[2] itself is a collection, we should be able to select an element out of that with indexing, right? So, what happens if we do the following:

print(my_list_4[2][1])

? We get the index [1] element of the list stored in the index [2] list in my_list_4, which is Hi!. Having a collection stored inside another collection is called a nested structure. But we can just use our normal indexing rules at each "layer" of the structure to select out elements. We can just keep appending index selection (as well as slicing, etc.) at each nesting layer.

Note that even the value stored in my_list_4[2][1] is another collection structure---a string---which uses indices to select elements.

Q: How would you then print the ! element from my_list_4? And how would you print the string again from my_list_4?

+ show/hide code

20.2. Some useful list methods ¶

As with any type or class in Python, there are a number of built-in methods to use with it. Most of the "list" class methods involve adding, deleting and finding elements in the list---but there are some other fun (and useful) functionalities, too. The full set of methods can be seen with help(list), but here are a few particularly common ones:

append(self, object, /)
    Append object to the end of the list.

extend(self, iterable, /)
    Extend list by appending elements from the iterable.

insert(self, index, object, /)
    Insert object before index.

pop(self, index=-1, /)
    Remove and return item at index (default last).

remove(self, value, /)
    Remove first occurrence of value.

reverse(self, /)
    Reverse *IN PLACE*.

sort(self, /, *, key=None, reverse=False)
    Stable sort *IN PLACE*.

Note that all of these actually operate in place, meaning that the specific list from which we operate gets changed by the method; some descriptions note this specifically, whereas for others it is implied. Let's investigate some of these and note the differing behaviors between ones that might seem similar. For initial example lists, let's use these, each of which has a length of four to start:

AA = ["Khoekhoe", "Bambara", "Oromo", "Egede"]
BB = ["Rokel", "Kagera", "Juba", "Oum Er-Rbia"]
CC = ["Apala", "Kwaito", "Gnawa", "Benga"]

Consider the append and extend methods. Each takes an input besides self: for extend, it is specifically an "iterable", while for append it is more generically any "object". Let's see what the following produces:

AA.append(CC)
BB.extend(CC)

print(len(AA), AA)
print(len(BB), BB)
print(len(CC), CC)

... which is (and note that we are printing the length of the updated list, as well as the list itself, because we suspect that will be useful information):

['Khoekhoe', 'Bambara', 'Oromo', 'Egede', ['Apala', 'Kwaito', 'Gnawa', 'Benga']]
['Rokel', 'Kagera', 'Juba', 'Oum Er-Rbia', 'Apala', 'Kwaito', 'Gnawa', 'Benga']
['Apala', 'Kwaito', 'Gnawa', 'Benga']

In the case of append, the new length increases only by one, and we see that the list BB is slotted in as a single element. For extend, the list BB is unpacked, as it were, and each element added in separately, so the final length is the sum of both lists. This is why the input to extend must be an iterable: each element gets pulled out (in order) and becomes a new element in the "base" list (so, what would happen if the input iterable were a string?). The append method does not require the input to be an interable: if the input is one, then we will get a nested list, introduced briefly above. Note that in neither case does the input CC change.

The insert method puts a new element somewhere in the list, specified by the index location, pushing whatever is there at present to the right. You have to get the intended order of inputs correct for the index and object arguments, and negative indices can be used. Does it make sense what this:

CC.insert(2, -1)
CC.insert(-1, 4)
print(CC)

... produces:

['Apala', 'Kwaito', -1, 'Gnawa', 4, 'Benga']

?

There are a couple ways to remove unwanted elements: specifying the unwanted element by its index (pop) or by its value (remove). Each removes just one element. Note that another difference of these methods is that pop will output the value of the purged element, which could be assigned to a new variable if desired. Consider running this after the previous insertions:

unwanted = CC.pop(2)
print("after pop 2nd element:", CC)
print("... with unwanted element:", unwanted)
CC.remove(4)
print("after removing value 4", CC)

... to get the following progression:

after pop 2nd element: ['Apala', 'Kwaito', 'Gnawa', 4, 'Benga']
... with unwanted element: -1
after removing value 4 ['Apala', 'Kwaito', 'Gnawa', 'Benga']

And note that trying to remove an element that doesn't exist makes Python unhappy:

CC.remove("me singing")

... leads to:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-55-6f2e958daad3> in <module>
----> 1 CC.remove("me singing")

ValueError: list.remove(x): x not in list

(though that might also be a comment on my musical stylings). Feel free to test some of the lists methods listed above on your own.

Q: From reading the above method usages (how many arguments do each require and/or optionally use?), what would each of the following now produce?

print("start :", CC)

CC.reverse()
print("rev   :", CC)

CC.sort()
print("sort1 :", CC)

CC.sort(reverse=True)
print("sort2 :", CC)

+ show/hide response

Note how sorting a list can make sense in some contexts, such as if all the elements are strings or numbers, but might not be so meaningful in others---such as if an element is another list or a mixed bag of types. In fact, Python will even refuse to sort in some cases:

list_mixed = [ "tree", -3.1, ["spam", "bacon", "eggs"], 2+1j]
list_mixed.sort()

... leads to an error:

  TypeError                                 Traceback (most recent call last)
<ipython-input-76-4aad9954b2d9> in <module>
----> 1 list_mixed.sort()

TypeError: '<' not supported between instances of 'float' and 'str'

So even with all the flexibility lists offer, it will often be useful to construct them carefully in order to have full use of their functionality.

20.3. Converting ordered collections to/from lists ¶

There are times where it will be useful to convert an array to a list, or vice versa; or to glue together a list of strings into a single string, or to separate a string into a list of characters; or ... you get get the idea. Here we discuss some of these conversions to/from lists and other types of ordered collections. In general, converting to a list is pretty straightforward, but converting from a list has additional considerations (because many non-list collections only store elements of a single type at one time).

20.3.1. Array to/from list ¶

To convert an array to a list, we can use the built-in list() function:

array_01 = np.linspace(0, 5, 11)    # create array
list_01  = list(array_01)           # convert type to list

print(array_01)
print(list_01)

You can verify that the collection lengths are the same in both cases. The values and type of each element also match. (On a minor note, the style with which they are printed differs.) Any array can be converted to a list like this.

To convert a list to an array, we can use the NumPy function np.array(), but we have to remember that arrays have the restriction of containing a single element type. Therefore, consider the conversion of the following lists:

list_02  = [1, 3, 10]
list_03  = [1, 3.5, 10]
list_04  = [1, 3.5, 'mango']

array_02 = np.array(list_02)
array_03 = np.array(list_03)
array_04 = np.array(list_04)

print(array_02)
print(array_03)
print(array_04)

... which produces:

[ 1  3 10]
[ 1.   3.5 10. ]
['1' '3.5' 'mango']

We can inspect the three printed outputs and observe that the types in each do appear to be uniform: first ints, then floats, and finally strings. In each case, the Python function performed implicit type casting to pick an datatype to accommodate the most demanding element: for array_03, this means converting all elements to float (rather than truncating 3.5); for array_04, this means making everything into a string (rather than zeroing 'mango' or something).

While we see that it is possible to convert lists with non-numeric types to arrays, it probably isn't advisable-- usually, array functionality is best suited to more strictly mathematical objects. If you have such a list of strings or mixed elements as array_04, it is probably better to leave it as a list. "Upgrading" ints (or bools) to floats, however, can be fine, as long as that is what is desired---as we often stress, type does matter, such as if you want to use values for conditional testing. One can also specify the dtype of output array elements explicitly:

array_03b = np.array(list_03, dtype=int)
array_03c = np.array(list_03, dtype=bool)
array_03d = np.array(list_03, dtype=complex)
print(array_03b)
print(array_03c)
print(array_03d)

... leading to:

[ 1  3 10]
[ True  True  True]
[ 1. +0.j  3.5+0.j 10. +0.j]

20.3.2. String to/from list ¶

To convert a string to a list, we might have a couple choices, depending on what we want the outcome to be. Consider this string and the application of the list conversion function:

str_05  = "Strings fall apart"
list_05 = list(str_05)

print("before :", str_05)
print("after  :", list_05)

... which produces the following:

before : Strings fall apart
after  : ['S', 't', 'r', 'i', 'n', 'g', 's', ' ', 'f', 'a', 'l', 'l', ' ', 'a', 'p', 'a', 'r', 't']

As word-reading humans, we might have expected the conversion to produce this list of three items ['Strings', 'fall', 'apart'], but that is a different kind of functionality (one that we discussed earlier, using the string method "split"). In the present case we are mapping all the elements in an ordered collection to a new kind of collection, and what are the elements of a string? They are characters, not words, so that is what gets converted. The spaces are just another character here, not receiving any special consideration---they are merely represented in the list of strings along with every other element.

So, in summary: to map a string to a list, decide first whether you want to map it in units of words using the string method split(), or in units of characters (which are just strings of length 1) using the list() function.

To convert a list to a string, we have to consider what type of elements are in string. Let us first consider a list of strings.

We might first think of using the str() function. However, we can see from this example:

list_06 = ['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']
str_06  = str(list_06)

print(str_06)

... which produces:

['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']

Hmmm, that is odd and probably unexpected: it appears that nothing has happened. We can doublecheck that we really put str_06 in the print function and not list_06: true! So, what is happening? It looks like str_06 is still a list.

However, we can verify that type(str_06) == str (and it does). If we check the len(str_06), we find it is 60 (!!), which seems surprisingly long for the number of elements in list_06 (12). It might clear things up to just display the variable like this:

str_06

... which produces:

"['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']"

Oh. Now we see that Python has literally taken the list printout and converted that into a string, formatting with apostrophes around each element. Wow. That is probably not what we wanted. So we wonder, how can we convert a list of strings into a string, more akin to inverting list(STRING)?

The way to do this in Python is to use built-in string functionality, a method called join(). We can look at its docstring with str.join():

Signature: str.join(self, iterable, /)
Docstring:
Concatenate any number of strings.

The string whose method is called is inserted in between each given string.
The result is returned as a new string.

Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'
Type:      method_descriptor

The join method takes a given "base" string and uses it as glue between each element in an iterable (here, a list of strings) to produce a new string. The docstring's example in Line 8 shows how the base string '.' is used to glue together the string elements in the input list (again, note that we use this method as if "self" weren't in the parameter list).

So, back to our example above, we now just need to choose what character we want to use to glue together our list of strings. Consider the following:

str_06b  = " ".join(list_06)          # space
str_06c  = "".join(list_06)           # 'null' str
str_06d  = "'".join(list_06)          # single apostrophe
str_06e  = "\t".join(list_06)         # Tab
str_06f  = "Ab12".join(list_06)       # several characters

print(str_06b)
print(str_06c)
print(str_06d)
print(str_06e)
print(str_06f)

... which produces:

S t r i n g   b r e e d
String breed
S't'r'i'n'g' 'b'r'e'e'd
S    t       r       i       n       g               b       r       e       e       d
SAb12tAb12rAb12iAb12nAb12gAb12 Ab12bAb12rAb12eAb12eAb12d

Any of these run without error. In all likelihood, using "".join(LIST), as for str_06c, is probably the case that will most often be desired. But in different situations, other ones might certainly be useful, too. (Note that str_06e printed in Line 4 might appear differently in your output, depending on your interface's tab size.)

When using the string method join, note if any elements of the input list are not already strings, then Python will produce an error.

Q: Try running the following:

"-".join([2, "be", "continued"])

... and both see that the error makes sense and try a solution.

+ show/hide response
+ show/hide possible fix

20.4. List Operators ¶

We have seen how Python employs traditional mathematical operators with various types beyond int, float, etc. For example, we saw this with strings here and there. We now look briefly at similar operations on lists.

Analogous to strings, the + operator concatenates two lists:

print(['a', 'b', 'c'] + [1, 2, 3])

... outputs: ['a', 'b', 'c', 1, 2, 3].

Q: What list method from above could you apply here, to obtain the same output of the two lists here?

+ show/hide code

The * can operate on a list and an int, producing a new list that is that integer number of copies of the original list, all concatenated together:

print([True, False] * 3)

... outputs: [True, False, True, False, True, False].

In either case, the result could be assigned to a variable, saving the output to a new list.

Q: Earlier, we found it useful to initialize arrays with a certain length and type of zeros (before looping through them). How could you use * to make a list of 10 float zeros? A list of 7 boolean "zeros"?

+ show/hide code

20.5. Lists and for loops ¶

We have already seen how arrays can be conveniently combined with for-loops, whether populating or walking through elements. Lists can also be combined with for-loops in a similar way.

Consider populating a list with elements according to the following rule:

(1)¶ $L_n = 2^{n}, {\rm ~for~} n = 0, .., 5$

For starters, we could exactly emulate our strategy with arrays: populate a full list of zeros (of the right type!), then walk through and fill it in. This might look like:

N = 6                    # number of elements
L = [0] * N              # a list of int zeros
for n in range(N):
    L[n] = 2**n
print("L:", L)

... and that should work fine to produce: L: [1, 2, 4, 8, 16, 32].

As another approach, we could recall that lists come with built-in methods to add elements to a list. Really that append takes any object and sticks it onto the end of the list. So, we could start with an empty list, and then build it up element-by-element:

N = 6                    # number of elements
L = []                   # an empty list
for n in range(N):
    L.append(2**n)
print("L:", L)

This produces the same result as above. Note the pieces of the calculation that changed: the initialization of L in Line 2, and the way we store each 2**n in Line 4: assignment in the first case, and a list-method in the second.

Is one of these inherently better than the other? Probably not, at least for these kinds of calculations (later, we will look at computational time, which can matter for larger collections of numbers). The first one is a bit closer to the original mathematical formulation, which is nice; the second takes a bit less planning ahead, which might be appealing (but in programming, we do want to always be planning ahead...). Note that in the second case, one doesn't need to know the index location to store the value, because it just gets stored at the end of what exists already---in cases that are less math-driven, this can be particularly equivalent.

20.5.1. List Comprehension ¶

There is another syntax that Python has for combining for-loops and lists, which is called list comprehension. List comprehensions are simply a way to compress the building of a list using a for-loop into a single short and readable line. Let's apply it to the same example we used above:

$L_n = 2^{n}, {\rm ~for~} n = 0, .., 5$

... which would look like:

L = [2**n for n in range(6)]
print("L:", L)

Wow, everything really did get calculated in one line! (We cheated a bit by not defining N=6 first, but that is a trivial change.) How does this work?

Well, note that the RHS expression is in square brackets, denoting that a list is being created. The major change is that the iteration takes place inside these brackets. But we should be able to recognize most of the same pieces that we have seen in our "classical" for loops above: for n in range(6) is still there, as is the value that gets calculated for each iteration 2**n. We just see them in a different order than usual: but one that happens to match the math order nicely.

Comparing the mathematical and computational expressions, we see that we have the following translations of each part of the list comprehension:

Math expression	$\rightarrow$	Comp expression
$L_n = ...$	$\rightarrow$	`L = [...]`
$2^n$	$\rightarrow$	`2**n`
${\rm ~for~} n = 0, .., 5$	$\rightarrow$	`for n in range(6)`

In general, the basic syntax of list comprehension is:

[ EXPRESSION for ITEM in ITERABLE ]

which is analogous to the standard for-loop:

for ITEM in ITERABLE:
    EXPRESSION

Though note that in the note that with the for-loop, one still has to consider the list initialization and assignment of elements. In list comprehension, both of those aspects are handled implicitly and one just assigns the whole list to a variable name (one of the reasons people like this structure).

In summary, list comprehension can be a nice, compact way to store a set of values. For simple looping, it can be convenient (we will see that even things like conditions can be combined with it). For more complicated expressions, however, the standard for-loop might be preferable or required.

In summary:

Lists provide a convenient object for storing multiple objects, particularly when we want to store objects of varied or initially-unknown type. (For many mathematical operations, we still might like to use arrays.)
Lists can even store other lists, which we refer to as "nesting".
Lists are ordered (so we use indices to select elements) and mutable (we can change elements).
Many useful methods exist for managing lists, such as sort, find, and count.
We can inject new elements into the list, either specifically at the end (e.g., with the append and extend methods) or generally anywhere (insert method).
We can also remove existing elements (with remove or pop).
Many list methods operate in-place, meaning that the base ("self") list itself gets updated, rather than creating a new object to assign to a new variable.
List comprehension provides a compact way to calculate over items in an iterable object, which might be convenient in codes.

20.6. Practice ¶

Use list comprehension to store the first 10 terms of the sequence defined by: $x_n=2n+3, n \ge 0$ .
Use list comprehension to extract the letters of the word extraction into a list.
What is the largest number number you can make from rearranging the digits in 9275827179421? How could you find this computationally say, using some clever type conversion(s)? And what is the smallest number you could make from this (found in a similar way)?
Using the append() method, create a list containing the first 20 terms of the sequence $a_0 = 3, a_n = 2a_{n-1} - 2$ . Note that this is a recursive sequence, so consider what that might for any initialization...
Let w = np.linspace(1, 2, 101). Use list comprehension to calculate $(w_i)^{-1}\,\sqrt{1+w_i}$ and to store the result in a list output. Plot w vs output (NB: plotting works the exact same for arrays and lists).

20. Lists, I: storing more stuff¶

20.1. What is a `list`?¶

20.1.1. Nested lists ¶

20.2. Some useful list methods ¶

20.3. Converting ordered collections to/from lists ¶

20.3.1. Array to/from list ¶

20.3.2. String to/from list ¶

20.4. List Operators ¶

20.5. Lists and for loops ¶

20.5.1. List Comprehension ¶

20.6. Practice ¶

Table of Contents

This Page