20. Lists, I: storing more stuff¶
In this section we get to know the list type. It is useful for storing sequences of multiple objects, even if they are not of the same type. Lists are also convenient when collecting an unknown number of items, as there are methods for inserting (and removing) elements in various ways. There are built-in operations for finding and sorting elements, too.
For strictly mathematical operations, we will still typically prefer to use arrays. But for various other expressions and work, the flexbility and large number of methods in lists will be helpful.
20.1. What is a list
?¶
A list
is another example of an ordered collection of elements
in Python. These share some similarities with arrays, which we have
introduced previously, but lists are a bit more flexible and less
purely mathematical.
Both arrays and lists store items in indexed sequences, and we can change elements in each after we have created them (they are both mutable). But while an array can only contain a single type of element (e.g., just floats or only ints), a given list can store elements with different types: int, float, string, and much more (including other lists). Additionally, the methods and operations associated with lists differ from those with arrays.
To define and initialize a list, all you need to do is put your items
within square brackets [ ... ]
, separated by commas. For
example:
my_list_1 = [5, 9, 3] # a list of ints
my_list_2 = [1, "Hi!", 3.4j, "Hello again"] # a list of mixed element types
my_list_3 = [] # an empty list (but will be useful!)
There is also a built-in list()
function for converting other
collections to a list, which is discussed in more detail below.
A lot of the basic behaviors with lists and elements are similar to what we've seen with arrays:
You can check the length of each with the
len()
function, and for a length N the allowed indices range from .The elements of each list are selected using indices in the same way as with arrays, e.g.,
my_list_1[0]
is5
andmy_list_1[-1]
is3
.Lists also have slicing capabilities similar to arrays. So,
my_list_2[2:]
evaluates to[3.4j, 'Hello again']
.Again, we should remember to distinguish between the type of the ordered collection object (
type(my_list_2)
islist
) and the type of each of its elements (type(my_list_2[1])
isstr
). Though unlike with arrays, knowing one list element's type doesn't necessarily inform you about the others because they can differ.The values in elements can be reassigned, just like in arrays. But unlike arrays, the type of a list element can be changed, as well; for example:
my_list_1[2] = 'banana' print(my_list_1)
... produces:
[5, 9, 'banana']
.
20.1.1. Nested lists¶
A list can even contain another list as an element. Consider:
my_list_4 = [ -3, "airplane", my_list_2 ]
print(my_list_4)
What is the length of this new list? One might guess either 3 or 6,
but using len()
tells us the answer is 3. Displaying the last
element in this list my_list_4[2]
verifies that it does contain
the entire other list (as does type(my_list_4[2])
).
... but since my_list_4[2]
itself is a collection, we should be
able to select an element out of that with indexing, right? So,
what happens if we do the following:
print(my_list_4[2][1])
? We get the index [1]
element of the list stored in the index
[2]
list in my_list_4, which is Hi!
. Having a collection
stored inside another collection is called a nested structure.
But we can just use our normal indexing rules at each "layer" of the
structure to select out elements. We can just keep appending index
selection (as well as slicing, etc.) at each nesting layer.
Note that even the value stored in my_list_4[2][1]
is another
collection structure---a string---which uses indices to select
elements.
Q: How would you then print the !
element from my_list_4
?
And how would you print the string again
from my_list_4
?
20.2. Some useful list methods¶
As with any type or class in Python, there are a number of built-in
methods to use with it. Most of the "list" class methods involve
adding, deleting and finding elements in the list---but there are some
other fun (and useful) functionalities, too. The full set of methods
can be seen with help(list)
, but here are a few particularly
common ones:
append(self, object, /)
Append object to the end of the list.
extend(self, iterable, /)
Extend list by appending elements from the iterable.
insert(self, index, object, /)
Insert object before index.
pop(self, index=-1, /)
Remove and return item at index (default last).
remove(self, value, /)
Remove first occurrence of value.
reverse(self, /)
Reverse *IN PLACE*.
sort(self, /, *, key=None, reverse=False)
Stable sort *IN PLACE*.
Note that all of these actually operate in place, meaning that the specific list from which we operate gets changed by the method; some descriptions note this specifically, whereas for others it is implied. Let's investigate some of these and note the differing behaviors between ones that might seem similar. For initial example lists, let's use these, each of which has a length of four to start:
AA = ["Khoekhoe", "Bambara", "Oromo", "Egede"]
BB = ["Rokel", "Kagera", "Juba", "Oum Er-Rbia"]
CC = ["Apala", "Kwaito", "Gnawa", "Benga"]
Consider the append
and extend
methods. Each takes an input
besides self
: for extend
, it is specifically an "iterable",
while for append it is more generically any "object". Let's see what
the following produces:
AA.append(CC)
BB.extend(CC)
print(len(AA), AA)
print(len(BB), BB)
print(len(CC), CC)
... which is (and note that we are printing the length of the updated list, as well as the list itself, because we suspect that will be useful information):
5 ['Khoekhoe', 'Bambara', 'Oromo', 'Egede', ['Apala', 'Kwaito', 'Gnawa', 'Benga']]
8 ['Rokel', 'Kagera', 'Juba', 'Oum Er-Rbia', 'Apala', 'Kwaito', 'Gnawa', 'Benga']
4 ['Apala', 'Kwaito', 'Gnawa', 'Benga']
In the case of append
, the new length increases only by one, and
we see that the list BB
is slotted in as a single element. For
extend
, the list BB
is unpacked, as it were, and each element
added in separately, so the final length is the sum of both lists.
This is why the input to extend
must be an iterable: each element
gets pulled out (in order) and becomes a new element in the "base"
list (so, what would happen if the input iterable were a string?).
The append
method does not require the input to be an interable:
if the input is one, then we will get a nested list, introduced
briefly above. Note that in neither case
does the input CC
change.
The insert
method puts a new element somewhere in the list,
specified by the index location, pushing whatever is there at present
to the right. You have to get the intended order of inputs correct
for the index
and object
arguments, and negative indices can
be used. Does it make sense what this:
CC.insert(2, -1)
CC.insert(-1, 4)
print(CC)
... produces:
['Apala', 'Kwaito', -1, 'Gnawa', 4, 'Benga']
?
There are a couple ways to remove unwanted elements: specifying the
unwanted element by its index (pop
) or by its value
(remove
). Each removes just one element. Note that another
difference of these methods is that pop
will output the value of
the purged element, which could be assigned to a new variable if
desired. Consider running this after the previous insertions:
unwanted = CC.pop(2)
print("after pop 2nd element:", CC)
print("... with unwanted element:", unwanted)
CC.remove(4)
print("after removing value 4", CC)
... to get the following progression:
after pop 2nd element: ['Apala', 'Kwaito', 'Gnawa', 4, 'Benga']
... with unwanted element: -1
after removing value 4 ['Apala', 'Kwaito', 'Gnawa', 'Benga']
And note that trying to remove an element that doesn't exist makes Python unhappy:
CC.remove("me singing")
... leads to:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-55-6f2e958daad3> in <module>
----> 1 CC.remove("me singing")
ValueError: list.remove(x): x not in list
(though that might also be a comment on my musical stylings). Feel free to test some of the lists methods listed above on your own.
Q: From reading the above method usages (how many arguments do each require and/or optionally use?), what would each of the following now produce?
1print("start :", CC)
2
3CC.reverse()
4print("rev :", CC)
5
6CC.sort()
7print("sort1 :", CC)
8
9CC.sort(reverse=True)
10print("sort2 :", CC)
Note how sorting a list can make sense in some contexts, such as if all the elements are strings or numbers, but might not be so meaningful in others---such as if an element is another list or a mixed bag of types. In fact, Python will even refuse to sort in some cases:
list_mixed = [ "tree", -3.1, ["spam", "bacon", "eggs"], 2+1j]
list_mixed.sort()
... leads to an error:
TypeError Traceback (most recent call last)
<ipython-input-76-4aad9954b2d9> in <module>
----> 1 list_mixed.sort()
TypeError: '<' not supported between instances of 'float' and 'str'
So even with all the flexibility lists offer, it will often be useful to construct them carefully in order to have full use of their functionality.
20.3. Converting ordered collections to/from lists¶
There are times where it will be useful to convert an array to a list, or vice versa; or to glue together a list of strings into a single string, or to separate a string into a list of characters; or ... you get get the idea. Here we discuss some of these conversions to/from lists and other types of ordered collections. In general, converting to a list is pretty straightforward, but converting from a list has additional considerations (because many non-list collections only store elements of a single type at one time).
20.3.1. Array to/from list¶
To convert an array to a list, we can use the built-in list()
function:
array_01 = np.linspace(0, 5, 11) # create array
list_01 = list(array_01) # convert type to list
print(array_01)
print(list_01)
You can verify that the collection lengths are the same in both cases. The values and type of each element also match. (On a minor note, the style with which they are printed differs.) Any array can be converted to a list like this.
To convert a list to an array, we can use the NumPy function
np.array()
, but we have to remember that arrays have the
restriction of containing a single element type. Therefore, consider
the conversion of the following lists:
list_02 = [1, 3, 10]
list_03 = [1, 3.5, 10]
list_04 = [1, 3.5, 'mango']
array_02 = np.array(list_02)
array_03 = np.array(list_03)
array_04 = np.array(list_04)
print(array_02)
print(array_03)
print(array_04)
... which produces:
[ 1 3 10]
[ 1. 3.5 10. ]
['1' '3.5' 'mango']
We can inspect the three printed outputs and observe that the types in
each do appear to be uniform: first ints, then floats, and finally
strings. In each case, the Python function performed implicit type
casting to pick an datatype to accommodate the most demanding element:
for array_03
, this means converting all elements to float (rather
than truncating 3.5
); for array_04
, this means making
everything into a string (rather than zeroing 'mango'
or
something).
While we see that it is possible to convert lists with non-numeric
types to arrays, it probably isn't advisable-- usually, array
functionality is best suited to more strictly mathematical objects.
If you have such a list of strings or mixed elements as array_04
,
it is probably better to leave it as a list. "Upgrading" ints (or
bools) to floats, however, can be fine, as long as that is what is
desired---as we often stress, type does matter, such as if you want
to use values for conditional testing. One can also specify the dtype
of output array elements explicitly:
array_03b = np.array(list_03, dtype=int)
array_03c = np.array(list_03, dtype=bool)
array_03d = np.array(list_03, dtype=complex)
print(array_03b)
print(array_03c)
print(array_03d)
... leading to:
[ 1 3 10]
[ True True True]
[ 1. +0.j 3.5+0.j 10. +0.j]
20.3.2. String to/from list¶
To convert a string to a list, we might have a couple choices, depending on what we want the outcome to be. Consider this string and the application of the list conversion function:
str_05 = "Strings fall apart"
list_05 = list(str_05)
print("before :", str_05)
print("after :", list_05)
... which produces the following:
before : Strings fall apart
after : ['S', 't', 'r', 'i', 'n', 'g', 's', ' ', 'f', 'a', 'l', 'l', ' ', 'a', 'p', 'a', 'r', 't']
As word-reading humans, we might have expected the conversion to
produce this list of three items ['Strings', 'fall', 'apart']
, but
that is a different kind of functionality (one that we discussed
earlier, using the string method "split"). In the present case we are mapping all
the elements in an ordered collection to a new kind of collection, and
what are the elements of a string? They are characters, not words, so
that is what gets converted. The spaces are just another character
here, not receiving any special consideration---they are merely
represented in the list of strings along with every other element.
So, in summary: to map a string to a list, decide first whether you
want to map it in units of words using the string method split()
,
or in units of characters (which are just strings of length 1) using
the list()
function.
To convert a list to a string, we have to consider what type of elements are in string. Let us first consider a list of strings.
We might first think of using the str()
function. However, we can
see from this example:
list_06 = ['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']
str_06 = str(list_06)
print(str_06)
... which produces:
['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']
Hmmm, that is odd and probably unexpected: it appears that nothing has
happened. We can doublecheck that we really put str_06
in the
print function and not list_06
: true! So, what is happening? It
looks like str_06
is still a list.
However, we can verify that type(str_06) == str
(and it does). If
we check the len(str_06)
, we find it is 60 (!!), which seems
surprisingly long for the number of elements in list_06
(12). It
might clear things up to just display the variable like this:
str_06
... which produces:
"['S', 't', 'r', 'i', 'n', 'g', ' ', 'b', 'r', 'e', 'e', 'd']"
Oh. Now we see that Python has literally taken the list printout and
converted that into a string, formatting with apostrophes around
each element. Wow. That is probably not what we wanted. So we
wonder, how can we convert a list of strings into a string, more akin
to inverting list(STRING)
?
The way to do this in Python is to use built-in string functionality,
a method called join()
. We can look at its docstring with
str.join()
:
1Signature: str.join(self, iterable, /)
2Docstring:
3Concatenate any number of strings.
4
5The string whose method is called is inserted in between each given string.
6The result is returned as a new string.
7
8Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'
9Type: method_descriptor
The join
method takes a given "base" string and uses it as glue
between each element in an iterable (here, a list of strings) to
produce a new string. The docstring's example in Line 8 shows how the
base string '.'
is used to glue together the string elements in
the input list (again, note that we use this method as if "self"
weren't in the parameter list).
So, back to our example above, we now just need to choose what character we want to use to glue together our list of strings. Consider the following:
str_06b = " ".join(list_06) # space
str_06c = "".join(list_06) # 'null' str
str_06d = "'".join(list_06) # single apostrophe
str_06e = "\t".join(list_06) # Tab
str_06f = "Ab12".join(list_06) # several characters
print(str_06b)
print(str_06c)
print(str_06d)
print(str_06e)
print(str_06f)
... which produces:
1S t r i n g b r e e d
2String breed
3S't'r'i'n'g' 'b'r'e'e'd
4S t r i n g b r e e d
5SAb12tAb12rAb12iAb12nAb12gAb12 Ab12bAb12rAb12eAb12eAb12d
Any of these run without error. In all likelihood, using
"".join(LIST)
, as for str_06c
, is probably the case that will
most often be desired. But in different situations, other ones might
certainly be useful, too. (Note that str_06e
printed in Line 4
might appear differently in your output, depending on your interface's
tab size.)
When using the string method join
, note if any elements of the
input list are not already strings, then Python will produce an error.
Q: Try running the following:
"-".join([2, "be", "continued"])
... and both see that the error makes sense and try a solution.
+ show/hide response+ show/hide possible fix
20.4. List Operators¶
We have seen how Python employs traditional mathematical operators with various types beyond int, float, etc. For example, we saw this with strings here and there. We now look briefly at similar operations on lists.
Analogous to strings, the +
operator concatenates two lists:
print(['a', 'b', 'c'] + [1, 2, 3])
... outputs: ['a', 'b', 'c', 1, 2, 3]
.
Q: What list method from above could you apply here, to obtain the same output of the two lists here?
+ show/hide codeThe *
can operate on a list and an int, producing a new list that
is that integer number of copies of the original list, all
concatenated together:
print([True, False] * 3)
... outputs: [True, False, True, False, True, False]
.
In either case, the result could be assigned to a variable, saving the output to a new list.
Q: Earlier, we found it useful to initialize arrays with a certain
length and type of zeros (before looping through them). How could you
use *
to make a list of 10 float zeros? A list of 7 boolean "zeros"?
20.5. Lists and for loops¶
We have already seen how arrays can be conveniently combined with for-loops, whether populating or walking through elements. Lists can also be combined with for-loops in a similar way.
Consider populating a list with elements according to the following rule:
(1)¶
For starters, we could exactly emulate our strategy with arrays: populate a full list of zeros (of the right type!), then walk through and fill it in. This might look like:
1N = 6 # number of elements
2L = [0] * N # a list of int zeros
3for n in range(N):
4 L[n] = 2**n
5print("L:", L)
... and that should work fine to produce: L: [1, 2, 4, 8, 16,
32]
.
As another approach, we could recall that lists come with built-in
methods to add elements to a list. Really that append
takes any
object and sticks it onto the end of the list. So, we could start
with an empty list, and then build it up element-by-element:
1N = 6 # number of elements
2L = [] # an empty list
3for n in range(N):
4 L.append(2**n)
5print("L:", L)
This produces the same result as above. Note the pieces of the
calculation that changed: the initialization of L
in Line 2, and
the way we store each 2**n
in Line 4: assignment in the first
case, and a list-method in the second.
Is one of these inherently better than the other? Probably not, at least for these kinds of calculations (later, we will look at computational time, which can matter for larger collections of numbers). The first one is a bit closer to the original mathematical formulation, which is nice; the second takes a bit less planning ahead, which might be appealing (but in programming, we do want to always be planning ahead...). Note that in the second case, one doesn't need to know the index location to store the value, because it just gets stored at the end of what exists already---in cases that are less math-driven, this can be particularly equivalent.
20.5.1. List Comprehension¶
There is another syntax that Python has for combining for-loops and lists, which is called list comprehension. List comprehensions are simply a way to compress the building of a list using a for-loop into a single short and readable line. Let's apply it to the same example we used above:
... which would look like:
L = [2**n for n in range(6)]
print("L:", L)
Wow, everything really did get calculated in one line! (We
cheated a bit by not defining N=6
first, but that is a trivial
change.) How does this work?
Well, note that the RHS expression is in square brackets, denoting
that a list is being created. The major change is that the iteration
takes place inside these brackets. But we should be able to
recognize most of the same pieces that we have seen in our "classical"
for loops above: for n in range(6)
is still there, as is the value
that gets calculated for each iteration 2**n
. We just see them in
a different order than usual: but one that happens to match the math
order nicely.
Comparing the mathematical and computational expressions, we see that we have the following translations of each part of the list comprehension:
Math expression |
Comp expression |
|
---|---|---|
|
||
|
||
|
In general, the basic syntax of list comprehension is:
[ EXPRESSION for ITEM in ITERABLE ]
which is analogous to the standard for-loop:
for ITEM in ITERABLE:
EXPRESSION
Though note that in the note that with the for-loop, one still has to consider the list initialization and assignment of elements. In list comprehension, both of those aspects are handled implicitly and one just assigns the whole list to a variable name (one of the reasons people like this structure).
In summary, list comprehension can be a nice, compact way to store a set of values. For simple looping, it can be convenient (we will see that even things like conditions can be combined with it). For more complicated expressions, however, the standard for-loop might be preferable or required.
In summary:
Lists provide a convenient object for storing multiple objects, particularly when we want to store objects of varied or initially-unknown type. (For many mathematical operations, we still might like to use arrays.)
Lists can even store other lists, which we refer to as "nesting".
Lists are ordered (so we use indices to select elements) and mutable (we can change elements).
Many useful methods exist for managing lists, such as
sort
,find
, andcount
.We can inject new elements into the list, either specifically at the end (e.g., with the
append
andextend
methods) or generally anywhere (insert
method).We can also remove existing elements (with
remove
orpop
).Many list methods operate in-place, meaning that the base ("self") list itself gets updated, rather than creating a new object to assign to a new variable.
List comprehension provides a compact way to calculate over items in an iterable object, which might be convenient in codes.
20.6. Practice¶
Use list comprehension to store the first 10 terms of the sequence defined by: .
Use list comprehension to extract the letters of the word extraction into a list.
What is the largest number number you can make from rearranging the digits in 9275827179421? How could you find this computationally say, using some clever type conversion(s)? And what is the smallest number you could make from this (found in a similar way)?
Using the
append()
method, create a list containing the first 20 terms of the sequence . Note that this is a recursive sequence, so consider what that might for any initialization...Let
w = np.linspace(1, 2, 101)
. Use list comprehension to calculate and to store the result in a listoutput
. Plotw
vsoutput
(NB: plotting works the exact same for arrays and lists).