Formatting, indexing and slicing strings.¶

Placing data into strings ¶

Often we want to display results of our work at several points in a program: in the middle of calculations to check the code's progress and/or to see that intermediate results are running; at the end to display the final results. Additionally, we might want to make labels for plotting, or save data to an output file for further use. To do this, we have to fully understand what we are outputting (the type, length/shape of object, etc.). Then decisions have to be made like how many decimal places to display, how to include text describing what each number is in many cases, how to make aligned columns of results, etc. This all comes under the category of string formatting: inserting the quantities of interest into strings and specifying what display properties it should have.

Up until this point, we have been printing strings and other types very simply to display data (as described here), separating each item with commas:

print("Finished.")
print("x =", x)
print("Avec =", A, " and Bvec =", B)

etc. We now look at more interesting ways to insert data into strings and format the results.

There are different methods and styles for performing this kind of operation in Python, but we will primarily use the modern "string format method".

The help file for strings contains a .format() method with both positional and keyword arguments. Therefore for a given string S, we can apply .format() as follows to obtain a new string:

format(...)
  S.format(*args, **kwargs) -> str

  Return a formatted version of S, using substitutions from args and kwargs.
  The substitutions are identified by braces ('{' and '}').

The basic approach for this is to write a string with place-holders for values specified using curly brackets { }, and then providing the values themselves as arguments. The values can be either variables or expressions to be evaluated. For example:

x = 5
print("x = {}".format(x))

y = -15.5
print("if y = {}, then 5-y = {}".format(y, 5-y))

Avec = np.arange(3)
Bvec = np.ones(2, dtype=bool)
print("Avec = {} and Bvec = {}".format(Avec, Bvec))

produces:

x = 5
if y = -15.5, then 5-y = 20.5
Avec = [0 1 2] and Bvec = [ True  True]

In general, if we have N values to insert, we will reserve N spaces in string with curly brackets { }.

Formatting data in strings ¶

Above, we have specified how to place values directly into a string. We now discuss how to format it with various contents of the curly brackets { }, controlling things like spacing, alignment, number of decimal places and even ordering.

Ordering of variables ¶

By default, the values inserted into the string are placed by order of position. If we want to, it is possible to specify indices of the argument positions inside the curly brackets, in order to change around the order of placement in the string or even to repeat values. Consider:

xval, yval = 45.80000, -99
print("first = {0}, last = {1}".format(xval, yval))
print("first = {1}, last = {0}".format(xval, yval))
print("first = {0}, again = {0}, more (??) = {0}, last = {1}".format(xval, yval))

which produces

first = 45.8, last = -99
first = -99, last = 45.8
first = 45.8, again = 45.8, more (??) = 45.8, last = -99

Notice how the order is specified in each case and the output. We can also see that even though xval is a float specified to 5 decimal places, the Python interpreter has only specified one place. The next section shows how to control that.

Control characters ¶

We can control several aspects of spacing and decimal values using control characters. These are also specified in the curly brackets, but follow a colon :. Consider:

import numpy as np

print("PI is approx: {}".format(np.pi))
print("PI is approx: {:.3f}".format(np.pi))
print("PI is approx: {:.7f}".format(np.pi))
print("PI is approx: {:.25f}".format(np.pi))

which produces

PI is approx: 3.14159265359
PI is approx: 3.142
PI is approx: 3.1415927
PI is approx: 3.1415926535897931159979635

Thus, the :f specifies that the value is to be treated as a float, and one can also specify the number of decimal places, such as 7 with :0.7f or :.7f. Note that the output is rounded to that value, not just truncated. As further examples, consider:

xval, yval = 45.80000, -99
print("first = {0}, last = {1}".format(xval, yval))
print("first = {0:f}, last = {1:f}".format(xval, yval))
print("first = {0:0.8f}, last = {1:f}".format(xval, yval))
print("first = {0:0.8f}, last = {1:0.8f}".format(xval, yval))
print("first = {0:e}, last = {1:e}".format(xval, yval))
print("first = {0:0.8e}, last = {1:0.8e}".format(xval, yval))

which produces

first = 45.8, last = -99
first = 45.800000, last = -99.000000
first = 45.80000000, last = -99.000000
first = 45.80000000, last = -99.00000000
first = 4.580000e+01, last = -9.900000e+01
first = 4.58000000e+01, last = -9.90000000e+01

Thus, the :f specifies that the value is to be treated as a float, and one can also specify the number of decimal places, such as 8 with :0.8f or :.8f.

The :e specifies "exponentiated" representation, and also takes an argument for a number of decimal places to include.

The number to the left of the decimal specifies how many spaces should be placed to the left of a decimal point. One can use this to align numbers at a decimal point. For example, consider the two outputs in this case with/without using this:

C = np.array([-18.5, 300.1234, 0.1, 99.9999999])
N = len(C)
print("Without 'left' spacing:")
for i in range(N):
    print("val [{0}] --> {1}".format(i, C[i]))

print("\nWith 'left' spacing")
for i in range(N):
    print("val [{0}] --> {1:15.8f}".format(i, C[i]))

which produces

Without 'left' spacing:
val [0] --> -18.5
val [1] --> 300.1234
val [2] --> 0.1
val [3] --> 99.9999999

With 'left' spacing
val [0] -->    -18.50000000
val [1] -->    300.12340000
val [2] -->      0.10000000
val [3] -->     99.99999990

table to be filled in

Control character

description

f

floating point number

e

scientific notation

d

integer

s

string

Whitespace and escape characters ¶

Spacing can be controlled in several ways. The following are all examples of white space:

print("Whitespace example with all spaces inserted")
print("Whitespace example with 2 \t\t tabs inserted")
print("Whitespace example with a \n newline char inserted")

Whitespace example with all spaces inserted
Whitespace example with 2           tabs inserted
Whitespace example with a
 newline char inserted

Note that \t and \n are actually treated as a single character. You can see this by checking the length of a string:

print(len("abc d"))
print(len("abc\td))
print(len("abc\nd"))

which is 5 in each case. The backslash \ in this (and most) contexts is an escape character that alters the typical interpretation of the character following it. Thus, abc\td has different interpretation than abctd; we say that \t is an escape sequence (typically just the escape and the character following it).

Sometimes the escape character \ is used to make a "normal" character signify something else (such as \t, and sometimes it is used to escape the behavior of a "special" character. As an example of the latter, consider what the following prints as:

print("The backslash looks like: \")

It actually leads to a syntax error, because Python wants to interpret the slash as escaping whatever follows it, and the second quotation marks " are excaped, and don't pair up to close the string anymore. One can actually use the escape character itself to escape the escape character's escaping behavior:

print("The backslash looks like: \\")

Escapade successful.

Whitespace character

description

' '

space

'\t'

tab character

'\n'

newline character

Indexing and slicing strings ¶

As we have seen, a string is a sequence made up of one or more characters which can be letters, numbers, symbols, and white spaces. In this section we discuss how to access individual characters in a string and how we can extract a substring from a given string.

String indexing ¶

Just like for arrays, string characters can be indexed and each character corresponds to an index number starting from 0. For the string Next Einstein, the possible index numbers are 0 for the first letter N through to 12 for the last letter n.

N

e

x

t

E

i

n

s

t

e

i

n

0

1

2

3

4

5

6

7

8

9

10

11

12

This particular string has 13 characters (including the white space character) representing the length on the string. Then len() function can be used to determine the length of a string:

len('Next Einstein')

The answer is 13.

Given a string S, the command S[i] will return the character at index position i. For instance:

print('Next Einstein'[0])
print('Next Einstein'[9])
print('Next Einstein'[12])

will return:

N
s
n

Characters can also be accessed by negative index starting from -1 as follows.

N

e

x

t

E

i

n

s

t

e

i

n

-13

-12

-11

-10

-9

-8

-7

-6

-5

-4

-3

-2

-1

By using negetive index, we can extract the last character of a string S with the command S[-1], which is more handy than the command S[len(s) -1] which does the same job.

String slicing ¶

Sometimes, we want to extract a substring or a range of characters from a string S. For instance extracting the first 4 characters of the string Next Einstein will return Next

N

e

x

t

E

i

n

s

t

e

i

n

0

1

2

3

and extracting every second characters starting between the fourth and the eleventh character will yield tEnt

N

e

x

t

E

i

n

s

t

e

i

n

3

5

7

9

Translating into Python commands, we get:

'Next Einstein'[:4]
'Next Einstein'[3:10:2]

Which return Next and tEnt.

In general the Python string slicing syntax is given by:

string_to_slice[start_pos:end_pos:step]

The slicing begins at the start_pos index (included) and stops at end_pos index (excluded). The step parameter is used to specify the steps to take from start to end index. If the step is not specified, the default step of 1 is applied. Hence then command:

string_to_slice[start_pos:end_pos]

will return a substring between the start_pos index (included) and the end_pos index (excluded): For instance:

'Next Einstein'[6:11]

will yield inste

If the start_pos index is not specified, the default start_pos index 0 is applied. For instance:

'Next Einstein'[ :10:2]

will yield Nx is. That is taking every second character (step = 2) from the beginning of the string (default start_pos index is 0) to end_pos 10 ( 10 excluded)

On the other hand, if the end_pos ondex is not specified, the default is the end of the string. For instance:

'Next Einstein'[5 : : 3]

will yield Esi. That is taking every third character (step = 3) starting at start_pos index 5 (included) to the end of the string (default end_pos index is the end of the string)

Exercise: What will the the output of the following commands:

'Next Einstein'[ : ]
'Next Einstein'[ : : ]

Try and explain your answer.

A negative step means that we start counting from the end of the string. For instance we can reverse a string using slicing by providing the step value as -1:

'Next Einstein'[ : : -1 ]

The output is nietsniE txeN

Practice ¶

Write a program that inputs an email address and returns the username and the domain name

What is the output of the following lines of code:

print('Days of the week'[4:12:3])
print('Days of the week'[9])
print('Days of the week'[-7])
print('I love python programming'[:8])
print('I love python programming'[4:])
print('you love python programming'[::3])
print('You love python programming'[-8:-2])
print('I love python programming'[:-5])

Give the command to extract the word python from the string I love python programming.
Give the command to extract the substring I love python from the string I love python programming.
Give the command to extract every fifth characters form the string I love python programming starting at the third character.

Formatting, indexing and slicing strings.¶

Placing data into strings ¶

Formatting data in strings ¶

Ordering of variables ¶

Control characters ¶

Whitespace and escape characters ¶

Indexing and slicing strings ¶

String indexing ¶

String slicing ¶

Practice ¶

Table of Contents

This Page

Control character	description
`f`	floating point number
`e`	scientific notation
`d`	integer
`s`	string