:tocdepth: 2 .. _float_rep: *************************************** Be careful: Working with floats *************************************** .. contents:: :local: .. highlight:: python :ref:`In a previous section <basic_ops_exact>`, we saw that there were several reasons that we should *not* expect floating point evaluations and representations to be exact. There were issues of having finite degrees of decimal precision, rounding and truncation. For example, the result of evaluating ``2**0.5`` can only ever be an approximation to the true mathematical value, which has an infinite number of decimal places; the Python result of ``1.4142135623730951`` is pretty detailed and generally a useful approximation, but it is not exact. Those issues were essentially observable when results had a lot of decimal places to represent, and approximations were needed. But what about evaluations of floating point numbers that only require one or two decimal places, like 0.1, 0.25, -8.5, etc.: *they look safe to treat as exact, but are they?* .. _float_rep_bin: Binary vs decimal representation of floats ============================================ As humans, we typically work with and write numbers in base-10 (like we did just up above), which is literally what "decimal" means. However, the underlying code interpreters work in base-2, or what is called operating in **binary**. The base-10 quantities we type into the Python environment are translated into "machine speak", and the actual calculations are done in binary using **bits** (which only take values 1 and 0) and **bytes** (a basic grouping of eight bits). So let's look at floating point values in each representation. In general, to finitely represent a number in a particular base *B*, we have to be able to write it as a fraction where: * | the numerator is a finite sum of integer coefficients times powers of *B*, i.e.: | :math:`x_0 \times B^0 + x_1 \times B^1 + x_2 \times B^2 + \dots+ x_M \times B^M`, for finite *M*. * the denominator is a single integer power of *B*, i.e.: :math:`B^N`, for finite *N*. *Sidenote:* The numerator's coefficients represent the digits of a number. So, in base-10, :math:`x_0` is the ones digit, :math:`x_1` the tens, etc. Therefore, the sums in any base are often written in order of decreasing exponent, so the coefficient order matches how they slot into the number they represent. **Case A.** Consider 7.0, which we can express as the fraction :math:`7/1`. In base-10, the numerator would be just the single term :math:`7\times10^0`, and the denominator is made unity by raising the base to zero, :math:`10^0`. In base-2, the numerator has more components :math:`1\times2^2+1\times2^1+1\times2^0`, but the denominator is made unity in the same way, :math:`2^0`. So, we have done the job of showing that a finite, "exact" representation is possible for this number in either base-10 or base-2. **Case B.** Consider 0.5. In decimal representation of a fraction, we have to find an integer that is a sum of powers of ten, that can be divided by some power of 10 to provide our value of interest. This is solvable with a finite number of terms in the numerator, basically reading from the decimal representation: :math:`0.5 = \dfrac{5}{10} = \dfrac{5 \times 10^{0}}{10^1}\, \rightarrow{\rm ~job~done}.` Thus, this number has a finite decimal representation, as expected. In terms of a binary representation, a similar rule applies with powers of 2, which is also solvable (by looking at the fraction notation: :math:`0.5 = \dfrac{5}{10} = \dfrac{1}{2} = \dfrac{1 \times 2^{0}}{2^1}\, \rightarrow{\rm ~job~done}.` So, both the decimal and binary representations are finite for this number. **Case C.** Now consider the humble ``0.1``, which initially looks to be a repetition of Case B, above. First, the decimal representation is: :math:`0.1 = \dfrac{1}{10} = \dfrac{1 \times 10^{0}}{10^1}\, \rightarrow{\rm ~job~done}.` However, in binary, we have a problem finding a denominator that will fit with any representation we try---they never seem to be an integer power of 2: :math:`0.1 = \dfrac{1}{10} \neq \dfrac{1 \times 2^{0}}{2^3} = 0.125\, \rightarrow{\rm ~try~again},` :math:`0.1 = \dfrac{2}{20} \neq \dfrac{1 \times 2^{1}}{2^4} = 0.125\, \rightarrow{\rm ~try~again},` :math:`0.1 = \dfrac{3}{30} \neq \dfrac{1 \times 2^{1} + 1 \times 2^{0}}{2^5} = 0.09375\, \rightarrow{\rm ~try~again},` :math:`\dots` It turns out that we can *never* find a satisfactory denominator, and there is *no* exact, finite representation of 0.1 in binary. Therefore, Python internally uses just a fractional approximation (to be precise, ``3602879701896397 / 2**55``). Thus, **computers can introduce rounding or truncation error even when representing finite decimal numbers.** We can only approximate 0.1 (and other decimals with similar properties) in Python. Consequences of binary approximation ======================================= We can see the effects of the unavoidable, internal binary approximations with a few examples. First, note that the expression ``5.1 % 1`` evaluates to ``0.09999999999999964``, instead of to ``0.1``. And we might have expected each of the following to evaluate to ``True``, but not all do:: 5.5 % 1 == 0.5 # True 5.4 % 1 == 0.4 # False 5.3 % 1 == 0.3 # False 5.2 % 1 == 0.2 # False 5.1 % 1 == 0.1 # False 0.1 * 1 == 0.1 # True 0.1 * 2 == 0.2 # True 0.1 * 3 == 0.3 # False 0.1 * 4 == 0.4 # True As a consequence, we see that even some values that we might think are safe to consider absolutely "precise" on paper are really not so exact within the computer. This does not mean we should avoid using such numbers---that is really not feasible. But it does mean that we should adjust our assumptions as to the exactness of evaluations of them, and we should avoid using them in certain kinds of expressions where the approximative nature would be unstable or otherwise problematic. In particular, as we noted before, **we should typically not use floating point evaluations within expressions of exact equality or inequality, because results will be unreliable.** .. note:: In truth, calculation "precision" is properly defined in terms of bytes, not decimal places, though we will often speak of the latter. In general, we won't have to think about base-2 representations of numbers as we work: this is just another point to emphasize that *we can't ask for exactness with floating point numbers.* .. container:: qpractice | **Q:** Following on from just above, evaluate each of: | ``5.6 % 1 == 0.6``, ``5.7 % 1 == 0.7``, ``5.8 % 1 == 0.8`` and ``5.9 % 1 == 0.9`` | in Python. Which evaluate to ``True``? .. hidden-code-block:: python :linenos: :label: + show/hide code # None do. | **Q:** Following on from just above, evaluate each of: | ``0.1 * 5 == 0.5``, ``0.1 * 6 == 0.6``, \.\.\., ``0.1 * 13 == 1.3`` | Which evaluate to ``True``? .. hidden-code-block:: python :linenos: :label: + show/hide code # The expressions that evaluate to True in that list are: 0.1 * 5 == 0.5 0.1 * 8 == 0.8 0.1 * 9 == 0.9 0.1 * 10 == 1.0 0.1 * 11 == 1.1 0.1 * 13 == 1.3 .. note for later section, after methods have been introduced: link to the following: https://docs.python.org/3/tutorial/floatingpoint.html .. Final comment =============== There is a saying that, "The exception proves the rule." Above, we have strongly tried to make the case that floating point values should not be thought of as exact, and therefore not be used in comparisons with strict equality/inequality. The one exception might be when directly using a floating point *integer*, which has *not* been produced by some operation. That is, the following could be a reasonable comparison ``3.0 == 3``, or more generally checking of ``3.0 == var``, as long as ``var`` is an int. However, we would still stay away from ``9.0**0.5 == var``, say. Practice ========== #. Does Python produce the correct answer for the following: ``0.1**2``? Why or why not? (Hint: "correct" mathematically might differ from "correct" computationally.) #. Consider having some float type variable ``x``. As noted above, we should avoid using either one of the following equivalent expressions for exact equality:: x % 1 == 0.1 x % 1 - 0.1 == 0 But is there an alternative way we could try to evaluate whether ``x % 1`` matches with ``0.1`` using a different comparison, which would still allow us to tell the difference between cases where, say, ``x = 5.1`` and ``x = 5.9``? That is, even if we cannot judge exact equality, what might be the next best thing we could test? .. nts: Include this later! Errors and debugging (and more on spacing) =========================================== We mentioned above that spacing *within* a line does not matter. But the spacing at the *start* of a line does matter in Python. Above, each command fit within a single line. However, we will later see more complicated, multi-line structures (such as conditions, loops and functions). And e Python uses the relative spacing at the start of a line to denote what commands are part of these structures, where they start and stop, and where independent lines of commands are (such as above). The spacing at the start of a line is called **indentation**, and adding it between the individual single-line commands, such as here: .. code-block:: python :linenos: 3**2 -3**2 3 ** -2 \.\.\. will cause Python to give us an **error message** when it gets to that spot: .. code-block:: File "<ipython-input-27-c8b0dd836076>", line 3 3 ** -2 ^ IndentationError: unexpected indent Note how Python helpfully tries to tell us where the problem is: it specifies "line 3", and then it uses the ``^`` symbol like an arrow to point to an even more specific part of that line. Then it also communicates the nature of its complaint: it classifies it as an ``IndentationError``, so we should now know that that means something about spacing at the beginning of a line ("indentation"), and then further states that it found indentation where it didn't think there should be some.