:tocdepth: 2

.. _float_rep:

***************************************
Be careful: Working with floats
***************************************

.. contents:: :local:

.. highlight:: python

:ref:`In a previous section <basic_ops_exact>`, we saw that there
were several reasons that we should *not* expect floating point
evaluations and representations to be exact.  There were issues of
having finite degrees of decimal precision, rounding and truncation.
For example, the result of evaluating ``2**0.5`` can only ever be an
approximation to the true mathematical value, which has an infinite
number of decimal places; the Python result of ``1.4142135623730951``
is pretty detailed and generally a useful approximation, but it is not
exact.

Those issues were essentially observable when results had a lot of
decimal places to represent, and approximations were needed.  But what
about evaluations of floating point numbers that only require one or
two decimal places, like 0.1, 0.25, -8.5, etc.: *they look safe to
treat as exact, but are they?*

.. _float_rep_bin:

Binary vs decimal representation of floats
============================================

As humans, we typically work with and write numbers in base-10 (like
we did just up above), which is literally what "decimal" means.
However, the underlying code interpreters work in base-2, or what is
called operating in **binary**.  The base-10 quantities we type into
the Python environment are translated into "machine speak", and the
actual calculations are done in binary using **bits** (which only take
values 1 and 0) and **bytes** (a basic grouping of eight bits).  So
let's look at floating point values in each representation.

In general, to finitely represent a number in a particular base *B*,
we have to be able to write it as a fraction where:

* | the numerator is a finite sum of integer coefficients times powers
    of *B*, i.e.:
  | :math:`x_0 \times B^0 + x_1 \times B^1 + x_2 \times
    B^2 + \dots+ x_M \times B^M`, for finite *M*.
* the denominator is a single integer power of *B*, i.e.: :math:`B^N`,
  for finite *N*.
  
*Sidenote:* The numerator's coefficients represent the digits of a
number. So, in base-10, :math:`x_0` is the ones digit, :math:`x_1` the
tens, etc. Therefore, the sums in any base are often written in order
of decreasing exponent, so the coefficient order matches how they slot
into the number they represent.

**Case A.** Consider 7.0, which we can express as the fraction
:math:`7/1`.  In base-10, the numerator would be just the single term
:math:`7\times10^0`, and the denominator is made unity by raising the
base to zero, :math:`10^0`.  In base-2, the numerator has more
components :math:`1\times2^2+1\times2^1+1\times2^0`, but the
denominator is made unity in the same way, :math:`2^0`.  So, we have
done the job of showing that a finite, "exact" representation is
possible for this number in either base-10 or base-2.

**Case B.** Consider 0.5.  In decimal representation of a fraction, we
have to find an integer that is a sum of powers of ten, that can be
divided by some power of 10 to provide our value of interest.  This is
solvable with a finite number of terms in the numerator, basically
reading from the decimal representation:

:math:`0.5 = \dfrac{5}{10} = \dfrac{5 \times 10^{0}}{10^1}\,
\rightarrow{\rm ~job~done}.`

Thus, this number has a finite decimal representation, as expected.
In terms of a binary representation, a similar rule applies with
powers of 2, which is also solvable (by looking at the fraction
notation:

:math:`0.5 = \dfrac{5}{10} = \dfrac{1}{2} = \dfrac{1 \times
2^{0}}{2^1}\, \rightarrow{\rm ~job~done}.`

So, both the decimal and binary representations are finite for this
number.

**Case C.** Now consider the humble ``0.1``, which initially looks to
be a repetition of Case B, above.  First, the decimal representation
is:

:math:`0.1 = \dfrac{1}{10} = \dfrac{1 \times 10^{0}}{10^1}\,
\rightarrow{\rm ~job~done}.`

However, in binary, we have a problem finding a denominator that will
fit with any representation we try---they never seem to be an integer
power of 2:

:math:`0.1 = \dfrac{1}{10} \neq \dfrac{1 \times 2^{0}}{2^3} = 0.125\,
\rightarrow{\rm ~try~again},`

:math:`0.1 = \dfrac{2}{20} \neq \dfrac{1 \times 2^{1}}{2^4} = 0.125\,
\rightarrow{\rm ~try~again},`

:math:`0.1 = \dfrac{3}{30} \neq \dfrac{1 \times 2^{1} + 1 \times
2^{0}}{2^5} = 0.09375\, \rightarrow{\rm ~try~again},`

:math:`\dots`

It turns out that we can *never* find a satisfactory denominator, and
there is *no* exact, finite representation of 0.1 in
binary. Therefore, Python internally uses just a fractional
approximation (to be precise, ``3602879701896397 / 2**55``). Thus,
**computers can introduce rounding or truncation error even when
representing finite decimal numbers.** We can only approximate 0.1
(and other decimals with similar properties) in Python.

Consequences of binary approximation
=======================================

We can see the effects of the unavoidable, internal binary
approximations with a few examples.  

First, note that the expression ``5.1 % 1`` evaluates to
``0.09999999999999964``, instead of to ``0.1``.  And we might have
expected each of the following to evaluate to ``True``, but not all
do::

  5.5 % 1 == 0.5            # True
  5.4 % 1 == 0.4            # False
  5.3 % 1 == 0.3            # False
  5.2 % 1 == 0.2            # False
  5.1 % 1 == 0.1            # False

  0.1 * 1 == 0.1            # True
  0.1 * 2 == 0.2            # True
  0.1 * 3 == 0.3            # False
  0.1 * 4 == 0.4            # True

As a consequence, we see that even some values that we might think are
safe to consider absolutely "precise" on paper are really not so exact
within the computer.  This does not mean we should avoid using such
numbers---that is really not feasible.  But it does mean that we
should adjust our assumptions as to the exactness of evaluations of
them, and we should avoid using them in certain kinds of expressions
where the approximative nature would be unstable or otherwise
problematic.  In particular, as we noted before, **we should typically
not use floating point evaluations within expressions of exact
equality or inequality, because results will be unreliable.**

.. note:: In truth, calculation "precision" is properly defined in
          terms of bytes, not decimal places, though we will often
          speak of the latter.  In general, we won't have to think
          about base-2 representations of numbers as we work: this is
          just another point to emphasize that *we can't ask for
          exactness with floating point numbers.*


.. container:: qpractice


   | **Q:** Following on from just above, evaluate each of:
   | ``5.6 % 1 == 0.6``, ``5.7 % 1 == 0.7``, ``5.8 % 1 == 0.8`` and
     ``5.9 % 1 == 0.9`` 
   | in Python.  Which evaluate to ``True``?

      .. hidden-code-block:: python
         :linenos:
         :label: + show/hide code

         # None do.

   | **Q:** Following on from just above, evaluate each of: 

   | ``0.1 * 5 == 0.5``, ``0.1 * 6 == 0.6``, \.\.\., ``0.1 * 13 ==
     1.3``
   | Which evaluate to ``True``?

      .. hidden-code-block:: python
         :linenos:
         :label: + show/hide code

         # The expressions that evaluate to True in that list are:
         0.1 * 5 == 0.5
         0.1 * 8 == 0.8
         0.1 * 9 == 0.9
         0.1 * 10 == 1.0
         0.1 * 11 == 1.1 
         0.1 * 13 == 1.3 

.. note for later section, after methods have been introduced:  

   link to the following:
   https://docs.python.org/3/tutorial/floatingpoint.html

.. 
    Final comment
    ===============

    There is a saying that, "The exception proves the rule."  Above, we
    have strongly tried to make the case that floating point values should
    not be thought of as exact, and therefore not be used in comparisons
    with strict equality/inequality.

    The one exception might be when directly using a floating point
    *integer*, which has *not* been produced by some operation. That is,
    the following could be a reasonable comparison ``3.0 == 3``, or more
    generally checking of ``3.0 == var``, as long as ``var`` is an int.
    However, we would still stay away from ``9.0**0.5 == var``, say.


Practice 
==========

#. Does Python produce the correct answer for the following:
   ``0.1**2``?  Why or why not?  (Hint: "correct" mathematically might
   differ from "correct" computationally.)

#. Consider having some float type variable ``x``.  As noted above, we
   should avoid using either one of the following equivalent
   expressions for exact equality::

     x % 1 == 0.1
     x % 1 - 0.1 == 0

   But is there an alternative way we could try to evaluate whether
   ``x % 1`` matches with ``0.1`` using a different comparison, which
   would still allow us to tell the difference between cases where,
   say, ``x = 5.1`` and ``x = 5.9``?  That is, even if we cannot judge
   exact equality, what might be the next best thing we could test?


.. nts:  Include this later!  

    Errors and debugging (and more on spacing)
    ===========================================

    We mentioned above that spacing *within* a line does not matter.  But
    the spacing at the *start* of a line does matter in Python.  Above,
    each command fit within a single line.  However, we will later see
    more complicated, multi-line structures (such as conditions, loops and
    functions).  And e

    Python uses the relative spacing at the start of a line to denote what
    commands are part of these structures, where they start and stop, and
    where independent lines of commands are (such as above).  The spacing
    at the start of a line is called **indentation**, and adding it
    between the individual single-line commands, such as here:

    .. code-block:: python
       :linenos:

       3**2
       -3**2
           3 ** -2

    \.\.\. will cause Python to give us an **error message** when it gets
    to that spot:

    .. code-block::

         File "<ipython-input-27-c8b0dd836076>", line 3
           3 ** -2
           ^
       IndentationError: unexpected indent

    Note how Python helpfully tries to tell us where the problem is: it
    specifies "line 3", and then it uses the ``^`` symbol like an arrow to
    point to an even more specific part of that line.  Then it also
    communicates the nature of its complaint: it classifies it as an
    ``IndentationError``, so we should now know that that means something
    about spacing at the beginning of a line ("indentation"), and then
    further states that it found indentation where it didn't think there
    should be some.