Computations of basic sums
==========================

For theoretical details on basic sums see :ref:`Theory overview <sect-theory>` section of the online documentation and papers cited therein.

Every application of basic sums starts with a specification of the considered :ref:`cell <ref-cell_periods>` and the maximal order of sums being used (i.e. :ref:`symbolic precision <symb_precision_Bq>`). Consider the square lattice and the symbolic precision :math:`q_\max=5`:

.. code:: ipython3

    import numpy as np
    w1, w2  = 1, 1j  # periods of the square lattice
    qmax = 5         # symbolic precision

Then we create an instance of the :py:class:`Cell` class that computes and stores the functions :math:`\wp` and :math:`\wp'` for a given periods :math:`\omega_1`, :math:`\omega_2`, as well as the corresponding Eisenstein functions :math:`E_n` (:math:`n=2, 3, 4,\ldots,q_\max`). For more details, refer to formula :eq:`eSum` and section :ref:`sect-wp_En`.

.. code:: ipython3

    from basicsums.basic_sums import Cell
    cell = Cell(w1, w2, qmax)

In applications, it is common to work with sets of disks of identical or different radii. Let us investigate both cases seperately (an aproach to the analysis of points distributions is described :ref:`here <sect-characteristics_points>`).

Basic sums for identical disks
------------------------------

Let us import a sample configuration of identical disks:

.. code:: ipython3

    from basicsums.examples import load_example01
    A, r = load_example01()

where `A` is a NumPy array of complex numbers specifying the disk centres and `r` is a NumPy array of floats corresponding to the disk radii. One can apply :py:func:`show_disks` function to visualize the sample:

.. code:: ipython3

    from basicsums.show_disks import show_disks

.. code:: ipython3

    show_disks(A, r, w1, w2)
    show_disks(A, r, w1, w2, neighbs=False)


.. image:: tutorial_basic_sums_files/tutorial_basic_sums_12_0.png


.. image:: tutorial_basic_sums_files/tutorial_basic_sums_12_1.png


Then, for this specific configuration of disks, we create an instance of the :py:class:`BasicSums` class, which computes all required matrices related to the functions :math:`E(n)` (:math:`n=2, 3,4,\ldots,q_\max`) and stores as attributes in order to reuse them in computations of basic sums (for more details on the underlying algorithms, see [#WN2017BS]_).

.. code:: ipython3

    from basicsums.basic_sums import BasicSums

.. code:: ipython3

    bso = BasicSums(A, cell)

The ``bso`` object provides two main methods: ``esum()`` and
``esums()``. The former computes the value of a single basic sum. Let us
compute :math:`e_2`, :math:`e_{2, 5, 5}`, :math:`e_{3, 3, 2}` and
:math:`e_{4, 4}` seperately:

.. code:: ipython3

    bso.esum(2), bso.esum(2, 2, 5), bso.esum(3, 3, 2), bso.esum(4, 4)


.. parsed-literal::

    ((3.15438582262448-0.09180159581592136j),
     (-0.024658672489985198+1.2585946256954157j),
     (-10.80733726705118+1.833254246831019j),
     (7.9877213504124365-2.86102294921875e-16j))


The latter, :py:func:`esums()` method, calculates values of basic sums given by a list of tuples representing multi-indexes. The method returns a ``numpy`` array of the corresponding results:

.. code:: ipython3

    sums = [(2,), (2, 2, 5), (3, 3, 2), (4, 4)]
    bso.esums(sums)


.. parsed-literal::

    array([  3.15438582-9.18015958e-02j,  -0.02465867+1.25859463e+00j,
           -10.80733727+1.83325425e+00j,   7.98772135-2.86102295e-16j])


We will describe the difference between the two approaches a bit later.

An instance of the :py:class:`BasicSums` class automaticaly computes matrices of values for Eisentstein functions :math:`E(n)`, :math:`n=2, 3,4,\ldots,q_\max` at the time of creation. On the other hand, one can optimize this process by passing the list of only those indexes of functions :math:`E_n(z)` which will be used in computation:

.. code:: ipython3

    bso = BasicSums(A, cell, eisenstein_indexes=[2, 5])

.. code:: ipython3

    bso.esum(2, 5, 5)


.. parsed-literal::

    (-54.5563109397971+1.4082941022526196j)


One can check which Eisenstein functions are used by a given object:

.. code:: ipython3

    bso.eis.keys()


.. parsed-literal::

    dict_keys([2, 5])


Disks of different radii
------------------------

In case of disks of different radii, (i.e. polydispersed disks case), corresponding :py:class:`BasicSums` object needs some more information about the system in the form of the normalising factors :math:`\nu_j, j = 1, 2, 3, \ldots, N` (see :eq:`nuk_const`). These constants are automaticaly derived  by the object based on the radii of disks. Let us import a sample with normally distributed length of radii:

.. code:: ipython3

    from basicsums.examples import load_example02
    A, r = load_example02()

.. code:: ipython3

    show_disks(A, r, w1, w2)


.. image:: tutorial_basic_sums_files/tutorial_basic_sums_29_0.png


Let us create the an instance of the :py:class:`BasicSums` class:

.. code:: ipython3

    bso = BasicSums(A, cell, r)

.. code:: ipython3

    bso.esum(2), bso.esum(2, 3, 3)


.. parsed-literal::

    ((3.1318248855050785-0.06693636374418609j),
     (-10.346554736396879+0.9422742346896005j))


Computions of sequences of basic sums
-------------------------------------

Assume that we want to compute all basic sums from a given set, say defined by :math:`G'_q` (see :ref:`this section<sect-theory_Gq_prime>` for more details). In such a case, one can use :py:mod:`multi_indexes` module implemented in the package, to get the list of required multi-indexes. Assume that we use :math:`G'_6` in the following considerations:

.. code:: ipython3

    from basicsums.multi_indexes import sums_in_Gq_prime
    qmax = 5
    sums = sums_in_Gq_prime(qmax)
    sums


.. parsed-literal::

    [(2,),
     (2, 2),
     (2, 2, 2),
     (3, 3),
     (2, 2, 2, 2),
     (3, 3, 2),
     (4, 4),
     (2, 2, 2, 2, 2),
     (2, 3, 3, 2),
     (3, 3, 2, 2),
     (3, 4, 3),
     (4, 4, 2),
     (5, 5)]


Let us import a random sample and create corresponding instances of :py:class:`Cell` and :py:class:`BasicSums` classes:

.. code:: ipython3

    A, r = load_example01()

.. code:: ipython3

    show_disks(A, r, w1, w2)


.. image:: tutorial_basic_sums_files/tutorial_basic_sums_38_0.png


.. code:: ipython3

    cell = Cell(w1, w2, qmax)
    bso = BasicSums(A, cell)

One can set the ``dict_output`` parameter to ``True`` in order to force the :py:func:`esums()` method to use a dictionary data structure (``dict``) instance:

.. code:: ipython3

    sums_res = bso.esums(sums, dict_output=True)

Such an approach is very handy:

.. code:: ipython3

    sums_res[3, 3, 2]


.. parsed-literal::

    (-10.80733726705118+1.833254246831019j)


Moreover, it allows the transfer of the results to ``Pandas`` [#PANDAS]_ ``Series`` or ``DataFrame``:

.. code:: ipython3

    import pandas as pd
    pd.Series(sums_res)


.. parsed-literal::

    2        (3.15438582262448-0.09180159581592136j)
    2     (11.757827037864901+5.82076609134674e-17j)
    2        (43.07907709266533-1.7681300908296955j)
    3    (-3.557295937476818-7.450580596923828e-17j)
    2          (173.12898896402592-6.103515625e-15j)
    3        (-10.80733726705118+1.833254246831019j)
    4     (7.9877213504124365-2.86102294921875e-16j)
    2         (670.9285779378052-33.06492019934968j)
    2           (-50.034946200507406-2.9296875e-15j)
    3        (-42.96183376801067+6.393431885893068j)
    3       (1.0336374840805849-1.3287290405279937j)
    4        (25.69722017834029-1.0060002202097529j)
    5          (-16.913965412216037+2.44140625e-16j)
    dtype: complex128


.. _sect-memory:

Memoization and memory management
---------------------------------

Memoization of intermediate results
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

What is the difference between :py:func:`esum` and :py:func:`esums` methods? Instead of computing every sum seperately via :py:func:`esum`, the :py:func:`esums`  is implemented as a reccurence function that uses the momoization technique for storing *vector representations* (``numpy`` arrays of :math:`N` complex numbers) of intermediate sums computed during its call (for more details on algorithms and vector representations of basic sums, see [#WN2017BS]_ ). Briefly speaking, it iterates over the list of multi-indexes and computes recursively each sum, memoizing intermediate results to reuse them in computations of the remaining sums.

By default, once an intermediate vector representations of a sum is computed, it is cached in the ``OrderedDict`` data structure, so that it can be reused in the forthcoming computations  Note that the cache structure is a local variable of :py:func:`esums`, hence all the results vanish as the function call ends.

To see this, let us derive time of computing :math:`G'_{16}` containing
263167 sums:

.. code:: ipython3

    qmax = 20
    sums = sums_in_Gq_prime(qmax)
    len(sums)


.. parsed-literal::

    263167


Let us prepare the required objects:

.. code:: ipython3

    cell = Cell(w1, w2, qmax)
    bso = BasicSums(A, cell)

Now, we can compare running times of both approaches using ``timeit`` magic command:

.. code:: ipython3

    %timeit -n 1 -r 1 np.array([bso.esum(*m) for m in sums])


.. parsed-literal::

    1min 42s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums)


.. parsed-literal::

    21.9 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Both the approaches give the same result.

.. code:: ipython3

    res1 = np.array([bso.esum(*m) for m in sums])
    res2 = bso.esums(sums)
    np.all(np.isclose(res1, res2))


.. parsed-literal::

    True


Intermediate vector representations are cached in the ``OrderedDict`` data structure which is local for :py:func:`esums` method. Hence, another run of :py:func:`esums` recomputes all the values. However, one can tell the :py:func:`esums` to store all intemediate results in the objects's attribute:

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums, nonlocal_cache=True)


.. parsed-literal::

    22 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Then,r another run computes only the sums of cached vector
representations.

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums)


.. parsed-literal::

    1.98 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


One can check how many vector representations were stored, comparing to the number of considered basic sums:

.. code:: ipython3

    len(sums), len(bso._cache)


.. parsed-literal::

    (263167, 481364)


Let us clear the cache:

.. code:: ipython3

    bso.clear_cache()

The nonlocal cache can be used in the scenario when basic sums are
computed at different steps of the program. Let us first compute half of
the considered sums:

.. code:: ipython3

    half_index = len(sums)//2

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums[:half_index], nonlocal_cache=True)
    len(bso._cache)


.. parsed-literal::

    11.1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


.. parsed-literal::

    240681


And the, the other half. Once the nonlocal cached was established with the ``nonlocal_cache`` parameter, it is reused and updated during the forthcomming calculations:

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums[half_index:])
    len(bso._cache)


.. parsed-literal::

    10.8 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


.. parsed-literal::

    481364


.. code:: ipython3

    bso.clear_cache()

As we can see, the running times sum up to the case of calculations of all sums from ``sums`` at once.

Memory load management
~~~~~~~~~~~~~~~~~~~~~~

Note, that the momory cache load is getting larger, as the order of :math:`G'_q` increase. By default, the :py:func:`esums()` method remembers all representations of intermediate sums. One can limit the number of cached arrays via  ``cache_only`` and ``maxsize`` parameters. However, it may result in an increased running time of computations.

The ``cache_only`` parameter indicates multi-indexes of sums which should be cached.

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums, cache_only=sums, nonlocal_cache=True)


.. parsed-literal::

    39.9 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


The cache is much smaller now:

.. code:: ipython3

    len(sums), len(bso._cache)


.. parsed-literal::

    (263167, 263167)


Let us try another example:

.. code:: ipython3

    bso.clear_cache()

.. code:: ipython3

    %timeit -n 1 -r 1 bso.esums(sums, cache_only=sums[:5000])


.. parsed-literal::

    1min 18s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


As we mentioned earlier, :py:func:`esums` uses ``OrderedDict`` for the cache. The ``maxsize`` parameter indicates that the cache will remember only ``maxsize`` most recently stored vector representations.

.. code:: ipython3

    %timeit -n 1 -r 1  bso.esums(sums, maxsize=1000, nonlocal_cache=True)


.. parsed-literal::

    45.2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


One can see, that the cache do not exceed 1000 elements:

.. code:: ipython3

    len(sums), len(bso._cache)


.. parsed-literal::

    (263167, 1000)


.. rubric:: References


.. [#WN2017BS]  W. Nawalaniec, *Efficient computation of basic sums for random polydispersed composites*,  W. Comp. Appl. Math. (2017). https://doi.org/10.1007/s40314-017-0449-6


.. [#PANDAS] Wes McKinney, *Data Structures for Statistical Computing in Python*, Proceedings of the 9th Python in Science Conference, 51-56 (2010)