Nested Runs#

In this tutorial, we will explore the nestedRuns module from the postopus package, which allows you to analyze multiple runs at once. This feature is useful when working with a large number of simulation runs, making it easier to access and process data from multiple runs simultaneously.

[1]:
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd

from postopus import nestedRuns

input file is already defined in the folder (s. GitLab repo), otherwise we recommend defining it in the notebook:

[2]:
cd ../octopus_data/nested_runs/
/builds/octopus-code/postopus/docs/octopus_data/nested_runs

For this example, we need to trigger Octopus from a python script.

[3]:
!python3 create_runs.py

The initialization of a nestedRun object is very similar to the one of a Run object. As an argument we need to pass a Pathlib.Path of the folder that contains the multiple runs.

[4]:
n = nestedRuns(Path("."))

The nestedRun object will contain nested Dictionaries. These reflect the traversing path from the current working directory to the folder with the data (in this case the nested_runs folder). In the data folder, we will also see the nestedObjects object that contains the initialised Run objects that will be used to retrieve the data. To make the structure of the nestedRun object as easy as possible you can initialize the nestedRun object in the same folder where your data is stored. This will make the data access much easier as we will see below.

[5]:
n
[5]:
nestedRuns(postopus.octopus_nested_runs.nestedObjects,
           {'deltax_0.6': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.6'):
            Found systems:
                'default'
            Found calculation modes:
                'scf',
            'deltax_0.4': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.4'):
            Found systems:
                'default'
            Found calculation modes:
                'scf',
            'deltax_0.5': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.5'):
            Found systems:
                'default'
            Found calculation modes:
                'scf'})

To access the data, we will need to traverse the nested dictionary tree structure until we get to the data level sub-tree. We can use the dot notation and tab completion for traversing. Except for the cases where the path contains a dot. The Python interpreter has problems with that. In these cases, we need to use the typical dictionary-attribute-accesing syntax [""]. Since we are already in the correct folder we don’t need to traverse further.

[6]:
nruns = n
[7]:
nruns
[7]:
nestedRuns(postopus.octopus_nested_runs.nestedObjects,
           {'deltax_0.6': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.6'):
            Found systems:
                'default'
            Found calculation modes:
                'scf',
            'deltax_0.4': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.4'):
            Found systems:
                'default'
            Found calculation modes:
                'scf',
            'deltax_0.5': Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.5'):
            Found systems:
                'default'
            Found calculation modes:
                'scf'})

We can extract individual Run objects from the original nestedRun object and do all the operations that we know from the other tutorials. Note that here we need to use the square brackets syntax because the individual run folders have a dot in the name (e.g. deltax_0.6).

[8]:
run = nruns["deltax_0.6"]
[9]:
run
[9]:
Run('/builds/octopus-code/postopus/docs/octopus_data/nested_runs/deltax_0.6'):
Found systems:
    'default'
Found calculation modes:
    'scf'
[10]:
run.default.scf.density().plot()
[10]:
(array([2.178e+03, 1.200e+01, 6.000e+00, 0.000e+00, 0.000e+00, 0.000e+00,
        0.000e+00, 0.000e+00, 0.000e+00, 1.000e+00]),
 array([0.        , 0.24196353, 0.48392707, 0.7258906 , 0.96785413,
        1.20981767, 1.4517812 , 1.69374473, 1.93570827, 2.1776718 ,
        2.41963533]),
 <BarContainer object of 10 artists>)
../_images/notebooks_nested_Runs_17_1.png

We also can use the nestedObjects.apply method to retrieve data from each of the run objects at the same time. E.g. here we can see all the convergence data of three different octopus runs in one multiindex dataframe.

[11]:
convergence = pd.concat(nruns.apply(lambda run: run.default.scf.convergence()))
[12]:
convergence
[12]:
energy energy_diff abs_dens rel_dens abs_ev rel_ev
#iter
deltax_0.6 1 -133.934140 133.934000 1.896780e+00 4.741960e-01 14.293000 3.618910e-01
2 -134.169699 0.235559 1.716880e-01 4.292190e-02 2.207580 5.293580e-02
3 -135.797654 1.627950 1.316720e-02 3.291800e-03 1.555710 3.596290e-02
4 -142.132650 6.335000 1.744600e-01 4.361510e-02 4.863450 1.010650e-01
5 -141.053350 1.079300 2.380870e-02 5.952180e-03 0.867123 1.834980e-02
6 -140.709536 0.343814 9.588510e-03 2.397130e-03 0.173801 3.691500e-03
7 -140.690361 0.019175 7.851220e-04 1.962810e-04 0.005895 1.252350e-04
8 -140.696562 0.006201 5.715200e-04 1.428800e-04 0.002500 5.310550e-05
9 -140.710721 0.014160 3.709220e-04 9.273050e-05 0.007319 1.554480e-04
10 -140.707937 0.002784 6.884600e-05 1.721150e-05 0.001926 4.091320e-05
11 -140.706670 0.001267 2.553390e-05 6.383480e-06 0.001063 2.257620e-05
12 -140.706718 0.000048 5.086120e-06 1.271530e-06 0.000082 1.739630e-06
13 -140.706741 0.000023 2.090530e-06 5.226330e-07 0.000016 3.443350e-07
14 -140.706752 0.000011 1.736110e-06 4.340270e-07 0.000012 2.553640e-07
deltax_0.4 1 -123.459317 123.459000 9.384990e-01 2.346250e-01 8.132140 2.699640e-01
2 -124.835211 1.375890 3.045150e-01 7.612880e-02 4.112090 1.201130e-01
3 -122.839925 1.995290 9.264640e-02 2.316160e-02 0.637417 1.897200e-02
4 -135.330219 12.490300 3.260060e-01 8.150160e-02 7.704570 1.865410e-01
5 -134.118451 1.211770 4.315490e-02 1.078870e-02 1.007920 2.501390e-02
6 -134.952459 0.834008 2.187240e-02 5.468090e-03 0.486333 1.192560e-02
7 -135.454239 0.501780 1.114400e-02 2.786000e-03 0.330593 8.041400e-03
8 -136.266915 0.812676 1.498140e-02 3.745360e-03 0.588851 1.412110e-02
9 -136.453710 0.186795 6.152740e-03 1.538190e-03 0.095393 2.282360e-03
10 -136.347947 0.105763 1.493420e-03 3.733560e-04 0.087948 2.108690e-03
11 -136.325958 0.021989 6.014180e-04 1.503540e-04 0.012930 3.101020e-04
12 -136.342873 0.016916 3.373360e-04 8.433390e-05 0.011981 2.872720e-04
13 -136.335253 0.007620 1.173030e-04 2.932580e-05 0.006007 1.440420e-04
14 -136.329943 0.005310 1.223250e-04 3.058130e-05 0.003452 8.278030e-05
15 -136.330457 0.000514 1.027060e-05 2.567650e-06 0.000509 1.219900e-05
16 -136.331310 0.000853 1.547230e-05 3.868070e-06 0.000625 1.498240e-05
17 -136.331103 0.000207 2.410580e-06 6.026450e-07 0.000181 4.349410e-06
18 -136.331195 0.000093 1.659190e-06 4.147990e-07 0.000068 1.641650e-06
deltax_0.5 1 -120.434496 120.434000 1.342400e+00 3.356000e-01 10.151300 3.242690e-01
2 -124.150961 3.716470 3.038930e-01 7.597310e-02 4.768300 1.321820e-01
3 -129.164039 5.013080 8.902340e-02 2.225590e-02 3.643850 9.174420e-02
4 -136.099112 6.935070 1.511840e-01 3.779590e-02 4.533900 1.024580e-01
5 -135.054754 1.044360 2.432960e-02 6.082410e-03 0.762322 1.752900e-02
6 -135.616984 0.562230 8.394770e-03 2.098690e-03 0.429385 9.776870e-03
7 -135.840250 0.223266 5.982680e-03 1.495670e-03 0.123754 2.809900e-03
8 -135.764751 0.075499 1.094700e-03 2.736740e-04 0.060248 1.369830e-03
9 -135.678934 0.085817 1.802370e-03 4.505910e-04 0.057218 1.302640e-03
10 -135.671259 0.007674 2.600480e-04 6.501200e-05 0.003825 8.707770e-05
11 -135.665580 0.005680 1.333160e-04 3.332900e-05 0.003561 8.107860e-05
12 -135.670800 0.005220 8.961400e-05 2.240350e-05 0.003825 8.708480e-05
13 -135.669618 0.001182 1.857110e-05 4.642790e-06 0.000976 2.223190e-05
14 -135.669205 0.000413 9.229340e-06 2.307340e-06 0.000269 6.115940e-06
15 -135.669281 0.000076 1.064830e-06 2.662080e-07 0.000062 1.411160e-06
16 -135.669323 0.000042 8.150980e-07 2.037750e-07 0.000029 6.547890e-07

Taking this combined dataframe we can produce complex plots with a few lines of code:

[13]:
def get_parameter_from_path(path):
    # this is a hack to get the spacing from the path
    return float(path[-3:])
[14]:
def get_converged_data(convergence):
    # get only the information from the last iteration for each run
    converged = convergence.groupby(level=0).tail(1).droplevel(1)
    i = converged.index
    combined = converged.set_index(i.map(get_parameter_from_path)).sort_index()
    return combined
[15]:
converged = get_converged_data(convergence)
[16]:
converged
[16]:
energy energy_diff abs_dens rel_dens abs_ev rel_ev
0.4 -136.331195 0.000093 1.659190e-06 4.147990e-07 0.000068 1.641650e-06
0.5 -135.669323 0.000042 8.150980e-07 2.037750e-07 0.000029 6.547890e-07
0.6 -140.706752 0.000011 1.736110e-06 4.340270e-07 0.000012 2.553640e-07
[17]:
width = 5
f, ax = plt.subplots(1, 1, figsize=(width, width * 0.6), sharex=True)
ax.plot(converged.index, converged.energy)
ax.set_ylabel("Total energy [eV]")
ax.set_xlabel(r"Spacing [$\AA$]")
f.tight_layout()
f.savefig("convergence.png")

f, ax = plt.subplots(1, 1, figsize=(width, width * 0.6), sharex=True)
for k, group in convergence.groupby(level=0):
    ax.semilogy(
        group.loc[k].index,
        group.rel_dens,
        label=rf"Spacing {get_parameter_from_path(k)} $\AA$",
    )
ax.legend()
ax.set_ylabel("Relative density change")
ax.set_xlabel(r"Iteration number")
f.tight_layout()
../_images/notebooks_nested_Runs_26_0.png
../_images/notebooks_nested_Runs_26_1.png

We can also retrieve field data in an analogous manner.

[18]:
nruns.apply(lambda run: run.default.scf.density())
[18]:
nestedObjects(postopus.octopus_nested_runs.nestedObjects,
              {'deltax_0.6': <xarray.DataArray 'density' (step: 1, x: 13, y: 13, z: 13)> Size: 18kB
               [2197 values with dtype=float64]
               Coordinates:
                 * step     (step) int64 8B 14
                 * x        (x) float32 52B -3.6 -3.0 -2.4 -1.8 -1.2 ... 1.2 1.8 2.4 3.0 3.6
                 * y        (y) float32 52B -3.6 -3.0 -2.4 -1.8 -1.2 ... 1.2 1.8 2.4 3.0 3.6
                 * z        (z) float32 52B -3.6 -3.0 -2.4 -1.8 -1.2 ... 1.2 1.8 2.4 3.0 3.6
               Attributes:
                   units:    eV_Ångstrom,
               'deltax_0.4': <xarray.DataArray 'density' (step: 1, x: 21, y: 21, z: 21)> Size: 74kB
               [9261 values with dtype=float64]
               Coordinates:
                 * step     (step) int64 8B 18
                 * x        (x) float32 84B -4.0 -3.6 -3.2 -2.8 -2.4 ... 2.4 2.8 3.2 3.6 4.0
                 * y        (y) float32 84B -4.0 -3.6 -3.2 -2.8 -2.4 ... 2.4 2.8 3.2 3.6 4.0
                 * z        (z) float32 84B -4.0 -3.6 -3.2 -2.8 -2.4 ... 2.4 2.8 3.2 3.6 4.0
               Attributes:
                   units:    eV_Ångstrom,
               'deltax_0.5': <xarray.DataArray 'density' (step: 1, x: 17, y: 17, z: 17)> Size: 39kB
               [4913 values with dtype=float64]
               Coordinates:
                 * step     (step) int64 8B 16
                 * x        (x) float32 68B -4.0 -3.5 -3.0 -2.5 -2.0 ... 2.0 2.5 3.0 3.5 4.0
                 * y        (y) float32 68B -4.0 -3.5 -3.0 -2.5 -2.0 ... 2.0 2.5 3.0 3.5 4.0
                 * z        (z) float32 68B -4.0 -3.5 -3.0 -2.5 -2.0 ... 2.0 2.5 3.0 3.5 4.0
               Attributes:
                   units:    eV_Ångstrom})
[ ]: