Basic fitting in EasyDynamics¶

We here show how to use EasyDynamics to fit a simple synthetic data set. We start with the simplest possible example; later tutorials will be more involved.

The general procedure is to create an Experiment object to hold the data, a SampleModel to describe the model, and Analysis to fit the model to the data.

In [1]:

Copied!





# Imports
import pooch

import easydynamics as edyn
import easydynamics.sample_model as sm
from easydynamics.analysis.analysis import Analysis

# Make the plots interactive
%matplotlib widget
# Imports
import pooch

import easydynamics as edyn
import easydynamics.sample_model as sm
from easydynamics.analysis.analysis import Analysis

# Make the plots interactive
%matplotlib widget

We first create an Experiment object to contain the data. The data must either be a hdf5 file or a scipp.DataArray; in both cases it must have coordinates Q and energy. We here use Pooch to download an example data set.

💡 Tip

If you have one or more data files that you want to fit using EasyDynamics, you can reach out to us e.g. at henrik.jacobsen@ess.eu for help.

We give the Experiment a display_name which is used as the title when plotting the data.

In [2]:

Copied!





# Load the data
experiment = edyn.Experiment(display_name='Tutorial')

file_path = pooch.retrieve(
    url='https://github.com/easyscience/dynamics-lib/raw/refs/heads/tutorial0/docs/docs/tutorials/data/fake_simple_data.hdf5',
    known_hash='b49944c4447e69be4d30d1bed935173c4a1727c25a347285cbb156edc76ee261',
)

experiment.load_hdf5(filename=file_path)
# Load the data
experiment = edyn.Experiment(display_name='Tutorial')

file_path = pooch.retrieve(
    url='https://github.com/easyscience/dynamics-lib/raw/refs/heads/tutorial0/docs/docs/tutorials/data/fake_simple_data.hdf5',
    known_hash='b49944c4447e69be4d30d1bed935173c4a1727c25a347285cbb156edc76ee261',
)

experiment.load_hdf5(filename=file_path)

Downloading data from 'https://github.com/easyscience/dynamics-lib/raw/refs/heads/tutorial0/docs/docs/tutorials/data/fake_simple_data.hdf5' to file '/home/runner/.cache/pooch/21fc99c50611838d312c8e9c054e0399-fake_simple_data.hdf5'.

We can visualize the data in multiple ways, relying on plopp: https://scipp.github.io/plopp/

For now, let us plot the data using a slicer showing intensity as function of energy for various Q. You can dragg the slider to choose which $Q$ is displayed. Notice that the title is the display_name that we gave the Experiment.

💡 Tip

You can change the title by running `experiment.display_name="new title"`

In [3]:

Copied!

experiment.plot_data(slicer=True)
experiment.plot_data(slicer=True)

Out[3]:

The data looks somewhat noisy. To improve statistics, we can rebin it. The method takes a dictionary of coordinates and the new bins. The new bins can be given as an integer number of bins or as a scipp.Variable (https://scipp.github.io/generated/classes/scipp.Variable.html#scipp.Variable) of bin-edges. For now, we just use integers, choosing 16 bins in $Q$ and 256 bins in energy.

In [4]:

Copied!

experiment.rebin({'Q': 16, 'energy': 256})
experiment.plot_data(slicer=True)
experiment.rebin({'Q': 16, 'energy': 256})
experiment.plot_data(slicer=True)

Out[4]:

💡 Tip

You may have noted that the y axis is automatically scaled to include all data for each Q. If you do not want this, use `autoscale=False` in the plot command.

In this data we see a single Gaussian shaped peak and a background that seems to be zero on average. We now want to fit this data, e.g. to determine how the Gaussian changes with $Q$. We define a Gaussian like this:

In [5]:

Copied!

gaussian = sm.Gaussian(display_name='Gaussian', area=1, width=0.05)
gaussian = sm.Gaussian(display_name='Gaussian', area=1, width=0.05)

💡 Tip

You can access and change the values of the parameters of the Gaussian like this: `gaussian.area=0.5`

💡 Tip

The following functions are implemented: Gaussian, Lorentzian, Voigt, Delta Function, Damped Harmonic Oscillator, Polynomial. If you need a different model, you can define your own using ExpressionComponent

We would like to fit this model to our data. However, it would be tedious to copy/paste the model for 16 values of $Q$, especially when the model grows more complicated. Instead, EasyDynamics handles this for us; we simply have to create a SampleModel and pass it our component.

In [6]:

Copied!

model = sm.SampleModel(components=gaussian)
model = sm.SampleModel(components=gaussian)

Our SampleModel does not yet have any $Q$ values. We could tell it what $Q$ is by using the Q attribute.

Q_values=sc.linspace(start=0.2,stop=2.1,num=16,unit='1/angstrom', dim='Q')

model.Q=Q_values

This would generate a copy of the model for each of the values of $Q$.

However, since we want the $Q$ values to match those of the experiment, it is much easier to let EasyDynamics handle it. To do this, we make an Analysis object and give it out experiment and sample model:

In [7]:

Copied!





analysis = Analysis(
    experiment=experiment,
    sample_model=model,
)
analysis = Analysis(
    experiment=experiment,
    sample_model=model,
)

A lot happens under the hood in this step. It automatically extracts the $Q$ values from the experiment and passes them to our SampleModel, which in turn generates a copy of the Gaussian for each $Q$. We furthermore create an Analysis1d object for each $Q$. You will usually not need to think too hard about any of this, since you will mostly interact with your data and model through the Analysis object, but it can be good to know what is going on.

We can access a list of the the Analysis1d objects like this:

In [8]:

Copied!

analysis_list = analysis.analysis_list
analysis_list = analysis.analysis_list

Let us plot our data and model using the plopp slicer:

In [9]:

Copied!

analysis.plot_data_and_model()
analysis.plot_data_and_model()

Out[9]:

The model is of course not yet particularly good, since we didn't try very hard to guess what the parameters should be. However, we can now start fitting it using the fit method.

Let us first fit a single $Q$ index and plot the data and model to see how it looks. We choose an arbitrary $Q$ and plot only that one

In [10]:

Copied!

fit_result_independent_single_Q = analysis.fit(Q_index=5)
analysis.plot_data_and_model(Q_index=5)
fit_result_independent_single_Q = analysis.fit(Q_index=5)
analysis.plot_data_and_model(Q_index=5)

Out[10]:

The fit looks very good. We can get a list of the parameters for this fit by accesing the corresponding Analysis1d object:

In [11]:

Copied!

analysis.analysis_list[5].get_all_parameters()
analysis.analysis_list[5].get_all_parameters()

Out[11]:

[<Parameter 'Gaussian area': 3.5814 ± 0.0570 meV, bounds=[0.0:inf]>,
 <Parameter 'Gaussian center': 0.0000 meV (fixed), bounds=[-inf:inf]>,
 <Parameter 'Gaussian width': 0.0985 ± 0.0018 meV, bounds=[1e-10:inf]>,
 <Parameter 'energy_offset': -0.0002 ± 0.0017 meV, bounds=[-inf:inf]>]

Note

You may note that the center of the Gaussian is fixed, and that we fitted an `energy_offset`. We discuss this in the following tutorial.

Since the fit looked good, we can now fit all $Q$. We also plot the result, again using the slicer.

In [12]:

Copied!

fit_result_all_Q = analysis.fit()
analysis.plot_data_and_model()
fit_result_all_Q = analysis.fit()
analysis.plot_data_and_model()

Out[12]:

Information about the fit is stored in the output. It is stored as a list of EasyScience FitResults. Here we show a few of the relevant properties:

In [13]:

Copied!

print(f'The reduced chi-squared value for Q_index=5 is: {fit_result_all_Q[5].reduced_chi}')

print(f'The minimizer engine is: {fit_result_all_Q[5].minimizer_engine}')
print(f'The reduced chi-squared value for Q_index=5 is: {fit_result_all_Q[5].reduced_chi}')

print(f'The minimizer engine is: {fit_result_all_Q[5].minimizer_engine}')

The reduced chi-squared value for Q_index=5 is: 0.9807375513271296
The minimizer engine is: <class 'easyscience.fitting.minimizers.minimizer_lmfit.LMFit'>

It can be nice to inspect the fit parameters to look for trends, and sometimes continue working with them. To do this, we convert the parameters to a scipp Dataset (https://scipp.github.io/generated/classes/scipp.Dataset.html#scipp.Dataset)

In [14]:

Copied!

analysis.parameters_to_dataset()
analysis.parameters_to_dataset()

Out[14]:

scipp.Dataset (4.05 KB)

Dimensions:
- Q: 16

Coordinates: (1)

(Q)

float64

1/Å

0.259, 0.378, ..., 1.922, 2.041

Values:
array([0.259375, 0.378125, 0.496875, 0.615625, 0.734375, 0.853125,
       0.971875, 1.090625, 1.209375, 1.328125, 1.446875, 1.565625,
       1.684375, 1.803125, 1.921875, 2.040625])

Data: (4)

Gaussian area

(Q)

float64

meV

3.668, 3.725, ..., 3.411, 3.262

σ = 0.063, 0.057, ..., 0.065, 0.065

Values:
array([3.66804026, 3.72542713, 3.67179924, 3.62230979, 3.55407715,
       3.58137489, 3.52028175, 3.53925636, 3.497424  , 3.45978058,
       3.45318365, 3.51750804, 3.4371155 , 3.47612963, 3.41124826,
       3.26164046])

Variances (σ²):
array([0.00395623, 0.00323745, 0.00310019, 0.00348887, 0.00304316,
       0.00325033, 0.00329041, 0.00443113, 0.00251451, 0.00272246,
       0.00302258, 0.00466298, 0.00418327, 0.00458332, 0.00421608,
       0.00418692])

Gaussian center

(Q)

float64

meV

0.0, 0.0, ..., 0.0, 0.0

σ = 0.0, 0.0, ..., 0.0, 0.0

Values:
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Variances (σ²):
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Gaussian width

(Q)

float64

meV

0.099, 0.098, ..., 0.101, 0.099

σ = 0.002, 0.002, ..., 0.002, 0.002

Values:
array([0.0992222 , 0.09825193, 0.09799274, 0.09683621, 0.0973105 ,
       0.09847969, 0.09789783, 0.09966887, 0.09919424, 0.09757216,
       0.09825065, 0.10108336, 0.10085033, 0.10186455, 0.10092218,
       0.09935962])

Variances (σ²):
array([3.86335653e-06, 2.37053045e-06, 3.21198395e-06, 3.20177060e-06,
       3.07556350e-06, 3.39158580e-06, 3.70282110e-06, 4.00010581e-06,
       2.64571042e-06, 3.19461582e-06, 2.25942250e-06, 3.87450474e-06,
       4.99342370e-06, 5.27613814e-06, 4.88824724e-06, 5.21159474e-06])

energy_offset

(Q)

float64

meV

-0.000, -0.000, ..., -0.004, 7.406e-05

σ = 0.002, 0.002, ..., 0.002, 0.002

Values:
array([-3.33302487e-04, -4.18205057e-04, -3.82264445e-04, -2.54040748e-03,
       -1.29933233e-04, -2.22715154e-04, -1.13577443e-04, -4.50183223e-03,
        2.50929987e-04,  2.44849330e-03, -1.97167770e-03,  3.09142330e-03,
        1.73187181e-03,  2.28486773e-03, -3.53259545e-03,  7.40603941e-05])

Variances (σ²):
array([3.83888797e-06, 3.63991826e-06, 2.84162216e-06, 3.58814208e-06,
       2.91288914e-06, 2.90404282e-06, 2.64424921e-06, 5.19461927e-06,
       2.69758793e-06, 2.63655524e-06, 3.63463210e-06, 6.53080162e-06,
       4.71156639e-06, 5.23464620e-06, 4.74489173e-06, 5.08537997e-06])

We can also plot the parameters as a function of Q using the plot_parameters method.

In [15]:

Copied!

analysis.plot_parameters(names=['Gaussian area'])
analysis.plot_parameters(names=['Gaussian area'])

Out[15]:

It will soon be possible to use EasyDynamics to fit these parameters to e.g. a polynomial.