Basic fitting in EasyDynamics¶
We here show how to use EasyDynamics to fit a simple synthetic data set. We start with the simplest possible example; later tutorials will be more involved.
The general procedure is to create an Experiment object to hold the data, a SampleModel to describe the model, and Analysis to fit the model to the data.
# Imports
import pooch
import easydynamics as edyn
import easydynamics.sample_model as sm
from easydynamics.analysis.analysis import Analysis
# Make the plots interactive
%matplotlib widget
We first create an Experiment object to contain the data. The data must either be a hdf5 file or a scipp.DataArray; in both cases it must have coordinates Q and energy. We here use Pooch to download an example data set.
💡 Tip
We give the Experiment a display_name which is used as the title when plotting the data.
# Load the data
experiment = edyn.Experiment(display_name='Tutorial')
file_path = pooch.retrieve(
url='https://github.com/easyscience/dynamics-lib/raw/refs/heads/tutorial0/docs/docs/tutorials/data/fake_simple_data.hdf5',
known_hash='b49944c4447e69be4d30d1bed935173c4a1727c25a347285cbb156edc76ee261',
)
experiment.load_hdf5(filename=file_path)
Downloading data from 'https://github.com/easyscience/dynamics-lib/raw/refs/heads/tutorial0/docs/docs/tutorials/data/fake_simple_data.hdf5' to file '/home/runner/.cache/pooch/21fc99c50611838d312c8e9c054e0399-fake_simple_data.hdf5'.
We can visualize the data in multiple ways, relying on plopp: https://scipp.github.io/plopp/
For now, let us plot the data using a slicer showing intensity as function of energy for various Q. You can dragg the slider to choose which $Q$ is displayed. Notice that the title is the display_name that we gave the Experiment.
💡 Tip
experiment.plot_data(slicer=True)
The data looks somewhat noisy. To improve statistics, we can rebin it. The method takes a dictionary of coordinates and the new bins. The new bins can be given as an integer number of bins or as a scipp.Variable (https://scipp.github.io/generated/classes/scipp.Variable.html#scipp.Variable) of bin-edges. For now, we just use integers, choosing 16 bins in $Q$ and 256 bins in energy.
experiment.rebin({'Q': 16, 'energy': 256})
experiment.plot_data(slicer=True)
💡 Tip
In this data we see a single Gaussian shaped peak and a background that seems to be zero on average. We now want to fit this data, e.g. to determine how the Gaussian changes with $Q$. We define a Gaussian like this:
gaussian = sm.Gaussian(display_name='Gaussian', area=1, width=0.05)
💡 Tip
💡 Tip
We would like to fit this model to our data. However, it would be tedious to copy/paste the model for 16 values of $Q$, especially when the model grows more complicated. Instead, EasyDynamics handles this for us; we simply have to create a SampleModel and pass it our component.
model = sm.SampleModel(components=gaussian)
Our SampleModel does not yet have any $Q$ values. We could tell it what $Q$ is by using the Q attribute.
Q_values=sc.linspace(start=0.2,stop=2.1,num=16,unit='1/angstrom', dim='Q')
model.Q=Q_values
This would generate a copy of the model for each of the values of $Q$.
However, since we want the $Q$ values to match those of the experiment, it is much easier to let EasyDynamics handle it. To do this, we make an Analysis object and give it out experiment and sample model:
analysis = Analysis(
experiment=experiment,
sample_model=model,
)
A lot happens under the hood in this step. It automatically extracts the $Q$ values from the experiment and passes them to our SampleModel, which in turn generates a copy of the Gaussian for each $Q$. We furthermore create an Analysis1d object for each $Q$. You will usually not need to think too hard about any of this, since you will mostly interact with your data and model through the Analysis object, but it can be good to know what is going on.
We can access a list of the the Analysis1d objects like this:
analysis_list = analysis.analysis_list
Let us plot our data and model using the plopp slicer:
analysis.plot_data_and_model()
The model is of course not yet particularly good, since we didn't try very hard to guess what the parameters should be. However, we can now start fitting it using the fit method.
Let us first fit a single $Q$ index and plot the data and model to see how it looks. We choose an arbitrary $Q$ and plot only that one
fit_result_independent_single_Q = analysis.fit(Q_index=5)
analysis.plot_data_and_model(Q_index=5)
The fit looks very good. We can get a list of the parameters for this fit by accesing the corresponding Analysis1d object:
analysis.analysis_list[5].get_all_parameters()
[<Parameter 'Gaussian area': 3.5814 ± 0.0570 meV, bounds=[0.0:inf]>, <Parameter 'Gaussian center': 0.0000 meV (fixed), bounds=[-inf:inf]>, <Parameter 'Gaussian width': 0.0985 ± 0.0018 meV, bounds=[1e-10:inf]>, <Parameter 'energy_offset': -0.0002 ± 0.0017 meV, bounds=[-inf:inf]>]
Note
Since the fit looked good, we can now fit all $Q$. We also plot the result, again using the slicer.
fit_result_all_Q = analysis.fit()
analysis.plot_data_and_model()
Information about the fit is stored in the output. It is stored as a list of EasyScience FitResults. Here we show a few of the relevant properties:
print(f'The reduced chi-squared value for Q_index=5 is: {fit_result_all_Q[5].reduced_chi}')
print(f'The minimizer engine is: {fit_result_all_Q[5].minimizer_engine}')
The reduced chi-squared value for Q_index=5 is: 0.9807375513271296 The minimizer engine is: <class 'easyscience.fitting.minimizers.minimizer_lmfit.LMFit'>
It can be nice to inspect the fit parameters to look for trends, and sometimes continue working with them. To do this, we convert the parameters to a scipp Dataset (https://scipp.github.io/generated/classes/scipp.Dataset.html#scipp.Dataset)
analysis.parameters_to_dataset()
- Q: 16
- Q(Q)float641/Å0.259, 0.378, ..., 1.922, 2.041
Values:
array([0.259375, 0.378125, 0.496875, 0.615625, 0.734375, 0.853125, 0.971875, 1.090625, 1.209375, 1.328125, 1.446875, 1.565625, 1.684375, 1.803125, 1.921875, 2.040625])
- Gaussian area(Q)float64meV3.668, 3.725, ..., 3.411, 3.262σ = 0.063, 0.057, ..., 0.065, 0.065
Values:
array([3.66804026, 3.72542713, 3.67179924, 3.62230979, 3.55407715, 3.58137489, 3.52028175, 3.53925636, 3.497424 , 3.45978058, 3.45318365, 3.51750804, 3.4371155 , 3.47612963, 3.41124826, 3.26164046])
Variances (σ²):
array([0.00395623, 0.00323745, 0.00310019, 0.00348887, 0.00304316, 0.00325033, 0.00329041, 0.00443113, 0.00251451, 0.00272246, 0.00302258, 0.00466298, 0.00418327, 0.00458332, 0.00421608, 0.00418692]) - Gaussian center(Q)float64meV0.0, 0.0, ..., 0.0, 0.0σ = 0.0, 0.0, ..., 0.0, 0.0
Values:
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
Variances (σ²):
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) - Gaussian width(Q)float64meV0.099, 0.098, ..., 0.101, 0.099σ = 0.002, 0.002, ..., 0.002, 0.002
Values:
array([0.0992222 , 0.09825193, 0.09799274, 0.09683621, 0.0973105 , 0.09847969, 0.09789783, 0.09966887, 0.09919424, 0.09757216, 0.09825065, 0.10108336, 0.10085033, 0.10186455, 0.10092218, 0.09935962])
Variances (σ²):
array([3.86335653e-06, 2.37053045e-06, 3.21198395e-06, 3.20177060e-06, 3.07556350e-06, 3.39158580e-06, 3.70282110e-06, 4.00010581e-06, 2.64571042e-06, 3.19461582e-06, 2.25942250e-06, 3.87450474e-06, 4.99342370e-06, 5.27613814e-06, 4.88824724e-06, 5.21159474e-06]) - energy_offset(Q)float64meV-0.000, -0.000, ..., -0.004, 7.406e-05σ = 0.002, 0.002, ..., 0.002, 0.002
Values:
array([-3.33302487e-04, -4.18205057e-04, -3.82264445e-04, -2.54040748e-03, -1.29933233e-04, -2.22715154e-04, -1.13577443e-04, -4.50183223e-03, 2.50929987e-04, 2.44849330e-03, -1.97167770e-03, 3.09142330e-03, 1.73187181e-03, 2.28486773e-03, -3.53259545e-03, 7.40603941e-05])
Variances (σ²):
array([3.83888797e-06, 3.63991826e-06, 2.84162216e-06, 3.58814208e-06, 2.91288914e-06, 2.90404282e-06, 2.64424921e-06, 5.19461927e-06, 2.69758793e-06, 2.63655524e-06, 3.63463210e-06, 6.53080162e-06, 4.71156639e-06, 5.23464620e-06, 4.74489173e-06, 5.08537997e-06])
We can also plot the parameters as a function of Q using the plot_parameters method.
analysis.plot_parameters(names=['Gaussian area'])
It will soon be possible to use EasyDynamics to fit these parameters to e.g. a polynomial.