Monitoring fits with progress_callback¶
Every easyscience minimizer (LMFit, Bumps, DFO) accepts an optional
progress_callback argument on its fit() call. The callback is invoked
during the optimization with a payload describing the current state, so a
user-interface (or just a notebook) can show progress bars, live plots, or
running parameter values.
The payload is a plain dict with the following keys:
| key | type | meaning |
|---|---|---|
iteration |
int |
Iteration / optimizer-step index (backend-specific) |
chi2 |
float |
Current $\chi^2$ |
reduced_chi2 |
float |
Current reduced $\chi^2$ |
parameter_values |
dict[str, float] |
Snapshot of varied parameters at this iteration |
refresh_plots |
bool |
Hint that the caller may want to refresh plots |
finished |
bool |
Always False during fit (reserved for terminal payloads) |
To make the fit non-trivial we use a model that takes a few hundred objective evaluations to converge: a sum of three Gaussian peaks plus a quadratic background, fitted to noisy synthetic data with starting values that are deliberately off.
import matplotlib.pyplot as plt
import numpy as np
from easyscience import AvailableMinimizers
from easyscience import Fitter
from easyscience import ObjBase
from easyscience import Parameter
A multi-peak model¶
$$ f(x) = a_0 + a_1 x + a_2 x^2 + \sum_{k=1}^{3} A_k \, \exp\!\left(-\frac{(x - \mu_k)^2}{2 \sigma_k^2}\right) $$
Eleven free parameters give the optimizers something to chew on while still converging in well under a second.
class MultiPeak(ObjBase):
a0: Parameter
a1: Parameter
a2: Parameter
A1: Parameter
mu1: Parameter
sigma1: Parameter
A2: Parameter
mu2: Parameter
sigma2: Parameter
A3: Parameter
mu3: Parameter
sigma3: Parameter
def __init__(self, **values):
params = {name: Parameter(name, value) for name, value in values.items()}
super().__init__('multipeak', **params)
def __call__(self, x):
bg = self.a0.value + self.a1.value * x + self.a2.value * x**2
peaks = 0.0
for A, mu, sigma in (
(self.A1.value, self.mu1.value, self.sigma1.value),
(self.A2.value, self.mu2.value, self.sigma2.value),
(self.A3.value, self.mu3.value, self.sigma3.value),
):
peaks = peaks + A * np.exp(-0.5 * ((x - mu) / sigma) ** 2)
return bg + peaks
Generate synthetic data¶
We build a "truth" instance, sample it on a fine grid, and add Gaussian noise to build a realistic dataset.
rng = np.random.default_rng(seed=20260426)
truth = MultiPeak(
a0=0.5,
a1=-0.05,
a2=0.01,
A1=2.0,
mu1=-3.0,
sigma1=0.6,
A2=3.5,
mu2=0.5,
sigma2=0.4,
A3=1.8,
mu3=4.0,
sigma3=1.0,
)
x = np.linspace(-6.0, 7.0, 400)
noise_sigma = 0.05
y_clean = truth(x)
y = y_clean + rng.normal(scale=noise_sigma, size=x.size)
weights = np.full_like(x, 1.0 / noise_sigma)
fig, ax = plt.subplots(figsize=(8, 3))
ax.plot(x, y, '.', ms=3, label='data')
ax.plot(x, y_clean, 'k-', lw=1, label='truth')
ax.set(xlabel='x', ylabel='y')
ax.legend()
plt.show()
A starting model with off-by-a-bit parameters¶
def make_initial_model() -> MultiPeak:
model = MultiPeak(
a0=0.0,
a1=0.0,
a2=0.0,
A1=1.0,
mu1=-2.0,
sigma1=1.0,
A2=2.0,
mu2=1.0,
sigma2=0.8,
A3=1.0,
mu3=3.0,
sigma3=1.2,
)
for p in model.get_parameters():
p.fixed = False
return model
A simple recording callback¶
The callback receives the payload dict. Here we just append every payload to a list so we can plot
the convergence after the fact.
def make_recorder():
history = []
def callback(payload: dict) -> None:
# Take a shallow copy so subsequent mutations by the backend don't bite us.
history.append({
'iteration': payload['iteration'],
'chi2': payload['chi2'],
'reduced_chi2': payload['reduced_chi2'],
'parameter_values': dict(payload['parameter_values']),
})
return callback, history
Fit with each backend, recording progress¶
def fit_with(backend: AvailableMinimizers):
model = make_initial_model()
fitter = Fitter(model, model)
fitter.switch_minimizer(backend)
fitter.max_evaluations = 5000
callback, history = make_recorder()
result = fitter.fit(x=x, y=y, weights=weights, progress_callback=callback)
return result, history
results = {}
histories = {}
for backend in (AvailableMinimizers.LMFit, AvailableMinimizers.Bumps, AvailableMinimizers.DFO):
res, hist = fit_with(backend)
results[backend.name] = res
histories[backend.name] = hist
print(
f'{backend.name:>6s}: callback fired {len(hist):4d} times, '
f'final reduced chi2 = {res.reduced_chi2:.4f}, '
f'n_evaluations = {res.n_evaluations}'
)
LMFit: callback fired 111 times, final reduced chi2 = 1.0379, n_evaluations = 108
Bumps: callback fired 2831 times, final reduced chi2 = 1.0379, n_evaluations = 3917
DFO: callback fired 141 times, final reduced chi2 = 1.0379, n_evaluations = 141
/home/runner/work/core/core/src/easyscience/fitting/minimizers/minimizer_base.py:273: RuntimeWarning: invalid value encountered in sqrt error_matrix = z * np.sqrt(error_matrix)
Convergence of $\chi^2_\nu$ versus iteration¶
Note that the meaning of iteration differs slightly between backends:
- LMFit counts every objective-function evaluation.
- BUMPS counts optimizer steps. A single step typically triggers several
objective evaluations, so the
n_evaluationsfield onFitResults(the total number of objective calls, kept consistent with LMFit'snfevand DFO'snf) is generally larger than the BUMPS step count returned byiterationhere, and can legitimately exceedmax_evaluations(which is forwarded to BUMPS as itsstepsbudget). The budget-exhaustion check is performed against the consumed step count, not againstn_evaluations. - DFO counts objective evaluations.
All three backends nevertheless deliver a payload after every meaningful update, so a UI can render a smooth-looking progress bar regardless.
fig, ax = plt.subplots(figsize=(8, 4))
for name, hist in histories.items():
its = [h['iteration'] for h in hist]
rc2 = [h['reduced_chi2'] for h in hist]
ax.semilogy(its, rc2, '.-', label=name, ms=4, lw=1)
ax.set(xlabel='iteration (callback index)', ylabel=r'reduced $\chi^2$')
ax.legend()
plt.show()
Real-time progress bar¶
A more typical use of progress_callback is updating a UI element as the
fit runs. Below we drive a simple tqdm progress bar from inside the
callback. Skip this cell if tqdm is not installed.
import time
try:
from tqdm.auto import tqdm
except ImportError: # pragma: no cover - tqdm is an optional convenience
print('tqdm not installed - skipping the progress-bar demo.')
else:
model = make_initial_model()
fitter = Fitter(model, model)
fitter.switch_minimizer(AvailableMinimizers.LMFit)
fitter.max_evaluations = 108
pbar = tqdm(total=fitter.max_evaluations, desc='LMFit', unit='eval')
def progress(payload: dict) -> None:
n = payload['iteration'] - pbar.n
if n > 0:
pbar.update(n)
# wait 0.1 seconds to make the progress bar more visible in the demo
time.sleep(0.01)
pbar.set_postfix(reduced_chi2=f'{payload["reduced_chi2"]:.3g}')
result = fitter.fit(x=x, y=y, weights=weights, progress_callback=progress)
pbar.close()
print(f'final reduced chi2 = {result.reduced_chi2:.4f}')
tqdm not installed - skipping the progress-bar demo.
Final fit overlay¶
Finally, plot each backend's best fit on top of the data.
fig, ax = plt.subplots(figsize=(8, 3.5))
ax.plot(x, y, '.', ms=2, color='0.6', label='data')
for name, res in results.items():
ax.plot(x, res.y_calc, lw=1.2, label=f'{name} fit')
ax.set(xlabel='x', ylabel='y')
ax.legend()
plt.show()