Tutorial: Creating a Master Curve

This tutorial will explore the fundamentals of using the mastercurves package to create a master curve from synthetic data describing the one-dimensional diffusion of an instantaneous point source. The tutorial is based on the example in the introduction from A Data-Driven Method for Automated Data Superposition with Applications in Soft Matter Science, and on this issue in the package repository.

Import Packages

Using the mastercurvse package will always require importing numpy and matplotlib.pyplot. It also requires importing the mastercurves package itself or some modules from the package. We’ll import the essential modules for this tutorial: mastercurves.MasterCurve and mastercurve.transforms.Multiply.

import numpy as np
import matplotlib.pyplot as plt
from mastercurves import MasterCurve
from mastercurves.transforms import Multiply

Creating Synthetic Data

Now, let’s generate some synthetic data from the diffusion equation. We’ll ultimately work with the logarithm of this data, so let’s first define an array of positive \(x\) coordinates:

x = np.linspace(0.1,2)

We’ll also sample from a few positive values of the time:

t_states = [1, 2, 3, 4]

We also need a function to compute the concentration at different \((x,t)\) from the diffusion equation:

diffusion_eq = lambda x, t, M, D : (M/np.sqrt(4*np.pi*D*t))*np.exp(-(x**2)/(4*D*t))

Now, we can create our synthetic data. We’ll just assume that \(M = 1\) and \(D = 1\) in dimensionless units for now:

x_data = [x for t in t_states]
c_data = [diffusion_eq(x, t, 1, 1) for t in t_states]

Lastly, we’ll take the logarithm of our data. The mastercurves package can work with the raw data itself for certain cases, but performance is much better (and the package is more flexible) when working with the logarithm of data that will be shifted by Multiply transforms (check out the Method section of the associated paper to learn why). So, we should almost always take the logarithm of the data before developing the master curve.

x_data = [np.log(xi) for xi in x_data]
c_data = [np.log(ci) for ci in c_data]

Creating the Master Curve

We’re now ready to create a master curve and superpose the data. The first step is to initialize a MasterCurve object. Because our synthetic data is noiseless, we’ll create a MasterCurve with no fixed noise:

mc = MasterCurve(fixed_noise = 0)

The next step is to add the data to the MasterCurve:

mc.add_data(x_data, c_data, t_states)

In the diffusion equation, there is both a dynamic concentration scale and a dynamic length scale. This means that we’ll need to shift both the horizontal and vertical axes by a time-dependent multiplicative shift factor to superpose the data. We can add these Multiply transforms (note that we could pass the argument scale="log" to these transforms to indicate that we’re working with the logarithm of our data, but this is not necessary since "log" is the default scale):

mc.add_htransform(Multiply())
mc.add_vtransform(Multiply())

Finally, we’ll superpose the data using these transforms:

mc.superpose()

Plotting the Master Curve

We can use the built-in plot method to graphically display the data, Gaussian process fits, and master curve:

fig1, ax1, fig2, ax2, fig3, ax3 = mc.plot(colormap = lambda i: plt.cm.Blues_r(i/1.5))

We’ve passed a colormap argument to this method to define a custom colormap to more closely match the figures in the paper. The value of this argument can be any colormap from matplotlib.pyplot.cm.

By default, the plot method will display the data on a logarithmic scale. Here, we’ll adjust to a linear scale to more closely mimic the figures in the paper (using the ax.set_xscale and ax.set_yscale methods). You can see the results below, which show the raw data (left), data with Gaussian process fits (center), and master curve (right).

Analyzing the Shift Factors

An important feature of the mastercurves package is that we may analyze the shift factors used to superpose the data. These shift factors are stored as attributes of the MasterCurve object. We can grab them directly from the object:

a = mc.hparams[0]
b = mc.vparams[0]

Note that we take the zeroth (0) element of the hparams and vparams attributes. This is because these attributes store the shift factors for each transformation added to the MasterCurve, and there may be more than one transformation. These shift factors are stores as a list, with each element containing the inferred shift parameters for each transform. We only have one horizontal and one vertical transform here, so mc.hparams and mc.vparams each have only one element. For mc.hparams, that element is the list of horizontal shift factors for each state (or the vertical shift factors for mc.vparams).

We can plot these shift factors against the state coordinate, \(t\):

fig, ax = plt.subplots(1,1)
ax.plot(t_states, a, 'ro')
ax.plot(t_states, b, 'bo')

Based on the diffusion equation, we expect that these shift factors should follow specific trends with \(t\), namely that they should vary with the inverse square root and square root of time, respectively. We’ll check this by plotting those relationships:

As we see from the plot below, the inferred shift factors indeed closely match the expected behavior!