Soil incubation experiments are used to estimate parameters of physical models of carbon flow in soil under different conditions. In the laboratory, external conditions such as temperature and moisture can be controlled, as can inputs such as leaf litter or microbial content and diversity. Measurements of carbon emissions are taken over time, perhaps along with other information such as microbial enzyme activity.
Most soil-carbon physical models approximate soil heterogeneity with discrete pools of carbon. For example, a two-compartment model posits a labile pool of carbon with readily available carbon and a recalcitrant pool from which carbon evolves more slowly. Systems may also posit additional pools for microbes, enzymes, and other physical features of soil ecology. Some of these pools may be measured. System dynamics are modeled with differential equations describing flows among the compartments.
Statistical models are overlayed on the deterministic physical models to represent measurement error and unmodeled variation among replicated experiments. Bayesian statistical inference can be applied to estimate parameters based on combining experimental measurements with prior scientific knowledge of soil carbon dynamics or the experimental mechanism. Poststratification may then be used to predict behavior on regional or global scales based on information about biogeochemical conditions, which are typically estimated for geographical grids on the 10–100 kilometer square scale.
Stan provides an expressive modeling language for statistical models including a full linear algebra and statistical library along with ordinary differential equation solvers. Stan also provides an efficient and scalable, fully Bayesian, inference mechanism based on adaptive Hamiltonian Monte Carlo sampling. Stan is integrated into the R statistical programming environment through the RStan interface.
The rest of this document describes
Sierra and Müller (2014) performed a soil incubation experiment in which they took a series of 25 measurements over time of evolved CO2. At each time point, multiple readings were taken from replicates and aggregated into a mean and standard deviation.
The data can be read in through Sierra and Müller’s (2014) SoilR package.
library(SoilR,quietly=TRUE);
totalC_t0 <- 7.7; # not included in data, so hard code here
t0 <- 0;
N_t <- 25; # calculated by inspection
ts <- eCO2[1:25,2];
eCO2mean <- eCO2[1:25,3];
eCO2sd <- eCO2[1:25,4];
library(ggplot2,quietly=TRUE);
df <- data.frame(list(ts=ts,eCO2mean=eCO2mean,eCO2sd=eCO2sd));
interval_95pct <- aes(ymin = eCO2mean + 1.96 * eCO2sd,
ymax = eCO2mean - 1.96 * eCO2sd);
plot_data <-
ggplot(df, aes(x=ts, y=eCO2mean)) +
geom_point() +
geom_errorbar(interval_95pct, width=0, colour="blue") +
scale_x_continuous(name="time (days)") +
scale_y_continuous(name="evolved CO2 (mgC g-1 soil)") +
ggtitle("Evolved CO2 Measurements (with 95% intervals)");
plot(plot_data);
Sierra and Müller (2014) present a two-pool model with decomposition rates for each pool as well as transfer rates from one pool to another.
Let \(C(t) = (C_1(t), C_2(t))\) be the concentration of carbon in pools 1 and 2. The model assumes pool-specific decomposition rates \(\kappa_1, \kappa_2\) and transfer rates \(\alpha_{2,1}\) and \(\alpha_{1,2}\) from pool 2 to pool 1 and vice-versa. The system dynamics are modeled through the equations \[ \frac{d C_1(t)}{dt} = - \kappa_1 \, C_1(t) + \alpha_{2,1} \, \kappa_2 \, C_2(t) \] and \[ \frac{d C_2(t)}{dt} = - \kappa_2 \, C_2(t) + \alpha_{1,2} \, \kappa_1 \, C_1(t). \]
Given initial carbon concentrations \(C(0)\) at \(t = 0\) and values for parameters \(\kappa, \alpha\), these equations may be solved for \(C(t_n)\) at measurement times \(t_n > 0\).
Because the two pools are theoretical constructs to account for soil heterogeneity, they are not measured directly. Instead, only the total initial carbon \(C_1(0) + C_2(0)\) is measured, not how it is partitioned between the two pools.
For each measurement time \(t_n\), the total carbon evolved from the system since the initial time \(t = 0\) is measured and recorded as \(\mbox{eCO}_2(t_n)\). This measurement is of the quantity predicted by the system of differential equations to be the total initial carbon minus what is left in the pools at time \(t_n\), \[ \left[ C_1(0) + C_2(0) \right] - \left[ C_1(t_n) + C_2(t_n) \right]. \] The SoilR package provides the \(\mbox{eCO2}_2(t_n)\) measurements.
The statistical model assumes the following parameters.
Param | Range | Description |
---|---|---|
\(\gamma\) | \((0,1)\) | proportion of initial carbon in pool 1 |
\(\kappa_1\) | \((0,\infty)\) | decomposition rate (pool 1) |
\(\kappa_2\) | \((0,\infty)\) | decomposition rate (pool 2) |
\(\alpha_{2,1}\) | \((0,\infty)\) | transfer rate (pool 2 to 1) |
\(\alpha_{1,2}\) | \((0,\infty)\) | transfer rate (pool 1 to 2) |
\(\sigma\) | \((0,\infty)\) | residual error scale |
Weakly informative priors are supplied for all parameters, whose rough values are known from previous experiments.
\[\gamma \sim \mbox{Beta}(10,1)\]
\[\kappa_1, \kappa_2, \alpha_{1,2}, \alpha_{2,1} \sim \mbox{Normal}(0,1)\]
\[\sigma \sim \mbox{Cauchy}(0,1)\]
The sampling distribution is a standard predictive error model assuming normal error of predictions,
\[\mbox{eCO}_2(t) \sim \mbox{Normal}(\hat{\mbox{eCO}_2}(t), \sigma),\]
where \(\mbox{eCO}_2(t)\) is the measured evolved carbon and \(\hat{\mbox{eCO}_2}(t)\) is the estimate of evolved carbon based on the solutions \(C(t)\) to the system equations given initial condition \(C(0)\) and model parameters \((\gamma,\kappa,\alpha,\sigma)\).
With this model, the noise scale \(\sigma\) is accounting both for measurement error in \(\mbox{eCO}_2(t)\) and the misprediction of the model with estimated parameters.
By Bayes’s rule, the posterior distribution of the parameters given the observations is \[ p(\gamma,\kappa,\alpha,\sigma \, | \, \mbox{eCO}_2,C(0)) \ \propto \ p(\mbox{eCO}_2 \, | \, \gamma,\kappa,\alpha,\sigma,C(0)) \ p(\gamma,\kappa,\alpha,\sigma). \] Here, \(\mbox{eCO}_2\) is the full sequence of evolved carbon measurements.
The following is a dump of a file containing the Stan program to implement the 2-pool model with feedback as described above; the model code is fully documented, and is explained at more length below.
file_path <- "soil_incubation.stan";
lines <- readLines(file_path, encoding="ASCII");
for (n in 1:length(lines)) cat(lines[n],'\n');
functions {
/**
* ODE system for two pool model with feedback and no inputs.
*
* This is the version that does not deal with measurement error.
*
* System State C is two dimensional with C[1] and C[2]
* being carbon in pools 1 and 2.
*
* The system has parameters
*
* theta = (k1, k2, alpha21, alpha12)
*
* where
*
* k1: pool 1 decomposition rate
* k2: pool 2 decomposition rate
* alpha21: transfer coefficient from pool 2 to pool 1
* alpha12: transfer coefficient from pool 1 to pool 2
*
* The system time derivatives are
*
* d.C[1] / d.t = -k1 * C[1] + alpha12 * k2 * C[2]
*
* d.C[2] / d.t = alpha21 * k1 * C[1] - k2 * C[2]
*
* @param t time at which derivatives are evaluated.
* @param C system state at which derivatives are evaluated.
* @param theta parameters for system.
* @param x_r real constants for system (empty).
* @param x_i integer constants for system (empty).
*/
real[] two_pool_feedback(real t, real[] C, real[] theta,
real[] x_r, int[] x_i) {
real k1;
real k2;
real alpha21;
real alpha12;
real dC_dt[2];
k1 <- theta[1];
k2 <- theta[2];
alpha21 <- theta[3];
alpha12 <- theta[4];
dC_dt[1] <- -k1 * C[1] + alpha12 * k2 * C[2];
dC_dt[2] <- - k2 * C[2] + alpha21 * k1 * C[1] ;
return dC_dt;
}
/**
* Compute total evolved CO2 from the system given the specified
* parameters and times. This is done by simulating the system
* defined by the ODE function two_pool_feedback and then
* subtracting the sum of the CO2 estimated in each pool from the
* initial CO2.
*
* @param T number of times.
* @param t0 initial time.
* @param ts observation times.
* @param gamma partitioning coefficient.
* @param k1 decomposition rate for pool 1
* @param k2 decomposition rate for pool 2
* @param alpha21 transfer coefficient from pool 2 to 1
* @param alpha12 transfer coefficient from pool 1 to 2
* @param x_r real data (empty)
* @param x_i integer data (empty)
* @return evolved CO2 for times ts
*/
real[] evolved_CO2(int N_t, real t0, real[] ts,
real gamma, real totalC_t0,
real k1, real k2,
real alpha21, real alpha12,
real[] x_r, int[] x_i) {
real C_t0[2]; // initial state
real theta[4]; // ODE parameters
real C_hat[N_t,2]; // predicted pool content
real eCO2_hat[N_t];
C_t0[1] <- gamma * totalC_t0;
C_t0[2] <- (1 - gamma) * totalC_t0;
theta[1] <- k1;
theta[2] <- k2;
theta[3] <- alpha21;
theta[4] <- alpha12;
C_hat <- integrate_ode(two_pool_feedback,
C_t0, t0, ts, theta, x_r, x_i);
for (t in 1:N_t)
eCO2_hat[t] <- totalC_t0 - sum(C_hat[t]);
return eCO2_hat;
}
}
data {
real<lower=0> totalC_t0; // initial total carbon
real t0; // initial time
int<lower=0> N_t; // number of measurement times
real<lower=t0> ts[N_t]; // measurement times
real<lower=0> eCO2mean[N_t]; // measured cumulative evolved carbon
}
transformed data {
real x_r[0]; // no real data for ODE system
int x_i[0]; // no integer data for ODE system
}
parameters {
real<lower=0> k1; // pool 1 decomposition rate
real<lower=0> k2; // pool 2 decomposition rate
real<lower=0> alpha21; // transfer coeff from pool 2 to 1
real<lower=0> alpha12; // transfer coeff from pool 1 to 2
real<lower=0,upper=1> gamma; // partitioning coefficient
real<lower=0> sigma; // observation std dev
}
transformed parameters {
real eCO2_hat[N_t];
eCO2_hat <- evolved_CO2(N_t, t0, ts, gamma, totalC_t0,
k1, k2, alpha21, alpha12, x_r, x_i);
}
model {
// priors
gamma ~ beta(10,1); // identifies pools
k1 ~ normal(0,1); // weakly informative
k2 ~ normal(0,1);
alpha21 ~ normal(0,1);
alpha12 ~ normal(0,1);
sigma ~ cauchy(0,1);
// likelihood
for (t in 1:N_t)
eCO2mean[t] ~ normal(eCO2_hat[t], sigma); // normal error
}
The Stan program begins by defining two convenience functions in the functions block.
two_pool_feedback()
computes the solutions to the system of differential equations given the model parameters, initial conditions, and requested solution times in an array. The second.
evolved_CO2()
computes the cumulative evolved CO2 given the compartment concentrations computed by the first function.
The data block declares the data that is read in, including the initial total carbon, initial time, measurement times, and measured total evolved carbon. The parameter block declares parameters with constraints as defined above. The transformed parameter block is used to compute the evolved CO2 estimate given the system equations and parameter values. Finally, the model block just repeats the sampling distributions for all of the parameters.
library(rstan);
fit <- stan("soil_incubation.stan",
data=c("totalC_t0", "t0", "N_t", "ts", "eCO2mean"),
control=list(adapt_delta=0.90,
stepsize=0.005),
chains=2, iter=200, seed=1234);
##
## TRANSLATING MODEL 'soil_incubation' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'soil_incubation' NOW.
##
## SAMPLING FOR MODEL 'soil_incubation' NOW (CHAIN 1).
##
## Iteration: 1 / 200 [ 0%] (Warmup)
## Iteration: 20 / 200 [ 10%] (Warmup)
## Iteration: 40 / 200 [ 20%] (Warmup)
## Iteration: 60 / 200 [ 30%] (Warmup)
## Iteration: 80 / 200 [ 40%] (Warmup)
## Iteration: 100 / 200 [ 50%] (Warmup)
## Iteration: 101 / 200 [ 50%] (Sampling)
## Iteration: 120 / 200 [ 60%] (Sampling)
## Iteration: 140 / 200 [ 70%] (Sampling)
## Iteration: 160 / 200 [ 80%] (Sampling)
## Iteration: 180 / 200 [ 90%] (Sampling)
## Iteration: 200 / 200 [100%] (Sampling)
## # Elapsed Time: 10.4934 seconds (Warm-up)
## # 5.7446 seconds (Sampling)
## # 16.238 seconds (Total)
##
##
## SAMPLING FOR MODEL 'soil_incubation' NOW (CHAIN 2).
##
## Iteration: 1 / 200 [ 0%] (Warmup)
## Iteration: 20 / 200 [ 10%] (Warmup)
## Iteration: 40 / 200 [ 20%] (Warmup)
## Iteration: 60 / 200 [ 30%] (Warmup)
## Iteration: 80 / 200 [ 40%] (Warmup)
## Iteration: 100 / 200 [ 50%] (Warmup)
## Iteration: 101 / 200 [ 50%] (Sampling)
## Iteration: 120 / 200 [ 60%] (Sampling)
## Iteration: 140 / 200 [ 70%] (Sampling)
## Iteration: 160 / 200 [ 80%] (Sampling)
## Iteration: 180 / 200 [ 90%] (Sampling)
## Iteration: 200 / 200 [100%] (Sampling)
## # Elapsed Time: 8.25357 seconds (Warm-up)
## # 14.8398 seconds (Sampling)
## # 23.0934 seconds (Total)
The following R code will print the summary of a model fit; explanations below.
options(width="100");
print(fit,digits=2);
## Inference for Stan model: soil_incubation.
## 2 chains, each with iter=200; warmup=100; thin=1;
## post-warmup draws per chain=100, total post-warmup draws=200.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## k1 0.27 0.11 0.35 0.07 0.09 0.12 0.21 1.42 10 1.18
## k2 0.49 0.21 0.42 0.10 0.19 0.34 0.61 1.73 4 1.14
## alpha21 0.64 0.28 0.42 0.02 0.21 0.69 0.99 1.45 2 1.60
## alpha12 0.79 0.52 0.68 0.03 0.26 0.54 1.43 2.28 2 1.86
## gamma 0.90 0.04 0.09 0.64 0.86 0.93 0.98 1.00 5 1.26
## sigma 0.36 0.00 0.06 0.27 0.32 0.34 0.39 0.49 138 1.00
## eCO2_hat[1] 0.28 0.02 0.15 -0.04 0.18 0.30 0.40 0.50 55 1.01
## eCO2_hat[2] 0.61 0.02 0.17 0.25 0.49 0.62 0.75 0.89 98 1.00
## eCO2_hat[3] 1.81 0.01 0.16 1.48 1.70 1.82 1.92 2.09 200 1.00
## eCO2_hat[4] 2.22 0.01 0.15 1.92 2.11 2.22 2.31 2.46 200 1.01
## eCO2_hat[5] 2.59 0.01 0.14 2.32 2.50 2.60 2.69 2.83 200 1.01
## eCO2_hat[6] 2.93 0.01 0.13 2.68 2.85 2.94 3.01 3.16 145 1.01
## eCO2_hat[7] 3.26 0.01 0.12 3.03 3.18 3.27 3.35 3.47 142 1.01
## eCO2_hat[8] 4.13 0.01 0.10 3.93 4.06 4.13 4.21 4.32 115 1.00
## eCO2_hat[9] 4.39 0.01 0.10 4.19 4.33 4.39 4.45 4.58 116 0.99
## eCO2_hat[10] 4.60 0.01 0.10 4.42 4.55 4.60 4.66 4.78 115 0.99
## eCO2_hat[11] 4.82 0.01 0.10 4.63 4.76 4.82 4.88 5.00 112 1.00
## eCO2_hat[12] 5.02 0.01 0.10 4.83 4.96 5.02 5.08 5.20 108 1.00
## eCO2_hat[13] 5.56 0.01 0.10 5.35 5.50 5.56 5.63 5.75 86 1.01
## eCO2_hat[14] 5.71 0.01 0.10 5.50 5.65 5.71 5.77 5.91 80 1.02
## eCO2_hat[15] 5.85 0.01 0.10 5.63 5.79 5.85 5.91 6.06 76 1.02
## eCO2_hat[16] 5.98 0.01 0.10 5.77 5.92 5.98 6.04 6.19 71 1.03
## eCO2_hat[17] 6.10 0.01 0.10 5.88 6.04 6.10 6.16 6.30 68 1.03
## eCO2_hat[18] 6.42 0.01 0.10 6.22 6.37 6.42 6.48 6.63 52 1.04
## eCO2_hat[19] 6.52 0.01 0.10 6.32 6.47 6.52 6.58 6.72 50 1.04
## eCO2_hat[20] 6.59 0.01 0.10 6.39 6.54 6.59 6.65 6.79 49 1.05
## eCO2_hat[21] 6.67 0.01 0.10 6.48 6.62 6.67 6.73 6.86 47 1.05
## eCO2_hat[22] 6.74 0.01 0.09 6.56 6.70 6.74 6.80 6.93 46 1.05
## eCO2_hat[23] 7.24 0.02 0.07 7.11 7.20 7.24 7.29 7.38 19 1.06
## eCO2_hat[24] 7.28 0.02 0.07 7.15 7.24 7.27 7.32 7.40 19 1.07
## eCO2_hat[25] 7.31 0.01 0.06 7.18 7.27 7.30 7.34 7.43 19 1.07
## lp__ 3.02 0.32 1.86 -1.11 2.04 3.22 4.25 6.10 33 1.09
##
## Samples were drawn using NUTS(diag_e) at Thu Nov 27 15:06:02 2014.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
There is one row per parameter, with the following columns, with statistics computed over all draws making up the sample.
In addition to the parameters, the row lp__
indicates the log density evaluated by the model; note that this quantity is equal to the log posterior up to an additive constant.
Column | Description |
---|---|
mean |
mean |
sd |
standard deviation |
N% |
percentiles (N/100 quantile) |
50% |
median |
n_eff |
number of effective samples |
R_hat |
potential scale reduction statistic (\(\hat{R}\)) |
se_mean |
standard error of posterior mean estimate (sd / sqrt(n_eff) ) |
Inspecting the output, the Stan implementation provided comparable posterior estimates to those estimated with SoilR by Sierra and Müller (2014). This is a nice sanity check on both implementations.
traceplot(fit,c("k1","k2","alpha21","alpha12","gamma","sigma"));
The warmup draws are highlighted with grey background; rather than sampling, Stan is adapting the sampling parameters (discretization interval and mass matrix). The traceplots during the sampling stage indicate Stan is mixing well.
If Stan does not mix well, try lowering the initial step size, increasing the target acceptance rate, and increasing the number of warmup iterations.
The pairs()
function in RStan displays a grid of plots. The diagonal contains a histogram of the posterior samples for each parameter; the above and below diagonals provide scatterplots for pairs of parameters. The above and below plots are based on draws from the two different Markov chains.
pairs(fit, pars=c("k1","k2","alpha21","alpha12","gamma","sigma"));
## KernSmooth 2.23 loaded
## Copyright M. P. Wand 1997-2009
The following figure, based on a figure in Sierra and Müller (2014), overlays the earlier plot of data and measurement error (blue) with the posterior median prediction for evolved CO2 (red) with 95% intervals (yellow) calculated from the median of the residual noise (\(\sigma\)) in the posterior.
library(ggplot2);
fit_ss <- extract(fit);
sigma_hat <- median(fit_ss$sigma);
eCO2_hat <- rep(NA,N_t);
for (t in 1:N_t) {
eCO2_hat[t] <- median(fit_ss$eCO2_hat[,t]);
}
df_post <- data.frame(list(ts = ts,
eCO2meas = eCO2mean,
eCO2_hat = eCO2_hat));
library(ggplot2);
interval_95pct <- aes(ymin = eCO2mean + 1.96 * eCO2sd,
ymax = eCO2mean - 1.96 * eCO2sd);
ggplot(df_post, aes(x = ts)) +
geom_ribbon(aes(ymin = eCO2_hat - 1.96 * sigma_hat,
ymax = eCO2_hat + 1.96 * sigma_hat),
fill="lightyellow") +
geom_line(aes(y=eCO2_hat),colour="darkred") +
geom_point(aes(y=eCO2meas),colour="darkblue") +
geom_errorbar(interval_95pct, width=0, colour="blue") +
labs(x="time (days)",
y="evolved CO2 (mgC g-1 soil)") +
ggtitle("Soil Incubation: 2 pools, feedback\nposterior median (red), predictive 95% (yellow)");
It is straightforward to add a measurement error component to the model.
file_path <- "soil_incubation_measurement_err.stan";
lines <- readLines(file_path, encoding="ASCII");
for (n in 1:length(lines)) cat(lines[n],'\n');
functions {
real[] two_pool_feedback(real t, real[] C, real[] theta,
real[] x_r, int[] x_i) {
real k1;
real k2;
real alpha21;
real alpha12;
real dC_dt[2];
k1 <- theta[1];
k2 <- theta[2];
alpha21 <- theta[3];
alpha12 <- theta[4];
dC_dt[1] <- -k1 * C[1] + alpha12 * k2 * C[2];
dC_dt[2] <- alpha21 * k1 * C[1] - k2 * C[2];
return dC_dt;
}
real[] evolved_CO2(int N_t, real t0, real[] ts,
real gamma, real totalC_t0,
real k1, real k2,
real alpha21, real alpha12,
real[] x_r, int[] x_i) {
real C_t0[2]; // initial state
real theta[4]; // ODE parameters
real C_hat[N_t,2]; // predicted pool content
real eCO2_hat[N_t];
C_t0[1] <- gamma * totalC_t0;
C_t0[2] <- (1 - gamma) * totalC_t0;
theta[1] <- k1;
theta[2] <- k2;
theta[3] <- alpha21;
theta[4] <- alpha12;
C_hat <- integrate_ode(two_pool_feedback,
C_t0, t0, ts, theta, x_r, x_i);
for (t in 1:N_t)
eCO2_hat[t] <- totalC_t0 - sum(C_hat[t]);
return eCO2_hat;
}
}
data {
real<lower=0> totalC_t0; // initial total carbon
real t0; // initial time
int<lower=0> N_t; // number of measurement times
real<lower=t0> ts[N_t]; // measurement times
vector<lower=0>[N_t] eCO2mean; // measured cumulative evolved carbon
real<lower=0> eCO2sd[N_t]; // measured cumulative evolved carbon
}
transformed data {
real x_r[0]; // no real data for ODE system
int x_i[0]; // no integer data for ODE system
}
parameters {
real<lower=0> k1; // pool 1 decomposition rate
real<lower=0> k2; // pool 2 decomposition rate
real<lower=0> alpha21; // transfer coeff from pool 2 to 1
real<lower=0> alpha12; // transfer coeff from pool 1 to 2
real<lower=0,upper=1> gamma; // partitioning coefficient
real<lower=0> sigma; // observation std dev
vector<lower=0>[N_t] eCO2; // evolved CO2
}
transformed parameters {
real eCO2_hat[N_t];
eCO2_hat <- evolved_CO2(N_t, t0, ts, gamma, totalC_t0,
k1, k2, alpha21, alpha12, x_r, x_i);
}
model {
gamma ~ beta(10,1); // identifies pools
k1 ~ normal(0,1); // weakly informative
k2 ~ normal(0,1);
alpha21 ~ normal(0,1);
alpha12 ~ normal(0,1);
sigma ~ normal(0,1);
// likelihood is INTEGRAL_(0,inf) p(eCO2,eCO2mean|...) d.eCO2
eCO2 ~ normal(eCO2_hat, sigma);
eCO2mean ~ normal(eCO2, eCO2sd); // measurement error
}
This can be fit as before; note the additional data variable, eCO2sd
.
fit_me <- stan("soil_incubation_measurement_err.stan",
data=c("totalC_t0", "t0", "N_t", "ts", "eCO2mean", "eCO2sd"),
control=list(adapt_delta=0.90,
stepsize=0.005),
chains=2, iter=100, seed=1234);
##
## TRANSLATING MODEL 'soil_incubation_measurement_err' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'soil_incubation_measurement_err' NOW.
##
## SAMPLING FOR MODEL 'soil_incubation_measurement_err' NOW (CHAIN 1).
##
## Iteration: 1 / 100 [ 1%] (Warmup)
## Iteration: 10 / 100 [ 10%] (Warmup)
## Iteration: 20 / 100 [ 20%] (Warmup)
## Iteration: 30 / 100 [ 30%] (Warmup)
## Iteration: 40 / 100 [ 40%] (Warmup)
## Iteration: 50 / 100 [ 50%] (Warmup)
## Iteration: 51 / 100 [ 51%] (Sampling)
## Iteration: 60 / 100 [ 60%] (Sampling)
## Iteration: 70 / 100 [ 70%] (Sampling)
## Iteration: 80 / 100 [ 80%] (Sampling)
## Iteration: 90 / 100 [ 90%] (Sampling)
## Iteration: 100 / 100 [100%] (Sampling)
## # Elapsed Time: 1.52226 seconds (Warm-up)
## # 1.10213 seconds (Sampling)
## # 2.62439 seconds (Total)
##
##
## SAMPLING FOR MODEL 'soil_incubation_measurement_err' NOW (CHAIN 2).
##
## Iteration: 1 / 100 [ 1%] (Warmup)
## Iteration: 10 / 100 [ 10%] (Warmup)
## Iteration: 20 / 100 [ 20%] (Warmup)
## Iteration: 30 / 100 [ 30%] (Warmup)
## Iteration: 40 / 100 [ 40%] (Warmup)
## Iteration: 50 / 100 [ 50%] (Warmup)
## Iteration: 51 / 100 [ 51%] (Sampling)
## Iteration: 60 / 100 [ 60%] (Sampling)
## Iteration: 70 / 100 [ 70%] (Sampling)
## Iteration: 80 / 100 [ 80%] (Sampling)
## Iteration: 90 / 100 [ 90%] (Sampling)
## Iteration: 100 / 100 [100%] (Sampling)
## # Elapsed Time: 18.2091 seconds (Warm-up)
## # 3.06753 seconds (Sampling)
## # 21.2766 seconds (Total)
Here is the summary of the posterior samples from the measurement error model.
options(width="100");
print(fit_me,digits=2);
## Inference for Stan model: soil_incubation_measurement_err.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## k1 0.32 0.02 0.35 0.08 0.12 0.19 0.36 1.36 404 1.01
## k2 0.47 0.02 0.41 0.09 0.18 0.33 0.63 1.59 564 1.00
## alpha21 1.02 0.02 0.32 0.30 0.88 1.02 1.14 1.77 239 1.01
## alpha12 0.40 0.04 0.32 0.02 0.19 0.36 0.53 1.36 78 1.06
## gamma 0.92 0.00 0.08 0.71 0.89 0.94 0.98 1.00 636 1.01
## sigma 0.29 0.00 0.07 0.18 0.24 0.28 0.33 0.44 618 1.00
## eCO2[1] 0.03 0.00 0.02 0.00 0.02 0.03 0.05 0.08 1119 1.00
## eCO2[2] 0.71 0.00 0.06 0.60 0.67 0.71 0.75 0.83 1859 1.00
## eCO2[3] 1.90 0.01 0.21 1.50 1.76 1.90 2.05 2.33 1291 1.00
## eCO2[4] 1.89 0.00 0.12 1.65 1.81 1.90 1.98 2.13 1197 1.00
## eCO2[5] 2.77 0.00 0.08 2.62 2.72 2.77 2.82 2.93 1530 1.00
## eCO2[6] 2.68 0.00 0.06 2.55 2.64 2.68 2.72 2.80 2000 1.00
## eCO2[7] 3.25 0.00 0.11 3.04 3.17 3.25 3.32 3.46 1296 1.00
## eCO2[8] 3.90 0.00 0.08 3.74 3.85 3.90 3.96 4.06 986 1.00
## eCO2[9] 4.30 0.01 0.22 3.86 4.15 4.30 4.44 4.73 1372 1.00
## eCO2[10] 4.82 0.00 0.06 4.70 4.78 4.82 4.86 4.93 1389 1.00
## eCO2[11] 4.72 0.01 0.22 4.28 4.58 4.73 4.88 5.14 1445 1.00
## eCO2[12] 5.09 0.01 0.20 4.71 4.96 5.09 5.22 5.50 1484 1.00
## eCO2[13] 5.81 0.00 0.18 5.45 5.69 5.81 5.93 6.17 1661 1.00
## eCO2[14] 5.77 0.01 0.27 5.24 5.59 5.76 5.94 6.31 1211 1.00
## eCO2[15] 5.92 0.01 0.25 5.43 5.76 5.91 6.08 6.41 1161 1.00
## eCO2[16] 5.81 0.00 0.02 5.77 5.80 5.81 5.83 5.85 2000 1.00
## eCO2[17] 5.94 0.01 0.21 5.52 5.81 5.95 6.08 6.33 1226 1.01
## eCO2[18] 6.31 0.01 0.24 5.82 6.15 6.31 6.46 6.76 1342 1.00
## eCO2[19] 6.63 0.00 0.20 6.25 6.49 6.63 6.77 7.01 1859 1.00
## eCO2[20] 6.61 0.01 0.23 6.20 6.45 6.60 6.75 7.06 1256 1.00
## eCO2[21] 6.58 0.01 0.26 6.05 6.41 6.58 6.76 7.08 1503 1.00
## eCO2[22] 6.80 0.00 0.13 6.54 6.71 6.80 6.89 7.06 1529 1.00
## eCO2[23] 7.85 0.01 0.20 7.47 7.72 7.85 7.99 8.24 1109 1.00
## eCO2[24] 7.61 0.00 0.16 7.31 7.50 7.60 7.71 7.92 1603 1.00
## eCO2[25] 7.63 0.00 0.05 7.52 7.59 7.63 7.66 7.73 2000 1.00
## eCO2_hat[1] 0.22 0.01 0.16 -0.12 0.13 0.24 0.33 0.48 798 1.00
## eCO2_hat[2] 0.54 0.01 0.17 0.21 0.43 0.54 0.65 0.85 1011 1.00
## eCO2_hat[3] 1.75 0.00 0.15 1.48 1.66 1.75 1.85 2.05 1193 1.00
## eCO2_hat[4] 2.17 0.00 0.14 1.92 2.08 2.17 2.26 2.45 1178 1.00
## eCO2_hat[5] 2.56 0.00 0.13 2.32 2.48 2.56 2.64 2.82 1196 1.00
## eCO2_hat[6] 2.91 0.00 0.12 2.69 2.83 2.91 2.98 3.14 1248 1.00
## eCO2_hat[7] 3.25 0.00 0.11 3.04 3.18 3.25 3.32 3.47 1341 1.00
## eCO2_hat[8] 4.13 0.00 0.10 3.95 4.07 4.13 4.20 4.33 1588 1.00
## eCO2_hat[9] 4.40 0.00 0.10 4.22 4.33 4.40 4.46 4.59 1595 1.00
## eCO2_hat[10] 4.62 0.00 0.10 4.44 4.55 4.62 4.68 4.81 1577 1.00
## eCO2_hat[11] 4.84 0.00 0.10 4.66 4.77 4.83 4.90 5.03 1539 1.00
## eCO2_hat[12] 5.04 0.00 0.10 4.86 4.98 5.04 5.11 5.23 1491 1.00
## eCO2_hat[13] 5.59 0.00 0.10 5.39 5.52 5.59 5.66 5.78 1338 1.00
## eCO2_hat[14] 5.74 0.00 0.10 5.54 5.67 5.73 5.81 5.93 1286 1.01
## eCO2_hat[15] 5.88 0.00 0.10 5.68 5.81 5.88 5.95 6.07 1180 1.01
## eCO2_hat[16] 6.01 0.00 0.10 5.81 5.94 6.01 6.08 6.21 1132 1.01
## eCO2_hat[17] 6.13 0.00 0.10 5.93 6.06 6.13 6.20 6.33 1095 1.01
## eCO2_hat[18] 6.46 0.00 0.10 6.26 6.39 6.46 6.53 6.65 909 1.01
## eCO2_hat[19] 6.55 0.00 0.10 6.35 6.48 6.55 6.62 6.74 890 1.01
## eCO2_hat[20] 6.62 0.00 0.10 6.43 6.56 6.62 6.69 6.81 876 1.01
## eCO2_hat[21] 6.70 0.00 0.10 6.51 6.64 6.70 6.77 6.89 862 1.01
## eCO2_hat[22] 6.78 0.00 0.09 6.59 6.71 6.78 6.84 6.95 849 1.01
## eCO2_hat[23] 7.27 0.00 0.07 7.13 7.22 7.27 7.31 7.40 781 1.01
## eCO2_hat[24] 7.30 0.00 0.07 7.16 7.25 7.30 7.34 7.42 778 1.01
## eCO2_hat[25] 7.33 0.00 0.06 7.20 7.29 7.33 7.37 7.45 774 1.01
## lp__ 29.72 0.23 4.68 19.71 26.79 30.03 33.07 37.77 433 1.00
##
## Samples were drawn using NUTS(diag_e) at Thu Nov 27 16:04:56 2014.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
And the traceplots.
traceplot(fit_me, c("k1","k2","alpha21","alpha12","gamma","sigma"));
There are several next steps which may depend on each other.
Move noise to log scale to respect positivity constraint. This will make it multiplicative instead of additive; verify this with exploratory data analysis comparing residual to value.
Implement a measurement error model for each aggregated observation. This will presuppose an underlying true carbon concentration which is measured with error on several samples to produce the reported mean and standard deviations.
Work directly with the individual measurements in a hierarchical model.
These are models that involve an enzyme pool and biomass data and data from two carbon pools (or maybe just the aggregate).
Get the data into usable format.
Write out math for forward and reverse models.
Write out statistical models.
Implement forward Michaelis-Menten kinetics.
Implement reverse Michaelis-Menten kinetics.
Evaluate mixture model at each observation level, which will require running the integrator to solve the diff eqs just between observation times. Each new integration will have to take up where the last solver left off.
Evaluate change point model going from forward to reverse kinetics. This will require evaluating the integrator over intial and final intervals of time points.
Schädel, C., Y. Luo, R. David Evans, S. Fei, and S. Schaeffer. 2013. Separating soil CO2 efflux into C-pool-specific decay rates via inverse analysis of soil incubation data. Oecologia 171:721–732.
Sierra, Carlos A. and Markus Müller. 2014. Parameter estimation of compartment models in SoilR using classical and Bayesian methods. CRAN Vignette. [pdf]
All output, including plots, was generated directly from the code with the knitr R package using style Cerulean from within the RStudio integrated development environment.
This document is distributed under the CC-BY-3 license. The code is distrbuted under the new BSD license.