An adapted Weibull function for agricultural applications

W. Daniel Reynolds; Craig F. Drury; Lori A. Phillips; Xueming Yang; Ikechukwu V. Agomoh

doi:10.1139/cjss-2021-0046

How to translate text using browser tools

25 June 2021 An adapted Weibull function for agricultural applications

W. Daniel Reynolds, Craig F. Drury, Lori A. Phillips, Xueming Yang, Ikechukwu V. Agomoh

Author Affiliations +

Canadian J. of Soil Science, 101(4):680-702 (2021). https://doi.org/10.1139/cjss-2021-0046

Abstract

The Weibull function is applied extensively in the life sciences and engineering but underused in agriculture. The function was consequently adapted to include parameters and metrics that increase its utility for characterizing agricultural processes. The parameters included initial and final dependent variables (Y₀ and Y_F, respectively), initial independent variable (x₀), a scale constant (k), and a shape constant (c). The primary metrics included mode, integral average, domain, skewness, and kurtosis. Nested within the Weibull function are the Mitscherlich and Rayleigh functions where c is fixed at 1 and 2, respectively. At least one of the three models provided an excellent fit to six example agricultural datasets, as evidenced by large adjusted coefficient of determination (R_A² ≥ 0.9266), small normalized mean bias error (MBE_N ≤ 1.49%), and small normalized standard error of regression (SER_N ≤ 8.08%). The Mitscherlich function provided the most probable (P_X) representation of corn (Zea mays L.) yield (P_M = 87.2%); Rayleigh was most probable for soil organic carbon depth profile (P_R = 96.4%); and Weibull was most probable for corn seedling emergence (P_W = 100%), nitrous oxide emissions (P_W = 100%), nitrogen mineralization (P_W = 58.4%), and soil water desorption (P_W = 100%). The Weibull fit to the desorption data was also equivalent to those of the well-established van Genuchten and Groenevelt–Grant desorption models. It was concluded that the adapted Weibull function has good potential for widespread and informative application to agricultural data and processes.

1. Introduction

Agricultural processes are amongst the most complex known, owing to their myriad of interacting chemical, physical, and biological factors. As a result, description and characterization of agricultural processes rely largely on fitted polynomial splines and empirical functions, rather than on theoretical models derived from first principles. Although this may be less than ideal, fitted splines and empirical functions can still be very useful. For example, splines and functions can often simulate the highly complex process of crop seedling emergence and can thereby provide objective estimates of important seed performance indicators such as time of first emergence, duration of emergence, maximum emergence rate, and average emergence rate. Fitted splines and empirical functions can also model strongly non-linear agricultural processes, such as nitrogen dynamics and soil water retention/release, which then serve as essential regulating functions in numerical crop growth and soil gas/water/solute balance models.

However, splines and empirical functions are frequently not equal when applied to agricultural data. Splines often perform unrealistically when applied to data that are irregularly spaced, change rapidly in value, exhibit outliers, and contain appreciable experimental and (or) random error, which are all “hallmark traits” of agricultural datasets. Empirical functions, in contrast, are not unduly perturbed by the above data traits; and when regression fitted, they are particularly adept at separating out the underlying data signal from random noise. As a result, empirical functions are often preferred over splines for describing and characterizing agricultural datasets.

One well-established empirical function that has potentially widespread agricultural applications is the so-called Weibull function. The function was first promoted in a publication by W. Weibull (Weibull 1951) and it has increased in popularity ever since, especially for characterizing extreme event processes. Some common life science and engineering applications include characterizing biological growth, ocean currents, particle size distributions, and failure of electrical insulation; determining breaking strength of natural fibers and building materials; assessing yield force, fatigue, and brittle failure of metals; estimating likelihood of infrequent events such as earthquakes, floods, gales, and tsunamis; and failure analysis of equipment such as lightbulbs, automotive parts, electronics, and industrial machinery (Weibull 1951; Monahan 2006; Abernethy 2010; Chu 2013; Brown and Mayer 1988; Mahanta and Borah 2014). Although the Weibull function is quintessentially empirical (Weibull 1951), it is nonetheless closely related to the power law distribution of flaw sizes, classical fracture mechanics, and weakest link theory (Zok 2017; Quinn and Quinn 2010).

The main popularizing features of the Weibull model are its relative simplicity, extensive versatility, and ability to perform well with small datasets and extreme data values (e.g., Abernethy 2010; Chu and Ke 2012; NCSS, Weibull 2020). More specifically, the standard Weibull function has only three fitting parameters, but it can still exhibit varying degrees of left skew, symmetry, right skew, leptokurtosis, mesokurtosis, and platykurtosis; and it can accommodate a variable inflection point, a variable y-axis intercept, and a variable x-axis threshold. In addition, Chu and Ke (2012) found that estimates of Weibull parameters can remain useably accurate even with datasets as small as 5–10 points. None of the other commonly used empirical functions (e.g., Gompertz, Logistic, Richards, Beta, Gumbel, Normal, Lognormal, etc.) can match the Weibull combination of simplicity, flexibility, and range of application.

Despite the above advantages, Weibull applications to agricultural processes appear limited to a relatively small number of studies using specialized (and restrictive) equation forms for characterizing crop yield and biomass production (e.g., Karadavut and Tozluca 2005; Karadavut et al. 2010), seed germination (e.g., Brown and Mayer 1988; Gardarin et al. 2011), and seedling emergence (e.g., Aboutalebian et al. 2017; Izquierdo et al. 2013; Navarro et al. 2013; Gan et al. 1996). The objectives of this study were consequently to (i) derive an adapted Weibull function that is applicable to a wide range of agricultural processes; (ii) describe Weibull function characteristics, parameters, and metrics that may be useful in agricultural applications; and (iii) present six illustrative Weibull characterizations of agricultural datasets.

We start by deriving an adapted Weibull function (Section 2.1.), then illustrate the primary Weibull characteristics (Section 2.2.), describe potentially useful function parameters and metrics (Section 2.3.), identify nested special case and related functions (Section 2.4.), describe a convenient but rigorous method for determining Weibull parameters (Section 2.5.), and finally, present six example applications to diverse agricultural datasets (Section 3.). Although the Weibull model may have theoretical justification for some processes, it is applied here as a purely empirical function for parameterizing and characterizing agricultural processes and not for making time/space/quantity predictions or extrapolations beyond fitted datasets.

2. Materials and Methods

2.1. Derivation of an adapted Weibull function

The Weibull relationship includes a “derivative function” and a “cumulative function”. The derivative function can be written as a “diminishing response” relationship:

(1)

where Y is the dependent variable, x is the independent variable, c is a shape parameter affecting function symmetry, α is a scale or proportionality parameter affecting function domain, x₀ is an initial/threshold/minimum allowable x-value, and Y_F is the final (end-point) Y-value as x becomes large. By “diminishing response”, we mean that dY/dx → 0 as Y → Y_F, regardless of other parameter values.

Given eq. 1, an adapted Weibull cumulative function is obtained by separating variables and integrating between (x₀,Y₀) and (x,Y):

(2)

which leads to

(3)

and ultimately simplifies to

(4)

where Y = Y₀ at x = x₀, and

. Equation 4 is effectively a five-parameter Weibull cumulative function, as it contains five independent parameters (Y_F, Y₀, x₀, k, and c) that can be (in any combination) specified, independently measured, or determined by curve fitting to Y vs. x measurements (elaborated later). In addition, eq. 4 accommodates both Y_F > Y₀ and Y_F < Y₀ (Section 2.2.). The c parameter in eq. 4 is occasionally referred to as the “Weibull slope” (Abernethy 2010) because in traditional engineering analysis, Y vs. x data were plotted as ln[−ln(Y)] vs. ln(x), c.f.

(5)

which yields a straight line with slope = c and Y-axis intercept = ln(k) = cln(k_W) if the data follow a Weibull function. The k_W parameter is sometimes referred to as the “Weibull scale constant”, or the “Weibull rate constant” if the x-axis represents time.

2.2. Essential characteristics of the five-parameter Weibull function

The essential characteristics of the five-parameter Weibull derivative and cumulative functions are illustrated in Fig. 1. The derivative function is negative monotone and unbounded (dY/dx → ∞ as x → x₀) when c < 1; negative monotone and bounded (dY/dx finite at x = x₀) when c = 1; bell shaped and right skewed when 1 < c < 3.6; bell shaped and near symmetrical when c = 3.6; and bell shaped and left skewed when c > 3.6 (Fig. 1a). The corresponding cumulative function has an inflection when c > 1; no inflection when c ≤ 1; and decreasing k causes increasing function domain but decreasing average slope (Fig. 1b). Note also that constant k, x₀, Y_F, and Y₀ causes all Y vs. x relationships to intersect at a common point regardless of c value; and that Y vs. x is increasing when Y_F > Y₀, but decreasing when Y_F < Y₀ (Fig. 1b).

Fig. 1.

Essential characteristics of the five-parameter Weibull function: (a) effects of shape parameter, c, and independent variable offset, x₀, on derivative function, dY/dx vs. x (eq. 1, k = 1.0); (b) effects of initial and final dependent variables, Y₀ and Y_F, respectively, dependent variable offset, x₀, shape parameter, c, and scale parameter, k, on cumulative function, Y vs. x (eq. 4). For the two curves labelled c = 2.0, k = 1.0 in panel (b), the dash-dot line applies for Y₀ = 2, Y_F = 10, and the short-dash line applies for Y₀ = 10, Y_F = 2; both curves produce the derivative function labelled c = 2.0 in panel (a). [Colour online.]

An interesting feature of the Weibull function is its ability to indicate the likelihood (or chance) of an event or condition happening at any particular x value, given that the event or condition has not already happened (known in engineering as the Weibull “hazard” function). In processes where c ≠ 1, the likelihood of an event or condition occurring changes with x (these processes are said to have “memory”), whereas in processes with c = 1, the likelihood does not change with x (these processes are said to be “memoryless”) (Glen 2020a). Furthermore, when c < 1, likelihood of the event or condition happening decreases at a decreasing rate with increasing x; when c = 1, likelihood is constant (independent of x); when 1 < c < 2, the likelihood increases at a decreasing rate with increasing x; when c = 2, likelihood increases linearly with increasing x; and when c > 2, likelihood increases at an increasing rate with increasing x (see e.g., figs. 2–8 in Abernethy 2010). Seedling emergence is an example of an agricultural process which may or may not exhibit memory. If a Weibull fit to measured seedling emergence (Y) vs. time (x) yielded c > 1, the likelihood of emergence increased as time increased, presumably due to viable seed and favourable seedbed conditions (e.g., amenable soil temperature, moisture, density, fertility, etc.). If the fitted function yielded c < 1, likelihood of emergence decreased with increasing time, suggesting low viability seed and (or) poor seedbed conditions. If c = 1 was obtained, likelihood of seedling emergence was constant through time and not affected by seed quality or seedbed conditions. The memory property is illustrated further in Section 3.

2.3. Parameters and metrics obtainable from the adapted Weibull function

As mentioned above, the parameters of the adapted (five-parameter) Weibull derivative and cumulative functions (i.e., Y_F, Y₀, x₀, k, and c) may be obtained by any combination of specification, independent measurement, and estimation by curve fitting to Y vs. x data. Potentially useful derivative function metrics include mode , median , skewness (S_F), kurtosis (K_F), and integral average Cumulative function metrics of potential use include mean , inflection , quantiles , and domain . These metrics are briefly derived and discussed in Appendix A, summarized in Table 1, and illustrated in Fig. 2. This study focuses on the Y_F, Y₀, x₀, and c parameters and on the x_D, x_M, R_M, A_I, S_F, and K_F metrics.

Table 1.

Selected metrics of an adapted five-parameter Weibull function (see also Appendix A, Table 2, and Fig. 2).

Fig. 2.

Some metrics of the five-parameter Weibull function (k = 0.5, c = 1.6): (a) derivative function, dY/dx vs. x; (b) cumulative function, Y vs. x. Metrics listed include initial and final independent variables (x₀ = 2.0 and x_F = 8.2, respectively), initial and final dependent variables (Y₀ = 20 and Y_F = 100, respectively), mode (Mode), median (Median), mean (Mean), inflection (Inflection), integral average of derivative function (A_I), maximum value of derivative function (R_M), median quantile (0.5 Quantile), and domain (x_D) (see Appendix A for details). In panel (a), the area under the rectangle bounded by the dashed lines equals the area under the derivative function curve between x₀ and x_F. The derivative function is moderately right skewed (S_F = 0.9620) and slightly leptokurtic (K_F = 1.0440) (see Table 2). [Colour online.]

2.4. Special cases of the Weibull function and related functions

The five-parameter Weibull function collapses to four-parameter Mitscherlich and Rayleigh functions when c = 1 and 2, respectively (e.g., Lai and Xie 2006). The four-parameter Mitscherlich functions (which are effectively exponential functions) are given by

(6)

(7)

where

is the Mitscherlich scale constant, and the shape constant is fixed at c = 1. Distinguishing features of the Mitscherlich functions include no inflection, mode located at

, and fixed Fisher skewness and kurtosis values of

and

, respectively (Weisstein 2004). Some traditional agricultural applications of Mitscherlich functions include description of nitrogen mineralization (e.g., Yang et al. 2020), characterization of nitrous oxide and carbon dioxide emissions from soil (e.g., Hoben et al. 2011; Gillis and Price 2011), depiction of crop and weed seed germination and (or) emergence (e.g., Brown and Mayer 1988; Gill et al. 1996), modeling crop growth and (or) yield responses to fertilizer rates (e.g., Srivastava et al. 2006), and simulating annual rainfall (e.g., Harmsen 2000) and water infiltration rate (e.g., Horton 1940).

It should also be noted that the four-parameter Mitscherlich cumulative function (eq. 7) is a generalization of a single-pool, first-order reaction kinetics model developed by Beauchamp et al. (1986) for describing mineralization of soil nitrogen (N) in laboratory incubation studies:

(8)

where

(mg N·kg⁻¹ soil) is the amount of nitrogen mineralized at time t (d),

(mg N·kg⁻¹) is the total amount of nitrogen available to mineralize (pool of mineralizable soil nitrogen), β (d⁻¹) is the mineralization rate constant, and

(mg N·kg⁻¹) is either the amount of mineralized N already present in the soil at t = 0, or the amount of “easily mineralizable” N that mineralized rapidly in the early stages of the incubation before measurements started. As eq. 8 assumes implicitly that t₀ = 0 (i.e., x₀ = 0 in eq. 7), then plots of

vs. t intersect the y-axis at

(see e.g., Figure 4 in Beauchamp et al. 1986). Note also that the mineralization “half time” or “half life”, t_1/2 (i.e., the time required for half of

to be mineralized), is a special case of the median x (

, eq. A6) or the 0.5 quantile x (

, eq. A23):

(9)

since x₀ = 0, c = 1 and β = k.

The four-parameter Rayleigh functions are

(10)

(11)

where

is the Rayleigh scale constant, and the shape constant is fixed at c = 2. The Rayleigh functions have variable mode, median, mean, and inflection (as with Weibull) but fixed Fisher skewness and kurtosis of

and

, respectively (Weisstein 2020). To the authors’ knowledge, Rayleigh functions have not been applied to agricultural processes but are used extensively for characterizing communication signals, wind speed, wave height, sound and (or) light radiation, electronics longevity, and magnetic resonance images (e.g., Glen 2020b).

In addition to the above special cases, Weibull functions provide close approximations to the Lognormal distribution when c = 2.5 (NCSS, Weibull 2020), the Normal distribution when c = 3.6023494 (Cousineau 2011; Lai and Xie 2006; NCSS, Weibull 2020; Fig. 1a), and the Gumbel distribution when c is large (Cousineau 2011). This obviously serves to further extend the flexibility and applications of Weibull functions.

2.5. Determining Weibull parameters

2.5.1. Parameter estimation methods

The main approaches for estimating Weibull parameters from Y vs. x data include method of moments, maximum likelihood estimation, and model-data curve fitting using Weibull probability plots (eq. 5) or iterative non-linear least-squares regression (Cousineau 2009; Abernethy 2010; Lai and Xie 2006; Quinn and Quinn 2010; Evans et al. 2019; NCSS, Weibull 2020). Of these, maximum likelihood and curve fitting via regression provide objective (and often the most accurate) parameter estimates (NCSS, Weibull 2020). However, regression may provide more accurate estimates of than maximum likelihood (NCSS, Weibull 2020), as well as more accurate estimates of all Weibull parameters when the dataset is small (Chu and Ke 2012). Notwithstanding that commercial software packages (e.g., Relyence, Weibull 2020; ReliaSoft, Weibull 7.0 2020; Reliability Workbench, Weibull Analysis 2020; NCSS, Weibull 2020) provide a variety of methods for estimating Weibull parameters, model-data curve fitting via regression may be the most widely accessible because it resides as a built-in application in most computer spreadsheets. Iterative non-linear least-squares regression was used here by applying the Solver^® algorithm in the Excel^® spreadsheet to minimize the sum of squared errors between the adapted Weibull cumulative function (eq. 4) and Y vs. x data. One or more of Y_F, Y₀, x₀, k, and c were treated as curve-fitting parameters, and the Weibull scale parameter, k_W, was separated from the shape parameter, c, to improve parameter identifiability (i.e., k was used instead of k_W^c, Banks and Joyner 2017). Initial guess values for the curve-fitting parameters were obtained by manual “trial-and-error” matching of model prediction to Y vs. x data and by plotting the data according to eq. 5 (when possible) to estimate c from slope and k from intercept. Whenever possible, the number of parameters used for curve fitting was ≤ half the number of data points to increase the likelihood of obtaining unique parameter values. The options used in the Solver fitting algorithm included generalized reduced gradient non-linear numerical iteration, slope calculation using central derivatives, automatic scaling, and a 10⁻¹⁰ convergence criterion.

2.5.2. Assessing model-data fits

The accuracy and validity of model-data fits were assessed using various “goodness of fit” indicators, including adjusted coefficient of determination, normalized mean bias error, normalized standard error of regression (e.g., Archontoulis and Miguez 2015), and regression of measured Y (y-axis) against model-predicted Y (x-axis) (Piňeiro et al. 2008).

Adjusted coefficient of determination for non-linear least-squares regression (R_A²) is given by

(12)

normalized mean bias error (MBE_N) by

(13)

and normalized standard error of regression (SER_N) by

(14.1)

where

is the regression sum of squared errors,

is the regression total sum of squares,

and

are, respectively, the model-predicted and measured Y values at each x-value,

is the mean of the measured Y values, B is the number of data points (measurements), and L is the number of model fitting parameters (excluding intercept). The R_A² metric indicates degree to which the fitted model predicts (or explains) systematic variation in the data, and accounts for the number of model fitting parameters (L) (e.g., Archontoulis and Miguez 2015). The MBE_N (also known as “normalized mean prediction error”, MPE_N) indicates degree of systematic model bias, with positive values indicating net overestimate of the data by the fitted model, and negative values indicating net underestimate (e.g., Archontoulis and Miguez 2015; Schjønning et al. 2017). The SER_N indicates the “predictive power” of a fitted model with L fitting parameters, where SER_N = 0 indicates perfect prediction (i.e., no discrepancies between predicted and measured values). Given that SER_N is functionally related to normalized root mean square error, RMSE_N, i.e.,

(14.2)

the RMSE_N predictive power categories of Jamieson et al. (1991) can be used as conservative SER_N categories, i.e., 0 ≤ SER_N < 10% indicates excellent model prediction of the data; 10% ≤ SER_N ≤ 20% signifies good prediction; 20% < SER_N ≤ 30% indicates fair prediction; and SER_N > 30% signifies poor prediction.

According to the analysis of Piňeiro et al. (2008), model representations of data are viable only when the slope and intercept of measured Y () regressed against model-predicted Y ( are not significantly different from unity and zero, respectively. When this occurs, the 95% confidence limits on slope fall on either side of unity, and the 95% confidence limits on intercept fall on either side of zero (at P < 0.05 significance level). If the regression slope is significantly different from unity (i.e., 95% confidence interval excludes unity), the model may be compromised due to “inconsistency” (with the data); and if the regression intercept is significantly different from zero (i.e., 95% confidence interval excludes zero), the model may be compromised due to “bias”. Generally speaking, the confidence limit “band” (i.e., upper confidence limit minus lower limit) becomes narrower and more symmetrical about the vs. regression line as fit between model and data improves.

2.5.3. Selecting the most suitable function or model

As the Mitscherlich and Rayleigh models are simpler special cases of the Weibull model (i.e., they are “nested” within the Weibull model because they have fixed c values), it is advisable to determine which of the three models provides the best balance between goodness of fit to the data and model simplicity, or “parsimony” (e.g., Johnson and Omland 2004; Vandekerckhove et al. 2015). This was achieved here using the two-model partial F test and the corrected Akaike information criterion (AIC_C).

The two-model partial F test (F_P) may be given as (e.g., Archontoulis and Miguez 2015):

(15)

(16)

where SSE_W and K are, respectively, the sum of squared errors and number of fitting parameters for the Weibull model, B is the number of Y vs. x measurements, F_W-X is F test significance level, F.DIST.RT is the right-tailed F probability distribution (as represented in Excel^®), and SSE_X and J are, respectively, the sum of squared errors and number of fitting parameters for the Mitscherlich or Rayleigh models. If F_W-M ≤ 0.05, the Weibull fit is significantly better (at P < 0.05) than the Mitscherlich fit; and if F_W-R ≤ 0.05, the Weibull fit is significantly better (at P < 0.05) than the Rayleigh fit. Note that the F_W-X test cannot compare models with an equal number of fitting parameters (K = J).

For ordinary least-squares regression, the AIC_C is given by (Banks and Joyner 2017)

(17)

where SSE_X is the regression sum of squared errors for the fitted model (i.e., Weibull, Mitscherlich, or Rayleigh), B is the number of data points, and L is the number of model parameters used as least-squares fitting variables (as c is fixed in Mitscherlich and Rayleigh, these models will usually have one less fitting variable than the Weibull model). The first term on the right of eq. 17 represents the model’s “goodness of fit”, and the last two terms represent a “parsimony penalty” (Banks and Joyner 2017; Vandekerckhove et al. 2015). The parsimony penalty increases with increasing number of model fitting variables (increasing model complexity), and it is designed to compensate for model over-fitting; i.e., spurious improvements in model-data fit caused by adding fitting variables that fit largely to the data’s random noise, rather than to the data’s underlying signal (Johnson and Omland 2004; Vandekerckhove et al. 2015). Generally speaking, the model providing the lowest AIC_C value is considered the “best estimator” of the data (Banks and Joyner 2017; Vandekerckhove et al. 2015). Note also that AIC_C applies for both nested and un-nested models, and that two or more models can be compared simultaneously (see e.g., Banks and Joyner 2017).

The lowest AIC_C alone does not indicate how much more probable the “best estimator” model is relative to the other models, especially if differences among AIC_C values are small. To alleviate this deficiency, likelihood of being the “best estimator” model can be assessed using normalized probabilities, P_X, derived from differences among AIC_C weights (Banks and Joyner 2017; Vandekerckhove et al. 2015), i.e.,

(18.1)

(18.2)

(18.3)

where

are normalized probabilities (%), and

are the corresponding AIC_C weights for the Weibull, Mitscherlich, and Rayleigh models, respectively. The AIC_C weights are in turn given by (Banks and Joyner 2017)

(19.1)

(19.2)

(19.3)

where

, and

(20.1)

(20.2)

(20.3)

where

,

, and

are the

values of the Weibull, Mitscherlich, and Rayleigh models, respectively, and

is the smallest

value of the three models. The relative magnitudes among P_X values indicate if the AIC_C values are sufficiently different to select any one model as being better than the others for estimating or representing a particular Y vs. x dataset (Banks and Joyner 2017).

Model suitability can also be assessed by comparing a plot of the model-predicted derivative function, , to a finite difference estimate of the actual derivative function, , where

(21)

and B is the number of measured data pairs. The comparison can be informal (e.g., visual) or quantified using goodness of fit criteria (e.g., eqs. 12–14, 18–20). Although such comparisons are absent (or rare) in the literature, they can be critically important as good correlation between the model-predicted and estimated derivative function implies model compatibility with underlying biological–physical–chemical processes. To be useful, this approach obviously requires relatively low random variability in the measurements.

3. Example Applications

3.1. Seedling emergence

Seedling emergence typically occurs several days after planting of crop seeds or plow-down of weed seeds. Cumulative seedling emergence versus time graphs usually have a sigmoid or concave shape (e.g., Haj Seyed Hadi and Gonzales-Andujar 2009; Forcella et al. 2000), which is determined largely by seed viability and (or) dormancy and seedbed conditions. Both the time delay and emergence versus time patterns of seedling emergence are amenable to characterization using the Weibull–Mitscherlich–Rayleigh models and are illustrated here for emergence of grain corn (Zea mays L.) seedlings. Figure 3 shows corn emergence data and fitted models, where Y₀ and x₀ were specified constants, Y_F and k were fitted for the Mitscherlich and Rayleigh models, and Y_F, k, and c were fitted for the Weibull model. Table 3 gives 95% confidence limits for slope and intercept of vs. ; Table 4 gives parameter values and fit metrics.

Fig. 3.

Weibull, Mitscherlich, and Rayleigh functions fitted to corn seedling emergence data: (a) cumulative seedling emergence, Y vs. t; (b) seedling emergence rate, dY/dt vs. t. Circles are measured cumulative seedling emergence expressed as percentage of seed planting rate; triangles are finite difference (FD) estimates of emergence rate based on measured cumulative seedling emergence (eq. 21); predicted end point (triangle) is time when complete emergence was reached (x_F,Y_F). Vertical “T bars” are standard error (n = 4). Data from Agomoh et al. (2021). [Colour online.]

The Weibull and Rayleigh models produced better visual fits than Mitscherlich to both cumulative emergence (Fig. 3a) and estimated emergence rate (Fig. 3b). Note in particular that Weibull and Rayleigh tracked the estimated emergence rate data quite well, whereas Mitscherlich did not (Fig. 3b), suggesting that the emergence mechanism was not a Mitscherlich-type first-order exponential. The fit statistics for Weibull were excellent (R_A² = 1.0000, MBE_N = −3.56 × 10⁻⁵%, SER_N = 0.05%), whereas those for Rayleigh were somewhat poorer (R_A² = 0.9988, MBE_N = 0.14%, SER_N = 2.17%), and those for Mitscherlich were substantially poorer (R_A² = 0.9617, MBE_N = 0.59%, SER_N = 12.4%) (Table 4). In addition, the 95% confidence limits for slope and intercept of Y_i^M vs. Y_i^P were very narrow for Weibull, whereas those for Mitscherlich and Rayleigh were much wider (Table 3). Evidently, Weibull’s excellent model-data fit was sufficient to over-ride its parsimony penalty (due to having one more fitting parameter), as the Weibull fit was highly significant (F_W-M < 0.001, F_W-R < 0.001), and Weibull was by far the most probable of the three models (P_W ≈ 100%, P_R ≈ 0%, P_M ≈ 0%). The fitted Weibull model indicated a final seedling emergence (Y_F) of 104%, a final emergence time (x_F) of 8.3 d after planting, a maximum emergence rate (R_M) of 77.3% d⁻¹ at day 6 after planting (x_M), and an integral average emergence rate (A_I) of 31.1% d⁻¹ over the 3.3 d (x_D) duration of seedling emergence (Table 4). Note also that since c = 2.3 was obtained from the Weibull fit (Table 4), likelihood of corn seedling emergence increased with time at an increasing rate, suggesting viable non-dormant seed and favourable seedbed conditions. The Weibull-fitted emergence rate (derivative) function (Fig. 3b) was moderately right skewed (S_F = 0.4536), but only slightly less tailed (K_F = −0.0353) than a normal distribution (Table 2).

Table 2.

Proposed skewness (S_F) and kurtosis (K_F) categories based on the inclusive graphic classifications of Folk (1980).

3.2. Nitrous oxide emissions

Nitrous oxide (N₂O) emissions from agricultural soil occur primarily as a result of microbial nitrification and denitrification of applied nitrogen sources, such as fertilizers and organic amendments (e.g., Signor and Cerri 2013). Although cumulative N₂O emission versus time curves can assume a wide variety of shapes, by far the most common are concave and sigmoid forms, which are readily characterized using the Weibull–Mitscherlich–Rayleigh models. Figure 4 shows cumulative N₂O emissions and fitted models for corn production on a clay loam soil that had received starter fertilizer (8–32–16) on Julian day 136 and side-dress nitrogen fertilizer (28% UAN) on Julian day 170 (see Drury et al. 2021 for details). Because there were more than twice as many data points (21) as fitting parameters (4–5), Y₀, Y_F, x₀, and k were fitted for the Mitscherlich and Rayleigh models, and Y₀, Y_F, x₀, k, and c were fitted for Weibull. Tables 3 and 5 give the parameter values and fit metrics.

Fig. 4.

Weibull, Mitscherlich, and Rayleigh functions fitted to nitrous oxide emissions from soil: (a) nitrous oxide (N₂O) emitted vs. time (t); (b) N₂O emission rate, dN/dt vs. t. Circles are measured cumulative N₂O emitted; diamonds are finite difference (FD) estimates of dN/dt based on measured cumulative N₂O (eq. 21); predicted end point (triangle) is estimated time when emissions ceased (x_F,Y_F). Vertical “T bars” are standard error (n = 4). Data from Drury et al. (2021). [Colour online.]

Table 3.

Confidence limits (CL, 95%) on slope and y-axis intercept of measured dependent variable (Y_i^M) regressed against model-predicted dependent variable (Y_i^P) for the Weibull, Mitscherlich, Rayleigh, van Genuchten, and Groenevelt–Grant models.

The measured N₂O emissions formed a sigmoid cumulative curve (Fig. 4a) and a bell-shaped derivative, or emission rate, curve (Fig. 4b). The Weibull and Rayleigh models produced visually good model-data fits, but Mitscherlich was clearly not competitive. The 95% confidence limits for slope and intercept of Y_i^M vs. Y_i^P (Table 3), and the R_A², MBE_N, SER_N, F_W-M, and F_W-R values (Table 5), further showed that the Weibull fit was excellent and significantly better than Rayleigh and Mitscherlich. As a result, Weibull was highly probable (P_W = 100%) despite its parsimony penalty, while Rayleigh and Mitscherlich were highly improbable (P_R ≈ 0%, P_M ≈ 0%) (Table 5). The fitted Weibull model indicated maximum cumulative N₂O emission (Y_F) of 16 769 g N₂O-N·ha⁻¹ at Julian day 231 (x_F), a maximum emission rate (R_M) of 424 g N₂O-N·ha⁻¹·d⁻¹ at Julian day 178 (x_M), and an average emission rate (A_I) of 170 g N₂O-N·ha⁻¹·d⁻¹ which occurred over a 98 d emission period (x_D) (Table 5). Note as well that since c = 3.2669 was obtained (Table 5), likelihood of emission increased with time, and emissions were almost normally distributed (S_F = 0.0872, K_F = −0.2884, Table 2).

3.3. Yield change

Corn is highly sensitive to the soil environment, and annual yields decline when root zone fertility and biophysical health start to deteriorate (e.g., Bennett et al. 2012). Figure 5 shows annual grain yields and fitted models for unfertilized monoculture corn grown under no-tillage on a sandy loam soil after a one-time application (in 2012) of yard waste compost (experimental details given in Reynolds et al. 2020). Parameters Y₀ and x₀ were specified constants, whereas Y_F and k were fitted for Mitscherlich and Rayleigh, and Y_F, k, and c were fitted for Weibull. Parameter values and fit metrics appear in Tables 3 and 6.

Fig. 5.

Weibull, Mitscherlich, and Rayleigh functions fitted to decline in grain yield of unfertilized monoculture corn: (a) yield (Y) vs. time (t) after a one-time (spring 2012) application of compost; (b) rate of yield decline, −dY/dt vs. t. Circles are measured corn grain yield at 15.5 wt. % moisture content; diamonds are finite difference (FD) estimates of −dY/dt based on measured cumulative yield (eq. 21); predicted end point (triangle) is estimated time when stable yield is reached (x_F,Y_F). Vertical “T bars” are standard error (n = 5). [Colour online.]

The measured yields decline with time (Fig. 5a), which presumably reflects deteriorating soil fertility and health as the oxidizing compost loses its ability to supply nutrients and maintain a good biophysical environment. The data-estimated rate of yield decline (eq. 21, Fig. 5b) was too scattered to be useful, and it may reflect annual variation in weather. The fitted Weibull and Mitscherlich models predicted yield (Fig. 5a) and yield change (Fig. 5b) curves which were similar and convex in shape. The fitted Rayleigh model, in contrast, produced a sigmoid curve for yield (Fig. 5a) and a right-skewed bell curve for yield change (Fig. 5b), which was expected as this model’s shape parameter is fixed at c = 2 (eqs. 10 and 11; Figs. 1a, 1b). The R_A² and SER_N fit metrics (Table 6) and confidence limits for Y_i^M vs. Y_i^P (Table 3) were similar between Weibull and Mitscherlich, and better than those for Rayleigh. However, the Weibull fit was not significantly better than the others (F_W-M = 0.7576, F_W-R = 0.1946); and this combined with the Weibull parsimony penalty, caused Mitscherlich to be most probable (P_M = 87.2%), with Rayleigh second (P_R = 10.1%), and Weibull third (P_W = 2.7%). The fitted Mitscherlich model indicated that unfertilized grain yield declined over a period of about 9.8 yr (x_D) and stabilized at about 7.5 t·ha⁻¹ (Y_F). The model also indicated a maximum rate of yield decline of 6.9 t·ha⁻¹·yr⁻¹ (R_M) during the first year after compost addition (x_M = 2012), and an average rate of yield decline of 0.75 t·ha⁻¹·yr⁻¹ (A_I) over the 9.8 yr yield stabilization period (x_D). Since c = 1 for Mitcherlich (Table 6), yield decline was predicted to be memoryless (i.e., likelihood of yield decline was constant and independent of time). As expected, the Mitscherlich rate of yield change (derivative) function (Fig. 5b) was strongly right skewed (S_F = 2) and substantially more tailed (K_F = 6) than a normal distribution (Table 2).

3.4. Nitrogen mineralization

Potential supply and release dynamics of crop-available nitrogen are often determined using laboratory incubations, wherein nitrogen release vs. time is measured for soil amended with fertilizers or organic materials (e.g., Yang et al. 2020). Nitrogen release curves can have a variety of shapes, but they are most often sigmoid or monotone, and thereby amenable to characterization using the Weibull–Mitscherlich–Rayleigh family of models. Figure 6 shows example mineralized nitrogen (N) release data (from Yang et al. 2020) and model fits for soil amended with crimson clover shoots. Because there were more than twice as many data points (11) as fitting parameters (4–5), Y₀, Y_F, x₀, and k were fitted for Mitscherlich and Rayleigh, and Y₀, Y_F, x₀, k, and c were fitted for Weibull. Tables 3 and 7 give parameter values and fit metrics.

Fig. 6.

Weibull, Mitscherlich, and Rayleigh functions fitted to laboratory-incubated nitrogen mineralization data: (a) inorganic nitrogen mineralized (N) vs. time (t); (b) N mineralization rate, dN/dt vs. t. Circles are measured cumulative inorganic N; diamonds are finite difference (FD) estimates of dN/dt based on measured cumulative N (eq. 21); predicted end point (triangle) is estimated time when net mineralization ceased (x_F,Y_F). Vertical “T bars” are standard error (n = 4). Data from Yang et al. (2020). [Colour online.]

All three models generated visually reasonable fits to cumulative N released vs. time data (Fig. 6a); however, only Weibull and Mitscherlich produced plausible estimates of mineralization rate (dN/dt) vs. time (Fig. 6b). Hence, Rayleigh was deemed non-competitive and not considered further. The Weibull fit was most probable of the three models (P_W = 58.37%, P_M = 41.63%, P_R = 0.0%) (Table 7), and its 95% confidence limits for slope and intercept of Y_i^M vs. Y_i^P were narrowest (Table 3). The Weibull fit also had excellent metrics (R_A² = 0.9988, MBE_N = 5.92 × 10⁻⁸ %, SER_N = 1.49%) which were better than Mitscherlich (R_A² = 0.9979, MBE_N = 0.097%, SER_N = 1.96%) (Table 7). The two models may actually be equivalent; however, as Weibull’s probability, fit metrics, and confidence limits were only slightly better than Mitscherlich, and the partial F-test for Weibull vs. Mitscherlich was not significant (F_W-M = 0.0669). In addition, Weibull and Mitscherlich gave similar values for Y₀, Y_F, k, c, R_M, and A_I (Table 7), and both indicated that mineralization was effectively memoryless (i.e., c ≈ 1, Table 7), strongly right skewed (S_F = 1.8–2, Tables 2 and 7), and moderate to strongly leptokurtic (K_F = 4.5–6, Tables 2 and 7). Note also that although both models track N mineralization rate (dN/dt vs. t) equally well (Fig. 6b), Weibull indicates that dN/dt = 0 at t = 0 (since c = 1.09 > 1), whereas Mitscherlich indicates that dN/dt = R_M = 2.45% d⁻¹ at t = 0 (since c = 1) (Table 7). Hence, the better model of the two may actually be the one that gives the most physically plausible representation of N mineralization rate at zero and near-zero time, given that most of the other parameters and metrics were similar.

3.5. Soil organic carbon depth profile

Soil organic carbon content (SOC) is perhaps the single most important soil attribute, as it affects crop productivity, most soil properties, and most soil quality and health indicators (e.g., Gregorich et al. 1997). Accurate and detailed characterization of SOC depth profiles can, therefore, be critically important for analytical and numerical models describing carbon sequestration, water transmission and storage, solute transport, crop growth, and environmental impact. Figure 7 gives an example SOC profile and corresponding Weibull–Mitscherlich–Rayleigh fits for a clay loam soil under a long-term corn–oat–alfalfa–alfalfa rotation. Parameters Y₀ and x₀ were specified constants, whereas Y_F and k were fitted for Mitscherlich and Rayleigh, and Y_F, k, and c were fitted for Weibull. The corresponding parameter values and fit metrics are in Tables 3 and 8.

Fig. 7.

Weibull, Mitscherlich, and Rayleigh functions fitted to soil organic carbon versus depth data: (a) soil organic carbon content (SOC) vs. depth (z); (b) rate of SOC change, −d(SOC)/dz vs. z. Circles are measured SOC content; diamonds are finite difference (FD) estimates of −d(SOC)/dz based on measured SOC (eq. 21); predicted end point (triangle) is estimated depth where minimum SOC occurs (x_F,Y_F). Vertical “T bars” are standard error (n = 4). Data from Reynolds et al. (2014). [Colour online.]

The SOC data profile is clearly sigmoidal (Fig. 7a) with a bell-shaped derivative function (Fig. 7b) — hence, well fitted by the Weibull and Rayleigh models but poorly fitted by Mitscherlich. The model-data fit metrics were excellent for Weibull (R_A² = 0.9950, MBE_N = −0.30%, SER_N = 4.36%), whereas Rayleigh was a close second (R_A² = 0.9890, MBE_N = −1.17%, SER_N = 6.45%), and Mitscherlich was a distant third (R_A² = 0.7827, MBE_N = −2.44%, SER_N = 28.69%) (Table 8). However, the Weibull fit was not significantly better (F_W-M = 0.0685, F_W-R = 0.3171); and when parsimony is taken into account, Rayleigh was most probable by far (P_R = 96.41%), with Weibull a distant second (P_W = 3.53%), and Mitscherlich a very distant third (P_M = 0.06%) (Table 8). Although this clearly indicates that Rayleigh has the best balance of fit and parsimony, there may be other factors to consider. For example, Weibull not only produced better fit metrics than Rayleigh but also narrower confidence limits for slope and intercept of Y_i^M vs. Y_i^P (Table 3), and a better visual fit to the −d(SOC)/dz vs. z data (Fig. 7b). Hence, choosing the most appropriate model may require subjective reasoning (e.g., visual model-data fits) as well as objective criteria (fit metrics). In any case, the Rayleigh (most probable) model indicated a minimum SOC (Y_F) of 0.21 wt. % at 83 cm depth (x_F), maximum rate of SOC change of 0.08 wt. %·cm⁻¹ at 23 cm depth (x_M), and an average rate of SOC change (A_I) of 0.03 wt. %·cm⁻¹ over a 78.4 cm depth range (x_D) (Table 8). Since c = 2 for Rayleigh, its derivative function for SOC (Fig. 7b) was moderately right skewed (S_F = 0.6311) and slightly more tailed (K_F = 0.2451) than a normal distribution (Table 2).

3.6. Soil water desorption curve

The soil water desorption curve, θ(h), describes the decrease in soil volumetric water content, θ [L³·L⁻³], with increasing soil water tension head, h [L]. It is a fundamental hydraulic property that underpins and regulates virtually all mechanistic descriptions of transmission and storage of water and gases in soils and other geologic porous media. Accurate representations of the θ(h) curve are consequently essential for valid and realistic characterizations of a wide range of agriculturally relevant processes such as infiltration, drainage, irrigation scheduling, tillage, trafficking, and depth profiles of root zone temperature, aeration, and moisture. Due to the extreme complexity of soil–water dynamics, θ(h) is typically represented by one of several empirical or semi-empirical equations; but to the authors’ knowledge, Weibull–Mitscherlich–Rayleigh expressions have never been applied. Figure 8 gives an example of Weibull, Mitscherlich, and Rayleigh model fits to a θ(h) curve from an agricultural sandy loam soil, with (x₀,Y₀) = (h_S,θ_S) = (0.1,0.52), where θ_S is measured water content at saturation (0.52 m³·m⁻³), and h_S is estimated tension head at saturation (0.1 cm). Fitting parameters included Y_F and k for Mitscherlich and Rayleigh, and Y_F, k, and c for Weibull. The resulting parameter values and fit metrics appear in Tables 3 and 9.

Fig. 8.

Weibull, Mitscherlich, and Rayleigh functions fitted to soil water desorption data: (a) water content (θ) vs. tension head (h); (b) rate of θ change, −dθ/dh vs. h. Circles are measured θ; diamonds are finite difference (FD) estimates of the differential water capacity relationship obtained using −Δθ/Δh vs. geometric mean h (eq. 21); predicted end point (triangle) is estimated tension head where minimum θ occurs (x_F,Y_F). Vertical “T bars” are standard error (n = 10). [Colour online.]

As is common for soil, the θ vs. h data produced a sigmoid, diminishing response curve (Fig. 8a, h on log₁₀ scale), whereas −dθ/dh vs. h produced a concave curve (Fig. 8b, both θ and h on log₁₀ scales). Although all three models could produce the same basic shapes as the data, it is clear that only Weibull was flexible enough (because of its variable c parameter) to achieve accurate and realistic model-data fits. Specifically, the Weibull line went through, or close to, every data point, whereas Mitscherlich and Rayleigh deviated systematically and substantially (Figs. 8a, 8b). Weibull consequently had much more favourable Y_i^M vs. Y_i^P confidence limits (Table 3) and fit metrics (Table 9) than Mitscherlich and Rayleigh; and as a result, P_W was 100% and both F_W-M and F_W-R were <0.001 (Table 9). Hence, Weibull was highly probable and produced accurate fits to the θ(h) and −dθ/dh data, whereas Mitscherlich and Rayleigh were both improbable and inaccurate. The Weibull fit indicated a minimum water content (Y_F) of 0.12 m³·m⁻³ at a tension head of 17 558 cm (x_F), a maximum desorption rate (R_M) of 2.23·cm⁻¹ at 0.1 cm tension head (x_M), and an integral average desorption rate (A_I) of 2.28 × 10⁻⁵·cm⁻¹ over a 17 557.9 cm desorption range (x_D) (Table 9). Since c = 0.4720 was obtained (Table 9), the desorption curve was extremely right skewed (S_F = 7.5) and extremely leptokurtic (K_F = 113.1) (Tables 2 and 8).

Given its apparent success, Weibull was compared with two of the most popular and versatile functions for describing the θ(h) relationship — the “van Genuchten” equations (van Genuchten 1980) and the “Groenevelt-Grant” equations (Groenevelt and Grant 2004). The empirical van Genuchten θ(h) and dθ/dh equations are given by (van Genuchten 1980)

(22.1)

(22.2)

where α [L⁻¹] and n [—] are empirical curve-fitting parameters, m = 1 − (1/n) [—], θ_R [L³·L⁻³] is residual soil water content (treated as a fitting parameter), and θ_S [L³·L⁻³] is saturated soil water content (usually specified or measured). The empirical Groenevelt–Grant θ(h) and dθ/dh equations can be written as (Groenevelt and Grant 2004)

(23.1)

(23.2)

where k₀ [L], k₁ [L³·L⁻³], and p [—] are empirical curve-fitting parameters, and (h_A,θ_A) is a specified water content-tension head coordinate. Equations 22.1 and 23.1 were fitted to the θ(h) data in Fig. 8a using the protocol in Section 2.5.1., with θ_S = θ_A = 0.52 m³·m⁻³, and h_A = 0.1 cm. It should also be noted that eqs. 22.2 and 23.2 are traditionally referred to as the “differential water capacity” or “specific moisture capacity” relationship.

The van Genuchten and Groenevelt–Grant functions produced nearly identical visual fits to θ(h) and −dθ/dh (Figs. 9a, 9b), as well as similar fit metrics (Tables 3 and 10). The fits were not as good as Weibull; however, as their fit metrics and confidence limits for Y_i^M vs. Y_i^P were less favourable (Tables 3 and 10), and the models did not “track” the wet and dry ends of the −dθ/dh vs. h relationship as well as Weibull (Fig. 9b). As a result, Weibull was by far the most probable of the three models with P_W = 99.72%, whereas van Genuchten and Groenevelt–Grant were P_vG = 0.04% and P_G-G = 0.24%, respectively (Table 10). Better tracking of −dθ/dh vs. h by Weibull (at least for this example) is particularly interesting, as it implies potentially better representations of water transmission and storage processes than the other two models, as well as more accurate estimates of the differential water capacity term (dθ/dh) in the physically based Richards equation for soil moisture flow (Richards 1931).

Fig. 9.

Comparison of Weibull, van Genuchten, and Groenevelt–Grant fits to soil water desorption data: (a) water content (θ) vs. tension head (h) (eqs. 4, 22.1, 23.1); (b) rate of θ change, −dθ/dh vs. h (eqs. 1, 22.2, 23.2). All three models were anchored to maximum water content and pressure head, i.e., (θ,h) = (0.52,0.1); Weibull fitting parameters were Y_F, k, and c; van Genuchten fitting parameters were θ_R, α, and n, with m = 1 − (1/n); Groenevelt–Grant fitting parameters were k₀, k₁, and p. Circles are measured θ; diamonds are finite difference (FD) estimates of the differential water capacity relationship obtained using −Δθ/Δh vs. geometric mean h (eq. 21). Vertical “T bars” are standard error (n = 10). [Colour online.]

4. Discussion

Although the above agricultural applications are limited in scope, they nonetheless provide a good demonstration of the potential and utility of the adapted Weibull–Mitscherlich–Rayleigh family of functions. In every case, at least one of the three models provided a highly plausible model-data fit (Figs. 3–9) with excellent fit metrics (R_A² ≥ 0.9266, |MBE_N| ≤ 1.49%, SER_N ≤ 8.08%; Tables 4–10), and 95% confidence intervals that were narrow and near-symmetric about the Y_i^M vs. Y_i^P regression line (Table 3). In addition, the partial F test (F_P) showed when Weibull produced a significantly better fit (P < 0.05) than the simpler Mitscherlich and Rayleigh functions; and the normalized suitability metric (P_X) provided clear probability rankings among the fitted models (Tables 4–10). However, as has been noted by others (e.g., Archontoulis and Miguez 2015), there is currently no single metric (or suite of metrics) for definitive determination of the “best fitting”, “most appropriate” or “most probable” model. It also needs to be remembered that P_X is a relative measure only, and therefore, cannot determine the best model in an absolute sense. In other words, some other model (e.g., Gompertz, Logistic, Gumbel, etc.) may provide better representations of the datasets than the Weibull–Mitscherlich–Rayleigh family. At present, estimating the best absolute representation of a dataset requires fitting a wide range of diverse (but plausible) models, and then applying the P_X metric as above (e.g., Banks and Joyner 2017).

Table 4.

Selected parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to corn emergence data (Figs. 3a, 3b).

Table 5.

Selected parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to nitrous oxide emissions from a clay loam soil (Figs. 4a, 4b).

Table 6.

Parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to decline in grain yield of unfertilized monoculture corn after a one-time application of compost (Figs. 5a, 5b).

Table 7.

Parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to laboratory-incubated nitrogen mineralization data (Figs. 6a, 6b).

Table 8.

Parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to soil organic carbon content (SOC) profile data (Figs. 7a, 7b).

Table 9.

Parameter values and associated metrics for Weibull, Mitscherlich, and Rayleigh models fitted to soil water desorption curve, θ(h), data (Figs. 8a, 8b).

Table 10.

Parameter values and associated metrics for the Weibull, van Genuchten, and Groenevelt–Grant models fitted to soil water desorption curve, θ(h), data (Figs. 9a, 9b).

Of the six example datasets, Mitscherlich was deemed most suitable for change in corn grain yield (Fig. 5, P_M = 87.2%), Rayleigh was best for SOC distribution (Fig. 7, P_R = 96.4%), and Weibull was best for corn seedling emergence (Fig. 3, P_W = 100%), nitrous oxide emissions (Fig. 4, P_W = 100%), nitrogen mineralization (Fig. 6, P_W = 58.4%), and soil water desorption (Fig. 9, P_W = 100% and Fig. 10, P_W = 99.7%). The greater success rate for Weibull is perhaps not surprising, as it has one more fitting parameter (c) and thereby much greater flexibility. However, goodness of fit and parsimony metrics may not necessarily be the only criteria for deciding model suitability. Perhaps consideration should also be given to how well the fitted model tracks the data derivative function (dY/dx), or to the model that yields more physically realistic end points for the derivative function. For example, the derivative function for SOC distribution (Fig. 7b) was not tracked as well by the best-fit Rayleigh model (P_R = 96.4%) as it was by the second-best Weibull model (P_W = 3.5%). Also, zero initial nitrogen mineralization rate indicated by the best-fit Weibull model (P_W = 58.4%, Fig. 6b) may not be as physically realistic as maximum initial mineralization rate indicated by second-best Mitscherlich (P_M = 41.6%, Fig. 6b). It may even be appropriate in some cases to select a model primarily on its ability to track the data derivative function, or to provide physically plausible derivative function end points, with goodness of fit and parsimony metrics reduced to secondary consideration. This possibility warrants further investigation.

The results of this study suggest that the time/space/quantity behaviour of a wide range of agricultural processes is amenable to characterization and parameterization using the Weibull–Mitscherlich–Rayleigh family of functions. Some likely (or already partially demonstrated) candidates include seed germination and seedling emergence versus time or thermal time (e.g., Fig. 3); greenhouse gas production and emissions from soil versus time, soil moisture, or fertilizer source/rate/timing/placement (e.g., Fig. 4); crop yield and biomass production versus time, soil fertility, rainfall, temperature, or soil biophysical condition (e.g., Fig. 5); nitrogen mineralization dynamics versus time, temperature, or substrate composition (e.g., Fig. 6); SOC sequestration versus time, management practice, or depth (e.g., Fig. 7); transmission and storage of soil air and water (e.g., Fig. 8); and soil microbial population dynamics versus time or depth below surface. Further investigation of the usability of Weibull–Mitscherlich–Rayleigh functions for characterizing agricultural data and processes appears well justified.

5. Conclusions

The Weibull–Mitscherlich–Rayleigh family of functions was adapted to increase their utility for characterizing agricultural processes (Figs. 1 and 2; Tables 1 and 2). Using non-linear least-squares curve fitting, goodness of model-data fit metrics, and model selection metrics, it was demonstrated that the adapted functions were capable of producing accurate and unbiased fits to a wide range of agricultural data including corn seedling emergence, nitrous oxide emissions, change in corn grain yield, nitrogen mineralization, SOC distribution, and soil water desorption (Figs. 3–9). The fitted functions in turn provided several parameters (e.g., Y₀, Y_F, x₀, k, and c) and metrics (e.g., x_F, x_D, x_M, R_M, A_I, S_F, and K_F) that were useful or potentially useful for characterizing agricultural data (Appendix A, Tables 1 and 2). It was therefore concluded that the adapted Weibull–Mitscherlich–Rayleigh family of functions has substantial potential for informative application to a wide range of agricultural data and processes.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as an actual or potential conflict of interest.

Author Contributions

Author D.R. conceived the research and conducted the mathematical and computer analyses, while authors D.R., C.D., X.Y., and I.A. supervised data collection. All authors contributed to data interpretation and writing of the manuscript.

Funding

Funding for this work was provided by the Science and Technology Branch of Agriculture and Agri-Food Canada, A-Base Study 2380.

Acknowledgements

We gratefully acknowledge W. Calder, J. Gignac, J. Huffman, M. Reeb, and various summer students for data collection; Essex-Windsor Solid Waste Authority, Windsor, Ontario for provision of yard waste compost (Section 3.3.); and the AAFC-Harrow farm crew for ongoing operation and maintenance of the field sites.

References

1.

Abernethy, R.B. 2010. The new Weibull handbook, 5th ed. Robert B. Abernethy, North Palm Beach, FL, USA. Google Scholar

2.

Aboutalebian, M.A., Nazari, S., and Gonzales-Andujar, J.L. 2017. Evaluation of a model for predicting Avena fatua and Descurainia sophia seed emergence in winter rapeseed. Span. J. Agric. Res. 15(2): 1–7. https://doi.org/10.5424/sjar/2017152-10572. Google Scholar

3.

Agomoh, I.V., Drury, C.F., Reynolds, W.D., Woodley, A., Yang, X., Phillips, L.A., and Rehmann, L. 2021. Stover harvest and tillage effects on corn seedling emergence. Agronomy J. 1–9. https://doi.org/10.1002/agj2.20738. Google Scholar

4.

Archontoulis, S.V., and Miguez, F.E. 2015. Nonlinear regression models and applications in agricultural research. Agron. J. 107: 786–798. https://doi.org/10.2134/agronj2012.0506. Google Scholar

5.

Banks, H.T., and Joyner, M.L. 2017. AIC under the framework of least squares estimation. Appl. Math. Lett. 74: 33–45. https://doi.org/10.1016/j.aml.2017.05.005. Google Scholar

6.

Beauchamp, E.G., Reynolds, W.D., Brasche-Villeneuve, D., and Kirby, K. 1986. Nitrogen mineralization kinetics with different soil pretreatments and cropping histories. Soil Sci. Soc. Am. J. 50: 1478–1483. Google Scholar

7.

Bennett, A.J., Bending, G.D., Chandler, D., Hilton, S., and Mills, P. 2012. Meeting the demand for crop production: the challenge of yield decline in crops grown in short rotations. Biol. Rev. 87: 52–71. https://doi.org/10.1111/j.1469-185x.2011.00184.x. Google Scholar

8.

Brown, R.F., and Mayer, D.G. 1988. Representing cumulative germination. 2. The use of the Weibull function and other empirically derived curves. Ann. Bot. 61: 127–138. Google Scholar

9.

Chu, P.C. 2013. Weibull statistics in ocean analysis and prediction. OCEANS-San Diego, San Diego, CA, USA. pp. 1–4. https://doi.org/10.23919/oceans.2013.6741376. Google Scholar

10.

Chu, Y.-K., and Ke, J.-C. 2012. Approaches for parameter estimation of Weibull distribution. Math. Comput. Appl. 17(1): 39–47. Google Scholar

11.

Cousineau, D. 2009. Fitting the three-parameter Weibull distribution: review and evaluation of existing and new methods. IEEE Trans. Dielectr. Electr. Insul. 16(1): 281–288. https://doi.org/10.1109/tdei.2009.4784578. Google Scholar

12.

Cousineau, D. 2011. The fallacy of large shape parameters when using the two-parameter Weibull distribution. IEEE Trans. Dielectr. Electr. Insul. 18(6): 2095–2102. Google Scholar

13.

Drury, C.F., Reynolds, W.D., Yang, X.M., McLaughlin, N.B., Calder, W.C., and Phillips, L.A. 2021. Diverse rotations impact microbial processes, seasonality and overall N₂O emissions from soils. Soil Sci. Soc. Am. J. https://doi.org/10.1002/saj2.20298. Google Scholar

14.

Evans, J.W., Kretschmann, D.E., and Green, D.W. 2019. Procedures for estimation of Weibull parameters. General Technical Report FPL-GTR-264. U.S. Department of Agriculture, Forest Service, Forest Products Laboratory, Madison, WI, USA. 17 p. Google Scholar

15.

Folk, R.L. 1980. Petrology of sedimentary rocks. Hemphill Publishing Co., Austin, TX, USA. Google Scholar

16.

Forcella, F., Benech Arnold, R.L., Sanchez, R., and Ghersa, C.M. 2000. Modeling seedling emergence. Field Crops Res. 67: 123–139. Google Scholar

17.

Gan, Y., Stobbe, E.H., and Njue, C. 1996. Evaluation of selected nonlinear regression models in quantifying seedling emergence rate of spring wheat. Crop Sci. 36: 165–168. Google Scholar

18.

Gardarin, A., Durr, C., and Colbach, N. 2011. Prediction of germination rates of weed species: Relationships between germination speed parameters and species traits. Ecol. Model. 222: 626–636. https://doi.org/10.1016/j.ecolmodel.2010.10.005. Google Scholar

19.

Gill, G.S., Cousens, R.D., and Allan, M.R. 1996. Germination, growth and development of herbicide resistant and susceptible populations of rigid ryegrass (Lolium rigidum). Weed Sci. 44: 252–256. Google Scholar

20.

Gillis, J.D., and Price, G.W. 2011. Comparison of a novel model to three conventional models describing carbon mineralization from soil amended with organic residues. Geoderma, 160: 304–310. https://doi.org/10.1016/j.geoderma.2010.09.025. Google Scholar

21.

Glen, S. 2020a. Memoryless property. From StatisticsHowTo.com https://www.statisticshowto.com/memoryless-property/ [accessed 5 Sept. 2020]. Google Scholar

22.

Glen, S. 2020b. Rayleigh distribution: definition, uses, mean, variance. From StatisticsHowTo.com https://www.statisticshowto.com/rayleigh-distribution/ [accessed 5 Sept. 2020]. Google Scholar

23.

Glen, S. 2020c. Expected value in statistics: definition and calculating it. From StatisticsHowTo.com https://www.statisticshowto.com/probability-and-statistics/expected-value/ [accessed 5 Sept. 2020]. Google Scholar

24.

Gregorich, E.G., Carter, M.R., Doran, J.W., Pankhurst, C.E., and Dwyer, L.M. 1997. Biological attributes of soil quality. Pages 81–114 in E.G. Gregorich and M.R. Carter, eds. Soil quality for crop production and ecosystem health. Developments in soil science. Vol. 25. Elsevier, New York, NY, USA. Google Scholar

25.

Groenevelt, P.H., and Grant, C.D. 2004. A new model for the soil-water retention curve that solves the problem of residual water contents. Eur. J. Soil Sci. 55: 479–485. https://doi.org/10.1111/j.1365-2389.2004.00617.x. Google Scholar

26.

Haj Seyed Hadi, M.R., and Gonzales-Andujar, J.L. 2009. Comparison of fitting weed seedling emergence models with nonlinear regression and genetic algorithm. Comput. Electr. Agric. 65: 19–25. https://doi.org/10.1016/j.compag.2008.07.005. Google Scholar

27.

Harmsen, K. 2000. A modified Mitscherlich equation for rainfed crop production in semi-arid areas: 1. Theory. Neth. J. Agric. Sci. 48: 237–250. Google Scholar

28.

Hoben, J.P., Gehj, R.J., Millar, N., Graces, P.R., and Robertson, G.P. 2011. Nonlinear nitrous oxide (N₂O) response to nitrogen fertilizer in on-farm corn crops of the US Midwest. Global Change Biol. 17: 1140–1152. https://doi.org/10.1111/j.1365-2486.2010.02349.x. Google Scholar

29.

Horton, R.E. 1940. An approach towards a physical interpretation of infiltration capacity. Proc. Soil Sci. Soc. Am. 5: 399–417. Google Scholar

30.

Izquierdo, J., Bastida, F., Lezaun, J.M., Sa Nchez Del Arco, M.J., and Gonzales-Andujar, J.L. 2013. Development and evaluation of a model for predicting Lolium rigidum emergence in winter cereal crops in the Mediterranean area. Weed Res. 53: 269–278. https://doi.org/10.1111/wre.12023. Google Scholar

31.

Jamieson, P.D., Porter, J.R., and Wilson, D.R. 1991. A test of the computer simulation model ARCWHEAT1 on wheat crops grown in New Zealand. Field Crops Res. 27: 337–350. Google Scholar

32.

Johnson, J.B., and Omland, K.S. 2004. Model selection in ecology and evolution (review). Trends Ecol. Evol. 19(2): 101–108. https://doi.org/10.1016/j.tree.2003.10.013. Google Scholar

33.

Karadavut, U., and Tozluca, A. 2005. Growth analysis some characters in rye (Secale cereal L.): growth of root and upper ground parts. J. Crop Res. 2: 1–10. Google Scholar

34.

Karadavut, U., Kokten, K., and Kavurmaci, Z. 2010. Comparison of relative growth rates in silage corn cultivars. Asian J. Anim. Vet. Adv. 5(3): 223–228. Google Scholar

35.

Lai, C.-D., and Xie, M. 2006. Chapter 5 — Weibull related distributions. Pages 139–166 inStochastic ageing and dependence for reliability. Springer, New York, NY, USA. https://doi.org/10.1007/0-387-34232-x5. Google Scholar

36.

Mahanta, D.J., and Borah, M. 2014. Parameter estimation of Weibull growth models in forestry. Int. J. Math. Technol. 8(3): 157–163. Google Scholar

37.

Monahan, A.H. 2006. The probability distribution of sea surface wind speeds. Part 1: theory and sea winds observations. J. Clim. 19: 497–520. Google Scholar

38.

National Institute of Standards and Technology/SEMATECH. 2013. e-Handbook of statistical methods. http://www.itl.nist.gov/div898/handbook/. Google Scholar

39.

Navarro, M., Febles, G., Torres, V., Mesa, A.R., and Jay, Y.O. 2013. Use of the Weibull function to evaluate the emergence of Albizia lebbeck (L.) Benth seedlings. Pastos y Forrajes, 36(2): 222–226. Google Scholar

40.

NCSS, Weibull. 2020. NCSS statistical software, Weibull module. NCSS, LLC, Kaysville, UT, USA. Google Scholar

41.

Piňeiro, G., Perelman, S., Guerschman, J.P., and Paruelo, J.M. 2008. How to evaluate models: observed vs. predicted or predicted vs. observed? Ecol. Model. 216: 316–322. https://doi.org/10.1016/j.ecolmodel.2008.05.006. Google Scholar

42.

Quinn, J.B., and Quinn, G.D. 2010. A practical and systematic review of Weibull statistics for reporting strengths of dental materials. Dental Mater. 26: 135–147. https://doi.org/10.1016/j.dental.2009.09.006 Google Scholar

43.

Reliability Workbench, Weibull Analysis. 2020. Isograph Inc., Alpine, UT, USA. Google Scholar

44.

ReliaSoft, Weibull 7.0. 2020. HBM Prenscia Inc., Southfield, MI, USA. Google Scholar

45.

Relyence, Weibull. 2020. Relyence Corporation, Greensburg, PA, USA. Google Scholar

46.

Reynolds, W.D., Drury, C.F., Yang, X.M., Tan, C.S., and Yang, J.Y. 2014. Impacts of 48 years of consistent cropping, fertilization and land management on the physical quality of a clay loam soil. Can. J. Soil Sci. 94: 403–419. https://doi.org/10.4141/cjss2013-097. Google Scholar

47.

Reynolds, W.D., Nurse, R.E., Phillips, L.A., Drury, C.F., Yang, X.M., and Page, E.R. 2020. Characterizing mass–volume–density–porosity relationships in a sandy loam soil amended with compost. Can. J. Soil Sci. 100: 289–301. https://doi.org/10.1139/cjss-2019-0149. Google Scholar

48.

Richards, L.A. 1931. Capillary conduction of liquids through porous mediums. J. Appl. Phys. 1: 318–333. https://doi.org/10.1063/1.1745010. Google Scholar

49.

Schjønning, P., McBride, R.A., Keller, T., and Obour, P.B. 2017. Predicting soil particle density from clay and soil organic matter contents. Geoderma, 286: 83–87. https://doi.org/10.1016/j.geoderma.2016.10.020. Google Scholar

50.

Signor, D., and Cerri, C.E.P. 2013. Nitrous oxide emissions in agricultural soils: a review. Pesq. Agropec. Trop. Goiânia, 43: 322–338. Google Scholar

51.

Srivastava, S., Subba Rao, A., Alivelu, K., Singh, K.N., Raju, N.S., and Rathore, A. 2006. Evaluation of crop responses to applied fertilizer phosphorus and derivation of optimum recommendations using the Mitscherlich–Bray equation. Comm. Soil Sci. Plant Anal. 37(05-06): 847–858. https://doi.org/10.1080/00103620600564182. Google Scholar

52.

Vandekerckhove, J., Matzke, D., and Wagenmakers, E.-J. 2015. Model comparison and the principle of parsimony. Pages 300–318 in J.R. Busemeyer, Z. Wang, J.T. Townsend, and A. Eidels, eds. The Oxford handbook of computational and mathematical psychology. Oxford University Press, New York, NY, USA. Google Scholar

53.

van Genuchten, M.Th. 1980. A closed-form equation for predicting the hydraulic conductivity of unsaturated soils. Soil Sci. Soc. Am. J. 44: 892–898. Google Scholar

54.

Weibull, W. 1951. A statistical distribution function of wide applicability. ASME J. Appl. Mech. 18: 293–297. Google Scholar

55.

Weisstein, E.W. 2020. Rayleigh distribution. From MathWorld — a Wolfram web resource. https://mathworld.wolfram.com/RayleighDistribution.html [accessed 5 Sept. 2020]. Google Scholar

56.

Weisstein, E.W. 2004. Exponential distribution. From MathWorld — a Wolfram web resource. https://mathworld.wolfram.com/ExponentialDistribution.html [accessed 5 Sept. 2020]. Google Scholar

57.

Yang, X.M., Drury, C.F., Reynolds, W.D., and Phillips, L.A. 2020. Nitrogen release from shoots and roots of crimson clover, hairy vetch, and red clover. Can. J. Soil Sci. 100: 179–188. https://doi.org/10.1139/cjss-2019-0164. Google Scholar

58.

Zok, F.W. 2017. On weakest link theory and Weibull statistics. J. Am. Ceram. Soc. 100: 1265–1268. https://doi.org/10.1111/jace.14665. Google Scholar

Appendices

Appendix A

Potentially useful metrics of the adapted Weibull derivative and cumulative functions are derived below, illustrated in Fig. 2, and summarized in Table 1.

A1. .

Weibull derivative function: mode, median, skewness, kurtosis, and integral average

The derivative function mode gives the coordinates, , and value of the derivative function peak, (Fig. 2a). The mode is obtained by setting and :

(A1)

then solving for

:

(A2)

and then back substituting

into eqs. 1 and 4 to produce

(A3)

where

(A4)

Interestingly, Y_M (eq. A4) is independent of x and k. Note also that when c = 1, occurs at ; and when 0 < c < 1, as (see e.g., Fig. 1a).

The median bisects the area under the derivative function curve into two equal halves (Fig. 2a); and is obtained by setting , then equating a normalized form of eq. 4 to the 50th percentile, i.e.,

(A5)

then solving for

:

(A6)

and then back-substituting

into eq. 1 to obtain:

(A7)

where

(A8)

As might be expected, the median is less than the mode when the derivative function is left skewed, equal to the mode when the derivative function is symmetrical, and greater than the mode when the derivative function is right skewed (Fig. 2a).

The Fisher population skewness (S_F) and kurtosis (K_F) of the derivative function are given by (e.g., Chu 2013):

(A9)

and

(A10)

where is the Gamma distribution (available as a built-in function in most computer spreadsheets — given as GAMMA(x) in Excel^®). The S_F is a measure of function symmetry. Negative skewness indicates left skew (excess of small x-values relative to a normal distribution), positive skewness indicates right skew (excess of large x-values relative to a normal distribution), and zero skewness indicates a symmetrical (normal) distribution (National Institute of Standards and Technology 2013). The K_F, in contrast, is a measure of the degree of function tailing relative to a normal distribution. Positive kurtosis (leptokurtic derivative function) indicates more/heavier tailing (more small and large “outlier” x-values) than a normal distribution, and negative kurtosis (platykurtic derivative function) indicates less/lighter tailing (fewer small and large outlier x-values) than a normal distribution (National Institute of Standards and Technology 2013). The kurtosis of a normal distribution is zero when K_F is defined as in eq. A10. Table 2 gives proposed S_F and K_F categories based on the inclusive graphic classifications of Folk (1980). As an example, the derivative function in Fig. 2a is moderately right skewed (S_F = 0.9620) and slightly leptokurtic (K_F = 1.0440) — indicating that it has both a moderate excess of large x-values and slightly more outlier x-values than the derivative function of a normal distribution.

The integral average of the Weibull derivative function is obtained using the fundamental theorem of integral calculus, i.e.,

(A11)

which readily simplifies to

(A12)

since:

(A13)

The integral average of the derivative function describes a rectangle of length and height which has the same area as the area under the derivative function curve between and (Fig. 2a). The value can consequently be viewed as a mean rate of change of Y with respect to x (i.e., mean dY/dx) between and .

A2. .

Weibull cumulative function: mean, inflection, quantiles, and domain

The mean, is given by (Lai and Xie 2006; NCSS, Weibull 2020):

(A14)

(A15)

where

is the Gamma function (Fig. 2b). The

value is an average x of the cumulative function weighted according to the corresponding derivative function (e.g., Glen 2020c), i.e.,

(A16)

The lies to the left of the modal x ( for negative-skewed (left-tailed) derivative functions and to the right of for positive-skewed (right-tailed) derivative functions (Fig. 2a). Note also that although can be back-substituted into the derivative function (eq. 1), the resulting value applies only at and is, therefore, not the same as (Fig. 2a).

The inflection, , is derived in the same fashion as the derivative function mode, leading to

(A17)

(A18)

and hence

is identical to

. The inflection locates the point where the cumulative function changes from convex (upward bending) to concave (downward bending) (Fig. 2b), thus demarking the maximum slope of Y vs. x (i.e., maximum dY/dx), and thereby the peak (mode) of the corresponding derivative function (Fig. 2a).

The quantiles, are specified points on the cumulative function that represent fractions of the function maximum, e.g., 0.2|Y_F-Y₀|, 0.6|Y_F-Y₀|, etc. Quantiles are obtained by setting , then equating a normalized form of the cumulative function to the Qth quantile, i.e.,

(A19)

then solving for

:

(A20)

and then back-substituting

into eq. 4:

(A21)

Therefore, (for example) is given by

(A22)

Interestingly, is given by

(A23)

and hence, the 0.5 quantile

(Fig. 2b) also locates the median

(Fig. 2a). Note as well that time, t, Y(t), and rate (dY/dt) at the 0.5 quantile are sometimes used as diagnostic metrics in seed germination, seedling emergence, and mineralization studies (e.g., Gardarin et al. 2011; Gan et al. 1996; Yang et al. 2020). That is, 0.5 quantile time is given by

(eq. A23); germination, emergence, or mineralization rate at

is given by

(A24)

and cumulative germination, emergence, or mineralization at is given by (eq. A21).

The domain, x_D, of our adapted Weibull function is defined here as

(A25)

where

is the initial (minimum) x-value, and

is the final (end-point) x-value (Fig. 2b). The corresponding Weibull function range, Y_R, is therefore Y_R = Y_F − Y₀, where Y_F = Y(x_F) and Y₀ = Y(x₀). It is clear from eq. 4 and Fig. 2b that the adapted Weibull cumulative function is finite and defined at

but asymptotic to the x-axis at

(i.e., x → ∞ as

). Hence,

is readily obtained, but

must be estimated. We estimate

here by equating it to the 0.9999 quantile in eq. A20, i.e.,

(A26)

which is the x-value that corresponds to

(A27)

or

(A28)

Our estimated consequently approximates to ≥99.99 % of (Fig. 2b). Note also that the derivative function departs from zero at and returns to zero at (Fig. 2a).

Citation Download Citation

W. Daniel Reynolds, Craig F. Drury, Lori A. Phillips, Xueming Yang, and Ikechukwu V. Agomoh "An adapted Weibull function for agricultural applications," Canadian Journal of Soil Science 101(4), 680-702, (25 June 2021). https://doi.org/10.1139/cjss-2021-0046

Received: 15 April 2021; Accepted: 9 June 2021; Published: 25 June 2021

Access the abstract

JOURNAL ARTICLE
23 PAGES

DOWNLOAD PAPER + SAVE TO MY LIBRARY