Detect a discrepancy between the empirical normal distribution of a sample of (transformed to normal scale) NLME parameters and a specified "target" normal distribution.
detectSampleDiscrepancy.Rd
In brief, this function tests whether the Bhattacharyya distances to the normal distribution
defined by targetMuNormal
and targetSigmaNormal
of (the transformed to normal scale) values in
testSample
and targetSample
are significantly different. The Bhattacharyya distance measures the
overlap between two (multivariate) normal distributions. It is assumed that
targetSample
is consistent with targetMuNormal
and targetSigmaNormal
. A discrepancy is then
reported for each parameter X (column X in testSample
and targetSample
), and for each pair
of parameters X,Y, such that the T-test Bonferoni-corrected p-value for differing mean Bhattacharyya distances is
significant at the level tAlpha
. Specifically, for each parameter X and pair of parameters X,Y
among the variable parameter columns in testSample
and targetSample
, the procedure goes as follows:
Transform
testSample
and targetSample to normal scale based on the parameter transformation inspec
.Divide the two transformed samples into
Ntests
chunks ofNpop
rows.Calculate the Bhattacharyya distances between the normal distributions defined by the empirical mean vector and covariance matrix of each (test and target) chunk and the normal distribution defined by
targetMuNormal
andtargetSigmaNormal
. The result of this calculation are two vectors of lengthNtests
containing the Bhattacharyya for the test- and target-sample:btestSample
andbtargetSample
.Perform a statistical T-test for significant difference between the means of
btestSample
andbtargetSample
.Perform a Bonferoni p-value correction (multiplying the p-value by the number of single parameters or by the number of parameter pairs.
If X is a single parameter with a significant Bonferoni-corrected p-value at the
tAlpha
critical level, generate a combined histogram plot comparing the X-values in testSample against targetSample. Otherwise, if X is pair of parameters with a significant Bonferoni-corrected p-value at thetAlpha
critical level, generate a combined 2d density plot, comparing theX[1],X[2]
-values in testSample and targetSample.
detectSampleDiscrepancy(
obj,
spec = specifyParamSampling(obj),
testSample = sampleParamFromUncertainty(spec, Npop = Npop * Ntests),
targetMuNormal = unlist(transformParamToNormal(spec, getParamEstimates(spec))[1,
getNamesAllParameters(spec)]),
targetSigmaNormal = getCovMatrixUncertainty(spec),
targetSample = untransformParamFromNormal(spec, as.data.frame(MASS::mvrnorm(Npop *
Ntests, mu = targetMuNormal, Sigma = targetSigmaNormal))),
Npop = 200,
Ntests = 5,
tAlpha = 0.01,
FLAGverbose = FALSE
)
Arguments
- obj
a filename (character string) denoting the path to a GPF file, or a GPF object, or an IQRnlmeParamSpec object. This argument is ignored if
spec
is specified.- spec
an IQRnlmeParamSpec object. This argument overwrites the argument
obj
. Default:specifyParamSampling(obj)
.- testSample
a data.frame of
Ntests*Npop
rows with columns named as (some of) the parameters inspec
(seegetNamesAllParameters
) and rows corresponding to different sampled tupplets of such parameters. The parameter values in this data.frame are at the original scale (not transformed to normal scale). Default:sampleParamFromUncertainty(spec, Npop = Npop*Ntests)
.- targetMuNormal
a named numeric vector specifying the mean of the target normal distribution (on the normal scale). Default:
unlist(transformParamToNormal(spec, getParamEstimates(spec))[1L, getNamesAllParameters(spec)])
- targetSigmaNormal
a square matrix specifying the variance covariance matrix of the target normal distribution (on the normal scale). Default:
getCovMatrixUncertainty(spec)
.- targetSample
a data.frame of the same format as
testSample
. Default:untransformParamFromNormal(spec, as.data.frame(MASS::mvrnorm(Npop * Ntests, mu = targetMuNormal, Sigma = targetSigmaNormal)))
- Npop, Ntests
integers defining the number of rows in one parameter sample for one T-test. The number of rows in
testSample
andtargetSample
is equal toNpop*Ntests
. IftestSample
andtargetSample
are specifiedNpop
is set tonrow(testSample) %/% Ntests
.- tAlpha
a double between 0 and 1 specifying the critical p-value level for the T-tests. Note: p-values are Bonferoni corrected.
- FLAGverbose
a logical indicating if information should be printed to the console.
Value
A named list with at least the following elements:
detectedSingleParameters: a character vector of the names of detected parameters with apparent sampling discrepancy;
listHistogramsDetectedSingleParams: a list of ggplot objects;
detectedPairParameters: a character vector of the names of detected parameter-pairs with apparent sampling discrepancy;
listDensPlotsDetectedPairParams: a list of ggplot objects.
Examples
if (FALSE) {
discr <- detectSampleDiscrepancy(
system.file("extdata", "nlme_param_sampling", "PKparameters.xlsx",
package = "IQRtools"),
Npop = 500, Ntests = 10)
discr$histogramsDetectedSingleParams + ggplot2::theme(legend.position = "bottom")
discr$densplotsDetectedPairParams + ggplot2::theme(legend.position = "bottom")
}