Title: | Statistical Analyses and Power Calculations for Count Data and Faecal Egg Count Reduction Tests (FECRT) |
---|---|
Description: | Power calculations and hypothesis testing for the difference in mean of two negative binomial distributions A set of functions to allow analysis of count data (such as faecal egg count data) using Bayesian MCMC methods. Returns information on the possible values for mean count, coefficient of variation and zero inflation (true prevalence) present in the data. A complete faecal egg count reduction test (FECRT) model is implemented, which returns inference on the true efficacy of the drug from the pre- and post-treatment data provided, using non-parametric bootstrapping as well as using Bayesian MCMC. Functions to perform power analyses for faecal egg counts (including FECRT) are also provided. A working installation of JAGS (<http://mcmc-jags.sourceforge.net>) is required for MCMC-based methods |
Authors: | Matthew Denwood [aut, cre], Bob Wheeler [cph] (Copyright holder of the code in /src/dists.*) |
Maintainer: | Matthew Denwood <[email protected]> |
License: | GPL-2 |
Version: | 1.0.0-9 |
Built: | 2024-11-06 04:12:15 UTC |
Source: | https://github.com/mdenwood/bayescount |
The bayescount package analyses the over-dispersed count datasets(such as those commonly encountered in parasitology) usingcomputationally intensive statistical techniques such asnon-parametric bootstrapping and Bayesian MCMC. The FEC data analysisreturns information on the mean count, coefficient of variation andzero inflation (true prevalence) present in the data. The faecal eggcount reduction test (FECRT) analysis calculates efficacy as well aspost-treatment changes in count variation, and allows the analysis ofrepeat observations from the same animals. Power and sample sizecalculations for FECRT and FEC studies (given any expected parameter values) are also provided. Requires Just Another Gibbs Sampler (JAGS)for MCMC analysis functions (see http://mcmc-jags.sourceforge.net)
FEC Analysis
You can do that
FECRT analysis
That too
Power
And that
M. J. Denwood, S. W. J. Reid, S. Love, M. K. Nielsen, L. Matthews, I. J. McKendrick, and G. T. Innocent. Comparison of three alternative methods for analysis of equine Faecal Egg Count Reduction Test data. Prev. Vet. Med. (2009), doi:10.1016/j.prevetmed.2009.11.009
Denwood, M. J. (2010). A quantitative approach to improving the analysis of faecal worm egg count data. University of Glasgow. Retrieved from http://www.gla.ac.uk/media/media_149338_en.pdf
fecrt
, bayescount
, bayescount.single
, fec.power
, fecrt.power
, runjags
Analyses count data for difference to either known or fixed mean using any of (1) Christian approach, (2) my approach, (3) bootstrap, (4) MCMC (default 1-3). (4) calls count_model and then runs the model using runjags, and is the only one to permit ZI and LP. Requires Just Another Gibbs Sampler (JAGS) (see http://mcmc-jags.sourceforge.net)
count_analysis(data = stop("Data must be specified"), model = "ZILP", alt.prior = FALSE, adjust.zi.mean = FALSE, raw.output = FALSE, likelihood = FALSE, ...)
count_analysis(data = stop("Data must be specified"), model = "ZILP", alt.prior = FALSE, adjust.zi.mean = FALSE, raw.output = FALSE, likelihood = FALSE, ...)
count_model
, reduction_analysis
, bayescount
, autoextend.jags
, runjags
Creates a count model for the specified data and returns it as an un-run runjags object
count_model(data = stop("No data supplied"), model = stop("No model specified"), call.jags = FALSE, alt.prior = FALSE, monitor.lambda = FALSE, monitor.deviance = FALSE, ...)
count_model(data = stop("No data supplied"), model = stop("No model specified"), call.jags = FALSE, alt.prior = FALSE, monitor.lambda = FALSE, monitor.deviance = FALSE, ...)
count_analysis
, reduction_model
, bayescount
, autoextend.jags
, runjags
Power calculation based on BNB code and christian approach yet to be written. Do either test vs known single mean, or test for difference in mean to observed dataset (k can vary between datasets or be fixed).
count_power(type = "fec", cutoff = if ("fec") 400 else 95, alternative = "two.sided", conf.level = 0.05, maxiterations = 1e+06, mc.decimals = 2, mc.conf = 0.99, feedback = FALSE, ...)
count_power(type = "fec", cutoff = if ("fec") 400 else 95, alternative = "two.sided", conf.level = 0.05, maxiterations = 1e+06, mc.decimals = 2, mc.conf = 0.99, feedback = FALSE, ...)
count_analysis
, reduction_power
, bayescount
Precision analysis by Monte Carlo simulation and examination of distribution of mean
count_precision(meanepg = 200, g.faeces = 3, sensitivity = 1/25, replicates = 1, animals = 10, coeffvarmcm = 0.1, coeffvarrep = 0.4, coeffvarind = 0.3, coeffvargroup = 0.7, true.sample = FALSE, type = "confidence", precision = if (type == "confidence") 0.1 else NA, lower.interval = meanepg * (1 - precision), upper.interval = meanepg * (1 + precision), confidence = 0.95, iterations = if (type == "confidence") 10^6 else 10^5, mc.decimals = 2, mc.conf = 0.99, feedback = FALSE, forcesim = FALSE)
count_precision(meanepg = 200, g.faeces = 3, sensitivity = 1/25, replicates = 1, animals = 10, coeffvarmcm = 0.1, coeffvarrep = 0.4, coeffvarind = 0.3, coeffvargroup = 0.7, true.sample = FALSE, type = "confidence", precision = if (type == "confidence") 0.1 else NA, lower.interval = meanepg * (1 - precision), upper.interval = meanepg * (1 + precision), confidence = 0.95, iterations = if (type == "confidence") 10^6 else 10^5, mc.decimals = 2, mc.conf = 0.99, feedback = FALSE, forcesim = FALSE)
count_analysis
, count_power
, bayescount
Apply a Bayesian [zero-inflated] gamma / Weibull / lognormal / independant / simple Poisson model to count data to return possible values for mean count, coefficient of variation, and zero-inflation, as either summary statistics or mcmc objects. Convergence is assessed for each dataset by calculating the Gelman-Rubin statistic for each parameter, see autorun.jags
. Optionally, the log likelihood for the model fit is also calculated. The time taken to complete each analysis (not including calculation of the likelihood) is also recorded. The lower level functions in the runjags package are used for calling JAGS.
Note: The GUI interface for R in Windows may not continually refresh the output window, making it difficult to track the progress of the simulation (if silent.jags is FALSE). To avoid this, you can run the function from the terminal version of R (located in the Program Files/R/bin/ folder).
count.analysis(data = stop("Data must be specified"), model="ZILP", alt.prior = FALSE, adjust.zi.mean = FALSE, raw.output = FALSE, likelihood=FALSE, ...)
count.analysis(data = stop("Data must be specified"), model="ZILP", alt.prior = FALSE, adjust.zi.mean = FALSE, raw.output = FALSE, likelihood=FALSE, ...)
data |
an existing R object containing the data. Data can either be specified as a numeric vector of single counts for each sample, or as a matrix of repeated counts (columns) for each sample (rows). Repeated counts are modelled as part of the same Poisson process. Data can also be specified as a (ragged) list of repeated counts, with each element of the list representing a seperate sample. Finally, data can be specified as a list containing the elements 'totals' representing the sum of the repeated McMasters counts, and 'repeats' representing the number of McMasters counts performed per sample. Missing data (or unused elements of non-ragged arrays) may be represented using NA, which will be removed from the data before analysis. The likelihood calculation is not available for ragged arrays and missing data, and will print a warning. No default. |
model |
the model to use. Choices are "GP" (gamma Poisson = negative binomial), "ZIGP" (zero-inflated gamma Poisson = zero-inflated negative binomial), "LP" (lognormal Poisson), "ZILP" (zero-inflated lognormal Poisson), "WP" (Wiebull Poisson), "ZIWP" (zero-inflated Weibull Poisson), "SP" (simple Poisson), "ZISP" (zero-inflated simple Poisson) or "IP" (independant Poisson). Case insensitive. The simple Poisson model forces each count to have the same mean, wheras the independant Poisson process allows each count to have an unrelated mean (therefore a zero-inflated version is not possible). Default "ZILP". |
alt.prior |
should the model run the [ZI] [WP|GP|LP] models using the standard or the alternative prior distribution for variance? (logical) Can also be a character value of a user-specified prior distribution. Default FALSE. Where information concerning overdispersion in the data is sparse, the choice of prior distribution will have an affect on the posterior distribution for ALL parameters. It is recommended to run a simulation using both types of prior when working with small datasets, to make sure results are consistent. |
adjust.zi.mean |
should the mean count parameter of the zero-inflated models be adjusted to reflect the mean of the whole population? (logical) If FALSE the mean count of the zero-inflated models reflects the mean of the gamma or Poisson distribution only, if TRUE the mean includes extra zeros. Used for comparing results between zero-inflated and non zero-inflated models. Default FALSE. |
raw.output |
the function can return either a summary of the results (as with bayescount), or an MCMC object representing the estimates at each iteration for both chains (including the likelihood estimates where appropriate). If TRUE, the latter is output. (logical) Default FALSE. |
likelihood |
should the (log) likelihood for the fit of the model to the dataset be calculated? (logical) The likelihood for the [ZI] WP, LP and GP models are calculated using a likelihood function integrated over all possible values for lambda, which can take some time. The likelihood is calculated using a thinned chain of 1000 values to reduce the time taken. Default FALSE. |
... |
additional arguments to be passed directly to |
Either a vector similar to that obtained from bayescount
containing an indication of the error/crash/convergence status, the number of sampled updates used, and a lower/upper 95% highest posterior density interval (see HPDinterval
), and median estimate for each relevant parameter (optionally including the likelihood), or an MCMC object representing the estimates at each iteration for both chains (optionally including the likelihood).
# use a zero-inflated lognormal Poisson model to analyse some count # data, and suppressing JAGS output: ## Not run: bayescount.single(data=c(0,5,3,7,0,4,3,8,0), model="ZILP", silent.jags=TRUE) ## End(Not run)
# use a zero-inflated lognormal Poisson model to analyse some count # data, and suppressing JAGS output: ## Not run: bayescount.single(data=c(0,5,3,7,0,4,3,8,0), model="ZILP", silent.jags=TRUE) ## End(Not run)
Apply a Bayesian (zero-inflated) (gamma / Weibull / lognormal / independant / simple) Poisson model to count data to return possible values for mean count, variance, shape paramater, scale parameter (overdispersion or 'k') and zero-infaltion where appropriate to the model selected. This function generates the model specifications and starting values, and is used by the higher level functions bayescount and bayescount.single, but can also be called directly. Requires Just Another Gibbs Sampler (JAGS). *THIS SOFTWARE IS INTENDED FOR EDUCATIONAL PURPOSES ONLY AND SHOULD NOT BE RELIED UPON FOR REAL WORLD APPLICATIONS* The GUI interface for R in Windows may not continually refresh the output window, making it difficult to track the progress of the simulation (if silent.jags is FALSE). To avoid this, you can run the function from the terminal version of R (located in the Program Files/R/bin/ folder).
count.model(data=stop("No data supplied"), model=stop("No model specified"), call.jags = TRUE, alt.prior=FALSE, monitor.lambda=FALSE, monitor.deviance=FALSE, ...)
count.model(data=stop("No data supplied"), model=stop("No model specified"), call.jags = TRUE, alt.prior=FALSE, monitor.lambda=FALSE, monitor.deviance=FALSE, ...)
data |
an existing R vector containing the data (integer vector). No default. |
model |
model to use. Choices are "GP" (gamma Poisson = negative binomial), "ZIGP" (zero-inflated gamma Poisson = zero-inflated negative binomial), "LP" (lognormal Poisson), "ZILP" (zero-inflated lognormal Poisson), "WP" (Wiebull Poisson), "ZIWP" (zero-inflated Weibull Poisson), "SP" (simple Poisson), "ZISP" (zero-inflated simple Poisson) or "IP" (independant Poisson). The simple Poisson model forces each count to have the same mean, wheras the independant Poisson process allows each count to have an unrelated mean (therefore a zero-inflated version is not possible). Default "ZILP". |
call.jags |
should the function run the model using run.jags? If not, the model specification, initial values and data in the correct format are returned so that the model can be run using run.jags with no modifications. If TRUE, the model is run and the results are returned. |
alt.prior |
should the model run the [ZI] [WP|GP|LP] models using the standard or the alternative prior distribution for variance? (logical) Can also be a character value of a user-specified prior distribution. Default FALSE. Where information concerning overdispersion in the data is sparse, the choice of prior distribution will have an affect on the posterior distribution for ALL parameters. It is recommended to run a simulation using both types of prior when working with small datasets, to make sure results are consistent. |
monitor.lambda |
should the model or model specification monitor the mean of the Poisson process for each datapoint? This is required to calculate the likelihood for the Independant Poisson model only, but may be useful for other purposes. Default FALSE. |
monitor.deviance |
option to monitor the deviance of the model (using the built-in 'deviance' monitor). Default FALSE. |
... |
additional arguments to be passed directly to |
Either a list of character strings representing [[1]] the model specification, [[2]] the data string, [[3]] the initial value string(s) and [[4]] the monitors required (if call.jags is FALSE) or an MCMC object returned from run.jags (if call.jags is TRUE).
# Return the model specification and starting values for a # lognormal Poisson, then run the model using run.jags: ## Not run: data <- rpois(100, rlnorm(3, 0.2)) strings <- run.model(model="LP", data=data, call.jags=FALSE) modelstring <- strings[[1]] datastring <- strings[[2]] initial.values <- strings[[3]] monitors <- strings[[4]] run.jags(model=modelstring, inits=initial.values, check.conv=TRUE, data=datastring, monitor=monitors) # This is equivalent to using: run.model(model="LP", data=data, call.jags=TRUE, check.conv=TRUE) ## End(Not run)
# Return the model specification and starting values for a # lognormal Poisson, then run the model using run.jags: ## Not run: data <- rpois(100, rlnorm(3, 0.2)) strings <- run.model(model="LP", data=data, call.jags=FALSE) modelstring <- strings[[1]] datastring <- strings[[2]] initial.values <- strings[[3]] monitors <- strings[[4]] run.jags(model=modelstring, inits=initial.values, check.conv=TRUE, data=datastring, monitor=monitors) # This is equivalent to using: run.model(model="LP", data=data, call.jags=TRUE, check.conv=TRUE) ## End(Not run)
Finds the power for a faecal egg count study with the given combination of parameters. This represents the probability that the observed empirical mean FEC will lie between the lower.limit and upper.limit specified. The power is calculated using the negative binomial distribution when considering the true mean of a single individual, or using Monte Carlo integration for more than one animal. Confidence intervals for the true power are produced for the latter.
count.power(meanepg=200, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, coeffvarrep=0.4, coeffvarind=0.3, coeffvargroup=0.7, true.sample=FALSE, accuracy=0.1, lower.limit=meanepg*(1-accuracy), upper.limit=meanepg*(1+accuracy), maxiterations=1000000, precision=2, confidence = 0.99, feedback=FALSE, forcesim=FALSE)
count.power(meanepg=200, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, coeffvarrep=0.4, coeffvarind=0.3, coeffvargroup=0.7, true.sample=FALSE, accuracy=0.1, lower.limit=meanepg*(1-accuracy), upper.limit=meanepg*(1+accuracy), maxiterations=1000000, precision=2, confidence = 0.99, feedback=FALSE, forcesim=FALSE)
meanepg |
the mean egg count of the group (in EPG). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for power. |
g.faeces |
the number of grams of faeces used per sample (must be the same for all samples). |
sensitivity |
the minimum egg detection threshold or counting sensitivity of the technique used. If using the McMasters technique, this is the number of McMasters chambers counted divided by 50. |
replicates |
the number of different samples (individually analysed) taken from each animal. Must be the same for all animals. This would normally be 1, but increasing this provides an effective way of improving power at the cost of additional laboratory work. |
animals |
the number of animals in the group. Can be 1, in which case the ability to predict the true mean of the animal (if true.sample=TRUE) or the true mean of a group from which the animal is taken (if true.sample=FALSE) can be calculated. |
coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
true.sample |
option to calculate the power for the population mean (if true.sample=FALSE) or the true sample mean (if true.sample=TRUE). The difference is that conceptually the true mean of a small group of animals may not reflect the mean of the population from which they are derived. If only one animal is considered, the true sample mean is the true mean of the individual whereas the population mean is the mean of a (theoretical) group of animals from which it is derived. The power will always be greater (or identical) for the true sample mean. |
accuracy |
specify the lower and upper limits as the meanepg +/- this fraction of the meanepg. Ignored if non-default values are specified for lower.limit or upper.limit. |
lower.limit |
lower limit for the tolerance. Specifying lower.limit and upper.limit separately allows non-symmetrical tolerance around the mean with which to calculate the power. The default is to use meanepg*(1-accuracy). |
upper.limit |
upper limit for the tolerance. Specifying lower.limit and upper.limit separately allows non-symmetrical tolerance around the mean with which to calculate the power. The default is to use meanepg*(1+accuracy). |
maxiterations |
the maximum number of iterations to use for the Monte Carlo integration, or the number to use if precision=NA. If precision is defined, then only the number of iterations required to estimate the power to the given precision are performed (up to a maximum of maxiterations). |
precision |
the number of decimal places with which to calculate the power, unless maxiterations is reached first. A larger precision will give a more precise estimate of the true power but will take longer to calculate. Specifying this as NA performs the calculation on a fixed (=maxiterations) number of iterations. |
confidence |
the degree of confidence required with which to report confidence limits for the true power when using Monte Carlo integration to report the power. |
feedback |
option to display a progress indicator for calculation of the values used for Monte Carlo integration. Using feedback with some GUI versions of R may slow down the analysis considerably. |
forcesim |
option to force the function to use the Monte Carlo method of approximating the power when the default would be to use a negative binomial approximation (when true.sample=FALSE and coeffvargroup is very small, or when true.sample=TRUE and animals=1). Results using the two methods should be very similar, however there may be differences due to the differenct optimisation mechanisms when calculating both lower and upper limits. In addition, Monte Carlo integration will take longer and gives a confidence interval for the true mean rather than the absolute value. |
Returns a list containing the elements 'roundedci' and 'ci', which specifies the median and confidence limits (as defined by 'confidence') for the true power both rounded by 'precison' and unrounded. For analyses using the Monte Carlo integration method, 'within' 'without' and 'total' are also returned, and indicate the number of iterations for which the observed mean fell outside and inside the specified limits and the total number of iterations.
Finds the appropriate tolerance with which to consider the observed mean for a faecal egg count study with the given combination of power and other parameters. The precision is calculated using the negative binomial distribution when considering the true mean of a single individual, or using Monte Carlo integration for more than one animal. Confidence intervals for the true power are produced for the latter. Tolerance can be defined as either a lower limit only if upper.limit is defined, as an upper limit only if lower.limit is defined, or as both (equidistant) limits if neither are defined.
count.precision(meanepg=200, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, coeffvarrep=0.4, coeffvarind=0.3, coeffvargroup=0.7, true.sample=FALSE, lower.limit=NA, upper.limit=NA, iterations=100000, power = 0.95, confidence = 0.99, feedback=FALSE, forcesim=FALSE)
count.precision(meanepg=200, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, coeffvarrep=0.4, coeffvarind=0.3, coeffvargroup=0.7, true.sample=FALSE, lower.limit=NA, upper.limit=NA, iterations=100000, power = 0.95, confidence = 0.99, feedback=FALSE, forcesim=FALSE)
meanepg |
the mean egg count of the group (in EPG). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for tolerance. |
g.faeces |
the number of grams of faeces used per sample (must be the same for all samples). |
sensitivity |
the minimum egg detection threshold or counting sensitivity of the technique used. If using the McMasters technique, this is the number of McMasters chambers counted divided by 50. |
replicates |
the number of different samples (individually analysed) taken from each animal. Must be the same for all animals. This would normally be 1, but increasing this provides an effective way of improving power at the cost of additional laboratory work. |
animals |
the number of animals in the group. Can be 1, in which case the ability to predict the true mean of the animal (if true.sample=TRUE) or the true mean of a group from which the animal is taken (if true.sample=FALSE) can be calculated. |
coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. |
true.sample |
option to calculate the power for the population mean (if true.sample=FALSE) or the true sample mean (if true.sample=TRUE). The difference is that conceptually the true mean of a small group of animals may not reflect the mean of the population from which they are derived. If only one animal is considered, the true sample mean is the true mean of the individual whereas the population mean is the mean of a (theoretical) group of animals from which it is derived. The power will always be greater (or identical) for the true sample mean. |
lower.limit |
(optional) lower limit for the tolerance. If this is supplied, the required upper limit to maintain the desired power with the specified lower limit will be calculated. If neither lower limit or upper limit are specified, symmetrical limits about the mean are found that satisfy the power condition. |
upper.limit |
(optional) upper limit for the tolerance. If this is supplied, the required lower limit to maintain the desired power with the specified upper limit will be calculated. If neither lower limit or upper limit are specified, symmetrical limits about the mean are found that satisfy the power condition. |
iterations |
the number of iterations to use for the Monte Carlo integration. More iterations will take longer but provide a smaller confidence interval for the true power. |
power |
the desired power with which to interpret the given tolerances. Default is 0.95, so that the observed mean FEC will lie within the calculated lower and upper limits 95% of the time. |
confidence |
the degree of confidence required with which to report confidence limits for the true power when using Monte Carlo integration to report the power. Only used for calculating the true power after determining the tolerance limits. |
feedback |
option to display a progress indicator for calculation of the values used for Monte Carlo integration. Using feedback with some GUI versions of R may slow down the analysis considerably. |
forcesim |
option to force the function to use the Monte Carlo method of approximating the power when the default would be to use a negative binomial approximation (when true.sample=FALSE and coeffvargroup is very small, or when true.sample=TRUE and animals=1). Results using the two methods should be very similar, however there may be differences due to the differenct optimisation mechanisms when calculating both lower and upper limits. In addition, Monte Carlo integration will take longer and gives a confidence interval for the true mean rather than the absolute value. |
Returns a list containing the elements 'limits', which is the calculated lower and upper tolerance level which provides the required power, and 'power' which specifies the median estimate and confidence intervals for the true power when using Monte Carlo integration, or the absolute value (replicated 3 times for consistency) if using the negative binomial approximation method. The true power returned may not exactly match the required power input due to the integer nature of FEC data.
Finds the power for a faecal egg count reduction test with the given combination of parameters. This represents the probability that the observed empirical mean reduction will lie between the lower.limit and upper.limit specified. The power is calculated using Monte Carlo integration and confidence intervals for the true power are produced. This function can be used to determine the probability that the observed empirical mean reduction will lie within a the range (for example) 90% to 100% if the true reduction is 80% (the false negative rate if considering anthelmintic resistance), or the probability that the observed empirical mean reduction will lie within a the range (for example) 0% to 90% if the true reduction is 95% (the false positive rate if considering anthelmintic resistance)
fecrt.power(meanepg=200, reduction = 80, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, pre.coeffvarrep=0.4, pre.coeffvarind=0.3, pre.coeffvargroup=0.7, post.coeffvarrep=0.4, post.coeffvarind=0.3, post.coeffvargroup=0.7, true.sample=FALSE, lower.limit=0, upper.limit=90, maxiterations=1000000, precision=2, confidence = 0.99, feedback=FALSE)
fecrt.power(meanepg=200, reduction = 80, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, pre.coeffvarrep=0.4, pre.coeffvarind=0.3, pre.coeffvargroup=0.7, post.coeffvarrep=0.4, post.coeffvarind=0.3, post.coeffvargroup=0.7, true.sample=FALSE, lower.limit=0, upper.limit=90, maxiterations=1000000, precision=2, confidence = 0.99, feedback=FALSE)
meanepg |
the mean pre-treatment egg count of the group (in EPG). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for power. |
reduction |
the true mean egg count reduction (in %). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for power. |
g.faeces |
the number of grams of faeces used per sample (must be the same for all samples both pre and post treatment). |
sensitivity |
the minimum egg detection threshold or counting sensitivity of the technique used. If using the McMasters technique, this is the number of McMasters chambers counted divided by 50. |
replicates |
the number of different samples (individually analysed) taken from each animal. Must be the same for all animals, both pre and post treatment. This would normally be 1, but increasing this provides an effective way of improving power at the cost of additional laboratory work. |
animals |
the number of animals in the group. Can be 1, in which case the ability to predict the true mean reduction for the animal (if true.sample=TRUE) or the true mean reduction for a group from which the animal is taken (if true.sample=FALSE) can be calculated. The same number of animals MUST be sample pre and post treatment. |
pre.coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
pre.coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
pre.coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
post.coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
post.coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
post.coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
true.sample |
option to calculate the power for the population mean reduction (if true.sample=FALSE) or the true sample mean reduction (if true.sample=TRUE). The difference is that conceptually the true mean of a small group of animals may not reflect the mean of the population from which they are derived. If only one animal is considered, the true sample mean is the true mean of the individual whereas the population mean is the mean of a (theoretical) group of animals from which it is derived. The power will always be greater (or identical) for the true sample mean. |
lower.limit |
lower limit for the tolerance (in %). |
upper.limit |
lower limit for the tolerance (in %). |
maxiterations |
the maximum number of iterations to use for the Monte Carlo integration, or the number to use if precision=NA. If precision is defined, then only the number of iterations required to estimate the power to the given precision are performed (up to a maximum of maxiterations). |
precision |
the number of decimal places with which to calculate the power, unless maxiterations is reached first. A larger precision will give a more precise estimate of the true power but will take longer to calculate. Specifying this as NA performs the calculation on a fixed (=maxiterations) number of iterations. |
confidence |
the degree of confidence required with which to report confidence limits for the true power when using Monte Carlo integration to report the power. |
feedback |
option to display a progress indicator for calculation of the values used for Monte Carlo integration. Using feedback with some GUI versions of R may slow down the analysis considerably. |
Returns a list containing the elements 'roundedci' and 'ci', which specifies the median and confidence limits (as defined by 'confidence') for the true power both rounded by 'precison' and unrounded. The additional variables 'within' 'without' and 'total' are also returned, and indicate the number of iterations for which the observed mean reduction fell outside and inside the specified limits and the total number of iterations.
Finds the appropriate tolerance with which to consider the observed mean for a faecal egg count reduction test with the given combination of power and other parameters. The precision is calculated using Monte Carlo integration, and confidence intervals for the true power are produced. Tolerance can be defined as either a lower limit only if upper.limit is defined, as an upper limit only if lower.limit is defined, or as both (equidistant) limits if neither are defined. This function can be used for example to determine the upper limit for a true reduction of 80% by setting lower.limit=0, or to determine the lower limit for a true reduction of 99% by setting upper.limit=100.
fecrt.precision(meanepg=200, reduction = 95, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, pre.coeffvarrep=0.4, pre.coeffvarind=0.3, pre.coeffvargroup=0.7, post.coeffvarrep=0.4, post.coeffvarind=0.3, post.coeffvargroup=0.7, true.sample=FALSE, lower.limit=NA, upper.limit=NA, iterations=100000, power = 0.95, confidence = 0.99, feedback=FALSE)
fecrt.precision(meanepg=200, reduction = 95, g.faeces=3, sensitivity=1/25, replicates=1, animals=10, pre.coeffvarrep=0.4, pre.coeffvarind=0.3, pre.coeffvargroup=0.7, post.coeffvarrep=0.4, post.coeffvarind=0.3, post.coeffvargroup=0.7, true.sample=FALSE, lower.limit=NA, upper.limit=NA, iterations=100000, power = 0.95, confidence = 0.99, feedback=FALSE)
meanepg |
the mean pre-treatment egg count of the group (in EPG). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for tolerance. |
reduction |
the true mean egg count reduction (in %). If this is unknown then use a value likely to be encountered, or iterate over a distribution of values to obtain a distribution of values for tolerance. |
g.faeces |
the number of grams of faeces used per sample (must be the same for all samples both pre and post treatment). |
sensitivity |
the minimum egg detection threshold or counting sensitivity of the technique used. If using the McMasters technique, this is the number of McMasters chambers counted divided by 50. |
replicates |
the number of different samples (individually analysed) taken from each animal. Must be the same for all animals, both pre and post treatment. This would normally be 1, but increasing this provides an effective way of improving power at the cost of additional laboratory work. |
animals |
the number of animals in the group. Can be 1, in which case the ability to predict the true mean reduction for the animal (if true.sample=TRUE) or the true mean reduction for a group from which the animal is taken (if true.sample=FALSE) can be calculated. The same number of animals MUST be sample pre and post treatment. |
pre.coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
pre.coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
pre.coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the pre-treatment distributions. |
post.coeffvarrep |
coefficient of variation between sub-samples taken from the same faecal sample, assuming 1g faeces is used per sub-sample (the effective value is automatically adjusted according to g.faeces in the function). The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
post.coeffvarind |
coefficient of variation between samples taken from different faecal piles from the same animal over a period of days, not including that due to coeffvarrep. The default value for this is taken from an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
post.coeffvargroup |
coefficient of variation between animals. This includes the variability between samples taken from different animals within a group, not including that due to within animal variability. The default value for this is inferred from a combination of several equine datasets and an intensive study of a small group of horses, but may not be accurate in other groups of animals. This value is applied only to the post-treatment distributions. |
true.sample |
option to calculate the power for the population mean (if true.sample=FALSE) or the true sample mean (if true.sample=TRUE). The difference is that conceptually the true mean of a small group of animals may not reflect the mean of the population from which they are derived. If only one animal is considered, the true sample mean is the true mean of the individual whereas the population mean is the mean of a (theoretical) group of animals from which it is derived. The power will always be greater (or identical) for the true sample mean. |
lower.limit |
(optional) lower limit for the tolerance (in %). If this is supplied, the required upper limit to maintain the desired power with the specified lower limit will be calculated. If neither lower limit or upper limit are specified, symmetrical limits about the true mean reduction are found that satisfy the power condition. |
upper.limit |
(optional) upper limit for the tolerance (in %). If this is supplied, the required lower limit to maintain the desired power with the specified upper limit will be calculated. If neither lower limit or upper limit are specified, symmetrical limits about the true mean reduction are found that satisfy the power condition. |
iterations |
the number of iterations to use for the Monte Carlo integration. More iterations will take longer but provide a smaller confidence interval for the true power. |
power |
the desired power with which to interpret the given tolerances. Default is 0.95, so that the observed mean FEC will lie within the calculated lower and upper limits 95% of the time. |
confidence |
the degree of confidence required with which to report confidence limits for the true power when using Monte Carlo integration to report the power. Only used for calculating the true power after determining the tolerance limits. |
feedback |
option to display a progress indicator for calculation of the values used for Monte Carlo integration. Using feedback with some GUI versions of R may slow down the analysis considerably. |
Returns a list containing the elements 'limits', which is the calculated lower and upper tolerance level which provides the required power, and 'power' which specifies the median estimate and confidence intervals for the true power. The true power returned may not exactly match the required power input due to the integer nature of FECRT data.
Function to calculate the (log) likelihood of obtaining a set of data from one of the following distributions: Poisson (P), gamma (G), lognormal (L), Weibull (W), gamma Poisson (GP), lognormal Poisson (LP), Weibull Poisson (WP), all with or without zero-inflation (ZI). For mixture models, the likelihood is calculated for the data by integrating over each possible value of lambda for each data point, which may take some time for large datasets. *THIS FUNCTION IS DEPRECATED AND WILL BE REMOVED FROM BAYESCOUNT VERSION 1*
likelihood(model, data, mean=NA, variance=NA, zi=NA, shape=NA, scale=NA, iterations=min(1000, (length(as.matrix(mean)[,1])*length(shape))), log=TRUE, silent=FALSE, raw.output=FALSE)
likelihood(model, data, mean=NA, variance=NA, zi=NA, shape=NA, scale=NA, iterations=min(1000, (length(as.matrix(mean)[,1])*length(shape))), log=TRUE, silent=FALSE, raw.output=FALSE)
model |
the distribution to fit to the data. Choices are: 'IP', 'P', 'ZIP', 'G', 'ZIG', 'L', 'ZIL', 'W', 'ZIW', 'GP', 'ZIGP', 'LP', 'ZILP', 'WP', 'ZIWP' (case insensitive). No default. |
data |
a vector of data to fit the distribution to. Count data for the count models, continuous data for the continuous distributions. |
mean |
a vector of mean values over which to calculate the likelihood. Must be a matrix with number of columns equal to the number of datapoints for the IP model only, since using the IP model each datapoint has an independant mean (this is effectively a shortcut for repeating the likelihood function for each datapoint). Mean is required for the IP, (ZI)P, (ZI)L, and (ZI)L models. Otherwise ignored (with a warning if silent=FALSE). |
variance |
a vector of values for variance over which to calculate the likelihood, with length equal to length of mean. Required for the (ZI)L, and (ZI)L models. Otherwise ignored (with a warning if silent=FALSE). |
zi |
a vector of values for zero-inflation, with length equal to either scale/shape or mean/variance as appropriate. Required for zero-inflated models only; if supplied for other models the appropriate zero-inflated model will be used instead (with a warning if silent=FALSE). |
shape |
a vector of values for the shape parameter. Must be of length equal to that of the scale parameter. Required for the (ZI)W, (ZI)WP, (ZI)G, and (ZI)GP models. Otherwise ignored (with a warning if silent=FALSE). |
scale |
a vector of values for the scale parameter. Must be of length equal to that of the shape parameter. Required for the (ZI)W, (ZI)WP, (ZI)G, and (ZI)GP models. Otherwise ignored (with a warning if silent=FALSE). |
iterations |
the total number of iterative samples for which to calculate the likelihood. If this is smaller than the number of iterations provided for mean/variance/shape/scale, then the chain will be thinned to the number of iterations required (this is useful for the mixture models where integrating the likelihood for each datapoint may take some time). Default is 1000 iterations, or the length of the parameters supplied if this is shorter than 1000. |
log |
choose to output the log likelihood (TRUE) or the likelihood (FALSE). Default TRUE. |
silent |
should warning messages and progress indicators be supressed? Default FALSE. |
raw.output |
output a vector of length equal to the number of iterations representing the likelihood at each iteration if TRUE, or a summary of the results (median and 95 percent confidence interval) if FALSE. Default FALSE. |
The (log) likelihood is returned either as a vector of length equal to the number of iterations representing the likelihood at each iteration if raw.output==TRUE, or a summary of the results (median, maximum and 95% highest posterior density interval (see HPDinterval
) if raw.output==FALSE. It is possible for the functions that use integrate to produce incalculable probabilities for some iterations, in which case the likelihood for these iterations are returned as missing data (and therefore the summary statistics for the whole simulation are also returned as missing data). If only 1 iteration of values for mean/variance/shape/scale are provided, a single value for likelihood is output.
# calculate the likelihood of obtaining a set of count data from # a zero-inflated Poisson distribution with set mean and zero-inflation # values ## Not run: data <- rpois(100, 10) data[1:15] <- 0 likelihood('ZIP', data, mean=10, zi=15) # now calculate the likelihood for the same data using an MCMC object # to provide the values for mean and zero-inflation values <- bayescount.single(data, model='ZISP', raw.output=TRUE) means <- c(values[,'mean'][[1]], values[,'mean'][[2]]) zis <- (1-c(values[,'prob'][[1]], values[,'prob'][[2]]))*100 # The function outputs the prevalence of disease when raw.ouput is # TRUE, so zero-inflation must be calculated from this likes <- likelihood('ZIP', data, mean=means, zi=zis, raw.output=TRUE)$likelihood hist(likes, breaks='fd', col='red') ## End(Not run)
# calculate the likelihood of obtaining a set of count data from # a zero-inflated Poisson distribution with set mean and zero-inflation # values ## Not run: data <- rpois(100, 10) data[1:15] <- 0 likelihood('ZIP', data, mean=10, zi=15) # now calculate the likelihood for the same data using an MCMC object # to provide the values for mean and zero-inflation values <- bayescount.single(data, model='ZISP', raw.output=TRUE) means <- c(values[,'mean'][[1]], values[,'mean'][[2]]) zis <- (1-c(values[,'prob'][[1]], values[,'prob'][[2]]))*100 # The function outputs the prevalence of disease when raw.ouput is # TRUE, so zero-inflation must be calculated from this likes <- likelihood('ZIP', data, mean=means, zi=zis, raw.output=TRUE)$likelihood hist(likes, breaks='fd', col='red') ## End(Not run)
Function to calculate the equivalent values for the mean and standard deviation of a normal distribution from the mean and standard deviation of the log-normal distribution and vice versa. Outputs from this function can be used with the dnorm() or dlnorm functions, and with the equivalent distributions in JAGS.
normal_params(log.mean, log.sd, coeff.variation = sqrt(exp(log.sd^2) - 1))
normal_params(log.mean, log.sd, coeff.variation = sqrt(exp(log.sd^2) - 1))
lmean <- 2.5 lsd <- 0.2 mean <- normal.params(lmean,lsd)[[1]] sd <- normal.params(lmean,lsd)[[2]] curve(dlnorm(x, lmean, lsd), from=0, to=25) dev.new() curve(dnorm(x, mean, sd), from=0, to=25) mean <- 10 sd <- 2 lmean <- lnormal.params(mean,sd)[[1]] lsd <- lnormal.params(mean,sd)[[2]] curve(dnorm(x, mean, sd), from=0, to=20) dev.new() curve(dlnorm(x, lmean, lsd), from=0, to=20)
lmean <- 2.5 lsd <- 0.2 mean <- normal.params(lmean,lsd)[[1]] sd <- normal.params(lmean,lsd)[[2]] curve(dlnorm(x, lmean, lsd), from=0, to=25) dev.new() curve(dnorm(x, mean, sd), from=0, to=25) mean <- 10 sd <- 2 lmean <- lnormal.params(mean,sd)[[1]] lsd <- lnormal.params(mean,sd)[[2]] curve(dnorm(x, mean, sd), from=0, to=20) dev.new() curve(dlnorm(x, lmean, lsd), from=0, to=20)
Function to calculate the equivalent values for the mean and standard deviation of a log-normal distribution from the mean and standard deviation of the normal distribution. Outputs from this function can be used with the dlnorm() function, and with the lognormal distribution in JAGS.
lnormal.params(mean, sd, coeff.variation)
lnormal.params(mean, sd, coeff.variation)
mean |
either a single value or vector of values for the mean of the normal distribution. |
sd |
either a single value or vector of values for the standard deviation of the normal distribution. Ignored if values are supplied for coeff.variation. |
coeff.variation |
either a single value or vector of values for the coefficient of dispersion. |
A list with elements representing the mean of the lognormal distribution, the standard deviation of the lognormal distribution, and the coefficient of variation.
## Not run: mean <- 10 sd <- 2 lmean <- lnormal.params(mean,sd)[[1]] lsd <- lnormal.params(mean,sd)[[2]] curve(dnorm(x, mean, sd), from=0, to=20) dev.new() curve(dlnorm(x, lmean, lsd), from=0, to=20) ## End(Not run)
## Not run: mean <- 10 sd <- 2 lmean <- lnormal.params(mean,sd)[[1]] lsd <- lnormal.params(mean,sd)[[2]] curve(dnorm(x, mean, sd), from=0, to=20) dev.new() curve(dlnorm(x, lmean, lsd), from=0, to=20) ## End(Not run)
Crude function to maximise the likelihood of one of the following distributions: Poisson (P), gamma (G), lognormal (L), Weibull (W), gamma Poisson (GP), lognormal Poisson (LP), Weibull Poisson (WP), all with or without zero-inflation (ZI). Uses the likelihood() function to calculate the likelihood at each iteration. For mixture models, the likelihood is calculated for the data by integrating over each possible value of lambda for each data point, which may take some time for large datasets. Starting values for each parameter are optional, but may improve the speed and reliability of the function if appropriate values are provided. If missing, starting values will be calculated from the data. This function is provided for interest only, and is vastly inferior as a data analysis tool to the Bayesian MCMC methods used by bayescount(). *THIS FUNCTION IS DEPRECATED AND WILL BE REMOVED FROM BAYESCOUNT VERSION 1 - SEE ?count.analysis FOR AN ALTERNATIVE USING MCMC*
maximise.likelihood(data=stop("Data must be specified"), model=stop("Please specify a distribution"), mean=NA, variance=NA, zi=NA, shape=NA, scale=NA, silent=FALSE)
maximise.likelihood(data=stop("Data must be specified"), model=stop("Please specify a distribution"), mean=NA, variance=NA, zi=NA, shape=NA, scale=NA, silent=FALSE)
data |
a vector of data to fit the distribution to. Count data for the count models, continuous data for the continuous distributions. |
model |
the distribution to fit to the data. Choices are: "P", "ZIP", "G", "ZIG", "L", "ZIL", "W", "ZIW", "GP", "ZIGP", "LP", "ZILP", "WP", "ZIWP" (case insensitive). No default. |
mean |
the starting value for mean. Optional. |
variance |
the starting value for variance. Optional. |
zi |
the starting value for zero-inflation. Optional. |
shape |
the starting value for the shape parameter. Optional. |
scale |
the starting value for the scale parameter. Optional. |
silent |
should warning messages and progress indicators be supressed? Default FALSE. |
The values for each parameter at the maximum likelihood are output. No standard error is given (Bayesian MCMC using bayescount() gives a much better analysis)
# obtain values for mean and zero-inflation of a zero-inflated # gamma Poisson model: data <- rpois(100, rgamma(100, shape=1, scale=8)) data[1:15] <- 0 #maximise.likelihood(data, "ZIGP")
# obtain values for mean and zero-inflation of a zero-inflated # gamma Poisson model: data <- rpois(100, rgamma(100, shape=1, scale=8)) data[1:15] <- 0 #maximise.likelihood(data, "ZIGP")
Function to calculate the equivalent values for the mean and standard deviation of a normal distribution from the mean and standard deviation of the log-normal distribution. Outputs from this function can be used with the dnorm() function, and with the normal distribution in JAGS.
normal.params(log.mean, log.sd, coeff.variation=sqrt(exp(log.sd^2)-1))
normal.params(log.mean, log.sd, coeff.variation=sqrt(exp(log.sd^2)-1))
log.mean |
either a single value or vector of values for the mean of the lognormal distribution. |
log.sd |
either a single value or vector of values for the standard deviation of the lognormal distribution. Ignored if values are supplied for coeff.variation. |
coeff.variation |
either a single value or vector of values for the coefficient of dispersion. |
A list with elements representing the mean of the normal distribution, the standard deviation of the normal distribution, and the coefficient of variation.
## Not run: lmean <- 2.5 lsd <- 0.2 mean <- normal.params(lmean,lsd)[[1]] sd <- normal.params(lmean,lsd)[[2]] curve(dlnorm(x, lmean, lsd), from=0, to=25) dev.new() curve(dnorm(x, mean, sd), from=0, to=25) ## End(Not run)
## Not run: lmean <- 2.5 lsd <- 0.2 mean <- normal.params(lmean,lsd)[[1]] sd <- normal.params(lmean,lsd)[[2]] curve(dlnorm(x, lmean, lsd), from=0, to=25) dev.new() curve(dnorm(x, mean, sd), from=0, to=25) ## End(Not run)
Analyses two sets of count data for using any of (1) Christian approach, (2) my approach, (3) bootstrap, (4) MCMC (default 1-3). (4) calls reduction_model and then runs the model using runjags, and is the only one to permit ZI and LP. Requires Just Another Gibbs Sampler (JAGS) (see http://mcmc-jags.sourceforge.net)
reduction_analysis(name = NA, pre.data = NA, post.data = NA, data = list(pre = pre.data, post = post.data), animal.names = FALSE, efficacy = 95, confidence = 95, restrict.efficacy = TRUE, control.animals = FALSE, paired.model = FALSE, fix.variation = FALSE, fix.efficacy = TRUE, zero.inflation = FALSE, individual.analysis = FALSE, divide.data = 1, record.chains = FALSE, write.file = FALSE, bootstrap.iters = 10000, plot.graph = TRUE, skip.mcmc = NULL, stat.method = c("MCMC", "WAAVP", "Bootstrap"), custom.model = NA, ...)
reduction_analysis(name = NA, pre.data = NA, post.data = NA, data = list(pre = pre.data, post = post.data), animal.names = FALSE, efficacy = 95, confidence = 95, restrict.efficacy = TRUE, control.animals = FALSE, paired.model = FALSE, fix.variation = FALSE, fix.efficacy = TRUE, zero.inflation = FALSE, individual.analysis = FALSE, divide.data = 1, record.chains = FALSE, write.file = FALSE, bootstrap.iters = 10000, plot.graph = TRUE, skip.mcmc = NULL, stat.method = c("MCMC", "WAAVP", "Bootstrap"), custom.model = NA, ...)
name |
a name for the analysis (character). Missing by default (function will require it to be input). |
pre.data |
the pre-treatment data. Either a numeric vector, amatrix with repeated McMasters counts from the same sample (each row) indifferent columns, or an array with different animals in dimension 1,repeat samples from the same animal in dimension 2, repeated McMasterscounts from the same sample in dimension 3 (can be of length 1 if only 1count recorded). If an array, then the use.paired must be TRUE. Nodefault. Ignored if a value is specified for data. |
post.data |
the post-treatment data, for acceptable formats seepre.data. |
data |
either a path to a comma delimited csv file, or an existingR object containing the data. Data can be specified in one of thefollowing ways. If a matrix, the first column is pre-treatment data,the second column is the post-treatment data, and the third column (ifsupplied) indicates control (1) or treatment (0) animal. Alternatively,the first column is animal names id animal.names = TRUE, with columns 2and 3 making up the pre- and post-treatment data and optionally column 4the treatment of control status. If the data is specified as a list,then the first element is taken as the pre-treatment data, and thesecond element is taken as the post-treatment data. These elements ofthe list can be provided as for pre.data and post.data. If the data isin a comma delimited file, it must be in the format specified for asingle matrix. Missing data, or unused elements of non-ragged arrays,may be represented using NA, which will be removed from the data beforeanalysis. This argument is taken from the specified values of pre.dataand post.data by default. If a value is specified for data thenarguments specified for pre.data and post.data are ignored. |
animal.names |
either a character vector of names to be used foreach animal specified in the data, TRUE or FALSE. If TRUE, then animalnames are taken from the first column of the matrix supplied for data (amatrix must be supplied for data if animal.names is TRUE). If FALSE,then animal names are generated automatically. Default FALSE. |
efficacy |
the target % efficacy of the drug used. Used tocalculate the probability of resistance with the MCMC and bootstrapmethods. Default 95%. |
confidence |
the degree of confidence required with which to reportthe confidence limits for the true FEC reduction. |
restrict.efficacy |
option to restrict the estimate of efficacyfrom the MCMC method to values of greater than 0 (i.e. the mean faecalegg count cannot increase after treatment). If TRUE, then the prior forchange in mean egg count is uniform between 0 and 1, and if FALSE thenthe prior for change in mean egg count is uniform between 0 and 10. Ifusing control animals, then restrict.efficacy applies to the change inmean of the treated animals only, allowing the efficacy to be below 0. Default TRUE. |
control.animals |
indication of which animals are to be used ascontrols. Should be either a vector of TRUE/FALSE (or 1/0)corresponding to whether each animal is a control (TRUE) or treatment(FALSE), or simply FALSE (the default) in which case all animals areassumed to be treated. Ignored if data is specified as a matrix (or csvfile) with 3 columns, in which case the third column should reflect thetreatment status. Default FALSE. |
paired.model |
the FECRT can be run using two compound gammadistributions to describe the variability within and between animals inplace of the single gamma distribution combining the two sources ofvariability. When using two compound gamma distributions forpre-treatment data, the post-treatment data are paired to thepre-treatment data by animal. The advantage of using a singledistribution is that the model is more identifiable, and therefore islikely to converge more quickly and return errors less frequently. Theadvantage of the paired model is that it allows repeat measurements tobe incorporated in the model, and non-randomly missing data (ie.protocols that involve post-treatment sampling of only animals with ahigh pre-treatment count) to be modelled appropriately. The simplemodel cannot be used when using repeat samples within animal, and willprovide inaccurate results if animals are targeted for post-treatmentsampling based on their pre-treatment count. Default uses the simplemodel (FALSE). |
zero.inflation |
option to use a zero-inflated gamma Poisson inplace of the gamma Poisson distribution. If TRUE, zero inflateddistributions are used pre- and post-treatment (with the prevalencefixed between pre- and post-treatment distributions). Default FALSE. |
divide.data |
count division factor to allow egg count data in eggsper gram to be used raw (numeric). Default 1 (no transformation todata). |
record.chains |
option to allow the MCMC chains to be recorded forfuture use. If TRUE, the function returns the MCMC object as part ofthe return value (the MCMC object is not printed using the printmethod). If write.file==TRUE, the results are also saved using afilename containing the name of the analysis. Default FALSE. |
write.file |
option to write the results of the analysis to a textfile using a filename containing the name of the analysis. The contentsof the text file are identical to the return value of the function. Default FALSE. |
bootstrap.iters |
the number of bootstrap iterations to use for thebootstrap method. Default 10000. |
plot.graph |
an option to plot the posterior true egg countreduction from the MCMC method graphically. If write.file==TRUE thenthe graph is saved as a PDF, otherwise the graph is plotted on theactive device. Default FALSE. |
skip.mcmc |
option to omit the MCMC analysis, and return bootstrapand WAAVP method analysis results alone. Default FALSE. |
... |
other options to be passed directly to |
Should probably discuss different methods here
Returns a list of the results obtained using each method, printed using a custom print method.
M. J. Denwood, S. W. J. Reid, S. Love, M. K. Nielsen, L. Matthews, I. J. McKendrick, and G. T. Innocent. Comparison of three alternative methods for analysis of equine Faecal Egg Count Reduction Test data. Prev. Vet. Med. (2009), doi:10.1016/j.prevetmed.2009.11.009
reduction_model
, count_analysis
, bayescount
, autoextend.jags
, runjags
This function applies a Bayesian [zero-inflated] gamma Poisson (negative binomial) model to faecal egg count reduction test (FECRT) data to return possible values for the mean drug efficacy. Pre-treatment data are assumed to arise from either a gamma-Poisson or zero-inflated gamma-Poisson distribution, with post-treatment data described by a separate gamma-Poisson or zero-inflated gamma-Poisson distribution. The change in mean between these distributions is therefore the mean egg count reduction. If paired.model=TRUE, a slightly different formulation is used whereby the observed count pre-treatment is assumed to follow a compound gamma-gamma-Poisson distribution with the variability within and between animals separated. The post treatment mean for each animal is derived from the pre-treatment animal mean and FEC reduction. This formulation allows data with non-random missing post-treatment counts to be analysed correctly, and also allows data with repeat counts from an individual to be analysed - providing a method of increasing the power of the method substantially. Results are also obtained using non-parametric bootstrapping and the method advocated in the 1992 World Association for the Advancement of Veterinary Parasitology (W.A.A.V.P.) methods for the detection of anthelmintic resistance in nematodes of veterinary importance (unless the data contains repeat values or missing values). Confidence intervals for the relevant statistics are printed to file if write.file = TRUE, and returned using a custom print method.
Lower level running of JAGS and assessing the simulation for convergence and required run length is done using autorun.jags
. Note: The GUI interface for R in Windows may not continually refresh the output window, making it difficult to track the progress of the simulation (if silent.jags is FALSE). To avoid this, you can run the function from the terminal version of R (located in the Program Files/R/bin/ folder).
*THIS SOFTWARE IS INTENDED FOR EDUCATIONAL PURPOSES ONLY AND SHOULD NOT BE RELIED UPON FOR REAL WORLD APPLICATIONS*
fecrt.power
and fecrt.power.limits
for power analysis methods for the FECRT, bayescount
, autorun.jags
#reduction_analysis()
#reduction_analysis()
Does not require Just Another Gibbs Sampler (JAGS) (see http://mcmc-jags.sourceforge.net) but not much use otherwise
reduction_model(name = NA, pre.data = NA, post.data = NA, data = list(pre = pre.data, post = post.data), animal.names = FALSE, efficacy = 95, confidence = 95, restrict.efficacy = TRUE, control.animals = FALSE, paired.model = FALSE, fix.variation = FALSE, fix.efficacy = TRUE, zero.inflation = FALSE, individual.analysis = FALSE, divide.data = 1, record.chains = FALSE, write.file = FALSE, bootstrap.iters = 10000, plot.graph = TRUE, skip.mcmc = NULL, stat.method = c("MCMC", "WAAVP", "Bootstrap"), custom.model = NA, ...)
reduction_model(name = NA, pre.data = NA, post.data = NA, data = list(pre = pre.data, post = post.data), animal.names = FALSE, efficacy = 95, confidence = 95, restrict.efficacy = TRUE, control.animals = FALSE, paired.model = FALSE, fix.variation = FALSE, fix.efficacy = TRUE, zero.inflation = FALSE, individual.analysis = FALSE, divide.data = 1, record.chains = FALSE, write.file = FALSE, bootstrap.iters = 10000, plot.graph = TRUE, skip.mcmc = NULL, stat.method = c("MCMC", "WAAVP", "Bootstrap"), custom.model = NA, ...)
name |
a name for the analysis (character). Missing by default (function will require it to be input). |
pre.data |
the pre-treatment data. Either a numeric vector, amatrix with repeated McMasters counts from the same sample (each row) indifferent columns, or an array with different animals in dimension 1,repeat samples from the same animal in dimension 2, repeated McMasterscounts from the same sample in dimension 3 (can be of length 1 if only 1count recorded). If an array, then the use.paired must be TRUE. Nodefault. Ignored if a value is specified for data. |
post.data |
the post-treatment data, for acceptable formats seepre.data. |
data |
either a path to a comma delimited csv file, or an existingR object containing the data. Data can be specified in one of thefollowing ways. If a matrix, the first column is pre-treatment data,the second column is the post-treatment data, and the third column (ifsupplied) indicates control (1) or treatment (0) animal. Alternatively,the first column is animal names id animal.names = TRUE, with columns 2and 3 making up the pre- and post-treatment data and optionally column 4the treatment of control status. If the data is specified as a list,then the first element is taken as the pre-treatment data, and thesecond element is taken as the post-treatment data. These elements ofthe list can be provided as for pre.data and post.data. If the data isin a comma delimited file, it must be in the format specified for asingle matrix. Missing data, or unused elements of non-ragged arrays,may be represented using NA, which will be removed from the data beforeanalysis. This argument is taken from the specified values of pre.dataand post.data by default. If a value is specified for data thenarguments specified for pre.data and post.data are ignored. |
animal.names |
either a character vector of names to be used foreach animal specified in the data, TRUE or FALSE. If TRUE, then animalnames are taken from the first column of the matrix supplied for data (amatrix must be supplied for data if animal.names is TRUE). If FALSE,then animal names are generated automatically. Default FALSE. |
efficacy |
the target % efficacy of the drug used. Used tocalculate the probability of resistance with the MCMC and bootstrapmethods. Default 95%. |
confidence |
the degree of confidence required with which to reportthe confidence limits for the true FEC reduction. |
restrict.efficacy |
option to restrict the estimate of efficacyfrom the MCMC method to values of greater than 0 (i.e. the mean faecalegg count cannot increase after treatment). If TRUE, then the prior forchange in mean egg count is uniform between 0 and 1, and if FALSE thenthe prior for change in mean egg count is uniform between 0 and 10. Ifusing control animals, then restrict.efficacy applies to the change inmean of the treated animals only, allowing the efficacy to be below 0. Default TRUE. |
control.animals |
indication of which animals are to be used ascontrols. Should be either a vector of TRUE/FALSE (or 1/0)corresponding to whether each animal is a control (TRUE) or treatment(FALSE), or simply FALSE (the default) in which case all animals areassumed to be treated. Ignored if data is specified as a matrix (or csvfile) with 3 columns, in which case the third column should reflect thetreatment status. Default FALSE. |
paired.model |
the FECRT can be run using two compound gammadistributions to describe the variability within and between animals inplace of the single gamma distribution combining the two sources ofvariability. When using two compound gamma distributions forpre-treatment data, the post-treatment data are paired to thepre-treatment data by animal. The advantage of using a singledistribution is that the model is more identifiable, and therefore islikely to converge more quickly and return errors less frequently. Theadvantage of the paired model is that it allows repeat measurements tobe incorporated in the model, and non-randomly missing data (ie.protocols that involve post-treatment sampling of only animals with ahigh pre-treatment count) to be modelled appropriately. The simplemodel cannot be used when using repeat samples within animal, and willprovide inaccurate results if animals are targeted for post-treatmentsampling based on their pre-treatment count. Default uses the simplemodel (FALSE). |
zero.inflation |
option to use a zero-inflated gamma Poisson inplace of the gamma Poisson distribution. If TRUE, zero inflateddistributions are used pre- and post-treatment (with the prevalencefixed between pre- and post-treatment distributions). Default FALSE. |
divide.data |
count division factor to allow egg count data in eggsper gram to be used raw (numeric). Default 1 (no transformation todata). |
record.chains |
option to allow the MCMC chains to be recorded forfuture use. If TRUE, the function returns the MCMC object as part ofthe return value (the MCMC object is not printed using the printmethod). If write.file==TRUE, the results are also saved using afilename containing the name of the analysis. Default FALSE. |
write.file |
option to write the results of the analysis to a textfile using a filename containing the name of the analysis. The contentsof the text file are identical to the return value of the function. Default FALSE. |
bootstrap.iters |
the number of bootstrap iterations to use for thebootstrap method. Default 10000. |
plot.graph |
an option to plot the posterior true egg countreduction from the MCMC method graphically. If write.file==TRUE thenthe graph is saved as a PDF, otherwise the graph is plotted on theactive device. Default FALSE. |
skip.mcmc |
option to omit the MCMC analysis, and return bootstrapand WAAVP method analysis results alone. Default FALSE. |
... |
other options to be passed directly to |
Should probably discuss different methods here
Returns a list of the results obtained using each method, printed using a custom print method.
M. J. Denwood, S. W. J. Reid, S. Love, M. K. Nielsen, L. Matthews, I. J. McKendrick, and G. T. Innocent. Comparison of three alternative methods for analysis of equine Faecal Egg Count Reduction Test data. Prev. Vet. Med. (2009), doi:10.1016/j.prevetmed.2009.11.009
reduction_analysis
, count_model
, bayescount
, autoextend.jags
, runjags
This function applies a Bayesian [zero-inflated] gamma Poisson (negative binomial) model to faecal egg count reduction test (FECRT) data to return possible values for the mean drug efficacy. Pre-treatment data are assumed to arise from either a gamma-Poisson or zero-inflated gamma-Poisson distribution, with post-treatment data described by a separate gamma-Poisson or zero-inflated gamma-Poisson distribution. The change in mean between these distributions is therefore the mean egg count reduction. If paired.model=TRUE, a slightly different formulation is used whereby the observed count pre-treatment is assumed to follow a compound gamma-gamma-Poisson distribution with the variability within and between animals separated. The post treatment mean for each animal is derived from the pre-treatment animal mean and FEC reduction. This formulation allows data with non-random missing post-treatment counts to be analysed correctly, and also allows data with repeat counts from an individual to be analysed - providing a method of increasing the power of the method substantially. Results are also obtained using non-parametric bootstrapping and the method advocated in the 1992 World Association for the Advancement of Veterinary Parasitology (W.A.A.V.P.) methods for the detection of anthelmintic resistance in nematodes of veterinary importance (unless the data contains repeat values or missing values). Confidence intervals for the relevant statistics are printed to file if write.file = TRUE, and returned using a custom print method.
Lower level running of JAGS and assessing the simulation for convergence and required run length is done using autorun.jags
. Note: The GUI interface for R in Windows may not continually refresh the output window, making it difficult to track the progress of the simulation (if silent.jags is FALSE). To avoid this, you can run the function from the terminal version of R (located in the Program Files/R/bin/ folder).
*THIS SOFTWARE IS INTENDED FOR EDUCATIONAL PURPOSES ONLY AND SHOULD NOT BE RELIED UPON FOR REAL WORLD APPLICATIONS*
#reduction_model()
#reduction_model()
Power calculation based on BNB code and Christian approximation.
reduction_pval(pre_data, post_data, H0_1 = 0.9, H0_2 = 0.95, edt_pre = 1, edt_post = 1, prob_priors = c(1, 1), k_change = 1, true_k = NA, delta_method = TRUE, beta_iters = 10^6)
reduction_pval(pre_data, post_data, H0_1 = 0.9, H0_2 = 0.95, edt_pre = 1, edt_post = 1, prob_priors = c(1, 1), k_change = 1, true_k = NA, delta_method = TRUE, beta_iters = 10^6)
count_analysis
, reduction_power
, bayescount
Precision analysis by Monte Carlo simulation and examination of distribution of mean
reduction_precision(meanepg = 200, reduction = 95, g.faeces = 3, sensitivity = 1/25, replicates = 1, animals = 10, pre.coeffvarrep = 0.4, pre.coeffvarind = 0.3, pre.coeffvargroup = 0.7, post.coeffvarrep = 0.4, post.coeffvarind = 0.3, post.coeffvargroup = 0.7, true.sample = FALSE, lower.interval = NA, upper.interval = NA, iterations = 1e+05, mc.decimals = 0.95, mc.conf = 0.99, feedback = FALSE, mc.output = FALSE)
reduction_precision(meanepg = 200, reduction = 95, g.faeces = 3, sensitivity = 1/25, replicates = 1, animals = 10, pre.coeffvarrep = 0.4, pre.coeffvarind = 0.3, pre.coeffvargroup = 0.7, post.coeffvarrep = 0.4, post.coeffvarind = 0.3, post.coeffvargroup = 0.7, true.sample = FALSE, lower.interval = NA, upper.interval = NA, iterations = 1e+05, mc.decimals = 0.95, mc.conf = 0.99, feedback = FALSE, mc.output = FALSE)
reduction_analysis
, reduction_power
, bayescount