Econstudentlog

Common Errors in Statistics… (III)

This will be my last post about the book. I liked most of it, and I gave it four stars on goodreads, but that doesn’t mean there weren’t any observations included in the book with which I took issue/disagreed. Here’s one of the things I didn’t like:

“In the univariate [model selection] case, if the errors were not normally distributed, we could take advantage of permutation methods to obtain exact significance levels in tests of the coefficients. Exact permutation methods do not exist in the multivariable case.

When selecting variables to incorporate in a multivariable model, we are forced to perform repeated tests of hypotheses, so that the resultant p-values are no longer meaningful. One solution, if sufficient data are available, is to divide the dataset into two parts, using the first part to select variables, and the second part to test these same variables for significance.” (chapter 13)

The basic idea is to use the results of hypothesis tests to decide which variables to include in the model. This is both common- and bad practice. I found it surprising that such a piece of advice would be included in this book, as I’d figured beforehand that this would precisely be the sort of thing a book like this one would tell people not to do. I’ve said this before multiple times on this blog, but I’ll keep saying it, especially if/when I find this sort of advice in statistics textbooks: Using hypothesis testing as a basis for model selection is an invalid approach to model selection, and it’s in general a terrible idea. “There is no statistical theory that supports the notion that hypothesis testing with a fixed α level is a basis for model selection.” (Burnham & Anderson). Use information criteria, not hypothesis tests, to make your model selection decisions. (And read Burnham & Anderson’s book on these topics.)

Anyway, much of the stuff included in the book was good stuff and it’s a very decent book. I’ve added some quotes and observations from the last part of the book below.

“OLS is not the only modeling technique. To diminish the effect of outliers, and treat prediction errors as proportional to their absolute magnitude rather than their squares, one should use least absolute deviation (LAD) regression. This would be the case if the conditional distribution of the dependent variable were characterized by a distribution with heavy tails (compared to the normal distribution, increased probability of values far from the mean). One should also employ LAD regression when the conditional distribution of the dependent variable given the predictors is not symmetric and we wish to estimate its median rather than its mean value.
If it is not clear which variable should be viewed as the predictor and which the dependent variable, as is the case when evaluating two methods of measurement, then one should employ Deming or error in variable (EIV) regression.
If one’s primary interest is not in the expected value of the dependent variable but in its extremes (the number of bacteria that will survive treatment or the number of individuals who will fall below the poverty line), then one ought consider the use of quantile regression.
If distinct strata exist, one should consider developing separate regression models for each stratum, a technique known as ecological regression [] If one’s interest is in classification or if the majority of one’s predictors are dichotomous, then one should consider the use of classification and regression trees (CART) […] If the outcomes are limited to success or failure, one ought employ logistic regression. If the outcomes are counts rather than continuous measurements, one should employ a generalized linear model (GLM).”

“Linear regression is a much misunderstood and mistaught concept. If a linear model provides a good fit to data, this does not imply that a plot of the dependent variable with respect to the predictor would be a straight line, only that a plot of the dependent variable with respect to some not-necessarily monotonic function of the predictor would be a line. For example, y = A + B log[x] and y = A cos(x) + B sin(x) are both linear models whose coefficients A and B might be derived by OLS or LAD methods. Y = Ax5 is a linear model. Y = xA is nonlinear. […] Perfect correlation (ρ2 = 1) does not imply that two variables are identical but rather that one of them, Y, say, can be written as a linear function of the other, Y = a + bX, where b is the slope of the regression line and a is the intercept. […] Nonlinear regression methods are appropriate when the form of the nonlinear model is known in advance. For example, a typical pharmacological model will have the form A exp[bX] + C exp[dW]. The presence of numerous locally optimal but globally suboptimal solutions creates challenges, and validation is essential. […] To be avoided are a recent spate of proprietary algorithms available solely in software form that guarantee to find a best-fitting solution. In the words of John von Neumann, “With four parameters I can fit an elephant and with five I can make him wiggle his trunk.””

“[T]he most common errors associated with quantile regression include: 1. Failing to evaluate whether the model form is appropriate, for example, forcing linear fit through an obvious nonlinear response. (Of course, this is also a concern with mean regression, OLS, LAD, or EIV.) 2. Trying to over interpret a single quantile estimate (say 0.85) with a statistically significant nonzero slope (p < 0.05) when the majority of adjacent quantiles (say 0.5 − 0.84 and 0.86 − 0.95) are clearly zero (p > 0.20). 3. Failing to use all the information a quantile regression provides. Even if you think you are only interested in relations near maximum (say 0.90 − 0.99), your understanding will be enhanced by having estimates (and sampling variation via confidence intervals) across a wide range of quantiles (say 0.01 − 0.99).”

“Survival analysis is used to assess time-to-event data including time to recovery and time to revision. Most contemporary survival analysis is built around the Cox model […] Possible sources of error in the application of this model include all of the following: *Neglecting the possible dependence of the baseline function λ0 on the predictors. *Overmatching, that is, using highly correlated predictors that may well mask each other’s effects. *Using the parametric Breslow or Kaplan–Meier estimators of the survival function rather than the nonparametric Nelson–Aalen estimator. *Excluding patients based on post-hoc criteria. Pathology workups on patients who died during the study may reveal that some of them were wrongly diagnosed. Regardless, patients cannot be eliminated from the study as we lack the information needed to exclude those who might have been similarly diagnosed but who are still alive at the conclusion of the study. *Failure to account for differential susceptibility (frailty) of the patients”.

“In reporting the results of your modeling efforts, you need to be explicit about the methods used, the assumptions made, the limitations on your model’s range of application, potential sources of bias, and the method of validation […] Multivariable regression is plagued by the same problems univariate regression is heir to, plus many more of its own. […] If choosing the correct functional form of a model in a univariate case presents difficulties, consider that in the case of k variables, there are k linear terms (should we use logarithms? should we add polynomial terms?) and k(k − 1) first-order cross products of the form xixk. Should we include any of the k(k − 1)(k − 2) second-order cross products? A common error is to attribute the strength of a relationship to the magnitude of the predictor’s regression coefficient […] Just scale the units in which the predictor is reported to see how erroneous such an assumption is. […] One of the main problems in multiple regression is multicollinearity, which is the correlation among predictors. Even relatively weak levels of multicollinearity are enough to generate instability in multiple regression models […]. A simple solution is to evaluate the correlation matrix M among predictors, and use this matrix to choose the predictors that are less correlated. […] Test M for each predictor, using the variance inflation factor (VIF) given by (1 − R2) − 1, where R2 is the multiple coefficient of determination of the predictor against all other predictors. If VIF is large for a given predictor (>8, say) delete this predictor and reestimate the model. […] Dropping collinear variables from the analysis can result in a substantial loss of power”.

“It can be difficult to predict the equilibrium point for a supply-and-demand model, because producers change their price in response to demand and consumers change their demand in response to price. Failing to account for endogeneous variables can lead to biased estimates of the regression coefficients.
Endogeneity can arise not only as a result of omitted variables, but of measurement error, autocorrelated errors, simultaneity, and sample selection errors. One solution is to make use of instrument variables that should satisfy two conditions: 1. They should be correlated with the endogenous explanatory variables, conditional on the other covariates. 2. They should not be correlated with the error term in the explanatory equation, that is, they should not suffer from the same problem as the original predictor.
Instrumental variables are commonly used to estimate causal effects in contexts in which controlled experiments are not possible, for example in estimating the effects of past and projected government policies.”

“[T]he following errors are frequently associated with factor analysis: *Applying it to datasets with too few cases in relation to the number of variables analyzed […], without noticing that correlation coefficients have very wide confidence intervals in small samples. *Using oblique rotation to get a number of factors bigger or smaller than the number of factors obtained in the initial extraction by principal components, as a way to show the validity of a questionnaire. For example, obtaining only one factor by principal components and using the oblique rotation to justify that there were two differentiated factors, even when the two factors were correlated and the variance explained by the second factor was very small. *Confusion among the total variance explained by a factor and the variance explained in the reduced factorial space. In this way a researcher interpreted that a given group of factors explaining 70% of the variance before rotation could explain 100% of the variance after rotation.”

“Poisson regression is appropriate when the dependent variable is a count, as is the case with the arrival of individuals in an emergency room. It is also applicable to the spatial distributions of tornadoes and of clusters of galaxies.2 To be applicable, the events underlying the outcomes must be independent […] A strong assumption of the Poisson regression model is that the mean and variance are equal (equidispersion). When the variance of a sample exceeds the mean, the data are said to be overdispersed. Fitting the Poisson model to overdispersed data can lead to misinterpretation of coefficients due to poor estimates of standard errors. Naturally occurring count data are often overdispersed due to correlated errors in time or space, or other forms of nonindependence of the observations. One solution is to fit a Poisson model as if the data satisfy the assumptions, but adjust the model-based standard errors usually employed. Another solution is to estimate a negative binomial model, which allows for scalar overdispersion.”

“When multiple observations are collected for each principal sampling unit, we refer to the collected information as panel data, correlated data, or repeated measures. […] The dependency of observations violates one of the tenets of regression analysis: that observations are supposed to be independent and identically distributed or IID. Several concerns arise when observations are not independent. First, the effective number of observations (that is, the effective amount of information) is less than the physical number of observations […]. Second, any model that fails to specifically address [the] correlation is incorrect […]. Third, although the correct specification of the correlation will yield the most efficient estimator, that specification is not the only one to yield a consistent estimator.”

“The basic issue in deciding whether to utilize a fixed- or random-effects model is whether the sampling units (for which multiple observations are collected) represent the collection of most or all of the entities for which inference will be drawn. If so, the fixed-effects estimator is to be preferred. On the other hand, if those same sampling units represent a random sample from a larger population for which we wish to make inferences, then the random-effects estimator is more appropriate. […] Fixed- and random-effects models address unobserved heterogeneity. The random-effects model assumes that the panel-level effects are randomly distributed. The fixed-effects model assumes a constant disturbance that is a special case of the random-effects model. If the random-effects assumption is correct, then the random-effects estimator is more efficient than the fixed-effects estimator. If the random-effects assumption does not hold […], then the random effects model is not consistent. To help decide whether the fixed- or random-effects models is more appropriate, use the Durbin–Wu–Hausman3 test comparing coefficients from each model. […] Although fixed-effects estimators and random-effects estimators are referred to as subject-specific estimators, the GEEs available through PROC GENMOD in SAS or xtgee in Stata, are called population-averaged estimators. This label refers to the interpretation of the fitted regression coefficients. Subject-specific estimators are interpreted in terms of an effect for a given panel, whereas population-averaged estimators are interpreted in terms of an affect averaged over panels.”

“A favorite example in comparing subject-specific and population-averaged estimators is to consider the difference in interpretation of regression coefficients for a binary outcome model on whether a child will exhibit symptoms of respiratory illness. The predictor of interest is whether or not the child’s mother smokes. Thus, we have repeated observations on children and their mothers. If we were to fit a subject-specific model, we would interpret the coefficient on smoking as the change in likelihood of respiratory illness as a result of the mother switching from not smoking to smoking. On the other hand, the interpretation of the coefficient in a population-averaged model is the likelihood of respiratory illness for the average child with a nonsmoking mother compared to the likelihood for the average child with a smoking mother. Both models offer equally valid interpretations. The interpretation of interest should drive model selection; some studies ultimately will lead to fitting both types of models. […] In addition to model-based variance estimators, fixed-effects models and GEEs [Generalized Estimating Equation models] also admit modified sandwich variance estimators. SAS calls this the empirical variance estimator. Stata refers to it as the Robust Cluster estimator. Whatever the name, the most desirable property of the variance estimator is that it yields inference for the regression coefficients that is robust to misspecification of the correlation structure. […] Specification of GEEs should include careful consideration of reasonable correlation structure so that the resulting estimator is as efficient as possible. To protect against misspecification of the correlation structure, one should base inference on the modified sandwich variance estimator. This is the default estimator in SAS, but the user must specify it in Stata.”

“There are three main approaches to [model] validation: 1. Independent verification (obtained by waiting until the future arrives or through the use of surrogate variables). 2. Splitting the sample (using one part for calibration, the other for verification) 3. Resampling (taking repeated samples from the original sample and refitting the model each time).
Goodness of fit is no guarantee of predictive success. […] Splitting the sample into two parts, one for estimating the model parameters, the other for verification, is particularly appropriate for validating time series models in which the emphasis is on prediction or reconstruction. If the observations form a time series, the more recent observations should be reserved for validation purposes. Otherwise, the data used for validation should be drawn at random from the entire sample. Unfortunately, when we split the sample and use only a portion of it, the resulting estimates will be less precise. […] The proportion to be set aside for validation purposes will depend upon the loss function. If both the goodness-of-fit error in the calibration sample and the prediction error in the validation sample are based on mean-squared error, Picard and Berk [1990] report that we can minimize their sum by using between a quarter and a third of the sample for validation purposes.”

Advertisements

November 13, 2017 Posted by | Books, Statistics | Leave a comment

Common Errors in Statistics… (II)

Some more observations from the book below:

“[A] multivariate test, can be more powerful than a test based on a single variable alone, providing the additional variables are relevant. Adding variables that are unlikely to have value in discriminating among the alternative hypotheses simply because they are included in the dataset can only result in a loss of power. Unfortunately, what works when making a comparison between two populations based on a single variable fails when we attempt a multivariate comparison. Unless the data are multivariate normal, Hötelling’s T2, the multivariate analog of Student’s t, will not provide tests with the desired significance level. Only samples far larger than those we are likely to afford in practice are likely to yield multi-variate results that are close to multivariate normal. […] [A]n exact significance level can [however] be obtained in the multivariate case regardless of the underlying distribution by making use of the permutation distribution of Hötelling’s T2.”

“If you are testing against a one-sided alternative, for example, no difference versus improvement, then you require a one-tailed or one-sided test. If you are doing a head-to-head comparison — which alternative is best? — then a two-tailed test is required. […] A comparison of two experimental effects requires a statistical test on their difference […]. But in practice, this comparison is often based on an incorrect procedure involving two separate tests in which researchers conclude that effects differ when one effect is significant (p < 0.05) but the other is not (p > 0.05). Nieuwenhuis, Forstmann, and Wagenmakers [2011] reviewed 513 behavioral, systems, and cognitive neuroscience articles in five top-ranking journals and found that 78 used the correct procedure and 79 used the incorrect procedure. […] When the logic of a situation calls for demonstration of similarity rather than differences among responses to various treatments, then equivalence tests are often more relevant than tests with traditional no-effect null hypotheses […] Two distributions F and G, such that G[x] = F[x − δ], are said to be equivalent providing |δ| < Δ, where Δ is the smallest difference of clinical significance. To test for equivalence, we obtain a confidence interval for δ, rejecting equivalence only if this interval contains values in excess of |Δ|. The width of a confidence interval decreases as the sample size increases; thus, a very large sample may be required to demonstrate equivalence just as a very large sample may be required to demonstrate a clinically significant effect.”

“The most common test for comparing the means of two populations is based upon Student’s t. For Student’s t-test to provide significance levels that are exact rather than approximate, all the observations must be independent and, under the null hypothesis, all the observations must come from identical normal distributions. Even if the distribution is not normal, the significance level of the t-test is almost exact for sample sizes greater than 12; for most of the distributions one encounters in practice,5 the significance level of the t-test is usually within a percent or so of the correct value for sample sizes between 6 and 12. For testing against nonnormal alternatives, more powerful tests than the t-test exist. For example, a permutation test replacing the original observations with their normal scores is more powerful than the t-test […]. Permutation tests are derived by looking at the distribution of values the test statistic would take for each of the possible assignments of treatments to subjects. For example, if in an experiment two treatments were assigned at random to six subjects so that three subjects got one treatment and three the other, there would have been a total of 20 possible assignments of treatments to subjects.6 To determine a p-value, we compute for the data in hand each of the 20 possible values the test statistic might have taken. We then compare the actual value of the test statistic with these 20 values. If our test statistic corresponds to the most extreme value, we say that p = 1/20 = 0.05 (or 1/10 = 0.10 if this is a two-tailed permutation test). Against specific normal alternatives, this two-sample permutation test provides a most powerful unbiased test of the distribution-free hypothesis that the centers of the two distributions are the same […]. Violation of assumptions can affect not only the significance level of a test but the power of the test […] For example, although the significance level of the t-test is robust to departures from normality, the power of the t-test is not.”

“Group randomized trials (GRTs) in public health research typically use a small number of randomized groups with a relatively large number of participants per group. Typically, some naturally occurring groups are targeted: work sites, schools, clinics, neighborhoods, even entire towns or states. A group can be assigned to either the intervention or control arm but not both; thus, the group is nested within the treatment. This contrasts with the approach used in multicenter clinical trials, in which individuals within groups (treatment centers) may be assigned to any treatment. GRTs are characterized by a positive correlation of outcomes within a group and by the small number of groups. Feng et al. [2001] report a positive intraclass correlation (ICC) between the individuals’ target-behavior outcomes within the same group. […] The variance inflation factor (VIF) as a result of such commonalities is 1 + (n − 1)σ. […] Although σ in GRTs is usually quite small, the VIFs could still be quite large because VIF is a function of the product of the correlation and group size n. […] To be appropriate, an analysis method of GRTs need acknowledge both the ICC and the relatively small number of groups.”

“Recent simulations reveal that the classic test based on Pearson correlation is almost distribution free [Good, 2009]. Still, too often we treat a test of the correlation between two variables X and Y as if it were a test of their independence. X and Y can have a zero correlation coefficient, yet be totally dependent (for example, Y = X2). Even when the expected value of Y is independent of the expected value of X, the variance of Y might be directly proportional to the variance of X.”

“[O]ne of the most common statistical errors is to assume that because an effect is not statistically significant it does not exist. One of the most common errors in using the analysis of variance is to assume that because a factor such as sex does not yield a significant p-value that we may eliminate it from the model. […] The process of eliminating nonsignificant factors one by one from an analysis of variance means that we are performing a series of tests rather than a single test; thus, the actual significance level is larger than the declared significance level.”

“The greatest error associated with the use of statistical procedures is to make the assumption that one single statistical methodology can suffice for all applications. From time to time, a new statistical procedure will be introduced or an old one revived along with the assertion that at last the definitive solution has been found. […] Every methodology [however] has a proper domain of application and another set of applications for which it fails. Every methodology has its drawbacks and its advantages, its assumptions and its sources of error.”

“[T]o use the bootstrap or any other statistical methodology effectively, one has to be aware of its limitations. The bootstrap is of value in any situation in which the sample can serve as a surrogate for the population. If the sample is not representative of the population because the sample is small or biased, not selected at random, or its constituents are not independent of one another, then the bootstrap will fail. […] When using Bayesian methods[:] Do not use an arbitrary prior. Never report a p-value. Incorporate potential losses in the decision. Report the Bayes’ factor. […] In performing a meta-analysis, we need to distinguish between observational studies and randomized trials. Confounding and selection bias can easily distort the findings from observational studies. […] Publication and selection bias also plague the meta-analysis of completely randomized trials. […] One can not incorporate in a meta-analysis what one is not aware of. […] Similarly, the decision as to which studies to incorporate can dramatically affect the results. Meta-analyses of the same issue may reach opposite conclusions […] Where there are substantial differences between the different studies incorporated in a meta-analysis (their subjects or their environments), or substantial quantitative differences in the results from the different trials, a single overall summary estimate of treatment benefit has little practical applicability […]. Any analysis that ignores this heterogeneity is clinically misleading and scientifically naive […]. Heterogeneity should be scrutinized, with an attempt to explain it […] Bayesian methods can be effective in meta-analyses […]. In such situations, the parameters of various trials are considered to be random samples from a distribution of trial parameters. The parameters of this higher-level distribution are called hyperparameters, and they also have distributions. The model is called hierarchical. The extent to which the various trials reinforce each other is determined by the data. If the trials are very similar, the variation of the hyperparameters will be small, and the analysis will be very close to a classical meta-analysis. If the trials do not reinforce each other, the conclusions of the hierarchical Bayesian analysis will show a very high variance in the results. A hierarchical Bayesian analysis avoids the necessity of a prior decision as to whether the trials can be combined; the extent of the combination is determined purely by the data. This does not come for free; in contrast to the meta-analyses discussed above, all the original data (or at least the sufficient statistics) must be available for inclusion in the hierarchical model. The Bayesian method is also vulnerable to […] selection bias”.

“For small samples of three to five observations, summary statistics are virtually meaningless. Reproduce the actual observations; this is easier to do and more informative. Though the arithmetic mean or average is in common use for summarizing measurements, it can be very misleading. […] When the arithmetic mean is meaningful, it is usually equal to or close to the median. Consider reporting the median in the first place. The geometric mean is more appropriate than the arithmetic in three sets of circumstances: 1. When losses or gains can best be expressed as a percentage rather than a fixed value. 2. When rapid growth is involved, as is the case with bacterial and viral populations. 3. When the data span several orders of magnitude, as with the concentration of pollutants. […] Most populations are actually mixtures of populations. If multiple modes are observed in samples greater than 25 in size, the number of modes should be reported. […] The terms dispersion, precision, and accuracy are often confused. Dispersion refers to the variation within a sample or a population. Standard measures of dispersion include the variance, the mean absolute deviation, the interquartile range, and the range. Precision refers to how close several estimates based upon successive samples will come to one another, whereas accuracy refers to how close an estimate based on a sample will come to the population parameter it is estimating.”

“One of the most egregious errors in statistics, one encouraged, if not insisted upon by the editors of journals in the biological and social sciences, is the use of the notation “Mean ± Standard Error” to report the results of a set of observations. The standard error is a useful measure of population dispersion if the observations are continuous measurements that come from a normal or Gaussian distribution. […] But if the observations come from a nonsymmetric distribution such as an exponential or a Poisson, or a truncated distribution such as the uniform, or a mixture of populations, we cannot draw any such inference. Recall that the standard error equals the standard deviation divided by the square root of the sample size […] As the standard error depends on the squares of individual observations, it is particularly sensitive to outliers. A few extreme or outlying observations will have a dramatic impact on its value. If you can not be sure your observations come from a normal distribution, then consider reporting your results either in the form of a histogram […] or a Box and Whiskers plot […] If the underlying distribution is not symmetric, the use of the ± SE notation can be deceptive as it suggests a nonexistent symmetry. […] When the estimator is other than the mean, we cannot count on the Central Limit Theorem to ensure a symmetric sampling distribution. We recommend that you use the bootstrap whenever you report an estimate of a ratio or dispersion. […] If you possess some prior knowledge of the shape of the population distribution, you should take advantage of that knowledge by using a parametric bootstrap […]. The parametric bootstrap is particularly recommended for use in determining the precision of percentiles in the tails (P20, P10, P90, and so forth).”

“A common error is to misinterpret the confidence interval as a statement about the unknown parameter. It is not true that the probability that a parameter is included in a 95% confidence interval is 95%. What is true is that if we derive a large number of 95% confidence intervals, we can expect the true value of the parameter to be included in the computed intervals 95% of the time. (That is, the true values will be included if the assumptions on which the tests and confidence intervals are based are satisfied 100% of the time.) Like the p-value, the upper and lower confidence limits of a particular confidence interval are random variables, for they depend upon the sample that is drawn. […] In interpreting a confidence interval based on a test of significance, it is essential to realize that the center of the interval is no more likely than any other value, and the confidence to be placed in the interval is no greater than the confidence we have in the experimental design and statistical test it is based upon.”

“How accurate our estimates are and how consistent they will be from sample to sample will depend upon the nature of the error terms. If none of the many factors that contribute to the value of ε make more than a small contribution to the total, then ε will have a Gaussian distribution. If the {εi} are independent and normally distributed (Gaussian), then the ordinary least-squares estimates of the coefficients produced by most statistical software will be unbiased and have minimum variance. These desirable properties, indeed the ability to obtain coefficient values that are of use in practical applications, will not be present if the wrong model has been adopted. They will not be present if successive observations are dependent. The values of the coefficients produced by the software will not be of use if the associated losses depend on some function of the observations other than the sum of the squares of the differences between what is observed and what is predicted. In many practical problems, one is more concerned with minimizing the sum of the absolute values of the differences or with minimizing the maximum prediction error. Finally, if the error terms come from a distribution that is far from Gaussian, a distribution that is truncated, flattened or asymmetric, the p-values and precision estimates produced by the software may be far from correct.”

“I have attended far too many biology conferences at which speakers have used a significant linear regression of one variable on another as “proof” of a “linear” relationship or first-order behavior. […] The unfortunate fact, which should not be forgotten, is that if EY = a f[X], where f is a monotonically, increasing function of X, then any attempt to fit the equation Y = bg[X], where g is also a monotonically increasing function of X, will result in a value of b that is significantly different from zero. The “trick,” […] is in selecting an appropriate (cause-and-effect-based) functional form g to begin with. Regression methods and expensive software will not find the correct form for you.”

November 4, 2017 Posted by | Books, Statistics | Leave a comment

A few diabetes papers of interest

i. Chronic Fatigue in Type 1 Diabetes: Highly Prevalent but Not Explained by Hyperglycemia or Glucose Variability.

“Fatigue is a classical symptom of hyperglycemia, but the relationship between chronic fatigue and diabetes has not been systematically studied. […] glucose control [in diabetics] is often suboptimal with persistent episodes of hyperglycemia that may result in sustained fatigue. Fatigue may also sustain in diabetic patients because it is associated with the presence of a chronic disease, as has been demonstrated in patients with rheumatoid arthritis and various neuromuscular disorders (2,3).

It is important to distinguish between acute and chronic fatigue, because chronic fatigue, defined as severe fatigue that persists for at least 6 months, leads to substantial impairments in patients’ daily functioning (4,5). In contrast, acute fatigue can largely vary during the day and generally does not cause functional impairments.

Literature provides limited evidence for higher levels of fatigue in diabetic patients (6,7), but its chronicity, impact, and determinants are unknown. In various chronic diseases, it has been proven useful to distinguish between precipitating and perpetuating factors of chronic fatigue (3,8). Illness-related factors trigger acute fatigue, while other factors, often cognitions and behaviors, cause fatigue to persist. Sleep disturbances, low self-efficacy concerning fatigue, reduced physical activity, and a strong focus on fatigue are examples of these fatigue-perpetuating factors (810). An episode of hyperglycemia or hypoglycemia could trigger acute fatigue for diabetic patients (11,12). However, variations in blood glucose levels might also contribute to chronic fatigue, because these variations continuously occur.

The current study had two aims. First, we investigated the prevalence and impact of chronic fatigue in a large sample of type 1 diabetic (T1DM) patients and compared the results to a group of age- and sex-matched population-based controls. Secondly, we searched for potential determinants of chronic fatigue in T1DM.”

“A significantly higher percentage of T1DM patients were chronically fatigued (40%; 95% CI 34–47%) than matched controls (7%; 95% CI 3–10%). Mean fatigue severity was also significantly higher in T1DM patients (31 ± 14) compared with matched controls (17 ± 9; P < 0.001). T1DM patients with a comorbidity_mr [a comorbidity affecting patients’ daily functioning, based on medical records – US] or clinically relevant depressive symptoms [based on scores on the Beck Depression Inventory for Primary Care – US] were significantly more often chronically fatigued than patients without a comorbidity_mr (55 vs. 36%; P = 0.014) or without clinically relevant depressive symptoms (88 vs. 31%; P < 0.001). Patients who reported neuropathy, nephropathy, or cardiovascular disease as complications of diabetes were more often chronically fatigued […] Chronically fatigued T1DM patients were significantly more impaired compared with nonchronically fatigued T1DM patients on all aspects of daily functioning […]. Fatigue was the most troublesome symptom of the 34 assessed diabetes-related symptoms. The five most troublesome symptoms were overall sense of fatigue, lack of energy, increasing fatigue in the course of the day, fatigue in the morning when getting up, and sleepiness or drowsiness”.

“This study establishes that chronic fatigue is highly prevalent and clinically relevant in T1DM patients. While current blood glucose level was only weakly associated with chronic fatigue, cognitive behavioral factors were by far the strongest potential determinants.”

“Another study found that type 2 diabetic, but not T1DM, patients had higher levels of fatigue compared with healthy controls (7). This apparent discrepancy may be explained by the relatively small sample size of this latter study, potential selection bias (patients were not randomly selected), and the use of a different fatigue questionnaire.”

“Not only was chronic fatigue highly prevalent, fatigue also had a large impact on T1DM patients. Chronically fatigued T1DM patients had more functional impairments than nonchronically fatigued patients, and T1DM patients considered fatigue as the most burdensome diabetes-related symptom.

Contrary to what was expected, there was at best a weak relationship between blood glucose level and chronic fatigue. Chronically fatigued T1DM patients spent slightly less time in hypoglycemia, but average glucose levels, glucose variability, hyperglycemia, or HbA1c were not related to chronic fatigue. In type 2 diabetes mellitus also, no relationship was found between fatigue and HbA1c (7).”

“Regarding demographic characteristics, current health status, diabetes-related factors, and fatigue-related cognitions and behaviors as potential determinants of chronic fatigue, we found that sleeping problems, physical activity, self-efficacy concerning fatigue, age, depression, and pain were significantly associated with chronic fatigue in T1DM. Although depression was strongly related, it could not completely explain the presence of chronic fatigue (38), as 31% was chronically fatigued without having clinically relevant depressive symptoms.”

Some comments may be worth adding here. It’s important to note to people who may not be aware of this that although chronic fatigue is a weird entity that’s hard to get a handle on (and, to be frank, is somewhat controversial), specific organic causes have been identified that greatly increases the risk. Many survivors of cancer experience chronic fatigue (see e.g. this paper, or wikipedia), and chronic fatigue is also not uncommon in a kidney failure setting (“The silence of renal disease creeps up on us (doctors and patients). Do not dismiss odd chronic symptoms such as fatigue or ‘not being quite with it’ without considering checking renal function” (Oxford Handbook of Clinical Medicine, 9th edition. My italics – US)). As observed above, linkage with RA and some neuromuscular disorders has also been observed. The brief discussion of related topics in Houghton & Grey made it clear to me that some people with chronic fatigue are almost certainly suffering from an organic illness which has not been diagnosed or treated. Here’s a relevant quote from that book’s coverage: “it is unusual to find a definite organic cause for fatigue. However, consider anaemia, thyroid dysfunction, Addison’s disease and hypopituitarism.” It’s sort of neat, if you think about the potential diabetes-fatigue link investigated by the guys above, that these diseases are likely to be relevant, as type 1 diabetics are more likely to develop them (anemia is not linked to diabetes, as far as I know, but the rest of them clearly are) due to their development being caused by some of the same genetic mutations which cause type 1 diabetes – the combinations of some of these diseases even have fancy names of their own, like ‘Type I Polyglandular Autoimmune Syndrome’ and ‘Schmidt Syndrome’ (if you’re interested here are a couple of medscape links). It’s noteworthy that although most of these diseases are uncommon in the general population, their incidence is likely to be greatly increased in type 1 diabetics due to the common genetic pathways at play (variants regulating T-cell function seem to be important, but there’s no need to go into these details here). Sperling et al. note in their book that: “Hypothyroid or hyperthyroid AITD [autoimmune thyroid disease] has been observed in 10–24% of patients with type 1 diabetes”. In one series including 151 patients with APS [/PAS]-2, when they looked at disease combinations they found that: “Of combinations of the component diseases, [type 1] diabetes with thyroid disease was the most common, occurring in 33%. The second, diabetes with adrenal insufficiency, made up 15%” (same source).

It seems from estimates like these likely that a not unsubstantial proportion of type 1 diabetics over time go on to develop other health problems that might if unaddressed/undiagnosed cause fatigue, and this may in my opinion be a potentially much more important cause than direct metabolic effects such as hyperglycemia, or chronic inflammation. If this is the case you’d however expect to see a substantial sex difference, as the autoimmune syndromes are in general much more likely to hit females than males. I’m not completely sure how to interpret a few of the results reported, but to me it doesn’t look like the sex differences in this study are anywhere near ‘large enough’ to support such an explanatory model, though. Another big problem is also that fatigue seems to be more common in young patients, which is weird; most long-term complications display significant (positive) duration dependence, and when diabetes is a component of an autoimmune syndrome diabetes tend to develop first, with other diseases hitting later, usually in middle age. Duration and age are strongly correlated, and a negative duration dependence in a diabetes complication setting is a surprising and unusual finding that needs to be explained, badly; it’s unexpected and may in my opinion be the sign of a poor disease model. It’d make more sense for disease-related fatigue to present late, rather than early, I don’t really know what to make of that negative age gradient. ‘More studies needed’ (preferably by people familiar with those autoimmune syndromes..), etc…

ii. Risk for End-Stage Renal Disease Over 25 Years in the Population-Based WESDR Cohort.

“It is well known that diabetic nephropathy is the leading cause of end-stage renal disease (ESRD) in many regions, including the U.S. (1). Type 1 diabetes accounts for >45,000 cases of ESRD per year (2), and the incidence may be higher than in people with type 2 diabetes (3). Despite this, there are few population-based data available regarding the prevalence and incidence of ESRD in people with type 1 diabetes in the U.S. (4). A declining incidence of ESRD has been suggested by findings of lower incidence with increasing calendar year of diagnosis and in comparison with older reports in some studies in Europe and the U.S. (58). This is consistent with better diabetes management tools becoming available and increased renoprotective efforts, including the greater use of ACE inhibitors and angiotensin type II receptor blockers, over the past two to three decades (9). Conversely, no reduction in the incidence of ESRD across enrollment cohorts was found in a recent clinic-based study (9). Further, an increase in ESRD has been suggested for older but not younger people (9). Recent improvements in diabetes care have been suggested to delay rather than prevent the development of renal disease in people with type 1 diabetes (4).

A decrease in the prevalence of proliferative retinopathy by increasing calendar year of type 1 diabetes diagnosis was previously reported in the Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR) cohort (10); therefore, we sought to determine if a similar pattern of decline in ESRD would be evident over 25 years of follow-up. Further, we investigated factors that may mediate a possible decline in ESRD as well as other factors associated with incident ESRD over time.”

“At baseline, 99% of WESDR cohort members were white and 51% were male. Individuals were 3–79 years of age (mean 29) with diabetes duration of 0–59 years (mean 15), diagnosed between 1922 and 1980. Four percent of individuals used three or more daily insulin injections and none used an insulin pump. Mean HbA1c was 10.1% (87 mmol/mol). Only 16% were using an antihypertensive medication, none was using an ACE inhibitor, and 3% reported a history of renal transplant or dialysis (ESRD). At 25 years, 514 individuals participated (52% of original cohort at baseline, n = 996) and 367 were deceased (37% of baseline). Mean HbA1c was much lower than at baseline (7.5%, 58 mmol/mol), the decline likely due to the improvements in diabetes care, with 80% of participants using intensive insulin management (three or more daily insulin injections or insulin pump). The decline in HbA1c was steady, becoming slightly steeper following the results of the DCCT (25). Overall, at the 25-year follow-up, 47% had proliferative retinopathy, 53% used aspirin daily, and 54% reported taking antihypertensive medications, with the majority (87%) using an ACE inhibitor. Thirteen percent reported a history of ESRD.”

“Prevalence of ESRD was negligible until 15 years of diabetes duration and then steadily increased with 5, 8, 10, 13, and 14% reporting ESRD by 15–19, 20–24, 25–29, 30–34, and 35+ years of diabetes duration, respectively. […] After 15 years of diagnosis, prevalence of ESRD increased with duration in people diagnosed from 1960 to 1980, with the lowest increase in people with the most recent diagnosis. People diagnosed from 1922 to 1959 had consistent rather than increasing levels of ESRD with duration of 20+ years. If not for their greater mortality (at the 25-year follow-up, 48% of the deceased had been diagnosed prior to 1960), an increase with duration may have also been observed.

From baseline, the unadjusted cumulative 25-year incidence of ESRD was 17.9% (95% CI 14.3–21.5) in males, 10.3% (7.4–13.2) in females, and 14.2% (11.9–16.5) overall. For those diagnosed in 1970–1980, the cumulative incidence at 14, 20, and 25 years of follow-up (or ∼15–25, 20–30, and 25–35 years diabetes duration) was 5.2, 7.9, and 9.3%, respectively. At 14, 20, and 25 years of follow-up (or 35, 40, and 45 up to 65+ years diabetes duration), the cumulative incidence in those diagnosed during 1922–1969 was 13.6, 16.3, and 18.8%, respectively, consistent with the greater prevalence observed for these diagnosis periods at longer duration of diabetes.”

“The unadjusted hazard of ESRD was reduced by 70% among those diagnosed in 1970–1980 as compared with those in 1922–1969 (HR 0.29 [95% CI 0.19–0.44]). Duration (by 10%) and HbA1c (by an additional 10%) partially mediated this association […] Blood pressure and antihypertensive medication use each further attenuated the association. When fully adjusted for these and [other risk factors included in the model], period of diagnosis was no longer significant (HR 0.89 [0.55–1.45]). Sensitivity analyses for the hazard of incident ESRD or death due to renal disease showed similar findings […] The most parsimonious model included diabetes duration, HbA1c, age, sex, systolic and diastolic blood pressure, and history of antihypertensive medication […]. A 32% increased risk for incident ESRD was found per increasing year of diabetes duration at 0–15 years (HR 1.32 per year [95% CI 1.16–1.51]). The hazard plateaued (1.01 per year [0.98–1.05]) after 15 years of duration of diabetes. Hazard of ESRD increased with increasing HbA1c (1.28 per 1% or 10.9 mmol/mol increase [1.14–1.45]) and blood pressure (1.51 per 10 mmHg increase in systolic pressure [1.35–1.68]; 1.12 per 5 mmHg increase in diastolic pressure [1.01–1.23]). Use of antihypertensive medications increased the hazard of incident ESRD nearly fivefold [this finding is almost certainly due to confounding by indication, as also noted by the authors later on in the paper – US], and males had approximately two times the risk as compared with females. […] Having proliferative retinopathy was strongly associated with increased risk (HR 5.91 [3.00–11.6]) and attenuated the association between sex and ESRD.”

“The current investigation […] sought to provide much-needed information on the prevalence and incidence of ESRD and associated risk specific to people with type 1 diabetes. Consistent with a few previous studies (5,7,8), we observed decreased prevalence and incidence of ESRD among individuals with type 1 diabetes diagnosed in the 1970s compared with prior to 1970. The Epidemiology of Diabetes Complications (EDC) Study, another large cohort of people with type 1 diabetes followed over a long period of time, reported cumulative incidence rates of 2–6% for those diagnosed after 1970 and with similar duration (7), comparable to our findings. Slightly higher cumulative incidence (7–13%) reported from older studies at slightly lower duration also supports a decrease in incidence of ESRD (2830). Cumulative incidences through 30 years in European cohorts were even lower (3.3% in Sweden [6] and 7.8% in Finland [5]), compared with the 9.3% noted for those diagnosed during 1970–1980 in the WESDR cohort. The lower incidence could be associated with nationally organized care, especially in Sweden where a nationwide intensive diabetes management treatment program was implemented at least a decade earlier than recommendations for intensive care followed from the results of the DCCT in the U.S.”

“We noted an increased risk of incident ESRD in the first 15 years of diabetes not evident at longer durations. This pattern also demonstrated by others could be due to a greater earlier risk among people most genetically susceptible, as only a subset of individuals with type 1 diabetes will develop renal disease (27,28). The risk plateau associated with greater durations of diabetes and lower risk associated with increasing age may also reflect more death at longer durations and older ages. […] Because age and duration are highly correlated, we observed a positive association between age and ESRD only in univariate analyses, without adjustment for duration. The lack of adjustment for diabetes duration may have, in part, explained the increasing incidence of ESRD shown with age for some people in a recent investigation (9). Adjustment for both age and duration was found appropriate after testing for collinearity in the current analysis.”

In conclusion, this U.S. population-based report showed a lower prevalence and incidence of ESRD among those more recently diagnosed, explained by improvements in glycemic and blood pressure control over the last several decades. Even lower rates may be expected for those diagnosed during the current era of diabetes care. Intensive diabetes management, especially for glycemic control, remains important even in long-standing diabetes as potentially delaying the development of ESRD.

iii. Earlier Onset of Complications in Youth With Type 2 Diabetes.

The prevalence of type 2 diabetes in youth is increasing worldwide, coinciding with the rising obesity epidemic (1,2). […] Diabetes is associated with both microvascular and macrovascular complications. The evolution of these complications has been well described in type 1 diabetes (6) and in adult type 2 diabetes (7), wherein significant complications typically manifest 15–20 years after diagnosis (8). Because type 2 diabetes is a relatively new disease in children (first described in the 1980s), long-term outcome data on complications are scant, and risk factors for the development of complications are incompletely understood. The available literature suggests that development of complications in youth with type 2 diabetes may be more rapid than in adults, thus afflicting individuals at the height of their individual and social productivity (9). […] A small but notable proportion of type 2 diabetes is associated with a polymorphism of hepatic nuclear factor (HNF)-1α, a transcription factor expressed in many tissues […] It is not yet known what effect the HNF-1α polymorphism has on the risk of complications associated with diabetes.”

“The main objective of the current study was to describe the time course and risk factors for microvascular complications (nephropathy, retinopathy, and neuropathy) and macrovascular complications (cardiac, cerebrovascular, and peripheral vascular diseases) in a large cohort of youth [diagnosed with type 2 diabetes] who have been carefully followed for >20 years and to compare this evolution with that of youth with type 1 diabetes. We also compared vascular complications in the youth with type 2 diabetes with nondiabetic control youth. Finally, we addressed the impact of HNF-1α G319S on the evolution of complications in young patients with type 2 diabetes.”

“All prevalent cases of type 2 diabetes and type 1 diabetes (control group 1) seen between January 1986 and March 2007 in the DER-CA for youth aged 1–18 years were included. […] The final type 2 diabetes cohort included 342 youth, and the type 1 diabetes control group included 1,011. The no diabetes control cohort comprised 1,710 youth matched to the type 2 diabetes cohort from the repository […] Compared with the youth with type 1 diabetes, the youth with type 2 diabetes were, on average, older at the time of diagnosis and more likely to be female. They were more likely to have a higher BMIz, live in a rural area, have a low SES, and have albuminuria at diagnosis. […] one-half of the type 2 diabetes group was either a heterozygote (GS) or a homozygote (SS) for the HNF-1α polymorphism […] At the time of the last available follow-up in the DER-CA, the youth with diabetes were, on average, between 15 and 16 years of age. […] The median follow-up times in the repository were 4.4 (range 0–27.4) years for youth with type 2 diabetes, 6.7 ( 0–28.2) years for youth with type 1 diabetes, and 6.0 (0–29.9) years for nondiabetic control youth.”

“After controlling for low SES, sex, and BMIz, the risk associated with type 2 versus type 1 diabetes of any complication was an HR of 1.47 (1.02–2.12, P = 0.04). […] In the univariate analysis, youth with type 2 diabetes were at significantly higher risk of developing any vascular (HR 6.15 [4.26–8.87], P < 0.0001), microvascular (6.26 [4.32–9.10], P < 0.0001), or macrovascular (4.44 [1.71–11.52], P < 0.0001) disease compared with control youth without diabetes. In addition, the youth with type 2 diabetes had an increased risk of opthalmologic (19.49 [9.75–39.00], P < 0.0001), renal (16.13 [7.66–33.99], P < 0.0001), and neurologic (2.93 [1.79–4.80], P ≤ 0.001) disease. There were few cardiovascular, cerebrovascular, and peripheral vascular disease events in all groups (five or fewer events per group). Despite this, there was still a statistically significant higher risk of peripheral vascular disease in the type 2 diabetes group (6.25 [1.68–23.28], P = 0.006).”

“Differences in renal and neurologic complications between the two diabetes groups began to occur before 5 years postdiagnosis, whereas differences in ophthalmologic complications began 10 years postdiagnosis. […] Both cardiovascular and cerebrovascular complications were rare in both groups, but peripheral vascular complications began to occur 15 years after diagnosis in the type 2 diabetes group […] The presence of HNF-1α G319S polymorphism in youth with type 2 diabetes was found to be protective of complications. […] Overall, major complications were rare in the type 1 diabetes group, but they occurred in 1.1% of the type 2 diabetes cohort at 10 years, in 26.0% at 15 years, and in 47.9% at 20 years after diagnosis (P < 0.001) […] youth with type 2 diabetes have a higher risk of any complication than youth with type 1 diabetes and nondiabetic control youth. […] The time to both renal and neurologic complications was significantly shorter in youth with type 2 diabetes than in control youth, whereas differences were not significant with respect to opthalmologic and cardiovascular complications between cohorts. […] The current study is consistent with the literature, which has shown high rates of cardiovascular risk factors in youth with type 2 diabetes. However, despite the high prevalence of risk, this study reports low rates of clinical events. Because the median follow-up time was between 5 and 8 years, it is possible that a longer follow-up period would be required to correctly evaluate macrovascular outcomes in young adults. Also possible is that diagnoses of mild disease are not being made because of a low index of suspicion in 20- and 30-year-old patients.”

In conclusion, youth with type 2 diabetes have an increased risk of complications early in the course of their disease. Microvascular complications and cardiovascular risk factors are highly prevalent, whereas macrovascular complications are rare in young adulthood. HbA1c is an important modifiable risk factor; thus, optimizing glycemic control should remain an important goal of therapy.”

iv. HbA1c and Coronary Heart Disease Risk Among Diabetic Patients.

“We prospectively investigated the association of HbA1c at baseline and during follow-up with CHD risk among 17,510 African American and 12,592 white patients with type 2 diabetes. […] During a mean follow-up of 6.0 years, 7,258 incident CHD cases were identified. The multivariable-adjusted hazard ratios of CHD associated with different levels of HbA1c at baseline (<6.0 [reference group], 6.0–6.9, 7.0–7.9, 8.0–8.9, 9.0–9.9, 10.0–10.9, and ≥11.0%) were 1.00, 1.07 (95% CI 0.97–1.18), 1.16 (1.04–1.31), 1.15 (1.01–1.32), 1.26 (1.09–1.45), 1.27 (1.09–1.48), and 1.24 (1.10–1.40) (P trend = 0.002) for African Americans and 1.00, 1.04 (0.94–1.14), 1.15 (1.03–1.28), 1.29 (1.13–1.46), 1.41 (1.22–1.62), 1.34 (1.14–1.57), and 1.44 (1.26–1.65) (P trend <0.001) for white patients, respectively. The graded association of HbA1c during follow-up with CHD risk was observed among both African American and white diabetic patients (all P trend <0.001). Each one percentage increase of HbA1c was associated with a greater increase in CHD risk in white versus African American diabetic patients. When stratified by sex, age, smoking status, use of glucose-lowering agents, and income, this graded association of HbA1c with CHD was still present. […] The current study in a low-income population suggests a graded positive association between HbA1c at baseline and during follow-up with the risk of CHD among both African American and white diabetic patients with low socioeconomic status.”

A few more observations from the conclusions:

“Diabetic patients experience high mortality from cardiovascular causes (2). Observational studies have confirmed the continuous and positive association between glycemic control and the risk of cardiovascular disease among diabetic patients (4,5). But the findings from RCTs are sometimes uncertain. Three large RCTs (79) designed primarily to determine whether targeting different glucose levels can reduce the risk of cardiovascular events in patients with type 2 diabetes failed to confirm the benefit. Several reasons for the inconsistency of these studies can be considered. First, small sample sizes, short follow-up duration, and few CHD cases in some RCTs may limit the statistical power. Second, most epidemiological studies only assess a single baseline measurement of HbA1c with CHD risk, which may produce potential bias. The recent analysis of 10 years of posttrial follow-up of the UKPDS showed continued reductions for myocardial infarction and death from all causes despite an early loss of glycemic differences (10). The scientific evidence from RCTs was not sufficient to generate strong recommendations for clinical practice. Thus, consensus groups (AHA, ACC, and ADA) have provided a conservative endorsement (class IIb recommendation, level of evidence A) for the cardiovascular benefits of glycemic control (11). In the absence of conclusive evidence from RCTs, observational epidemiological studies might provide useful information to clarify the relationship between glycemia and CHD risk. In the current study with 30,102 participants with diabetes and 7,258 incident CHD cases during a mean follow-up of 6.0 years, we found a graded positive association by various HbA1c intervals of clinical relevance or by using HbA1c as a continuous variable at baseline and during follow-up with CHD risk among both African American and white diabetic patients. Each one percentage increase in baseline and follow-up HbA1c was associated with a 2 and 5% increased risk of CHD in African American and 6 and 11% in white diabetic patients. Each one percentage increase of HbA1c was associated with a greater increase in CHD risk in white versus African American diabetic patients.”

v. Blood Viscosity in Subjects With Normoglycemia and Prediabetes.

“Blood viscosity (BV) is the force that counteracts the free sliding of the blood layers within the circulation and depends on the internal cohesion between the molecules and the cells. Abnormally high BV can have several negative effects: the heart is overloaded to pump blood in the vascular bed, and the blood itself, more viscous, can damage the vessel wall. Furthermore, according to Poiseuille’s law (1), BV is inversely related to flow and might therefore reduce the delivery of insulin and glucose to peripheral tissues, leading to insulin resistance or diabetes (25).

It is generally accepted that BV is increased in diabetic patients (68). Although the reasons for this alteration are still under investigation, it is believed that the increase in osmolarity causes increased capillary permeability and, consequently, increased hematocrit and viscosity (9). It has also been suggested that the osmotic diuresis, consequence of hyperglycemia, could contribute to reduce plasma volume and increase hematocrit (10).

Cross-sectional studies have also supported a link between BV, hematocrit, and insulin resistance (1117). Recently, a large prospective study has demonstrated that BV and hematocrit are risk factors for type 2 diabetes. Subjects in the highest quartile of BV were >60% more likely to develop diabetes than their counterparts in the lowest quartile (18). This finding confirms previous observations obtained in smaller or selected populations, in which the association between hemoglobin or hematocrit and occurrence of type 2 diabetes was investigated (1922).

These observations suggest that the elevation in BV may be very early, well before the onset of diabetes, but definite data in subjects with normal glucose or prediabetes are missing. In the current study, we evaluated the relationship between BV and blood glucose in subjects with normal glucose or prediabetes in order to verify whether alterations in viscosity are appreciable in these subjects and at which blood glucose concentration they appear.”

“According to blood glucose levels, participants were divided into three groups: group A, blood glucose <90 mg/dL; group B, blood glucose between 90 and 99 mg/dL; and group C, blood glucose between 100 and 125 mg/dL. […] Hematocrit (P < 0.05) and BV (P between 0.01 and 0.001) were significantly higher in subjects with prediabetes and in those with blood glucose ranging from 90 to 99 mg/dL compared with subjects with blood glucose <90 mg/dL. […] The current study shows, for the first time, a direct relationship between BV and blood glucose in nondiabetic subjects. It also suggests that, even within glucose values ​​considered completely normal, individuals with higher blood glucose levels have increases in BV comparable with those observed in subjects with prediabetes. […] Overall, changes in viscosity in diabetic patients are accepted as common and as a result of the disease. However, the relationship between blood glucose, diabetes, and viscosity may be much more complex. […] the main finding of the study is that BV significantly increases already at high-normal blood glucose levels, independently of other common determinants of hemorheology. Intervention studies are needed to verify whether changes in BV can influence the development of type 2 diabetes.”

vi. Higher Relative Risk for Multiple Sclerosis in a Pediatric and Adolescent Diabetic Population: Analysis From DPV Database.

“Type 1 diabetes and multiple sclerosis (MS) are organ-specific inflammatory diseases, which result from an autoimmune attack against either pancreatic β-cells or the central nervous system; a combined appearance has been described repeatedly (13). For children and adolescents below the age of 21 years, the prevalence of type 1 diabetes in Germany and Austria is ∼19.4 cases per 100,000 population, and for MS it is 7–10 per 100,000 population (46). A Danish cohort study revealed a three times higher risk for the development of MS in patients with type 1 diabetes (7). Further, an Italian study conducted in Sardinia showed a five times higher risk for the development of type 1 diabetes in MS patients (8,9). An American study on female adults in whom diabetes developed before the age of 21 years yielded an up to 20 times higher risk for the development of MS (10).

These findings support the hypothesis of clustering between type 1 diabetes and MS. The pathogenesis behind this association is still unclear, but T-cell cross-reactivity was discussed as well as shared disease associations due to the HLA-DRB1-DQB1 gene loci […] The aim of this study was to evaluate the prevalence of MS in a diabetic population and to look for possible factors related to the co-occurrence of MS in children and adolescents with type 1 diabetes using a large multicenter survey from the Diabetes Patienten Verlaufsdokumentation (DPV) database.”

“We used a large database of pediatric and adolescent type 1 diabetic patients to analyze the RR of MS co-occurrence. The DPV database includes ∼98% of the pediatric diabetic population in Germany and Austria below the age of 21 years. In children and adolescents, the RR for MS in type 1 diabetes was estimated to be three to almost five times higher in comparison with the healthy population.”

November 2, 2017 Posted by | Cardiology, Diabetes, Epidemiology, Genetics, Immunology, Medicine, Nephrology, Statistics, Studies | Leave a comment

Common Errors in Statistics…

“Pressed by management or the need for funding, too many research workers have no choice but to go forward with data analysis despite having insufficient statistical training. Alas, though a semester or two of undergraduate statistics may develop familiarity with the names of some statistical methods, it is not enough to be aware of all the circumstances under which these methods may be applicable.

The purpose of the present text is to provide a mathematically rigorous but readily understandable foundation for statistical procedures. Here are such basic concepts in statistics as null and alternative hypotheses, p-value, significance level, and power. Assisted by reprints from the statistical literature, we reexamine sample selection, linear regression, the analysis of variance, maximum likelihood, Bayes’ Theorem, meta-analysis and the bootstrap. New to this edition are sections on fraud and on the potential sources of error to be found in epidemiological and case-control studies.

Examples of good and bad statistical methodology are drawn from agronomy, astronomy, bacteriology, chemistry, criminology, data mining, epidemiology, hydrology, immunology, law, medical devices, medicine, neurology, observational studies, oncology, pricing, quality control, seismology, sociology, time series, and toxicology. […] Lest the statisticians among you believe this book is too introductory, we point out the existence of hundreds of citations in statistical literature calling for the comprehensive treatment we have provided. Regardless of past training or current specialization, this book will serve as a useful reference; you will find applications for the information contained herein whether you are a practicing statistician or a well-trained scientist who just happens to apply statistics in the pursuit of other science.”

I’ve been reading this book. I really like it so far, this is a nice book. A lot of the stuff included is review, but there are of course also some new ideas here and there (for example I’d never heard about Stein’s paradox before) and given how much stuff you need to remember and keep in mind in order not to make silly mistakes when analyzing data or interpreting the results of statistical analyses, occasional reviews of these things is probably a very good idea.

I have added some more observations from the first 100 pages or so below:

“Test only relevant null hypotheses. The null hypothesis has taken on an almost mythic role in contemporary statistics. Obsession with the null (more accurately spelled and pronounced nil), has been allowed to shape the direction of our research. […] Virtually any quantifiable hypothesis can be converted into null form. There is no excuse and no need to be content with a meaningless nil. […] we need to have an alternative hypothesis or alternatives firmly in mind when we set up a test. Too often in published research, such alternative hypotheses remain unspecified or, worse, are specified only after the data are in hand. We must specify our alternatives before we commence an analysis, preferably at the same time we design our study. Are our alternatives one-sided or two-sided? If we are comparing several populations at the same time, are their means ordered or unordered? The form of the alternative will determine the statistical procedures we use and the significance levels we obtain. […] The critical values and significance levels are quite different for one-tailed and two-tailed tests and, all too often, the wrong test has been employed in published work. McKinney et al. [1989] reviewed some 70-plus articles that appeared in six medical journals. In over half of these articles, Fisher’s exact test was applied improperly. Either a one-tailed test had been used when a two-tailed test was called for or the authors of the paper simply had not bothered to state which test they had used. […] the F-ratio and the chi-square are what are termed omnibus tests, designed to be sensitive to all possible alternatives. As such, they are not particularly sensitive to ordered alternatives such “as more fertilizer equals more growth” or “more aspirin equals faster relief of headache.” Tests for such ordered responses at k distinct treatment levels should properly use the Pitman correlation“.

“Before we initiate data collection, we must have a firm idea of what we will measure and how we will measure it. A good response variable

  • Is easy to record […]
  • Can be measured objectively on a generally accepted scale.
  • Is measured in appropriate units.
  • Takes values over a sufficiently large range that discriminates well.
  • Is well defined. […]
  • Has constant variance over the range used in the experiment (Bishop and Talbot, 2001).”

“A second fundamental principle is also applicable to both experiments and surveys: Collect exact values whenever possible. Worry about grouping them in intervals or discrete categories later.”

“Sample size must be determined for each experiment; there is no universally correct value. We need to understand and make use of the relationships among effect size, sample size, significance level, power, and the precision of our measuring instruments. Increase the precision (and hold all other parameters fixed) and we can decrease the required number of observations. Decreases in any or all of the intrinsic and extrinsic sources of variation will also result in a decrease in the required number. […] The smallest effect size of practical interest may be determined through consultation with one or more domain experts. The smaller this value, the greater the number of observations that will be required. […] Strictly speaking, the significance level and power should be chosen so as to minimize the overall cost of any project, balancing the cost of sampling with the costs expected from Type I and Type II errors. […] When determining sample size for data drawn from the binomial or any other discrete distribution, one should always display the power curve. […] As a result of inspecting the power curve by eye, you may come up with a less-expensive solution than your software. […] If the data do not come from a well-tabulated distribution, then one might use a bootstrap to estimate the power and significance level. […] Many researchers today rely on menu-driven software to do their power and sample-size calculations. Most such software comes with default settings […] — settings that are readily altered, if, that is, investigators bother to take the time.”

“The relative ease with which a program like Stata […] can produce a sample size may blind us to the fact that the number of subjects with which we begin a study may bear little or no relation to the number with which we conclude it. […] Potential subjects can and do refuse to participate. […] Worse, they may agree to participate initially, then drop out at the last minute […]. They may move without a forwarding address before a scheduled follow-up, or may simply do not bother to show up for an appointment. […] The key to a successful research program is to plan for such drop-outs in advance and to start the trials with some multiple of the number required to achieve a given power and significance level. […] it is the sample you end with, not the sample you begin with, that determines the power of your tests. […] An analysis of those who did not respond to a survey or a treatment can sometimes be as or more informative than the survey itself. […] Be sure to incorporate in your sample design and in your budget provisions for sampling nonresponders.”

“[A] randomly selected sample may not be representative of the population as a whole. For example, if a minority comprises less than 10% of a population, then a jury of 12 persons selected at random from that population will fail to contain a single member of that minority at least 28% of the time.”

“The proper starting point for the selection of the best method of estimation is with the objectives of our study: What is the purpose of our estimate? If our estimate is θ* and the actual value of the unknown parameter is θ, what losses will we be subject to? It is difficult to understand the popularity of the method of maximum likelihood and other estimation procedures that do not take these losses into consideration. The majority of losses will be monotonically nondecreasing in nature, that is, the further apart the estimate θ* and the true value θ, the larger our losses are likely to be. Typical forms of the loss function are the absolute deviation |θ* – θ|, the square deviation (θ* − θ)2, and the jump, that is, no loss if |θ* − θ| < i, and a big loss otherwise. Or the loss function may resemble the square deviation but take the form of a step function increasing in discrete increments. Desirable estimators are impartial, consistent, efficient, robust, and minimum loss. […] Interval estimates are to be preferred to point estimates; they are less open to challenge for they convey information about the estimate’s precision.”

“Estimators should be consistent, that is, the larger the sample, the greater the probability the resultant estimate will be close to the true population value. […] [A] consistent estimator […] is to be preferred to another if the first consistent estimator can provide the same degree of accuracy with fewer observations. To simplify comparisons, most statisticians focus on the asymptotic relative efficiency (ARE), defined as the limit with increasing sample size of the ratio of the number of observations required for each of two consistent statistical procedures to achieve the same degree of accuracy. […] Estimators that are perfectly satisfactory for use with symmetric, normally distributed populations may not be as desirable when the data come from nonsymmetric or heavy-tailed populations, or when there is a substantial risk of contamination with extreme values. When estimating measures of central location, one way to create a more robust estimator is to trim the sample of its minimum and maximum values […]. As information is thrown away, trimmed estimators are [however] less efficient. […] Many semiparametric estimators are not only robust but provide for high ARE with respect to their parametric counterparts. […] The accuracy of an estimate […] and the associated losses will vary from sample to sample. A minimum loss estimator is one that minimizes the losses when the losses are averaged over the set of all possible samples. Thus, its form depends upon all of the following: the loss function, the population from which the sample is drawn, and the population characteristic that is being estimated. An estimate that is optimal in one situation may only exacerbate losses in another. […] It is easy to envision situations in which we are less concerned with the average loss than with the maximum possible loss we may incur by using a particular estimation procedure. An estimate that minimizes the maximum possible loss is termed a mini–max estimator.”

“In survival studies and reliability analyses, we follow each subject and/or experiment unit until either some event occurs or the experiment is terminated; the latter observation is referred to as censored. The principal sources of error are the following:

  • Lack of independence within a sample
  • Lack of independence of censoring
  • Too many censored values
  • Wrong test employed”

“Lack of independence within a sample is often caused by the existence of an implicit factor in the data. For example, if we are measuring survival times for cancer patients, diet may be correlated with survival times. If we do not collect data on the implicit factor(s) (diet in this case), and the implicit factor has an effect on survival times, then we no longer have a sample from a single population. Rather, we have a sample that is a mixture drawn from several populations, one for each level of the implicit factor, each with a different survival distribution. Implicit factors can also affect censoring times, by affecting the probability that a subject will be withdrawn from the study or lost to follow-up. […] Stratification can be used to control for an implicit factor. […] This is similar to using blocking in analysis of variance. […] If the pattern of censoring is not independent of the survival times, then survival estimates may be too high (if subjects who are more ill tend to be withdrawn from the study), or too low (if subjects who will survive longer tend to drop out of the study and are lost to follow-up). If a loss or withdrawal of one subject could increase the probability of loss or withdrawal of other subjects, this would also lead to lack of independence between censoring and the subjects. […] A study may end up with many censored values as a result of having large numbers of subjects withdrawn or lost to follow-up, or from having the study end while many subjects are still alive. Large numbers of censored values decrease the equivalent number of subjects exposed (at risk) at later times, reducing the effective sample sizes. […] Survival tests perform better when the censoring is not too heavy, and, in particular, when the pattern of censoring is similar across the different groups.”

“Kaplan–Meier survival analysis (KMSA) is the appropriate starting point [in the type 2 censoring setting]. KMSA can estimate survival functions even in the presence of censored cases and requires minimal assumptions. If covariates other than time are thought to be important in determining duration to outcome, results reported by KMSA will represent misleading averages, obscuring important differences in groups formed by the covariates (e.g., men vs. women). Since this is often the case, methods that incorporate covariates, such as event-history models and Cox regression, may be preferred. For small samples, the permutation distributions of the Gehan–Breslow, Mantel–Cox, and Tarone–Ware survival test statistics and not the chi-square distribution should be used to compute p-values. If the hazard or survival functions are not parallel, then none of the three tests […] will be particularly good at detecting differences between the survival functions.”

November 1, 2017 Posted by | Books, Statistics | Leave a comment

A few diabetes papers of interest

i. The Pharmacogenetics of Type 2 Diabetes: A Systematic Review.

“We performed a systematic review to identify which genetic variants predict response to diabetes medications.

RESEARCH DESIGN AND METHODS We performed a search of electronic databases (PubMed, EMBASE, and Cochrane Database) and a manual search to identify original, longitudinal studies of the effect of diabetes medications on incident diabetes, HbA1c, fasting glucose, and postprandial glucose in prediabetes or type 2 diabetes by genetic variation.

RESULTS Of 7,279 citations, we included 34 articles (N = 10,407) evaluating metformin (n = 14), sulfonylureas (n = 4), repaglinide (n = 8), pioglitazone (n = 3), rosiglitazone (n = 4), and acarbose (n = 4). […] Significant medication–gene interactions for glycemic outcomes included 1) metformin and the SLC22A1, SLC22A2, SLC47A1, PRKAB2, PRKAA2, PRKAA1, and STK11 loci; 2) sulfonylureas and the CYP2C9 and TCF7L2 loci; 3) repaglinide and the KCNJ11, SLC30A8, NEUROD1/BETA2, UCP2, and PAX4 loci; 4) pioglitazone and the PPARG2 and PTPRD loci; 5) rosiglitazone and the KCNQ1 and RBP4 loci; and 5) acarbose and the PPARA, HNF4A, LIPC, and PPARGC1A loci. Data were insufficient for meta-analysis.

CONCLUSIONS We found evidence of pharmacogenetic interactions for metformin, sulfonylureas, repaglinide, thiazolidinediones, and acarbose consistent with their pharmacokinetics and pharmacodynamics.”

“In this systematic review, we identified 34 articles on the pharmacogenetics of diabetes medications, with several reporting statistically significant interactions between genetic variants and medications for glycemic outcomes. Most pharmacogenetic interactions were only evaluated in a single study, did not use a control group, and/or did not report enough information to judge internal validity. However, our results do suggest specific, biologically plausible, gene–medication interactions, and we recommend confirmation of the biologically plausible interactions as a priority, including those for drug transporters, metabolizers, and targets of action. […] Given the number of comparisons reported in the included studies and the lack of accounting for multiple comparisons in approximately 53% of studies, many of the reported findings may [however] be false positives.”

ii. Insights Offered by Economic Analyses.

“This issue of Diabetes Care includes three economic analyses. The first describes the incremental costs of diabetes over a lifetime and highlights how interventions to prevent diabetes may reduce lifetime costs (1). The second demonstrates that although an expensive, intensive lifestyle intervention for type 2 diabetes does not reduce adverse cardiovascular outcomes over 10 years, it significantly reduces the costs of non-intervention−related medical care (2). The third demonstrates that although the use of the International Association of the Diabetes and Pregnancy Study Groups (IADPSG) criteria for the screening and diagnosis of gestational diabetes mellitus (GDM) results in a threefold increase in the number of people labeled as having GDM, it reduces the risk of maternal and neonatal adverse health outcomes and reduces costs (3). The first report highlights the enormous potential value of intervening in adults at high risk for type 2 diabetes to prevent its development. The second illustrates the importance of measuring economic outcomes in addition to standard clinical outcomes to fully assess the value of new treatments. The third demonstrates the importance of rigorously weighing the costs of screening and treatment against the costs of health outcomes when evaluating new approaches to care.”

“The costs of diabetes monitoring and treatment accrue as of function of the duration of diabetes, so adults who are younger at diagnosis are more likely to survive to develop the late, expensive complications of diabetes, thus they incur higher lifetime costs attributable to diabetes. Zhuo et al. report that people with diabetes diagnosed at age 40 spend approximately $125,000 more for medical care over their lifetimes than people without diabetes. For people diagnosed with diabetes at age 50, the discounted lifetime excess medical spending is approximately $91,000; for those diagnosed at age 60, it is approximately $54,000; and for those diagnosed at age 65, it is approximately $36,000 (1).

These results are very consistent with results reported by the Diabetes Prevention Program (DPP) Research Group, which assessed the cost-effectiveness of diabetes prevention. […] In the simulated lifetime economic analysis [included in that study] the lifestyle intervention was more cost-effective in younger participants than in older participants (5). By delaying the onset of type 2 diabetes, the lifestyle intervention delayed or prevented the need for diabetes monitoring and treatment, surveillance of diabetic microvascular and neuropathic complications, and treatment of the late, expensive complications and comorbidities of diabetes, including end-stage renal disease and cardiovascular disease (5). Although this finding was controversial at the end of the randomized, controlled clinical trial, all but 1 of 12 economic analyses published by 10 research groups in nine countries have demonstrated that lifestyle intervention for the prevention of type 2 diabetes is very cost-effective, if not cost-saving, compared with a placebo intervention (6).

Empiric, within-trial economic analyses of the DPP have now demonstrated that the incremental costs of the lifestyle intervention are almost entirely offset by reductions in the costs of medical care outside the study, especially the cost of self-monitoring supplies, prescription medications, and outpatient and inpatient care (7). Over 10 years, the DPP intensive lifestyle intervention cost only ∼$13,000 per quality-adjusted life-year gained when the analysis used an intent-to-treat approach (7) and was even more cost-effective when the analysis assessed outcomes and costs among adherent participants (8).”

“The American Diabetes Association has reported that although institutional care (hospital, nursing home, and hospice care) still account for 52% of annual per capita health care expenditures for people with diabetes, outpatient medications and supplies now account for 30% of expenditures (9). Between 2007 and 2012, annual per capita expenditures for inpatient care increased by 2%, while expenditures for medications and supplies increased by 51% (9). As the costs of diabetes medications and supplies continue to increase, it will be even more important to consider cost savings arising from the less frequent use of medications when evaluating the benefits of nonpharmacologic interventions.”

iii. The Lifetime Cost of Diabetes and Its Implications for Diabetes Prevention. (This is the Zhuo et al. paper mentioned above.)

“We aggregated annual medical expenditures from the age of diabetes diagnosis to death to determine lifetime medical expenditure. Annual medical expenditures were estimated by sex, age at diagnosis, and diabetes duration using data from 2006–2009 Medical Expenditure Panel Surveys, which were linked to data from 2005–2008 National Health Interview Surveys. We combined survival data from published studies with the estimated annual expenditures to calculate lifetime spending. We then compared lifetime spending for people with diabetes with that for those without diabetes. Future spending was discounted at 3% annually. […] The discounted excess lifetime medical spending for people with diabetes was $124,600 ($211,400 if not discounted), $91,200 ($135,600), $53,800 ($70,200), and $35,900 ($43,900) when diagnosed with diabetes at ages 40, 50, 60, and 65 years, respectively. Younger age at diagnosis and female sex were associated with higher levels of lifetime excess medical spending attributed to diabetes.

CONCLUSIONS Having diabetes is associated with substantially higher lifetime medical expenditures despite being associated with reduced life expectancy. If prevention costs can be kept sufficiently low, diabetes prevention may lead to a reduction in long-term medical costs.”

The selection criteria employed in this paper are not perfect; they excluded all individuals below the age of 30 “because they likely had type 1 diabetes”, which although true is only ‘mostly true’. Some of those individuals had(/have) type 2, but if you’re evaluating prevention schemes it probably makes sense to error on the side of caution (better to miss some type 2 patients than to include some type 1s), assuming the timing of the intervention is not too important. This gets more complicated if prevention schemes are more likely to have large and persistent effects in young people – however I don’t think that’s the case, as a counterpoint drug adherence studies often seem to find that young people aren’t particularly motivated to adhere to their treatment schedules compared to their older counterparts (who might have more advanced disease and so are more likely to achieve symptomatic relief by adhering to treatments).

A few more observations from the paper:

“The prevalence of participants with diabetes in the study population was 7.4%, of whom 54% were diagnosed between the ages of 45 and 64 years. The mean age at diagnosis was 55 years, and the mean length of time since diagnosis was 9.4 years (39% of participants with diabetes had been diagnosed for ≤5 years, 32% for 6–15 years, and 27% for ≥16 years). […] The observed annual medical spending for people with diabetes was $13,966—more than twice that for people without diabetes.”

“Regardless of diabetes status, the survival-adjusted annual medical spending decreased after age 60 years, primarily because of a decreasing probability of survival. Because the probability of survival decreased more rapidly in people with diabetes than in those without, corresponding spending declined as people died and no longer accrued medical costs. For example, among men diagnosed with diabetes at age 40 years, 34% were expected to survive to age 80 years; among men of the same age who never developed diabetes, 55% were expected to survive to age 80 years. The expected annual expenditure for a person diagnosed with diabetes at age 40 years declined from $8,500 per year at age 40 years to $3,400 at age 80 years, whereas the expenses for a comparable person without diabetes declined from $3,900 to $3,200 over that same interval. […] People diagnosed with diabetes at age 40 years lived with the disease for an average of 34 years after diagnosis. Those diagnosed when older lived fewer years and, therefore, lost fewer years of life. […] The annual excess medical spending attributed to diabetes […] was smaller among people who were diagnosed at older ages. For men diagnosed at age 40 years, annual medical spending was $3,700 higher than that of similar men without diabetes; spending was $2,900 higher for those diagnosed at age 50 years; $2,200 higher for those diagnosed at age 60 years; and $2,000 higher for those diagnosed at age 65 years. Among women diagnosed with diabetes, the excess annual medical spending was consistently higher than for men of the same age at diagnosis.”

“Regardless of age at diagnosis, people with diabetes spent considerably more on health care after age 65 years than their nondiabetic counterparts. Health care spending attributed to diabetes after age 65 years ranged from $23,900 to $40,900, depending on sex and age at diagnosis. […] Of the total excess lifetime medical spending among an average diabetic patient diagnosed at age 50 years, prescription medications and inpatient care accounted for 44% and 35% of costs, respectively. Outpatient care and other medical care accounted for 17% and 4% of costs, respectively.”

“Our findings differed from those of studies of the lifetime costs of other chronic conditions. For instance, smokers have a lower average lifetime medical cost than nonsmokers (29) because of their shorter life spans. Smokers have a life expectancy about 10 years less than those who do not smoke (30); life expectancy is 16 years less for those who develop smoking-induced cancers (31). As a result, smoking cessation leads to increased lifetime spending (32). Studies of the lifetime costs for an obese person relative to a person with normal body weight show mixed results: estimated excess lifetime medical costs for people with obesity range from $3,790 less to $39,000 more than costs for those who are nonobese (33,34). […] obesity, when considered alone, results in much lower annual excess medical costs than diabetes (–$940 to $1,150 for obesity vs. $2,000 to $4,700 for diabetes) when compared with costs for people who are nonobese (33,34).”

iv. Severe Hypoglycemia and Mortality After Cardiovascular Events for Type 1 Diabetic Patients in Sweden.

“This study examines factors associated with all-cause mortality after cardiovascular complications (myocardial infarction [MI] and stroke) in patients with type 1 diabetes. In particular, we aim to determine whether a previous history of severe hypoglycemia is associated with increased mortality after a cardiovascular event in type 1 diabetic patients.

Hypoglycemia is the most common and dangerous acute complication of type 1 diabetes and can be life threatening if not promptly treated (1). The average individual with type 1 diabetes experiences about two episodes of symptomatic hypoglycemia per week, with an annual prevalence of 30–40% for hypoglycemic episodes requiring assistance for recovery (2). We define severe hypoglycemia to be an episode of hypoglycemia that requires hospitalization in this study. […] Patients with type 1 diabetes are more susceptible to hypoglycemia than those with type 2 diabetes, and therefore it is potentially of greater relevance if severe hypoglycemia is associated with mortality (6).”

“This study uses a large linked data set comprising health records from the Swedish National Diabetes Register (NDR), which were linked to administrative records on hospitalization, prescriptions, and national death records. […] [The] study is based on data from four sources: 1) risk factor data from the Swedish NDR […], 2) hospital records of inpatient episodes from the National Inpatients Register (IPR) […], 3) death records […], and 4) prescription data records […]. A study comparing registered diagnoses in the IPR with information in medical records found positive predictive values of IPR diagnoses were 85–95% for most diagnoses (8). In terms of NDR coverage, a recent study found that 91% of those aged 18–34 years and with type 1 diabetes in the Prescribed Drug Register could be matched with those in the NDR for 2007–2009 (9).”

“The outcome of the study was all-cause mortality after a major cardiovascular complication (MI or stroke). Our sample for analysis included patients with type 1 diabetes who visited a clinic after 2002 and experienced a major cardiovascular complication after this clinic visit. […] We define type 1 diabetes as diabetes diagnosed under the age of 30 years, being reported as being treated with insulin only at some clinic visit, and when alive, having had at least one prescription for insulin filled per year between 2006 and 2010 […], and not having filled a prescription for metformin at any point between July 2005 and December 2010 (under the assumption that metformin users were more likely to be type 2 diabetes patients).”

“Explanatory variables included in both models were type of complication (MI or stroke), age at complication, duration of diabetes, sex, smoking status, HbA1c, BMI, systolic blood pressure, diastolic blood pressure, chronic kidney disease status based on estimated glomerular filtration rate, microalbuminuria and macroalbuminuria status, HDL, LDL, total–to–HDL cholesterol ratio, triglycerides, lipid medication status, clinic visits within the year prior to the CVD event, and prior hospitalization events: hypoglycemia, hyperglycemia, MI, stroke, heart failure, AF, amputation, PVD, ESRD, IHD/unstable angina, PCI, and CABG. The last known value for each clinical risk factor, prior to the cardiovascular complication, was used for analysis. […] Initially, all explanatory variables were included and excluded if the variable was not statistically significant at a 5% level (P < 0.05) via stepwise backward elimination.” [Aaaaaaargh! – US. These guys are doing a lot of things right, but this is not one of them. Just to mention this one more time: “Generally, hypothesis testing is a very poor basis for model selection […] There is no statistical theory that supports the notion that hypothesis testing with a fixed α level is a basis for model selection.” (Burnham & Anderson)]

“Patients who had prior hypoglycemic events had an estimated HR for mortality of 1.79 (95% CI 1.37–2.35) in the first 28 days after a CVD event and an estimated HR of 1.25 (95% CI 1.02–1.53) of mortality after 28 days post CVD event in the backward regression model. The univariate analysis showed a similar result compared with the backward regression model, with prior hypoglycemic events having an estimated HR for mortality of 1.79 (95% CI 1.38–2.32) and 1.35 (95% CI 1.11–1.65) in the logistic and Cox regressions, respectively. Even when all explanatory factors were included in the models […], the mortality increase associated with a prior severe hypoglycemic event was still significant, and the P values and SE are similar when compared with the backward stepwise regression. Similarly, when explanatory factors were included individually, the mortality increase associated with a prior severe hypoglycemic event was also still significant.” [Again, this sort of testing scheme is probably not a good approach to getting at a good explanatory model, but it’s what they did – US]

“The 5-year cumulative estimated mortality risk for those without complications after MI and stroke were 40.1% (95% CI 35.2–45.1) and 30.4% (95% CI 26.3–34.6), respectively. Patients with prior heart failure were at the highest estimated 5-year cumulative mortality risk, with those who suffered an MI and stroke having a 56.0% (95% CI 47.5–64.5) and 44.0% (95% CI 35.8–52.2) 5-year cumulative mortality risk, respectively. Patients who had a prior severe hypoglycemic event and suffered an MI had an estimated 5-year cumulative mortality risk at age 60 years of 52.4% (95% CI 45.3–59.5), and those who suffered a stroke had a 5-year cumulative mortality risk of 39.8% (95% CI 33.4–46.3). Patients at age 60 years who suffer a major CVD complication have over twofold risk of 5-year mortality compared with the general type 1 diabetic Swedish population, who had an estimated 5-year mortality risk of 13.8% (95% CI 12.0–16.1).”

“We found evidence that prior severe hypoglycemia is associated with reduced survival after a major CVD event but no evidence that prior severe hypoglycemia is associated with an increased risk of a subsequent CVD event.

Compared with the general type 1 diabetic Swedish population, a major CVD complication increased 5-year mortality risk at age 60 years by >25% and 15% in patients with an MI and stroke, respectively. Patients with a history of a hypoglycemic event had an even higher mortality after a major CVD event, with approximately an additional 10% being dead at the 5-year mark. This risk was comparable with that in those with late-stage kidney disease. This information is useful in determining the prognosis of patients after a major cardiovascular event and highlights the need to include this as a risk factor in simulation models (18) that are used to improve decision making (19).”

“This is the first study that has found some evidence of a dose-response relationship, where patients who experienced two or more severe hypoglycemic events had higher mortality after a cardiovascular event compared with those who experienced one severe hypoglycemic event. A lack of statistical power prevented us from investigating this further when we tried to stratify by number of prior severe hypoglycemic events in our regression models. There was no evidence of a dose-response relationship between repeated episodes of severe hypoglycemia and vascular outcomes or death in previous type 2 diabetes studies (5).”

v. Alterations in White Matter Structure in Young Children With Type 1 Diabetes.

“Careful regulation of insulin dosing, dietary intake, and activity levels are essential for optimal glycemic control in individuals with type 1 diabetes. However, even with optimal treatment many children with type 1 diabetes have blood glucose levels in the hyperglycemic range for more than half the day and in the hypoglycemic range for an hour or more each day (1). Brain cells may be especially sensitive to aberrant blood glucose levels, as glucose is the brain’s principal substrate for its energy needs.

Research in animal models has shown that white matter (WM) may be especially sensitive to dysglycemia-associated insult in diabetes (24). […] Early childhood is a period of rapid myelination and brain development (6) and of increased sensitivity to insults affecting the brain (6,7). Hence, study of the developing brain is particularly important in type 1 diabetes.”

“WM structure can be measured with diffusion tensor imaging (DTI), a method based on magnetic resonance imaging (MRI) that uses the movement of water molecules to characterize WM brain structure (8,9). Results are commonly reported in terms of mathematical scalars (representing vectors in vector space) such as fractional anisotropy (FA), axial diffusivity (AD), and radial diffusivity (RD). FA reflects the degree of diffusion anisotropy of water (how diffusion varies along the three axes) within a voxel (three-dimensional pixel) and is determined by fiber diameter and density, myelination, and intravoxel fiber-tract coherence (increases in which would increase FA), as well as extracellular diffusion and interaxonal spacing (increases in which would decrease FA) (10). AD, a measure of water diffusivity along the main axis of diffusion within a voxel, is thought to reflect fiber coherence and structure of axonal membranes (increases in which would increase AD), as well as microtubules, neurofilaments, and axonal branching (increases in which would decrease AD) (11,12). RD, the mean of the diffusivities perpendicular to the vector with the largest eigenvalue, is thought to represent degree of myelination (13,14) (more myelin would decrease RD values) and axonal “leakiness” (which would increase RD). Often, however, a combination of these WM characteristics results in opposing contributions to the final observed FA/AD/RD value, and thus DTI scalars should not be interpreted globally as “good” or “bad” (15). Rather, these scalars can show between-group differences and relationships between WM structure and clinical variables and are suggestive of underlying histology. Definitive conclusions about histology of WM can only be derived from direct microscopic examination of biological tissue.”

“Children (ages 4 to <10 years) with type 1 diabetes (n = 127) and age-matched nondiabetic control subjects (n = 67) had diffusion weighted magnetic resonance imaging scans in this multisite neuroimaging study. Participants with type 1 diabetes were assessed for HbA1c history and lifetime adverse events, and glucose levels were monitored using a continuous glucose monitor (CGM) device and standardized measures of cognition.

RESULTS Between-group analysis showed that children with type 1 diabetes had significantly reduced axial diffusivity (AD) in widespread brain regions compared with control subjects. Within the type 1 diabetes group, earlier onset of diabetes was associated with increased radial diffusivity (RD) and longer duration was associated with reduced AD, reduced RD, and increased fractional anisotropy (FA). In addition, HbA1c values were significantly negatively associated with FA values and were positively associated with RD values in widespread brain regions. Significant associations of AD, RD, and FA were found for CGM measures of hyperglycemia and glucose variability but not for hypoglycemia. Finally, we observed a significant association between WM structure and cognitive ability in children with type 1 diabetes but not in control subjects. […] These results suggest vulnerability of the developing brain in young children to effects of type 1 diabetes associated with chronic hyperglycemia and glucose variability.”

“The profile of reduced overall AD in type 1 diabetes observed here suggests possible axonal damage associated with diabetes (30). Reduced AD was associated with duration of type 1 diabetes suggesting that longer exposure to diabetes worsens the insult to WM structure. However, measures of hyperglycemia and glucose variability were either not associated or were positively associated with AD values, suggesting that these measures did not contribute to the observed decreased AD in the type 1 diabetes group. A possible explanation for these observations is that several biological processes influence WM structure in type 1 diabetes. Some processes may be related to insulin insufficiency or C-peptide levels independent of glucose levels (31,32) and may affect WM coherence (and reduce AD values as observed in the between-group results). Other processes related to hyperglycemia and glucose variability may target myelin (resulting in reduced FA and increased RD) as well as reduced axonal branching (both would result in increased AD values). Alternatively, these seemingly conflicting AD observations may be due to a dominant effect of age, which could overshadow effects from dysglycemia.

Early age of onset is one of the most replicable risk factors for cognitive impairments in type 1 diabetes (33,34). It has been hypothesized that young children are especially vulnerable to brain insults resulting from episodes of chronic hyperglycemia, hypoglycemia, and acute hypoglycemic complications of type 1 diabetes (seizures and severe hypoglycemic episodes). In addition, fear of hypoglycemia often results in caregivers maintaining relatively higher blood glucose to avoid lows altogether (1), especially in very young children. However, our study suggests that this approach of aggressive hypoglycemia avoidance resulting in hyperglycemia may not be optimal and may be detrimental to WM structure in young children.

Neuronal damage (reflected in altered WM structure) may affect neuronal signal transfer and, thus, cognition (35). Cognitive domains commonly reported to be affected in children with type 1 diabetes include general intellectual ability, visuospatial abilities, attention, memory, processing speed, and executive function (3638). In our sample, even though the duration of illness was relatively short (2.9 years on average), there were modest but significant cognitive differences between children with type 1 diabetes and control subjects (24).”

“In summary, we present results from the largest study to date investigating WM structure in very young children with type 1 diabetes. We observed significant and widespread brain differences in the WM microstructure of children with type 1 diabetes compared with nondiabetic control subjects and significant associations between WM structure and measures of hyperglycemia, glucose variability, and cognitive ability in the type 1 diabetic population.”

vi. Ultrasound Findings After Surgical Decompression of the Tarsal Tunnel in Patients With Painful Diabetic Polyneuropathy: A Prospective Randomized Study.

“Polyneuropathy is a common complication in diabetes. The prevalence of neuropathy in patients with diabetes is ∼30%. During the course of the disease, up to 50% of the patients will eventually develop neuropathy (1). Its clinical features are characterized by numbness, tingling, or burning sensations and typically extend in a distinct stocking and glove pattern. Prevention plays a key role since poor glucose control is a major risk factor in the development of diabetic polyneuropathy (DPN) (1,2).

There is no clear definition for the onset of painful diabetic neuropathy. Different hypotheses have been formulated.

Hyperglycemia in diabetes can lead to osmotic swelling of the nerves, related to increased glucose conversion into sorbitol by the enzyme aldose reductase (2,3). High sorbitol concentrations might also directly cause axonal degeneration and demyelination (2). Furthermore, stiffening and thickening of ligamental structures and the plantar fascia make underlying structures more prone to biomechanical compression (46). A thicker and stiffer retinaculum might restrict movements and lead to alterations of the nerve in the tarsal tunnel.

Both swelling of the nerve and changes in the tarsal tunnel might lead to nerve damage through compression.

Furthermore, vascular changes may diminish endoneural blood flow and oxygen distribution. Decreased blood supply in the (compressed) nerve might lead to ischemic damage as well as impaired nerve regeneration.

Several studies suggest that surgical decompression of nerves at narrow anatomic sites, e.g., the tarsal tunnel, is beneficial and has a positive effect on pain, sensitivity, balance, long-term risk of ulcers and amputations, and quality of life (3,710). Since the effect of decompression of the tibial nerve in patients with DPN has not been proven with a randomized clinical trial, its contribution as treatment for patients with painful DPN is still controversial. […] In this study, we compare the mean CSA and any changes in shape of the tibial nerve before and after decompression of the tarsal tunnel using ultrasound in order to test the hypothesis that the tarsal tunnel leads to compression of the tibial nerve in patients with DPN.”

“This study, with a large sample size and standardized sonographic imaging procedure with a good reliability, is the first randomized controlled trial that evaluates the effect of decompression of the tibial nerve on the CSA. Although no effect on CSA after surgery was found, this study using ultrasound demonstrates a larger and swollen tibial nerve and thicker flexor retinaculum at the ankle in patients with DPN compared with healthy control subjects.”

I would have been interested to know if there were any observable changes in symptom relief measures post-surgery, even if such variables are less ‘objective’ than measures like CSA (less objective, but perhaps more relevant to the patient…), but the authors did not look at those kinds of variables.

vii. Nonalcoholic Fatty Liver Disease Is Independently Associated With an Increased Incidence of Chronic Kidney Disease in Patients With Type 1 Diabetes.

“Nonalcoholic fatty liver disease (NAFLD) has reached epidemic proportions worldwide (1). Up to 30% of adults in the U.S. and Europe have NAFLD, and the prevalence of this disease is much higher in people with diabetes (1,2). Indeed, the prevalence of NAFLD on ultrasonography ranges from ∼50 to 70% in patients with type 2 diabetes (35) and ∼40 to 50% in patients with type 1 diabetes (6,7). Notably, patients with diabetes and NAFLD are also more likely to develop more advanced forms of NAFLD that may result in end-stage liver disease (8). However, accumulating evidence indicates that NAFLD is associated not only with liver-related morbidity and mortality but also with an increased risk of developing cardiovascular disease (CVD) and other serious extrahepatic complications (810).”

“Increasing evidence indicates that NAFLD is strongly associated with an increased risk of CKD [chronic kidney disease, US] in people with and without diabetes (11). Indeed, we have previously shown that NAFLD is associated with an increased prevalence of CKD in patients with both type 1 and type 2 diabetes (1517), and that NAFLD independently predicts the development of incident CKD in patients with type 2 diabetes (18). However, many of the risk factors for CKD are different in patients with type 1 and type 2 diabetes, and to date, it is uncertain whether NAFLD is an independent risk factor for incident CKD in type 1 diabetes or whether measurement of NAFLD improves risk prediction for CKD, taking account of traditional risk factors for CKD.

Therefore, the aim of the current study was to investigate 1) whether NAFLD is associated with an increased incidence of CKD and 2) whether measurement of NAFLD improves risk prediction for CKD, adjusting for traditional risk factors, in type 1 diabetic patients.”

“Using a retrospective, longitudinal cohort study design, we have initially identified from our electronic database all Caucasian type 1 diabetic outpatients with preserved kidney function (i.e., estimated glomerular filtration rate [eGFR] ≥60 mL/min/1.73 m2) and with no macroalbuminuria (n = 563), who regularly attended our adult diabetes clinic between 1999 and 2001. Type 1 diabetes was diagnosed by the typical presentation of disease, the absolute dependence on insulin treatment for survival, the presence of undetectable fasting C-peptide concentrations, and the presence of anti–islet cell autoantibodies. […] Overall, 261 type 1 diabetic outpatients were included in the final analysis and were tested for the development of incident CKD during the follow-up period […] All participants were periodically seen (every 3–6 months) for routine medical examinations of glycemic control and chronic complications of diabetes. No participants were lost to follow-up. […] For this study, the development of incident CKD was defined as occurrence of eGFR <60 mL/min/1.73 m2 and/or macroalbuminuria (21). Both of these outcome measures were confirmed in all participants in a least two consecutive occasions (within 3–6 months after the first examination).”

“At baseline, the mean eGFRMDRD was 92 ± 23 mL/min/1.73 m2 (median 87.9 [IQR 74–104]), or eGFREPI was 98.6 ± 19 mL/min/1.73 m2 (median 99.7 [84–112]). Most patients (n = 234; 89.7%) had normal albuminuria, whereas 27 patients (10.3%) had microalbuminuria. NAFLD was present in 131 patients (50.2%). […] At baseline, patients who developed CKD at follow-up were older, more likely to be female and obese, and had a longer duration of diabetes than those who did not. These patients also had higher values of systolic blood pressure, A1C, triglycerides, serum GGT, and urinary ACR and lower values of eGFRMDRD and eGFREPI. Moreover, there was a higher percentage of patients with hypertension, metabolic syndrome, microalbuminuria, and some degree of diabetic retinopathy in patients who developed CKD at follow-up compared with those remaining free from CKD. The proportion using antihypertensive drugs (that always included the use of ACE inhibitors or angiotensin receptor blockers) was higher in those who progressed to CKD. Notably, […] this patient group also had a substantially higher frequency of NAFLD on ultrasonography.”

“During follow-up (mean duration 5.2 ± 1.7 years, range 2–10), 61 patients developed CKD using the MDRD study equation to estimate eGFR (i.e., ∼4.5% of participants progressed every year to eGFR <60 mL/min/1.73 m2 or macroalbuminuria). Of these, 28 developed an eGFRMDRD <60 mL/min/1.73 m2 with abnormal albuminuria (micro- or macroalbuminuria), 21 developed a reduced eGFRMDRD with normal albuminuria (but 9 of them had some degree of diabetic retinopathy at baseline), and 12 developed macroalbuminuria alone. None of them developed kidney failure requiring chronic dialysis. […] The annual eGFRMDRD decline for the whole cohort was 2.68 ± 3.5 mL/min/1.73 m2 per year. […] NAFLD patients had a greater annual decline in eGFRMDRD than those without NAFLD at baseline (3.28 ± 3.8 vs. 2.10 ± 3.0 mL/min/1.73 m2 per year, P < 0.005). Similarly, the frequency of a renal functional decline (arbitrarily defined as ≥25% loss of baseline eGFRMDRD) was greater among those with NAFLD than among those without the disease (26 vs. 11%, P = 0.005). […] Interestingly, BMI was not significantly associated with CKD.”

“Our novel findings indicate that NAFLD is strongly associated with an increased incidence of CKD during a mean follow-up of 5 years and that measurement of NAFLD improves risk prediction for CKD, independently of traditional risk factors (age, sex, diabetes duration, A1C, hypertension, baseline eGFR, and microalbuminuria [i.e., the last two factors being the strongest known risk factors for CKD]), in type 1 diabetic adults. Additionally, although NAFLD was strongly associated with obesity, obesity (or increased BMI) did not explain the association between NAFLD and CKD. […] The annual cumulative incidence rate of CKD in our cohort of patients (i.e., ∼4.5% per year) was essentially comparable to that previously described in other European populations with type 1 diabetes and similar baseline characteristics (∼2.5–9% of patients who progressed every year to CKD) (25,26). In line with previously published information (2528), we also found that hypertension, microalbuminuria, and lower eGFR at baseline were strong predictors of incident CKD in type 1 diabetic patients.”

“There is a pressing and unmet need to determine whether NAFLD is associated with a higher risk of CKD in people with type 1 diabetes. It has only recently been recognized that NAFLD represents an important burden of disease for type 2 diabetic patients (11,17,18), but the magnitude of the problem of NAFLD and its association with risk of CKD in type 1 diabetes is presently poorly recognized. Although there is clear evidence that NAFLD is closely associated with a higher prevalence of CKD both in those without diabetes (11) and in those with type 1 and type 2 diabetes (1517), only four prospective studies have examined the association between NAFLD and risk of incident CKD (18,2931), and only one of these studies was published in patients with type 2 diabetes (18). […] The underlying mechanisms responsible for the observed association between NAFLD and CKD are not well understood. […] The possible clinical implication for these findings is that type 1 diabetic patients with NAFLD may benefit from more intensive surveillance or early treatment interventions to decrease the risk for CKD. Currently, there is no approved treatment for NAFLD. However, NAFLD and CKD share numerous cardiometabolic risk factors, and treatment strategies for NAFLD and CKD should be similar and aimed primarily at modifying the associated cardiometabolic risk factors.”

 

October 25, 2017 Posted by | Cardiology, Diabetes, Epidemiology, Genetics, Health Economics, Medicine, Nephrology, Neurology, Pharmacology, Statistics, Studies | Leave a comment

Infectious Disease Surveillance (IV)

I have added some more observations from the second half of the book below.

“The surveillance systems for all stages of HIV infection, including stage 3 (AIDS), are the most highly developed, complex, labor-intensive, and expensive of all routine infectious disease surveillance systems. […] Although some behaviorally based prevention interventions (e.g., individual counseling and testing) are relatively inexpensive and simple to implement, others are expensive and difficult to maintain. Consequently, HIV control programs have added more treatment-based methods in recent years. These consist primarily of routine and, in some populations, repeated and frequent testing for HIV with an emphasis on diagnosing every infected person as quickly as possible, linking them to clinical care, prescribing ART, monitoring for retention in care, and maintaining an undetectable viral load. This approach is referred to as “treatment as prevention.” […] Prior to the advent of HAART in the mid-1990s, surveillance consisted primarily of collecting initial HIV diagnosis, followed by monitoring of progression to AIDS and death. The current need to monitor adherence to treatment and care has led to surveillance to collect results of all CD4 count and viral load tests conducted on HIV-infected persons. Treatment guidelines recommend such testing quarterly [11], leading to dozens of laboratory tests being reported for each HIV-infected person in care; hence, the need to receive laboratory results electronically and efficiently has increased. […] The standard set by CDC for completeness is that at least 85% of diagnosed cases are reported to public health within the year of diagnosis. […] As HIV-infected persons live longer as a consequence of ART, the scope of HIV surveillance has expanded […] A critical part of collecting HIV data is maintaining the database.”

“The World Health Organization (WHO) estimates that 8.7 million new cases of TB and 1.4 million deaths from TB occurred in 2011 worldwide [2]. […] WHO estimates that one of every three individuals worldwide is infected with TB [6]. An estimated 5–10% of persons with LTBI [latent TB infection] in the general population will eventually develop active TB disease. Persons with latent infection who are immune suppressed for any reason are more likely to develop active disease. It is estimated that people infected with human immunodeficiency virus (HIV) are 21–34 times more likely to progress from latent to active TB disease […] By 2010, the percentage of all TB cases tested for HIV was 65% and the prevalence of coinfection was 6% [in the United States] [4]. […] From a global perspective, the United States is considered a low morbidity and mortality country for TB. In 2010, the national annual incidence rate for TB was 3.6 per 100,000 persons with 11,182 reported cases of TB  […] In 1953, 113,531 tuberculosis cases were reported in the United States […] Tuberculosis surveillance in the United States has changed a great deal in depth and quality since its inception more than a century ago. […] To assure uniformity and standardization of surveillance data, all TB programs in the United States report verified TB cases via the Report of Verified Case of Tuberculosis (RVCT) [43]. The RVCT collects demographic, diagnostic, clinical, and risk-factor information on incident TB cases […] A companion form, the Follow-up 1 (FU-1), records the date of specimen collection and results of the initial drug susceptibility test at the time of diagnosis for all culture-confirmed TB cases. […]  The Follow-up 2 (FU-2) form collects outcome data on patient treatment and additional clinical and laboratory information. […] Since 1993, the RVCT, FU-1, and FU-2 have been used to collect demographic and clinical information, as well as laboratory results for all reported TB cases in the United States […] The RVCT collects information about known risk factors for TB disease; and in an effort to more effectively monitor TB caused by drug-resistant strains, CDC also gathers information regarding drug susceptibility testing for culture-confirmed cases on the FU-2.”

“Surveillance data may come from widely different systems with different specific purposes. It is essential that the purpose and context of any specific system be understood before attempting to analyze and interpret the surveillance data produced by that system. It is also essential to understand the methodology by which the surveillance system collects data. […] The most fundamental challenge for analysis and interpretation of surveillance data is the identification of a baseline. […] For infections characterized by seasonal outbreaks, the baseline range will vary by season in a generally predictable manner […] The comparison of observations to the baseline range allows characterization of the impact of intentional interventions or natural phenomenon and determination of the direction of change. […] Resource investment in surveillance often occurs in response to a newly recognized disease […] a suspected change in the frequency, virulence, geography, or risk population of a familiar disease […] or following a natural disaster […] In these situations, no baseline data are available against which to judge the significance of data collected under newly implemented surveillance.”

“Differences in data collection methods may result in apparent differences in disease occurrence between geographic regions or over time that are merely artifacts resulting from variations in surveillance methodology. Data should be analyzed using standard periods of observation […] It may be helpful to examine the same data by varied time frames. An outbreak of short duration may be recognizable through hourly, daily, or weekly grouping of data but obscured if data are examined only on an annual basis. Conversely, meaningful longer-term trends may be recognized more efficiently by examining data on an annual basis or at multiyear intervals. […] An early approach to analysis of infectious disease surveillance data was to convert observation of numbers into observations of rates. Describing surveillance observations as rates […] standardizes the data in a way that allows comparisons of the impact of disease across time and geography and among different populations”.

“Understanding the sensitivity and specificity of surveillance systems is important. […] Statistical methods based on tests of randomness have been applied to infectious disease surveillance data for the purpose of analysis of aberrations. Methods include adaptations of quality control charts from industry; Bayesian, cluster, regression, time series, and bootstrap analyses; and application of smoothing algorithms, simulation, and spatial statistics [1,14].[…] Time series forecasting and regression methods have been fitted to mortality data series to forecast future epidemics of seasonal diseases, most commonly influenza, and to estimate the excess associated mortality. […] While statistical analysis can be applied to surveillance data, the use of statistics for this purpose is often limited by the nature of surveillance data. Populations under surveillance are often not random samples of a general population, and may not be broadly representative, complicating efforts to use statistics to estimate morbidity and mortality impacts on populations. […] The more information an epidemiologist has about the purpose of the surveillance system, the people who perform the reporting, and the circumstances under which the data are collected and conveyed through the system, the more likely it is that the epidemiologist will interpret the data correctly. […] In the context of public health practice, a key value of surveillance data is not just in the observations from the surveillance system but also in the fact that these data often stimulate action to collect better data, usually through field investigations. Field investigations may improve understanding of risk factors that were suggested by the surveillance data itself. Often, field investigations triggered by surveillance observations lead to research studies such as case control comparisons that identify and better define the strength of risk factors.”

“The increasing frequency of disease outbreaks that have spread across national borders has led to the development of multicountry surveillance networks. […] Countries that participate in surveillance networks typically agree to share disease outbreak information and to collaborate in efforts to control disease spread. […] Multicountry disease surveillance networks now exist in many parts of the world, such as the Middle East, Southeast Asia, Southern Africa, Southeastern Europe, and East Africa. […] Development of accurate and reliable diagnoses of illnesses is a fundamental challenge in global surveillance. Clinical specimen collection, analysis, and laboratory confirmation of the etiology of disease outbreaks are important components of any disease surveillance system [37]. In many areas of the world, however, insufficient diagnostic capacity leads to no or faulty diagnoses, inappropriate treatments, and disease misreporting. For example, surveillance for malaria is challenged by a common reliance on clinical symptoms for diagnosis, which has been shown to be a poor predictor of actual infection [38,39]. […] A WHO report indicates that more than 60% of laboratory equipment in countries with limited resources is outdated or not functioning [46]. Even when there is sufficient laboratory capacity, laboratory-based diagnosis of disease can also be slow, delaying detection of outbreaks. For example, it can take more than a month to determine whether a patient is infected with drug-resistant strains of tuberculosis. […] The International Health Regulations (IHR) codify the measures that countries must take to limit the international spread of disease while ensuring minimum interference with trade and travel. […] From the perspective of an individual nation, there are few incentives to report an outbreak of a disease to the international community. Rather, the decision to report diseases may result in adverse consequences — significant drops in tourism and trade, closings of borders, and other measures that the IHR are supposed to prevent.”

“Concerns about biological terrorism have raised the profile of infectious disease surveillance in the United States and around the globe [14]. […] Improving global surveillance for biological terrorism and emerging infectious diseases is now a major focus of the U.S. Department of Defense’s (DoD) threat reduction programs [17]. DoD spends more on global health surveillance than any other U.S. governmental agency [18].”

“Zoonoses, or diseases that can transmit between humans and animals, have been responsible for nearly two-thirds of infectious disease outbreaks that have occurred since 1950 and more than $200 billion in worldwide economic losses in the last 10 years [52]. Despite the significant economic and health threats caused by these diseases, worldwide capacity for surveillance of zoonotic diseases is insufficient [52]. […] Over the last few decades, there have been significant changes in the way in which infectious disease surveillance is practiced. New regulations and goals for infectious disease surveillance have given rise to the development of new surveillance approaches and methods and have resulted in participation by nontraditional sectors, including the security community. Though most of these developments have positively shaped global surveillance, there remain key challenges that stand in the way of continued improvements. These include insufficient diagnostic capabilities and lack of trained staff, lack of integration between human and animal-health surveillance efforts, disincentives for countries to report disease outbreaks, and lack of information exchange between public health agencies and other sectors that are critical for surveillance.

“The biggest limitations to the development and sustainment of electronic disease surveillance systems, particularly in resource-limited countries, are the ease with which data are collected, accessed, and used by public health officials. Systems that require large amounts of resources, whether that is in the form of the workforce or information technology (IT) infrastructure, will not be successful in the long term. Successful systems run on existing hardware that can be maintained by modestly trained IT professionals and are easy to use by end users in public health [20].”

October 20, 2017 Posted by | Books, Epidemiology, Infectious disease, Medicine, Statistics | Leave a comment

Beyond Significance Testing (V)

I never really finished my intended coverage of this book. Below I have added some observations from the last couple of chapters.

“Estimation of the magnitudes and precisions of interaction effects should be the focus of the analysis in factorial designs. Methods to calculate standardized mean differences for contrasts in such designs are not as well developed as those for one-way designs. Standardizers for single-factor contrasts should reflect variability as a result of intrinsic off-factors that vary naturally in the population, but variability due to extrinsic off-factors that do not vary naturally should be excluded. Measures of association may be preferred in designs with three or more factors or where some factors are random. […] There are multivariate versions of d statistics and measures of association for designs with two or more continuous outcomes. For example, a Mahalanobis distance is a multivariate d statistic, and it estimates the difference between two group centroids (the sets of all univariate means) in standard deviation units controlling for intercorrelation.”

“Replication is a foundational scientific activity but one neglected in the behavioral sciences. […] There is no single nomenclature to classify replication studies (e.g., Easley et al., 2000), but there is enough consensus to outline at least the broad types […] Internal replication includes statistical resampling and cross-validation by the original researcher(s). Resampling includes bootstrapping and related computer-based methods, such as the jackknife technique, that randomly combine the cases in an original data set in different ways to estimate the effect of idiosyncrasies in the sample on the results […] Such procedures are not replication in the usual scientific sense. the total sample in cross-validation is randomly divided into a derivation sample and a cross-validation sample, and the same analyses are conducted in each one. External replication is conducted by people other than the original researchers, and it involves new samples collected at different times or places.
There are two broad contexts for external replication. The first concerns different kinds of replications of experimental studies. One is exact replication, also known as direct replication, literal replication, or precise replication, where all major aspects of an original study — its sampling methods, design, and outcome measures — are closely copied. True exact replications exist more in theory than in practice because it is difficult to perfectly duplicate a study […] Another type is operational replication — also referred to as partial replication or improvisational replication — where just the sampling and methods of an original study are duplicated. […] The outcome of operational replication is potentially more informative than that of literal replication, because robust effects should stand out against variations in procedures, settings, or samples.
In balanced replication, operational replications are used as control conditions. Other conditions may represent the manipulation of additional substantive variables to test new hypotheses. […] The logic of balanced replication is similar to that of strong inference, which features designing studies to rule out competing explanations, and to that of dismantling research. The aim of the latter is to study elements of treatments with multiple components in smaller combinations to find the ones responsible for treatment efficacy.
A researcher who conducts a construct replication or conceptual replication avoids close imitation of the specific methods of an original study. An ideal construct replication would be carried out by telling a skilled researcher little more than the original empirical result. this researcher would then specify the design, measures, and data analysis methods deemed appropriate to test whether a finding has generality beyond the particular situation studied in an original work.”

“There is evidence that only small proportions — in some cases < 1% — of all published studies in the behavioral sciences are specifically described as replications (e.g., Easley et al., 2000; Kmetz, 2002). […] K. Hunt (1975), S. schmidt (2009), and others have argued that most replication in the behavioral sciences occurs covertly in the form of follow-up studies, which combine direct replication (or at least construct replication) with new procedures, measures, or hypotheses in the same investigation. Such studies may be described by their authors as “extensions” of previous works with new elements but not as “replications,” […] the problem with this informal approach to replication is that it is not explicit and therefore is unsystematic. […] Perhaps replication would be more highly valued if confidence intervals were reported more often. Then readers of empirical articles would be able to see the low precision with which many studies are conducted. […] Wide confidence intervals indicate that a study contains only limited information, a fact that is concealed when only results of statistical tests are reported”.

“Because sets of related investigations in the behavioral sciences are generally made up of follow-up studies, the explanation of observed variability in their results is a common goal in meta-analysis. That is, the meta-analyst tries to identify and measure characteristics of follow-up studies that give rise to variability among the results. These characteristics include attributes of samples (e.g., mean age, gender), settings in which cases are tested (e.g., inpatient vs. outpatient), and the type of treatment administered (e.g., duration, dosage). Other factors concern properties of the outcome measures (e.g., self-report vs. observational), quality of the research design, source of funding (e.g., private vs. public), professional backgrounds of the authors, or date of publication. The last reflects the potential impact of temporal factors such as changing societal attitudes. […] Study factors are conceptualized as meta-analytic predictors, and study outcome measured with the same standardized effect size is typically the criterion. Each predictor is actually a moderator variable, which implies interaction. This is because the criterion, study effect size, usually represents the association between the independent and dependent variables. If observed variation in effect sizes across a set of studies is explained by a meta-analytic predictor, the relation between the independent and dependent variables changes across the levels of that predictor. For the same reason, the terms moderator variable analysis and meta-regression describe the process of estimating whether study characteristics explain variability in results. […] study factors can covary, such as when different variations of a treatment tend to be administered to patients with acute versus chronic forms of a disorder. If meta-analytic predictors covary, it is necessary to control for overlapping explained proportions of variability in effect sizes.
It is also possible for meta-analytic predictors to interact, which means that they have a joint influence on observed effect sizes. Interaction also implies that to understand variability in results, one must consider the predictors together. This is a subtle point, one that requires some elaboration: Each individual predictor in meta-analysis is a moderator variable. But the relation of one meta-analytic predictor to study outcome may depend on another predictor. For example, the effect of treatment type on observed effect sizes may depend on whether cases with mild versus severe forms of an illness were studied. A different kind of phenomenon is mediation, or indirect effects among study factors. Suppose that one factor is degree of early exposure to a toxic agent and another is illness chronicity. The exposure factor may affect study outcome both directly and indirectly through its influence on chronicity. Indirect effects can be estimated in meta-analysis by applying techniques from structural equation modeling to covariance matrices of study factors and effect sizes pooled over related studies. the use of both techniques together is called mediational meta-analysis or model-driven meta-analysis. […] It is just as important in meta-analysis as when conducting a primary study to clearly specify the hypotheses and operational definitions of constructs.”

“There are ways to estimate in meta-analysis what is known as the fail-safe N, which is the number of additional studies where the average effect size is zero that would be needed to increase the p value in a meta-analysis for the test of the mean observed effect size to > .05 (i.e., the nil hypothesis is not rejected). These additional studies are assumed to be file drawer studies or to be otherwise not found in the literature search of a meta-analysis. If the estimated number of such studies is so large that it is unlikely that so many studies (e.g., 2,000) with a mean nil effect size could exist, more confidence in the results may be warranted. […] Studies from each source are subject to different types of biases. For example, bias for statistical significance implies that published studies have more H0 rejections and larger effect sizes than do unpublished studies […] There are techniques in meta-analysis for estimating the extent of publication bias […]. If such bias is indicated, a meta-analysis based mainly on published sources may be inappropriate.”

“For two reasons, it is crucial to assess the […] research quality for each found primary study. The first is to eliminate from further consideration studies so flawed that their results are untrustworthy. […] The other reason concerns the remaining (nonexcluded) studies, which may be divided into those that are well designed versus those with significant limitations. Results synthesized from the former group may be given greater weight in the analysis than those from the latter group. […] Relatively high proportions of found studies in meta-analyses are often discarded due to poor rated quality, a sad comment on the status of a research literature. […] It is probably best to see meta-analysis as a way to better understand the status of a research area than as an end in itself or some magical substitute for critical thought. Its emphasis on effect sizes and the explicit description of study retrieval methods and assumptions is an improvement over narrative literature reviews. It also has the potential to address hypotheses not directly tested in primary studies. […] But meta-analysis does not solve the replication crisis in the behavioral sciences.”

“Conventional meta-analysis and Bayesian analysis are both methods for research synthesis, and it is worthwhile to briefly summarize their relative strengths. Both methods accumulate evidence about a parameter of interest and generate confidence intervals for that parameter. Both methods also allow sensitivity analysis of the consequences of making different kinds of decisions that may affect the results. Because meta-analysis is based on traditional statistical methods, it tests basically the same kinds of hypotheses that are evaluated in primary studies with traditional statistical tests. This limits the kinds of questions that can be addressed in meta-analysis. For example, a standard meta-analysis cannot answer the question, What is the probability that treatment has an effect? It could be determined whether zero is included in the confidence interval based on the average effect size across a set of studies, but this would not address the question just posed. In contrast, there is no special problem in dealing with this kind of question in Bayesian statistics. A Bayesian approach takes into account both previous knowledge and the inherent plausibility of the hypothesis, but meta-analysis is concerned only with the former. It is possible to combine meta-analytical and Bayesian methods in the same analysis (see Howard et al., 2000).”

“Bayesian methods are no more magical than any other set of statistical techniques. One drawback is that there is no direct way in Bayesian estimation to control type I or type II errors regarding the dichotomous decision to reject or retain some hypothesis. Researchers can do so in traditional significance testing, but too often they ignore power (the complement of the probability of a type II error) or specify an arbitrary level of type I error (e.g., α = .05), so this capability is usually wasted. Specification of prior probabilities or prior distributions in Bayesian statistics affects estimates of their posterior counterparts. If these specifications are grossly wrong, the results could be meaningless […] assumptions in Bayesian analyses should be explicitly stated and thus open to scrutiny. Bowers and Davis (2012) criticized the application of Bayesian methods in neuroscience. They noted in particular that Bayesian methods offer little improvement over more standard statistical techniques, but they also noted problems with use of the former, such as the specification of prior probabilities or utility functions in ways that are basically arbitrary. As with more standard statistical methods, Bayesian techniques are not immune to misuse. […] The main point of this chapter — and that of the whole book — is [however] that there are alternatives to the unthinking overreliance on significance testing that has handicapped the behavioral sciences for so long.”

 

October 19, 2017 Posted by | Books, Statistics | Leave a comment

A few diabetes papers of interest

i. Neurocognitive Functioning in Children and Adolescents at the Time of Type 1 Diabetes Diagnosis: Associations With Glycemic Control 1 Year After Diagnosis.

“Children and youth with type 1 diabetes are at risk for developing neurocognitive dysfunction, especially in the areas of psychomotor speed, attention/executive functioning, and visuomotor integration (1,2). Most research suggests that deficits emerge over time, perhaps in response to the cumulative effect of glycemic extremes (36). However, the idea that cognitive changes emerge gradually has been challenged (79). Ryan (9) argued that if diabetes has a cumulative effect on cognition, cognitive test performance should be positively correlated with illness duration. Yet he found comparable deficits in psychomotor speed (the most commonly noted area of deficit) in adolescents and young adults with illness duration ranging from 6 to 25 years. He therefore proposed a diathesis model in which cognitive declines in diabetes are especially likely to occur in more vulnerable patients, at crucial periods, in response to illness-related events (e.g., severe hyperglycemia) known to have an impact on the central nervous system (CNS) (8). This model accounts for the finding that cognitive deficits are more likely in children with early-onset diabetes, and for the accelerated cognitive aging seen in diabetic individuals later in life (7). A third hypothesized crucial period is the time leading up to diabetes diagnosis, during which severe fluctuations in blood glucose and persistent hyperglycemia often occur. Concurrent changes in blood-brain barrier permeability could result in a flood of glucose into the brain, with neurotoxic effects (9).”

“In the current study, we report neuropsychological test findings for children and adolescents tested within 3 days of diabetes diagnosis. The purpose of the study was to determine whether neurocognitive impairments are detectable at diagnosis, as predicted by the diathesis hypothesis. We hypothesized that performance on tests of psychomotor speed, visuomotor integration, and attention/executive functioning would be significantly below normative expectations, and that differences would be greater in children with earlier disease onset. We also predicted that diabetic ketoacidosis (DKA), a primary cause of diabetes-related neurological morbidity (12) and a likely proxy for severe peri-onset hyperglycemia, would be associated with poorer performance.”

“Charts were reviewed for 147 children/adolescents aged 5–18 years (mean = 10.4 ± 3.2 years) who completed a short neuropsychological screening during their inpatient hospitalization for new-onset type 1 diabetes, as part of a pilot clinical program intended to identify patients in need of further neuropsychological evaluation. Participants were patients at a large urban children’s hospital in the southwestern U.S. […] Compared with normative expectations, children/youth with type 1 diabetes performed significantly worse on GPD, GPN, VMI, and FAS (P < 0.0001 in all cases), with large decrements evident on all four measures (Fig. 1). A small but significant effect was also evident in DSB (P = 0.022). High incidence of impairment was evident on all neuropsychological tasks completed by older participants (aged 9–18 years) except DSF/DSB (Fig. 2).”

“Deficits in neurocognitive functioning were evident in children and adolescents within days of type 1 diabetes diagnosis. Participants performed >1 SD below normative expectations in bilateral psychomotor speed (GP) and 0.7–0.8 SDs below expected performance in visuomotor integration (VMI) and phonemic fluency (FAS). Incidence of impairment was much higher than normative expectations on all tasks except DSF/DSB. For example, >20% of youth were impaired in dominant hand fine-motor control, and >30% were impaired with their nondominant hand. These findings provide provisional support for Ryan’s hypothesis (79) that the peri-onset period may be a time of significant cognitive vulnerability.

Importantly, deficits were not evident on all measures. Performance on measures of attention/executive functioning (TMT-A, TMT-B, DSF, and DSB) was largely consistent with normative expectations, as was reading ability (WRAT-4), suggesting that the below-average performance in other areas was not likely due to malaise or fatigue. Depressive symptoms at diagnosis were associated with performance on TMT-B and FAS, but not on other measures. Thus, it seems unlikely that depressive symptoms accounted for the observed motor slowing.

Instead, the findings suggest that the visual-motor system may be especially vulnerable to early effects of type 1 diabetes. This interpretation is especially compelling given that psychomotor impairment is the most consistently reported long-term cognitive effect of type 1 diabetes. The sensitivity of the visual-motor system at diabetes diagnosis is consistent with a growing body of neuroimaging research implicating posterior white matter tracts and associated gray matter regions (particularly cuneus/precuneus) as areas of vulnerability in type 1 diabetes (3032). These regions form part of the neural system responsible for integrating visual inputs with motor outputs, and in adults with type 1 diabetes, structural pathology in these regions is directly correlated to performance on GP [grooved pegboard test] (30,31). Arbelaez et al. (33) noted that these brain areas form part of the “default network” (34), a system engaged during internally focused cognition that has high resting glucose metabolism and may be especially vulnerable to glucose variability.”

“It should be noted that previous studies (e.g., Northam et al. [3]) have not found evidence of neurocognitive dysfunction around the time of diabetes diagnosis. This may be due to study differences in measures, outcomes, and/or time frame. We know of no other studies that completed neuropsychological testing within days of diagnosis. Given our time frame, it is possible that our findings reflect transient effects rather than more permanent changes in the CNS. Contrary to predictions, we found no association between DKA at diagnosis and neurocognitive performance […] However, even transient effects could be considered potential indicators of CNS vulnerability. Neurophysiological changes at the time of diagnosis have been shown to persist under certain circumstances or for some patients. […] [Some] findings suggest that some individuals may be particularly susceptible to the effects of glycemic extremes on neurocognitive function, consistent with a large body of research in developmental neuroscience indicating individual differences in neurobiological vulnerability to adverse events. Thus, although it is possible that the neurocognitive impairments observed in our study might resolve with euglycemia, deficits at diagnosis could still be considered a potential marker of CNS vulnerability to metabolic perturbations (both acute and chronic).”

“In summary, this study provides the first demonstration that type 1 diabetes–associated neurocognitive impairment can be detected at the time of diagnosis, supporting the possibility that deficits arise secondary to peri-onset effects. Whether these effects are transient markers of vulnerability or represent more persistent changes in CNS awaits further study.”

ii. Association Between Impaired Cardiovascular Autonomic Function and Hypoglycemia in Patients With Type 1 Diabetes.

“Cardiovascular autonomic neuropathy (CAN) is a chronic complication of diabetes and an independent predictor of cardiovascular disease (CVD) morbidity and mortality (13). The mechanisms of CAN are complex and not fully understood. It can be assessed by simple cardiovascular reflex tests (CARTs) and heart rate variability (HRV) studies that were shown to be sensitive, noninvasive, and reproducible (3,4).”

“HbA1c fails to capture information on the daily fluctuations in blood glucose levels, termed glycemic variability (GV). Recent observations have fostered the notion that GV, independent of HbA1c, may confer an additional risk for the development of micro- and macrovascular diabetes complications (8,9). […] the relationship between GV and chronic complications, specifically CAN, in patients with type 1 diabetes has not been systematically studied. In addition, limited data exist on the relationship between hypoglycemic components of the GV and measures of CAN among subjects with type 1 diabetes (11,12). Therefore, we have designed a prospective study to evaluate the impact and the possible sustained effects of GV on measures of cardiac autonomic function and other cardiovascular complications among subjects with type 1 diabetes […] In the present communication, we report cross-sectional analyses at baseline between indices of hypoglycemic stress on measures of cardiac autonomic function.”

“The following measures of CAN were predefined as outcomes of interests and analyzed: expiration-to-inspiration ratio (E:I), Valsalva ratio, 30:15 ratios, low-frequency (LF) power (0.04 to 0.15 Hz), high-frequency (HF) power (0.15 to 0.4 Hz), and LF/HF at rest and during CARTs. […] We found that LBGI [low blood glucose index] and AUC [area under the curve] hypoglycemia were associated with reduced LF and HF power of HRV [heart rate variability], suggesting an impaired autonomic function, which was independent of glucose control as assessed by the HbA1c.”

“Our findings are in concordance with a recent report demonstrating attenuation of the baroreflex sensitivity and of the sympathetic response to various cardiovascular stressors after antecedent hypoglycemia among healthy subjects who were exposed to acute hypoglycemic stress (18). Similar associations […] were also reported in a small study of subjects with type 2 diabetes (19). […] higher GV and hypoglycemic stress may have an acute effect on modulating autonomic control with inducing a sympathetic/vagal imbalance and a blunting of the cardiac vagal control (18). The impairment in the normal counter-regulatory autonomic responses induced by hypoglycemia on the cardiovascular system could be important in healthy individuals but may be particularly detrimental in individuals with diabetes who have hitherto compromised cardiovascular function and/or subclinical CAN. In these individuals, hypoglycemia may also induce QT interval prolongation, increase plasma catecholamine levels, and lower serum potassium (19,20). In concert, these changes may lower the threshold for serious arrhythmia (19,20) and could result in an increased risk of cardiovascular events and sudden cardiac death. Conversely, the presence of CAN may increase the risk of hypoglycemia through hypoglycemia unawareness and subsequent impaired ability to restore euglycemia (21) through impaired sympathoadrenal response to hypoglycemia or delayed gastric emptying. […] A possible pathogenic role of GV/hypoglycemic stress on CAN development and progressions should be also considered. Prior studies in healthy and diabetic subjects have found that higher exposure to hypoglycemia reduces the counter-regulatory hormone (e.g., epinephrine, glucagon, and adrenocorticotropic hormone) and blunts autonomic nervous system responses to subsequent hypoglycemia (21). […] Our data […] suggest that wide glycemic fluctuations, particularly hypoglycemic stress, may increase the risk of CAN in patients with type 1 diabetes.”

“In summary, in this cohort of relatively young and uncomplicated patients with type 1 diabetes, GV and higher hypoglycemic stress were associated with impaired HRV reflective of sympathetic/parasympathetic dysfunction with potential important clinical consequences.”

iii. Elevated Levels of hs-CRP Are Associated With High Prevalence of Depression in Japanese Patients With Type 2 Diabetes: The Diabetes Distress and Care Registry at Tenri (DDCRT 6).

“In the last decade, several studies have been published that suggest a close association between diabetes and depression. Patients with diabetes have a high prevalence of depression (1) […] and a high prevalence of complications (3). In addition, depression is associated with mortality in these patients (4). […] Because of this strong association, several recent studies have suggested the possibility of a common biological pathway such as inflammation as an underlying mechanism of the association between depression and diabetes (5). […] Multiple mechanisms are involved in the association between diabetes and inflammation, including modulation of lipolysis, alteration of glucose uptake by adipose tissue, and an indirect mechanism involving an increase in free fatty acid levels blocking the insulin signaling pathway (10). Psychological stress can also cause inflammation via innervation of cytokine-producing cells and activation of the sympathetic nervous systems and adrenergic receptors on macrophages (11). Depression enhances the production of inflammatory cytokines (1214). Overproduction of inflammatory cytokines may stimulate corticotropin-releasing hormone production, a mechanism that leads to hypothalamic-pituitary axis activity. Conversely, cytokines induce depressive-like behaviors; in studies where healthy participants were given endotoxin infusions to trigger cytokine release, the participants developed classic depressive symptoms (15). Based on this evidence, it could be hypothesized that inflammation is the common biological pathway underlying the association between diabetes and depression.”

“[F]ew studies have examined the clinical role of inflammation and depression as biological correlates in patients with diabetes. […] In this study, we hypothesized that high CRP [C-reactive protein] levels were associated with the high prevalence of depression in patients with diabetes and that this association may be modified by obesity or glycemic control. […] Patient data were derived from the second-year survey of a diabetes registry at Tenri Hospital, a regional tertiary care teaching hospital in Japan. […] 3,573 patients […] were included in the study. […] Overall, mean age, HbA1c level, and BMI were 66.0 years, 7.4% (57.8 mmol/mol), and 24.6 kg/m2, respectively. Patients with major depression tended to be relatively young […] and female […] with a high BMI […], high HbA1c levels […], and high hs-CRP levels […]; had more diabetic nephropathy […], required more insulin therapy […], and exercised less […]”.

“In conclusion, we observed that hs-CRP levels were associated with a high prevalence of major depression in patients with type 2 diabetes with a BMI of ≥25 kg/m2. […] In patients with a BMI of <25 kg/m2, no significant association was found between hs-CRP quintiles and major depression […] We did not observe a significant association between hs-CRP and major depression in either of HbA1c subgroups. […] Our results show that the association between hs-CRP and diabetes is valid even in an Asian population, but it might not be extended to nonobese subjects. […] several factors such as obesity and glycemic control may modify the association between inflammation and depression. […] Obesity is strongly associated with chronic inflammation.”

iv. A Novel Association Between Nondipping and Painful Diabetic Polyneuropathy.

“Sleep problems are common in painful diabetic polyneuropathy (PDPN) (1) and contribute to the effect of pain on quality of life. Nondipping (the absence of the nocturnal fall in blood pressure [BP]) is a recognized feature of diabetic cardiac autonomic neuropathy (CAN) and is attributed to the abnormal prevalence of nocturnal sympathetic activity (2). […] This study aimed to evaluate the relationship of the circadian pattern of BP with both neuropathic pain and pain-related sleep problems in PDPN […] Investigating the relationship between PDPN and BP circadian pattern, we found patients with PDPN exhibited impaired nocturnal decrease in BP compared with those without neuropathy, as well as higher nocturnal systolic BP than both those without DPN and with painless DPN. […] in multivariate analysis including comorbidities and most potential confounders, neuropathic pain was an independent determinant of ∆ in BP and nocturnal systolic BP.”

“PDPN could behave as a marker for the presence and severity of CAN. […] PDPN should increasingly be regarded as a condition of high cardiovascular risk.”

v. Reduced Testing Frequency for Glycated Hemoglobin, HbA1c, Is Associated With Deteriorating Diabetes Control.

I think a potentially important take-away from this paper, which they don’t really talk about, is that when you’re analyzing time series data in research contexts where the HbA1c variable is available at the individual level at some base frequency and you then encounter individuals for whom the HbA1c variable is unobserved in such a data set for some time periods/is not observed at the frequency you’d expect, such (implicit) missing values may not be missing at random (for more on these topics see e.g. this post). More specifically, in light of the findings of this paper I think it would make a lot of sense to default to an assumption of missing values being an indicator of worse-than-average metabolic control during the unobserved period of the time series in question when doing time-to-event analyses, especially in contexts where the values are missing for an extended period of time.

The authors of the paper consider metabolic control an outcome to be explained by the testing frequency. That’s one way to approach these things, but it’s not the only one and I think it’s also important to keep in mind that some patients also sometimes make a conscious decision not to show up for their appointments/tests; i.e. the testing frequency is not necessarily fully determined by the medical staff, although they of course have an important impact on this variable.

Some observations from the paper:

“We examined repeat HbA1c tests (400,497 tests in 79,409 patients, 2008–2011) processed by three U.K. clinical laboratories. We examined the relationship between retest interval and 1) percentage change in HbA1c and 2) proportion of cases showing a significant HbA1c rise. The effect of demographics factors on these findings was also explored. […] Figure 1 shows the relationship between repeat requesting interval (categorized in 1-month intervals) and percentage change in HbA1c concentration in the total data set. From 2 months onward, there was a direct relationship between retesting interval and control. A testing frequency of >6 months was associated with deterioration in control. The optimum testing frequency in order to maximize the downward trajectory in HbA1c between two tests was approximately four times per year. Our data also indicate that testing more frequently than 2 months has no benefit over testing every 2–4 months. Relative to the 2–3 month category, all other categories demonstrated statistically higher mean change in HbA1c (all P < 0.001). […] similar patterns were observed for each of the three centers, with the optimum interval to improvement in overall control at ∼3 months across all centers.”

“[I]n patients with poor control, the pattern was similar to that seen in the total group, except that 1) there was generally a more marked decrease or more modest increase in change of HbA1c concentration throughout and, consequently, 2) a downward trajectory in HbA1c was observed when the interval between tests was up to 8 months, rather than the 6 months as seen in the total group. In patients with a starting HbA1c of <6% (<42 mmol/mol), there was a generally linear relationship between interval and increase in HbA1c, with all intervals demonstrating an upward change in mean HbA1c. The intermediate group showed a similar pattern as those with a starting HbA1c of <6% (<42 mmol/mol), but with a steeper slope.”

“In order to examine the potential link between monitoring frequency and the risk of major deterioration in control, we then assessed the relationship between testing interval and proportion of patients demonstrating an increase in HbA1c beyond the normal biological and analytical variation in HbA1c […] Using this definition of significant increase as a ≥9.9% rise in subsequent HbA1c, our data show that the proportion of patients showing this magnitude of rise increased month to month, with increasing intervals between tests for each of the three centers. […] testing at 2–3-monthly intervals would, at a population level, result in a marked reduction in the proportion of cases demonstrating a significant increase compared with annual testing […] irrespective of the baseline HbA1c, there was a generally linear relationship between interval and the proportion demonstrating a significant increase in HbA1c, though the slope of this relationship increased with rising initial HbA1c.”

“Previous data from our and other groups on requesting patterns indicated that relatively few patients in general practice were tested annually (5,6). […] Our data indicate that for a HbA1c retest interval of more than 2 months, there was a direct relationship between retesting interval and control […], with a retest frequency of greater than 6 months being associated with deterioration in control. The data showed that for diabetic patients as a whole, the optimum repeat testing interval should be four times per year, particularly in those with poorer diabetes control (starting HbA1c >7% [≥53 mmol/mol]). […] The optimum retest interval across the three centers was similar, suggesting that our findings may be unrelated to clinical laboratory factors, local policies/protocols on testing, or patient demographics.”

It might be important to mention that there are important cross-country differences in terms of how often people with diabetes get HbA1c measured – I’m unsure of whether or not standards have changed since then, but at least in Denmark a specific treatment goal of the Danish Regions a few years ago was whether or not 95% of diabetics had had their HbA1c measured within the last year (here’s a relevant link to some stuff I wrote about related topics a while back).

October 2, 2017 Posted by | Cardiology, Diabetes, Immunology, Medicine, Neurology, Psychology, Statistics, Studies | Leave a comment

A few diabetes papers of interest

i. Impact of Parental Socioeconomic Status on Excess Mortality in a Population-Based Cohort of Subjects With Childhood-Onset Type 1 Diabetes.

“Numerous reports have shown that individuals with lower SES during childhood have increased morbidity and all-cause mortality at all ages (10–14). Although recent epidemiological studies have shown that all-cause mortality in patients with T1D increases with lower SES in the individuals themselves (15,16), the association between parental SES and mortality among patients with childhood-onset T1D has not been reported to the best of our knowledge. Our hypothesis was that low parental SES additionally increases mortality in subjects with childhood-onset T1D. In this study, we used large population-based Swedish databases to 1) explore in a population-based study how parental SES affects mortality in a patient with childhood-onset T1D, 2) describe and compare how the effect differs among various age-at-death strata, and 3) assess whether the adult patient’s own SES affects mortality independently of parental SES.”

“The Swedish Childhood Diabetes Registry (SCDR) is a dynamic population-based cohort reporting incident cases of T1D since 1 July 1977, which to date has collected >16,000 prospective cases. […] All patients recorded in the SCDR from 1 January 1978 to 31 December 2008 were followed until death or 31 December 2010. The cohort was subjected to crude analyses and stratified analyses by age-at-death groups (0–17, 18–24, and ≥25 years). Time at risk was calculated from date of birth until death or 31 December 2010. Kaplan-Meier analyses and log-rank tests were performed to compare the effect of low maternal educational level, low paternal educational level, and family income support (any/none). Cox regression analyses were performed to estimate and compare the hazard ratios (HRs) for the socioeconomic variables and to adjust for the potential confounding variables age at onset and sex.”

“The study included 14,647 patients with childhood-onset T1D. A total of 238 deaths (male 154, female 84) occurred in 349,762 person-years at risk. The majority of mortalities occurred among the oldest age-group (≥25 years of age), and most of the deceased subjects had onset of T1D at the ages of 10–14.99 years […]. Mean follow-up was 23.9 years and maximum 46.5 years. The overall standardized mortality ratio up to the age of 47 years was 2.3 (95% CI 1.35–3.63); for females, it was 2.6 (1.28–4.66) and for males, 2.1 (1.27–3.49). […] Analyses on the effect of low maternal educational level showed an increased mortality for male patients (HR 1.43 [95% CI 1.01–2.04], P = 0.048) and a nonsignificant increased mortality for female patients (1.21 [0.722–2.018], P = 0.472). Paternal educational level had no significant effect on mortality […] Having parents who ever received income support was associated with an increased risk of death in both males (HR 1.89 [95% CI 1.36–2.64], P < 0.001) and females (2.30 [1.43–3.67], P = 0.001) […] Excluding the 10% of patients with the highest accumulated income support to parents during follow-up showed that having parents who ever received income support still was a risk factor for mortality.”

“A Cox model including maternal educational level together with parental income support, adjusting for age at onset and sex, showed that having parents who received income support was associated with a doubled mortality risk (HR 1.96 [95% CI 1.49–2.58], P < 0.001) […] In a Cox model including the adult patient’s own SES, having parents who received income support was still an independent risk factor in the younger age-at-death group (18–24 years). Among those who died at age ≥25 years of age, the patient’s own SES was a stronger predictor for mortality (HR 2.46 [95% CI 1.54–3.93], P < 0.001)”

“Despite a well-developed health-care system in Sweden, overall mortality up to the age of 47 years is doubled in both males and females with childhood-onset T1D. These results are in accordance with previous Swedish studies and reports from other comparable countries […] Previous studies indicated that low SES during childhood is associated with low glycemic control and diabetes-related morbidity in patients with T1D (8,9), and the current study implies that mortality in adulthood is also affected by parental SES. […] The findings, when stratified by age-at-death group, show that adult patients’ own need of income support independently predicted mortality in those who died at ≥25 years of age, whereas among those who died in the younger age-group (18–24 years), parental requirement of income support was still a strong independent risk factor. None of the present SES measures seem to predict mortality in the ages 0–17 years perhaps due to low numbers and, thus, power.”

ii. Exercise Training Improves but Does Not Normalize Left Ventricular Systolic and Diastolic Function in Adolescents With Type 1 Diabetes.

“Adults and adolescents with type 1 diabetes have reduced exercise capacity (810), which increases their risk for cardiovascular morbidity and mortality (11). The causes for this reduced exercise capacity are unclear. However, recent studies have shown that adolescents with type 1 diabetes have lower stroke volume during exercise, which has been attributed to alterations in left ventricular function (9,10). Reduced left ventricular compliance resulting in an inability to fill the left ventricle appropriately during exercise has been shown to contribute to the lower stroke volume during exercise in both adults and adolescents with type 1 diabetes (12).

Exercise training is recommended as part of the management of type 1 diabetes. However, the effects of exercise training on left ventricular function at rest and during exercise in adolescents with type 1 diabetes have not been investigated. In particular, it is unclear whether exercise training improves cardiac hemodynamics during exercise in adolescents with diabetes. Therefore, we aimed to assess left ventricular volumes at rest and during exercise in a group of adolescents with type 1 diabetes compared with adolescents without diabetes before and after a 20-week exercise-training program. We hypothesized that exercise training would improve exercise capacity and exercise stroke volume in adolescents with diabetes.”

RESEARCH DESIGN AND METHODS Fifty-three adolescents with type 1 diabetes (aged 15.6 years) were divided into two groups: exercise training (n = 38) and nontraining (n = 15). Twenty-two healthy adolescents without diabetes (aged 16.7 years) were included and, with the 38 participants with type 1 diabetes, participated in a 20-week exercise-training intervention. Assessments included VO2max and body composition. Left ventricular parameters were obtained at rest and during acute exercise using MRI.

RESULTS Exercise training improved aerobic capacity (10%) and stroke volume (6%) in both trained groups, but the increase in the group with type 1 diabetes remained lower than trained control subjects. […]

CONCLUSIONS These data demonstrate that in adolescents, the impairment in left ventricular function seen with type 1 diabetes can be improved, although not normalized, with regular intense physical activity. Importantly, diastolic dysfunction, a common mechanism causing heart failure in older subjects with diabetes, appears to be partially reversible in this age group.”

“This study confirms that aerobic capacity is reduced in [diabetic] adolescents and that this, at least in part, can be attributed to impaired left ventricular function and a blunted cardiac response to exercise (9). Importantly, although an aerobic exercise-training program improved the aerobic capacity and cardiac function in adolescents with type 1 diabetes, it did not normalize them to the levels seen in the training group without diabetes. Both left ventricular filling and contractility improved after exercise training in adolescents with diabetes, suggesting that aerobic fitness may prevent or delay the well-described impairment in left ventricular function in diabetes (9,10).

The increase in peak aerobic capacity (∼12%) seen in this study was consistent with previous exercise interventions in adults and adolescents with diabetes (14). However, the baseline peak aerobic capacity was lower in the participants with diabetes and improved with training to a level similar to the baseline observed in the participants without diabetes; therefore, trained adolescents with diabetes remained less fit than equally trained adolescents without diabetes. This suggests there are persistent differences in the cardiovascular function in adolescents with diabetes that are not overcome by exercise training.”

“Although regular exercise potentially could improve HbA1c, the majority of studies have failed to show this (3134). Exercise training improved aerobic capacity in this study without affecting glucose control in the participants with diabetes, suggesting that the effects of glycemic status and exercise training may work independently to improve aerobic capacity.”

….

iii. Change in Medical Spending Attributable to Diabetes: National Data From 1987 to 2011.

“Diabetes care has changed substantially in the past 2 decades. We examined the change in medical spending and use related to diabetes between 1987 and 2011. […] Using the 1987 National Medical Expenditure Survey and the Medical Expenditure Panel Surveys in 2000–2001 and 2010–2011, we compared per person medical expenditures and uses among adults ≥18 years of age with or without diabetes at the three time points. Types of medical services included inpatient care, emergency room (ER) visits, outpatient visits, prescription drugs, and others. We also examined the changes in unit cost, defined by the expenditure per encounter for medical services.”

RESULTS The excess medical spending attributed to diabetes was $2,588 (95% CI, $2,265 to $3,104), $4,205 ($3,746 to $4,920), and $5,378 ($5,129 to $5,688) per person, respectively, in 1987, 2000–2001, and 2010–2011. Of the $2,790 increase, prescription medication accounted for 55%; inpatient visits accounted for 24%; outpatient visits accounted for 15%; and ER visits and other medical spending accounted for 6%. The growth in prescription medication spending was due to the increase in both the volume of use and unit cost, whereas the increase in outpatient expenditure was almost entirely driven by more visits. In contrast, the increase in inpatient and ER expenditures was caused by the rise of unit costs. […] The increase was observed across all components of medical spending, with the greatest absolute increase in the spending on prescription medications ($1,528 increase), followed by inpatient visits ($680 increase) and outpatient visits ($430 increase). The absolute change in the spending on ER and other medical services use was relatively small. In relative terms, the spending on ER visits grew more than five times, faster than that of prescription medication and other medical components. […] Among the total annual diabetes-attributable medical spending, the spending on inpatient and outpatient visits dropped from 40% and 23% to 31% and 19%, respectively, between 1987 and 2011, whereas spending on prescription medication increased from 27% to 41%.”

“The unit costs rose universally in all five measures of medical care in adults with and without diabetes. For each hospital admission, diabetes patients spent significantly more than persons without diabetes. The gap increased from $1,028 to $1,605 per hospital admission between 1987 and 2001, and dropped slightly to $1,360 per hospital admission in 2011. Diabetes patients also had higher spending per ER visit and per purchase of prescription medications.”

“From 1999 to 2011, national data suggest that growth in the use and price of prescription medications in the general population is 2.6% and 3.6% per year, respectively; and the growth has decelerated in recent years (22). Our analysis suggests that the growth rates in the use and prices of prescription medications for diabetes patients are considerably higher. The higher rate of growth is likely, in part, due to the growing emphasis on achieving glycemic targets, the use of newer medications, and the use of multidrug treatment strategies in modern diabetes care practice (23,24). In addition, the growth of medication spending is fueled by the rising prices per drug, particularly the drugs that are newly introduced in the market. For example, the prices for newer drug classes such as glitazones, dipeptidyl peptidase-4 inhibitors, and incretins have been 8 to 10 times those of sulfonylureas and 5 to 7 times those of metformin (9).”

“Between 1987 and 2011, medical spending increased both in persons with and in persons without diabetes; and the increase was substantially greater among persons with diabetes. As a result, the medical spending associated with diabetes nearly doubled. The growth was primarily driven by the spending in prescription medications. Further studies are needed to assess the cost-effectiveness of increased spending on drugs.”

iv. Determinants of Adherence to Diabetes Medications: Findings From a Large Pharmacy Claims Database.

“Adults with type 2 diabetes are often prescribed multiple medications to treat hyperglycemia, diabetes-associated conditions such as hypertension and dyslipidemia, and other comorbidities. Medication adherence is an important determinant of outcomes in patients with chronic diseases. For those with diabetes, adherence to medications is associated with better control of intermediate risk factors (14), lower odds of hospitalization (3,57), lower health care costs (5,79), and lower mortality (3,7). Estimates of rates of adherence to diabetes medications vary widely depending on the population studied and how adherence is defined. One review found that adherence to oral antidiabetic agents ranged from 36 to 93% across studies and that adherence to insulin was ∼63% (10).”

“Using a large pharmacy claims database, we assessed determinants of adherence to oral antidiabetic medications in >200,000 U.S. adults with type 2 diabetes. […] We selected a cohort of members treated for diabetes with noninsulin medications (oral agents or GLP-1 agonists) in the second half of 2010 who had continuous prescription benefits eligibility through 2011. Each patient was followed for 12 months from their index diabetes claim date identified during the 6-month targeting period. From each patient’s prescription history, we collected the date the prescription was filled, how many days the supply would last, the National Drug Code number, and the drug name. […] Given the difficulty in assessing insulin adherence with measures such as medication possession ratio (MPR), we excluded patients using insulin when defining the cohort.”

“We looked at a wide range of variables […] Predictor variables were defined a priori and grouped into three categories: 1) patient factors including age, sex, education, income, region, past exposure to therapy (new to diabetes therapy vs. continuing therapy), and concurrent chronic conditions; 2) prescription factors including refill channel (retail vs. mail order), total pill burden per day, and out of pocket costs; and 3) prescriber factors including age, sex, and specialty. […] Our primary outcome of interest was adherence to noninsulin antidiabetic medications. To assess adherence, we calculated an MPR for each patient. The ratio captures how often patients refill their medications and is a standard metric that is consistent with the National Quality Forum’s measure of adherence to medications for chronic conditions. MPR was defined as the proportion of days a patient had a supply of medication during a calendar year or equivalent period. We considered patients to be adherent if their MPR was 0.8 or higher, implying that they had their medication supplies for at least 80% of the days. An MPR of 0.8 or above is a well-recognized index of adherence (11,12). Studies have suggested that patients with chronic diseases need to achieve at least 80% adherence to derive the full benefits of their medications (13). […] [W]e [also] determined whether a patient was persistent, that is whether they had not discontinued or had at least a 45-day gap in their targeted therapy.”

“Previous exposure to diabetes therapy had a significant impact on adherence. Patients new to therapy were 61% less likely to be adherent to their diabetes medication. There was also a clear age effect. Patients 25–44 years of age were 49% less likely to be adherent when compared with patients 45–64 years of age. Patients aged 65–74 years were 27% more likely to be adherent, and those aged 75 years and above were 41% more likely to be adherent when compared with the 45–64 year age-group. Men were significantly more likely to be adherent than women […I dislike the use of the word ‘significant’ in such contexts; there is a difference in the level of adherence, but it is not large in absolute terms; the male vs female OR is 1.14 (CI 1.12-1.16) – US]. Education level and household income were both associated with adherence. The higher the estimated academic achievement, the more likely the patient was to be adherent. Patients completing graduate school were 41% more likely to be adherent when compared with patients with a high school equivalent education. Patients with an annual income >$60,000 were also more likely to be adherent when compared with patients with a household income <$30,000.”

“The largest effect size was observed for patients obtaining their prescription antidiabetic medications by mail. Patients using the mail channel were more than twice as likely to be adherent to their antidiabetic medications when compared with patients filling their prescriptions at retail pharmacies. Total daily pill burden was positively associated with antidiabetic medication adherence. For each additional pill a patient took per day, adherence to antidiabetic medications increased by 22%. Patient out-of-pocket costs were negatively associated with adherence. For each additional $15 in out-of-pocket costs per month, diabetes medication adherence decreased by 11%. […] We found few meaningful differences in patient adherence according to prescriber factors.”

“In our study, characteristics that suggest a “healthier” patient (being younger, new to diabetes therapy, and taking few other medications) were all associated with lower odds of adherence to antidiabetic medications. This suggests that acceptance of a chronic illness diagnosis and the potential consequences may be an important, but perhaps overlooked, determinant of medication-taking behavior. […] Our findings regarding income and costs are important reminders that prescribers should consider the impact of medication costs on patients with diabetes. Out-of-pocket costs are an important determinant of adherence to statins (26) and a self-reported cause of underuse of medications in one in seven insured patients with diabetes (27). Lower income has previously been shown to be associated with poor adherence to diabetes medications (15) and a self-reported cause of cost-related medication underuse (27).”

v. The Effect of Alcohol Consumption on Insulin Sensitivity and Glycemic Status: A Systematic Review and Meta-analysis of Intervention Studies.

“Moderate alcohol consumption, compared with abstaining and heavy drinking, is related to a reduced risk of type 2 diabetes (1,2). Although the risk is reduced with moderate alcohol consumption in both men and women, the association may differ for men and women. In a meta-analysis, consumption of 24 g alcohol/day reduced the risk of type 2 diabetes by 40% among women, whereas consumption of 22 g alcohol/day reduced the risk by 13% among men (1).

The association of alcohol consumption with type 2 diabetes may be explained by increased insulin sensitivity, anti-inflammatory effects, or effects of adiponectin (3). Several intervention studies have examined the effect of moderate alcohol consumption on these potential underlying pathways. A meta-analysis of intervention studies by Brien et al. (4) showed that alcohol consumption significantly increased adiponectin levels but did not affect inflammatory factors. Unfortunately, the effect of alcohol consumption on insulin sensitivity has not been summarized quantitatively. A review of cross-sectional studies by Hulthe and Fagerberg (5) suggested a positive association between moderate alcohol consumption and insulin sensitivity, although the three intervention studies included in their review did not show an effect (68). Several other intervention studies also reported inconsistent results (9,10). Consequently, consensus is lacking about the effect of moderate alcohol consumption on insulin sensitivity. Therefore, we aimed to conduct a systematic review and meta-analysis of intervention studies investigating the effect of alcohol consumption on insulin sensitivity and other relevant glycemic measures.”

“22 articles met criteria for inclusion in the qualitative synthesis. […] Of the 22 studies, 15 used a crossover design and 7 a parallel design. The intervention duration of the studies ranged from 2 to 12 weeks […] Of the 22 studies, 2 were excluded from the meta-analysis because they did not include an alcohol-free control group (14,19), and 4 were excluded because they did not have a randomized design […] Overall, 14 studies were included in the meta-analysis”

“A random-effects model was used because heterogeneity was present (P < 0.01, I2 = 91%). […] For HbA1c, a random-effects model was used because the I2 statistic indicated evidence for some heterogeneity (I2 = 30%).” [Cough, you’re not supposed to make these decisions that way, coughUS. This is not the first time I’ve seen this approach applied, and I don’t like it; it’s bad practice to allow the results of (frequently under-powered) heterogeneity tests to influence model selection decisions. As Bohrenstein and Hedges point out in their book, “A report should state the computational model used in the analysis and explain why this model was selected. A common mistake is to use the fixed-effect model on the basis that there is no evidence of heterogeneity. As [already] explained […], the decision to use one model or the other should depend on the nature of the studies, and not on the significance of this test”]

“This meta-analysis shows that moderate alcohol consumption did not affect estimates of insulin sensitivity or fasting glucose levels, but it decreased fasting insulin concentrations and HbA1c. Sex-stratified analysis suggested that moderate alcohol consumption may improve insulin sensitivity and decrease fasting insulin concentrations in women but not in men. The meta-regression suggested no influence of dosage and duration on the results. However, the number of studies may have been too low to detect influences by dosage and duration. […] The primary finding that alcohol consumption does not influence insulin sensitivity concords with the intervention studies included in the review of Hulthe and Fagerberg (5). This is in contrast with observational studies suggesting a significant association between moderate alcohol consumption and improved insulin sensitivity (34,35). […] We observed lower levels of HbA1c in subjects consuming moderate amounts of alcohol compared with abstainers. This has also been shown in several observational studies (39,43,44). Alcohol may decrease HbA1c by suppressing the acute rise in blood glucose after a meal and increasing the early insulin response (45). This would result in lower glucose concentrations over time and, thus, lower HbA1c concentrations. Unfortunately, the underlying mechanism of glycemic control by alcohol is not clearly understood.”

vi. Predictors of Lower-Extremity Amputation in Patients With an Infected Diabetic Foot Ulcer.

“Infection is a frequent complication of diabetic foot ulcers, with up to 58% of ulcers being infected at initial presentation at a diabetic foot clinic, increasing to 82% in patients hospitalized for a diabetic foot ulcer (1). These diabetic foot infections (DFIs) are associated with poor clinical outcomes for the patient and high costs for both the patient and the health care system (2). Patients with a DFI have a 50-fold increased risk of hospitalization and 150-fold increased risk of lower-extremity amputation compared with patients with diabetes and no foot infection (3). Among patients with a DFI, ∼5% will undergo a major amputation and 20–30% a minor amputation, with the presence of peripheral arterial disease (PAD) greatly increasing amputation risk (46).”

“As infection of a diabetic foot wound heralds a poor outcome, early diagnosis and treatment are important. Unfortunately, systemic signs of inflammation such as fever and leukocytosis are often absent even with a serious foot infection (10,11). As local signs and symptoms of infection are also often diminished, because of concomitant peripheral neuropathy and ischemia (12), diagnosing and defining resolution of infection can be difficult.”

“The system developed by the International Working Group on the Diabetic Foot (IWGDF) and the Infectious Diseases Society of America (IDSA) provides criteria for the diagnosis of infection of ulcers and classifies it into three categories: mild, moderate, or severe. The system was validated in three relatively small cohorts of patients […] The European Study Group on Diabetes and the Lower Extremity (Eurodiale) prospectively studied a large cohort of patients with a diabetic foot ulcer (17), enabling us to determine the prognostic value of the IWGDF system for clinically relevant lower-extremity amputations. […] We prospectively studied 575 patients with an infected diabetic foot ulcer presenting to 1 of 14 diabetic foot clinics in 10 European countries. […] Among these patients, 159 (28%) underwent an amputation. […] Patients were followed monthly until healing of the foot ulcer(s), major amputation, or death — up to a maximum of 1 year.”

“One hundred and ninety-nine patients had a grade 2 (mild) infection, 338 a grade 3 (moderate), and 38 a grade 4 (severe). Amputations were performed on 159 (28%) patients (126 minor and 33 major) within the year of follow-up; 103 patients (18%) underwent amputations proximal to and including the hallux. […] The independent predictors of any amputation were as follows: periwound edema, HR 2.01 (95% CI 1.33–3.03); foul smell, HR 1.74 (1.17–2.57); purulent and nonpurulent exudate, HR 1.67 (1.17–2.37) and 1.49 (1.02–2.18), respectively; deep ulcer, HR 3.49 (1.84–6.60); positive probe-to-bone test, HR 6.78 (3.79–12.15); pretibial edema, HR 1.53 (1.02–2.31); fever, HR 2.00 (1.15–3.48); elevated CRP levels but less than three times the upper limit of normal, HR 2.74 (1.40–5.34); and elevated CRP levels more than three times the upper limit, HR 3.84 (2.07–7.12). […] In comparison with mild infection, the presence of a moderate infection increased the hazard for any amputation by a factor of 2.15 (95% CI 1.25–3.71) and 3.01 (1.51–6.01) for amputations excluding the lesser toes. For severe infection, the hazard for any amputation increased by a factor of 4.12 (1.99–8.51) and for amputations excluding the lesser toes by a factor of 5.40 (2.20–13.26). Larger ulcer size and presence of PAD were also independent predictors of both any amputation and amputations excluding the lesser toes, with HRs between 1.81 and 3 (and 95% CIs between 1.05 and 6.6).”

“Previously published studies that have aimed to identify independent risk factors for lower-extremity amputation in patients with a DFI have noted an association with older age (5,22), the presence of fever (5), elevated acute-phase reactants (5,22,23), higher HbA1c levels (24), and renal insufficiency (5,22).”

“The new risk scores we developed for any amputation, and amputations excluding the lesser toes had higher prognostic capability, based on the area under the ROC curve (0.80 and 0.78, respectively), than the IWGDF system (0.67) […] which is currently the only one in use for infected diabetic foot ulcers. […] these Eurodiale scores were developed based on the available data of our cohort, and they will need to be validated in other populations before any firm conclusions can be drawn. The advantage of these newly developed scores is that they are easier for clinicians to perform […] These newly developed risk scores can be readily used in daily clinical practice without the necessity of obtaining additional laboratory testing.”

September 12, 2017 Posted by | Cardiology, Diabetes, Economics, Epidemiology, Health Economics, Infectious disease, Medicine, Microbiology, Statistics | Leave a comment

Utility of Research Autopsies for Understanding the Dynamics of Cancer

A few links:
Pancreatic cancer.
Jaccard index.
Limited heterogeneity of known driver gene mutations among the metastases of individual patients with pancreatic cancer.
Epitope.
Tissue-specific mutation accumulation in human adult stem cells during life.
Epigenomic reprogramming during pancreatic cancer progression links anabolic glucose metabolism to distant metastasis.

August 25, 2017 Posted by | Cancer/oncology, Genetics, Immunology, Lectures, Medicine, Statistics | Leave a comment

Quantifying tumor evolution through spatial computational modeling

Two general remarks: 1. She talks very fast, in my opinion unpleasantly fast – the lecture would have been at least slightly easier to follow if she’d slowed down a little. 2. A few of the lectures uploaded in this lecture series (from the IAS Mathematical Methods in Cancer Evolution and Heterogeneity Workshop) seem to have some sound issues; in this lecture there are multiple 1-2 seconds long ‘chunks’ where the sound drops out and some words are lost. This is really annoying, and a similar problem (which was likely ‘the same problem’) previously lead me to quit another lecture in the series; however in this case I decided to give it a shot anyway, and I actually think it’s not a big deal; the sound-losses are very short in duration, and usually no more than one or two words are lost so you can usually figure out what was said. During this lecture there was incidentally also some issues with the monitor roughly 27 minutes in, but this isn’t a big deal as no information was lost and unlike the people who originally attended the lecture you can just skip ahead approximately one minute (that was how long it took to solve that problem).

A few relevant links to stuff she talks about in the lecture:

A Big Bang model of human colorectal tumor growth.
Approximate Bayesian computation.
Site frequency spectrum.
Identification of neutral tumor evolution across cancer types.
Using tumour phylogenetics to identify the roots of metastasis in humans.

August 22, 2017 Posted by | Cancer/oncology, Evolutionary biology, Genetics, Lectures, Mathematics, Medicine, Statistics | Leave a comment

A few diabetes papers of interest

i. Rates of Diabetic Ketoacidosis: International Comparison With 49,859 Pediatric Patients With Type 1 Diabetes From England, Wales, the U.S., Austria, and Germany.

“Rates of DKA in youth with type 1 diabetes vary widely nationally and internationally, from 15% to 70% at diagnosis (4) to 1% to 15% per established patient per year (911). However, data from systematic comparisons between countries are limited. To address this gap in the literature, we analyzed registry and audit data from three organizations: the Prospective Diabetes Follow-up Registry (DPV) in Germany and Austria, the National Paediatric Diabetes Audit (NPDA) in England and Wales, and the T1D Exchange (T1DX) in the U.S. These countries have similarly advanced, yet differing, health care systems in which data on DKA and associated factors are collected. Our goal was to identify indicators of risk for DKA admissions in pediatric patients with >1-year duration of disease with an aim to better understand where targeted preventive programs might lead to a reduction in the frequency of this complication of management of type 1 diabetes.”

RESULTS The frequency of DKA was 5.0% in DPV, 6.4% in NPDA, and 7.1% in T1DX […] Mean HbA1c was lowest in DPV (63 mmol/mol [7.9%]), intermediate in T1DX (69 mmol/mol [8.5%]), and highest in NPDA (75 mmol/mol [9.0%]). […] In multivariable analyses, higher odds of DKA were found in females (odds ratio [OR] 1.23, 99% CI 1.10–1.37), ethnic minorities (OR 1.27, 99% CI 1.11–1.44), and HbA1c ≥7.5% (≥58 mmol/mol) (OR 2.54, 99% CI 2.09–3.09 for HbA1c from 7.5 to <9% [58 to <75 mmol/mol] and OR 8.74, 99% CI 7.18–10.63 for HbA1c ≥9.0% [≥75 mmol/mol]).”

Poor metabolic control is obviously very important, but it’s important to remember that poor metabolic control is in itself an outcome that needs to be explained. I would note that the mean HbA1c values here, especially that 75 mmol/mol one, seem really high; this is not a very satisfactory level of glycemic control and corresponds to an average glucose level of 12 mmol/l. And that’s a population average, meaning that many individuals have values much higher than this. Actually the most surprising thing to me about these data is that the DKA event rates are not much higher than they are, considering the level of metabolic control achieved. Another slightly surprising finding is that teenagers (13-17 yrs) were not actually all that much more likely to have experienced DKA than small children (0-6 yrs); the OR is only ~1.5. Of course this can not be taken as an indication that DKA in teenagers do not make up a substantial proportion of the total amount of DKA events in pediatric samples, as the type 1 prevalence is much higher in teenagers than in small children (incidence peaks in adolescence).

“In 2004–2009 in the U.S., the mean hospital cost per pediatric DKA admission was $7,142 (range $4,125–11,916) (6), and insurance claims data from 2007 reported an excess of $5,837 in annual medical expenditures for youth with insulin-treated diabetes with DKA compared with those without DKA (7). In Germany, pediatric patients with diabetes with DKA had diabetes-related costs that were up to 3.6-fold higher compared with those without DKA (8).”

“DKA frequency was lower in pump users than in injection users (OR 0.84, 99% CI 0.76–0.93). Heterogeneity in the association with DKA between registries was seen for pump use and age category, and the overall rate should be interpreted accordingly. A lower rate of DKA in pump users was only found in T1DX, in contrast to no association of pump use with DKA in DPV or NPDA. […] In multivariable analyses […], age, type 1 diabetes duration, and pump use were not significantly associated with DKA in the fully adjusted model. […] pump use was associated with elevated odds of DKA in the <6-year-olds and in the 6- to <13-year-olds but with reduced odds of DKA in the 13- to <18-year-olds.”

Pump use should probably all else equal increase the risk of DKA, but all else is never equal and in these data pump users actually had a lower DKA event rate than did diabetics treated with injections. One should not conclude from this finding that pump use decreases the risk of DKA, selection bias and unobserved heterogeneities are problems which it is almost impossible to correct for in an adequate way – I find it highly unlikely that selection bias is only a potential problem in the US (see below). There are many different ways selection bias can be a relevant problem, financial- and insurance-related reasons (relevant particularly in the US and likely the main factors the authors are considering) are far from the only potential problems; I could thus easily imagine selection dynamics playing a major role even in a hypothetical setting where all new-diagnosed children were started on pump therapy as a matter of course. In such a setting you might have a situation where very poorly controlled individuals would have 10 DKA events in a short amount of time because they didn’t take the necessary amount of blood glucose tests/disregarded alarms/forgot or postponed filling up the pump when it’s near-empty/failed to switch the battery in time/etc. etc., and then what might happen would be that the diabetologist/endocrinologist would then proceed to recommend these patients doing very poorly on pump treatment to switch to injection therapy, and what you would end up with would be a compliant/motivated group of patients on pump therapy and a noncompliant/poorly motivated group on injection therapy. This would happen even if everybody started on pump therapy and so pump therapy exposure was completely unrelated to outcomes. Pump therapy requires more of the patient than does injection therapy, and if the patient is unwilling/unable to put in the work required that treatment option will fail. In my opinion the default here should be that these treatment groups are (‘significantly’) different, not that they are similar.

A few more quotes from the paper:

“The major finding of these analyses is high rates of pediatric DKA across the three registries, even though DKA events at the time of diagnosis were not included. In the prior 12 months, ∼1 in 20 (DPV), 1 in 16 (NPDA), and 1 in 14 (T1DX) pediatric patients with a duration of diabetes ≥1 year were diagnosed with DKA and required treatment in a health care facility. Female sex, ethnic minority status, and elevated HbA1c were consistent indicators of risk for DKA across all three registries. These indicators of increased risk for DKA are similar to previous reports (10,11,18,19), and our rates of DKA are within the range in the pediatric diabetes literature of 1–15% per established patient per year (10,11).

Compared with patients receiving injection therapy, insulin pump use was associated with a lower risk of DKA only in the U.S. in the T1DX, but no difference was seen in the DPV or NPDA. Country-specific factors on the associations of risk factors with DKA require further investigation. For pump use, selection bias may play a role in the U.S. The odds of DKA in pump users was not increased in any registry, which is a marked difference from some (10) but not all historic data (20).”

ii. Effect of Long-Acting Insulin Analogs on the Risk of Cancer: A Systematic Review of Observational Studies.

NPH insulin has been the mainstay treatment for type 1 diabetes and advanced type 2 diabetes since the 1950s. However, this insulin is associated with an increased risk of nocturnal hypoglycemia, and its relatively short half-life requires frequent administration (1,2). Consequently, structurally modified insulins, known as long-acting insulin analogs (glargine and detemir), were developed in the 1990s to circumvent these limitations. However, there are concerns that long-acting insulin analogs may be associated with an increased risk of cancer. Indeed, some laboratory studies showed long-acting insulin analogs were associated with cancer cell proliferation and protected against apoptosis via their higher binding affinity to IGF-I receptors (3,4).

In 2009, four observational studies associated the use of insulin glargine with an increased risk of cancer (58). These studies raised important concerns but were also criticized for important methodological shortcomings (913). Since then, several observational studies assessing the association between long-acting insulin analogs and cancer have been published but yielded inconsistent findings (1428). […] Several meta-analyses of observational studies have investigated the association between insulin glargine and cancer risk (3437). These meta-analyses assessed the quality of included studies, but the methodological issues particular to pharmacoepidemiologic research were not fully considered. In addition, given the presence of important heterogeneity in this literature, the appropriateness of pooling the results of these studies remains unclear. We therefore conducted a systematic review of observational studies examining the association between long-acting insulin analogs and cancer incidence, with a particular focus on methodological strengths and weaknesses of these studies.”

“[W]e assessed the quality of studies for key components, including time-related biases (immortal time, time-lag, and time-window), inclusion of prevalent users, inclusion of lag periods, and length of follow-up between insulin initiation and cancer incidence.

Immortal time bias is defined by a period of unexposed person-time that is misclassified as exposed person-time or excluded, resulting in the exposure of interest appearing more favorable (40,41). Time-lag bias occurs when treatments used later in the disease management process are compared with those used earlier for less advanced stages of the disease. Such comparisons can result in confounding by disease duration or severity of disease if duration and severity of disease are not adequately considered in the design or analysis of the study (29). This is particularly true for chronic disease with dynamic treatment processes such as type 2 diabetes. Currently, American and European clinical guidelines suggest using basal insulin (e.g., NPH, glargine, and detemir) as a last line of treatment if HbA1c targets are not achieved with other antidiabetic medications (42). Therefore, studies that compare long-acting insulin analogs to nonbasal insulin may introduce confounding by disease duration. Time-window bias occurs when the opportunity for exposure differs between case subjects and control subjects (29,43).

The importance of considering a lag period is necessary for latency considerations (i.e., a minimum time between treatment initiation and the development of cancer) and to minimize protopathic and detection bias. Protopathic bias, or reverse causation, is present when a medication (exposure) is prescribed for early symptoms related to the outcome of interest, which can lead to an overestimation of the association. Lagging the exposure by a predefined time window in cohort studies or excluding exposures in a predefined time window before the event in case-control studies is a means of minimizing this bias (44). Detection bias is present when the exposure leads to higher detection of the outcome of interest due to the increased frequency of clinic visits (e.g., newly diagnosed patients with type 2 diabetes or new users of another antidiabetic medication), which also results in an overestimation of risk (45). Thus, including a lag period, such as starting follow-up after 1 year of the initiation of a drug, simultaneously considers a latency period while also minimizing protopathic and detection bias.”

“We systematically searched MEDLINE and EMBASE from 2000 to 2014 to identify all observational studies evaluating the relationship between the long-acting insulin analogs and the risk of any and site-specific cancers (breast, colorectal, prostate). […] 16 cohort and 3 case-control studies were included in this systematic review (58,1428). All studies evaluated insulin glargine, with four studies also investigating insulin detemir (15,17,25,28). […] The study populations ranged from 1,340 to 275,164 patients […]. The mean or median durations of follow-up and age ranged from 0.9 to 7.0 years and from 52.3 to 77.4 years, respectively. […] Thirteen of 15 studies reported no association between insulin glargine and detemir and any cancer. Four of 13 studies reported an increased risk of breast cancer with insulin glargine. In the quality assessment, 7 studies included prevalent users, 11 did not consider a lag period, 6 had time-related biases, and 16 had short (<5 years) follow-up.”

“Of the 19 studies in this review, immortal time bias may have been introduced in one study based on the time-independent exposure and cohort entry definitions that were used in this cohort study […] Time-lag bias may have occurred in four studies […] A variation of time-lag bias was observed in a cohort study of new insulin users (28). For the exposure definition, highest duration since the start of insulin use was compared with the lowest. It is expected that the risk of cancer would increase with longer duration of insulin use; however, the opposite was reported (with RRs ranging from 0.50 to 0.90). The protective association observed could be due to competing risks (e.g., death from cardiovascular-related events) (47,48). Patients with diabetes have a higher risk of cardiovascular-related deaths compared with patients with no diabetes (49,50). Therefore, patients with diabetes who die of cardiovascular-related events do not have the opportunity to develop cancer, resulting in an underestimation of the risk of cancer. […] Time-window bias was observed in two studies (18,22). […] HbA1c and diabetes duration were not accounted for in 15 of the 19 studies, resulting in likely residual confounding (7,8,1418,2026,28). […] Seven studies included prevalent users of insulin (8,15,18,20,21,23,25), which is problematic because of the corresponding depletion of susceptible subjects in other insulin groups compared with long-acting insulin analogs. Protopathic or detection bias could have resulted in 11 of the 19 studies because a lag period was not incorporated in the study design (6,7,1416,1821,23,28).”

CONCLUSIONS The observational studies examining the risk of cancer associated with long-acting insulin analogs have important methodological shortcomings that limit the conclusions that can be drawn. Thus, uncertainty remains, particularly for breast cancer risk.”

iii. Impact of Socioeconomic Status on Cardiovascular Disease and Mortality in 24,947 Individuals With Type 1 Diabetes.

“Socioeconomic status (SES) is a powerful predictor of cardiovascular disease (CVD) and death. We examined the association in a large cohort of patients with type 1 diabetes. […] Clinical data from the Swedish National Diabetes Register were linked to national registers, whereby information on income, education, marital status, country of birth, comorbidities, and events was obtained. […] Type 1 diabetes was defined on the basis of epidemiologic data: treatment with insulin and a diagnosis at the age of 30 years or younger. This definition has been validated as accurate in 97% of the cases listed in the register (14).”

“We included 24,947 patients. Mean (SD) age and follow-up was 39.1 (13.9) and 6.0 (1.0) years. Death and fatal/nonfatal CVD occurred in 926 and 1378 individuals. Compared with being single, being married was associated with 50% lower risk of death, cardiovascular (CV) death, and diabetes-related death. Individuals in the two lowest quintiles had twice as great a risk of fatal/nonfatal CVD, coronary heart disease, and stroke and roughly three times as great a risk of death, diabetes-related death, and CV death as individuals in the highest income quintile. Compared with having ≤9 years of education, individuals with a college/university degree had 33% lower risk of fatal/nonfatal stroke.”

“Individuals with 10–12 years of education were comparable at baseline (considering distribution of age and sex) with those with a college/university degree […]. Individuals with a college/university degree had higher income, had 5 mmol/mol lower HbA1c, were more likely to be married/cohabiting, used insulin pump more frequently (17.5% vs. 14.5%), smoked less (5.8% vs. 13.1%), and had less albuminuria (10.8% vs. 14.2%). […] Women had substantially lower income and higher education, were more often married, used insulin pump more frequently, had less albuminuria, and smoked more frequently than men […] Individuals with high income were more likely to be married/cohabiting, had lower HbA1c, and had lower rates of smoking as well as albuminuria”.

CONCLUSIONS Low SES increases the risk of CVD and death by a factor of 2–3 in type 1 diabetes.”

“The effect of SES was striking despite rigorous adjustments for risk factors and confounders. Individuals in the two lowest income quintiles had two to three times higher risk of CV events and death than those in the highest income quintile. Compared with low educational level, having high education was associated with ∼30% lower risk of stroke. Compared with being single, individuals who were married/cohabiting had >50% lower risk of death, CV death, and diabetes-related death. Immigrants had 20–40% lower risk of fatal/nonfatal CVD, all-cause death, and diabetes-related death. Additionally, we show that males had 44%, 63%, and 29% higher risk of all-cause death, CV death, and diabetes-related death, respectively.

Despite rigorous adjustments for covariates and equitable access to health care at a negligible cost (20,21), SES and sex were robust predictors of CVD disease and mortality in type 1 diabetes; their effect was comparable with that of smoking, which represented an HR of 1.56 (95% CI 1.29–1.91) for all-cause death. […] Our study shows that men with type 1 diabetes are at greater risk of CV events and death compared with women. This should be viewed in the light of a recent meta-analysis of 26 studies, which showed higher excess risk in women compared with men. Overall, women had 40% greater excess risk of all-cause mortality, and twice the excess risk of fatal/nonfatal vascular events, compared with men (29). Thus, whereas the excess risk (i.e., the risk of patients with diabetes compared with the nondiabetic population) of vascular disease is higher in women with diabetes, we show that men with diabetes are still at substantially greater risk of all-cause death, CV death, and diabetes death compared with women with diabetes. Other studies are in line with our findings (10,11,13,3032).”

iv. Interventions That Restore Awareness of Hypoglycemia in Adults With Type 1 Diabetes: A Systematic Review and Meta-analysis.

“Hypoglycemia remains the major limiting factor toward achieving good glycemic control (1). Recurrent hypoglycemia reduces symptomatic and hormone responses to subsequent hypoglycemia (2), associated with impaired awareness of hypoglycemia (IAH). IAH occurs in up to one-third of adults with type 1 diabetes (T1D) (3,4), increasing their risk of severe hypoglycemia (SH) sixfold (3) and contributing to substantial morbidity, with implications for employment (5), driving (6), and mortality. Distribution of risk of SH is skewed: one study showed that 5% of subjects accounted for 54% of all SH episodes, with IAH one of the main risk factors (7). “Dead-in-bed,” related to nocturnal hypoglycemia, is a leading cause of death in people with T1D <40 years of age (8).”

“This systematic review assessed the clinical effectiveness of treatment strategies for restoring hypoglycemia awareness (HA) and reducing SH risk in those with IAH and performed a meta-analysis, where possible, for different approaches in restoring awareness in T1D adults. Interventions to restore HA were broadly divided into three categories: educational (inclusive of behavioral), technological, and pharmacotherapeutic. […] Forty-three studies (18 randomized controlled trials, 25 before-and-after studies) met the inclusion criteria, comprising 27 educational, 11 technological, and 5 pharmacological interventions. […] A meta-analysis for educational interventions on change in mean SH rates per person per year was performed. Combining before-and-after and RCT studies, six studies (n = 1,010 people) were included in the meta-analysis […] A random-effects meta-analysis revealed an effect size of a reduction in SH rates of 0.44 per patient per year with 95% CI 0.253–0.628. [here’s the forest plot, US] […] Most of the educational interventions were observational and mostly retrospective, with few RCTs. The overall risk of bias is considered medium to high and the study quality moderate. Most, if not all, of the RCTs did not use double blinding and lacked information on concealment. The strength of association of the effect of educational interventions is moderate. The ability of educational interventions to restore IAH and reduce SH is consistent and direct with educational interventions showing a largely positive outcome. There is substantial heterogeneity between studies, and the estimate is imprecise, as reflected by the large CIs. The strength of evidence is moderate to high.”

v. Trends of Diagnosis-Specific Work Disability After Newly Diagnosed Diabetes: A 4-Year Nationwide Prospective Cohort Study.

“There is little evidence to show which specific diseases contribute to excess work disability among those with diabetes. […] In this study, we used a large nationwide register-based data set, which includes information on work disability for all working-age inhabitants of Sweden, in order to investigate trends of diagnosis-specific work disability (sickness absence and disability pension) among people with diabetes for 4 years directly after the recorded onset of diabetes. We compared work disability trends among people with diabetes with trends among those without diabetes. […] The register data of diabetes medication and in- and outpatient hospital visits were used to identify all recorded new diabetes cases among the population aged 25–59 years in Sweden in 2006 (n = 14,098). Data for a 4-year follow-up of ICD-10 physician-certified sickness absence and disability pension days (2007‒2010) were obtained […] Comparisons were made using a random sample of the population without recorded diabetes (n = 39,056).”

RESULTS The most common causes of work disability were mental and musculoskeletal disorders; diabetes as a reason for disability was rare. Most of the excess work disability among people with diabetes compared with those without diabetes was owing to mental disorders (mean difference adjusted for confounding factors 18.8‒19.8 compensated days/year), musculoskeletal diseases (12.1‒12.8 days/year), circulatory diseases (5.9‒6.5 days/year), diseases of the nervous system (1.8‒2.0 days/year), and injuries (1.0‒1.2 days/year).”

CONCLUSIONS The increased risk of work disability among those with diabetes is largely attributed to comorbid mental, musculoskeletal, and circulatory diseases. […] Diagnosis of diabetes as the cause of work disability was rare.”

August 19, 2017 Posted by | Cancer/oncology, Cardiology, Diabetes, Health Economics, Medicine, Statistics | Leave a comment

Infectious Disease Surveillance (II)

Some more observation from the book below.

“There are three types of influenza viruses — A, B, and C — of which only types A and B cause widespread outbreaks in humans. Influenza A viruses are classified into subtypes based on antigenic differences between their two surface glycoproteins, hemagglutinin and neuraminidase. Seventeen hemagglutinin subtypes (H1–H17) and nine neuraminidase subtypes (N1–N9) have been identifed. […] The internationally accepted naming convention for influenza viruses contains the following elements: the type (e.g., A, B, C), geographical origin (e.g., Perth, Victoria), strain number (e.g., 361), year of isolation (e.g., 2011), for influenza A the hemagglutinin and neuraminidase antigen description (e.g., H1N1), and for nonhuman origin viruses the host of origin (e.g., swine) [4].”

“Only two antiviral drug classes are licensed for chemoprophylaxis and treatment of influenza—the adamantanes (amantadine and rimantadine) and the neuraminidase inhibitors (oseltamivir and zanamivir). […] Antiviral resistant strains arise through selection pressure in individual patients during treatment [which can lead to treatment failure]. […] they usually do not transmit further (because of impaired virus fitness) and have limited public health implications. On the other hand, primarily resistant viruses have emerged in the past decade and in some cases have completely replaced the susceptible strains. […] Surveillance of severe influenza illness is challenging because most cases remain undiagnosed. […] In addition, most of the influenza burden on the healthcare system is because of complications such as secondary bacterial infections and exacerbations of pre-existing chronic diseases, and often influenza is not suspected as an underlying cause. Even if suspected, the virus could have been already cleared from the respiratory secretions when the testing is performed, making diagnostic confirmation impossible. […] Only a small proportion of all deaths caused by influenza are classified as influenza-related on death certificates. […] mortality surveillance based only on death certificates is not useful for the rapid assessment of an influenza epidemic or pandemic severity. Detection of excess mortality in real time can be done by establishing specific monitoring systems that overcome these delays [such as sentinel surveillance systems, US].”

“Influenza vaccination programs are extremely complex and costly. More than half a billion doses of influenza vaccines are produced annually in two separate vaccine production cycles, one for the Northern Hemisphere and one for the Southern Hemisphere [54]. Because the influenza virus evolves constantly and vaccines are reformulated yearly, both vaccine effectiveness and safety need to be monitored routinely. Vaccination campaigns are also organized annually and require continuous public health efforts to maintain an acceptable level of vaccination coverage in the targeted population. […] huge efforts are made and resources spent to produce and distribute influenza vaccines annually. Despite these efforts, vaccination coverage among those at risk in many parts of the world remains low.”

“The Active Bacterial Core surveillance (ABCs) network and its predecessor have been examples of using surveillance as information for action for over 20 years. ABCs has been used to measure disease burden, to provide data for vaccine composition and recommended-use policies, and to monitor the impact of interventions. […] sites represent wide geographic diversity and approximately reflect the race and urban-to-rural mix of the U.S. population [37]. Currently, the population under surveillance is 19–42 million and varies by pathogen and project. […] ABCs has continuously evolved to address challenging questions posed by the six pathogens (H. influenzae; GAS [Group A Streptococcus], GBS [Group B Streptococcus], S.  pneumoniae, N. meningitidis, and MRSA) and other emerging infections. […] For the six core pathogens, the objectives are (1) to determine the incidence and epidemiologic characteristics of invasive disease in geographically diverse populations in the United States through active, laboratory, and population-based surveillance; (2) to determine molecular epidemiologic patterns and microbiologic characteristics of isolates collected as part of routine surveillance in order to track antimicrobial resistance; (3) to detect the emergence of new strains with new resistance patterns and/or virulence and contribute to development and evaluation of new vaccines; and (4) to provide an infrastructure for surveillance of other emerging pathogens and for conducting studies aimed at identifying risk factors for disease and evaluating prevention policies.”

“Food may become contaminated by over 250 bacterial, viral, and parasitic pathogens. Many of these agents cause diarrhea and vomiting, but there is no single clinical syndrome common to all foodborne diseases. Most of these agents can also be transmitted by nonfoodborne routes, including contact with animals or contaminated water. Therefore, for a given illness, it is often unclear whether the source of infection is foodborne or not. […] Surveillance systems for foodborne diseases provide extremely important information for prevention and control.”

“Since 1995, the Centers for Disease Control and Prevention (CDC) has routinely used an automated statistical outbreak detection algorithm that compares current reports of each Salmonella serotype with the preceding 5-year mean number of cases for the same geographic area and week of the year to look for unusual clusters of infection [5]. The sensitivity of Salmonella serotyping to detect outbreaks is greatest for rare serotypes, because a small increase is more noticeable against a rare background. The utility of serotyping has led to its widespread adoption in surveillance for food pathogens in many countries around the world [6]. […] Today, a new generation of subtyping methods […] is increasing the specificity of laboratory-based surveillance and its power to detect outbreaks […] Molecular subtyping allows comparison of the molecular “fingerprint” of bacterial strains. In the United States, the CDC coordinates a network called PulseNet that captures data from standardized molecular subtyping by PFGE [pulsed field gel electrophoresis]. By comparing new submissions and past data, public health officials can rapidly identify geographically dispersed clusters of disease that would otherwise not be apparent and evaluate them as possible foodborne-disease outbreaks [8]. The ability to identify geographically dispersed outbreaks has become increasingly important as more foods are mass-produced and widely distributed. […] Similar networks have been developed in Canada, Europe, the Asia Pacifc region, Latin America and the Caribbean region, the Middle Eastern region and, most recently, the African region”.

“Food consumption and practices have changed during the past 20 years in the United States, resulting in a shift from readily detectable, point-source outbreaks (e.g., attendance at a wedding dinner), to widespread outbreaks that occur over many communities with only a few illnesses in each community. One of the changes has been establishment of large food-producing facilities that disseminate products throughout the country. If a food product is contaminated with a low level of pathogen, contaminated food products are distributed across many states; and only a few illnesses may occur in each community. This type of outbreak is often difficult to detect. PulseNet has been critical for the detection of widely dispersed outbreaks in the United States [17]. […] The growth of the PulseNet database […] and the use of increasingly sophisticated epidemiological approaches have led to a dramatic increase in the number of multistate outbreaks detected and investigated.”

“Each year, approximately 35 million people are hospitalized in the United States, accounting for 170 million inpatient days [1,2]. There are no recent estimates of the numbers of healthcare-associated infections (HAI). However, two decades ago, HAI were estimated to affect more than 2 million hospital patients annually […] The mortality attributed to these HAI was estimated at about 100,000 deaths annually. […] Almost 85% of HAI in the United States are associated with bacterial pathogens, and 33% are thought to be preventable [4]. […] The primary purpose of surveillance [in the context of HAI] is to alert clinicians, epidemiologists, and laboratories of the need for targeted prevention activities required to reduce HAI rates. HAI surveillance data help to establish baseline rates that may be used to determine the potential need to change public health policy, to act and intervene in clinical settings, and to assess the effectiveness of microbiology methods, appropriateness of tests, and allocation of resources. […] As less than 10% of HAI in the United States occur as recognized epidemics [18], HAI surveillance should not be embarked on merely for the detection of outbreaks.”

“There are two types of rate comparisons — intrahospital and interhospital. The primary goals of intrahospital comparison are to identify areas within the hospital where HAI are more likely to occur and to measure the efficacy of interventional efforts. […] Without external comparisons, hospital infection control departments may [however] not know if the endemic rates in their respective facilities are relatively high or where to focus the limited fnancial and human resources of the infection control program. […] The CDC has been the central aggregating institution for active HAI surveillance in the United States since the 1960s.”

“Low sensitivity (i.e., missed infections) in a surveillance system is usually more common than low specificity (i.e., patients reported to have infections who did not actually have infections).”

“Among the numerous analyses of CDC hospital data carried out over the years, characteristics consistently found to be associated with higher HAI rates include affiliation with a medical school (i.e., teaching vs. nonteaching), size of the hospital and ICU categorized by the number of beds (large hospitals and larger ICUs generally had higher infection rates), type of control or ownership of the hospital (municipal, nonprofit, investor owned), and region of the country [43,44]. […] Various analyses of SENIC and NNIS/NHSN data have shown that differences in patient risk factors are largely responsible for interhospital differences in HAI rates. After controlling for patients’ risk factors, average lengths of stay, and measures of the completeness of diagnostic workups for infection (e.g., culturing rates), the differences in the average HAI rates of the various hospital groups virtually disappeared. […] For all of these reasons, an overall HAI rate, per se, gives little insight into whether the facility’s infection control efforts are effective.”

“Although a hospital’s surveillance system might aggregate accurate data and generate appropriate risk-adjusted HAI rates for both internal and external comparison, comparison may be misleading for several reasons. First, the rates may not adjust for patients’ unmeasured intrinsic risks for infection, which vary from hospital to hospital. […] Second, if surveillance techniques are not uniform among hospitals or are used inconsistently over time, variations will occur in sensitivity and specificity for HAI case finding. Third, the sample size […] must be sufficient. This issue is of concern for hospitals with fewer than 200 beds, which represent about 10% of hospital admissions in the United States. In most CDC analyses, rates from hospitals with very small denominators tend to be excluded [37,46,49]. […] Although many healthcare facilities around the country aggregate HAI surveillance data for baseline establishment and interhospital comparison, the comparison of HAI rates is complex, and the value of the aggregated data must be balanced against the burden of their collection. […] If a hospital does not devote sufficient resources to data collection, the data will be of limited value, because they will be replete with inaccuracies. No national database has successfully dealt with all the problems in collecting HAI data and each varies in its ability to address these problems. […] While comparative data can be useful as a tool for the prevention of HAI, in some instances no data might be better than bad data.”

August 10, 2017 Posted by | Books, Data, Epidemiology, Infectious disease, Medicine, Statistics | Leave a comment

Beyond Significance Testing (IV)

Below I have added some quotes from chapters 5, 6, and 7 of the book.

“There are two broad classes of standardized effect sizes for analysis at the group or variable level, the d family, also known as group difference indexes, and the r family, or relationship indexes […] Both families are metric- (unit-) free effect sizes that can compare results across studies or variables measured in different original metrics. Effect sizes in the d family are standardized mean differences that describe mean contrasts in standard deviation units, which can exceed 1.0 in absolute value. Standardized mean differences are signed effect sizes, where the sign of the statistic indicates the direction of the corresponding contrast. Effect sizes in the r family are scaled in correlation units that generally range from 1.0 to +1.0, where the sign indicates the direction of the relation […] Measures of association are unsigned effect sizes and thus do not indicate directionality.”

“The correlation rpb is for designs with two unrelated samples. […] rpb […] is affected by base rate, or the proportion of cases in one group versus the other, p and q. It tends to be highest in balanced designs. As the design becomes more unbalanced holding all else constant, rpb approaches zero. […] rpb is not directly comparable across studies with dissimilar relative group sizes […]. The correlation rpb is also affected by the total variability (i.e., ST). If this variation is not constant over samples, values of rpb may not be directly comparable.”

“Too many researchers neglect to report reliability coefficients for scores analyzed. This is regrettable because effect sizes cannot be properly interpreted without knowing whether the scores are precise. The general effect of measurement error in comparative studies is to attenuate absolute standardized effect sizes and reduce the power of statistical tests. Measurement error also contributes to variation in observed results over studies. Of special concern is when both score reliabilities and sample sizes vary from study to study. If so, effects of sampling error are confounded with those due to measurement error. […] There are ways to correct some effect sizes for measurement error (e.g., Baguley, 2009), but corrected effect sizes are rarely reported. It is more surprising that measurement error is ignored in most meta-analyses, too. F. L. Schmidt (2010) found that corrected effect sizes were analyzed in only about 10% of the 199 meta-analytic articles published in Psychological Bulletin from 1978 to 2006. This implies that (a) estimates of mean effect sizes may be too low and (b) the wrong statistical model may be selected when attempting to explain between-studies variation in results. If a fixed
effects model is mistakenly chosen over a random effects model, confidence intervals based on average effect sizes tend to be too narrow, which can make those results look more precise than they really are. Underestimating mean effect sizes while simultaneously overstating their precision is a potentially serious error.”

“[D]emonstration of an effect’s significance — whether theoretical, practical, or clinical — calls for more discipline-specific expertise than the estimation of its magnitude”.

“Some outcomes are categorical instead of continuous. The levels of a categorical outcome are mutually exclusive, and each case is classified into just one level. […] The risk difference (RD) is defined as pCpT, and it estimates the parameter πC πT. [Those ‘n-resembling letters’ is how wordpress displays pi; this is one of an almost infinite number of reasons why I detest blogging equations on this blog and usually do not do this – US] […] The risk ratio (RR) is the ratio of the risk rates […] which rate appears in the numerator versus the denominator is arbitrary, so one should always explain how RR is computed. […] The odds ratio (OR) is the ratio of the within-groups odds for the undesirable event. […] A convenient property of OR is that it can be converted to a kind of standardized mean difference known as logit d (Chinn, 2000). […] Reporting logit d may be of interest when the hypothetical variable that underlies the observed dichotomy is continuous.”

“The risk difference RD is easy to interpret but has a drawback: Its range depends on the values of the population proportions πC and πT. That is, the range of RD is greater when both πC and πT are closer to .50 than when they are closer to either 0 or 1.00. The implication is that RD values may not be comparable across different studies when the corresponding parameters πC and πT are quite different. The risk ratio RR is also easy to interpret. It has the shortcoming that only the finite interval from 0 to < 1.0 indicates lower risk in the group represented in the numerator, but the interval from > 1.00 to infinity is theoretically available for describing higher risk in the same group. The range of RR varies according to its denominator. This property limits the value of RR for comparing results across different studies. […] The odds ratio or shares the limitation that the finite interval from 0 to < 1.0 indicates lower risk in the group represented in the numerator, but the interval from > 1.0 to infinity describes higher risk for the same group. Analyzing natural log transformations of OR and then taking antilogs of the results deals with this problem, just as for RR. The odds ratio may be the least intuitive of the comparative risk effect sizes, but it probably has the best overall statistical properties. This is because OR can be estimated in prospective studies, in studies that randomly sample from exposed and unexposed populations, and in retrospective studies where groups are first formed based on the presence or absence of a disease before their exposure to a putative risk factor is determined […]. Other effect sizes may not be valid in retrospective studies (RR) or in studies without random sampling ([Pearson correlations between dichotomous variables, US]).”

“Sensitivity and specificity are determined by the threshold on a screening test. This means that different thresholds on the same test will generate different sets of sensitivity and specificity values in the same sample. But both sensitivity and specificity are independent of population base rate and sample size. […] Sensitivity and specificity affect predictive value, the proportion of test results that are correct […] In general, predictive values increase as sensitivity and specificity increase. […] Predictive value is also influenced by the base rate (BR), the proportion of all cases with the disorder […] In general, PPV [positive predictive value] decreases and NPV [negative…] increases as BR approaches zero. This means that screening tests tend to be more useful for ruling out rare disorders than correctly predicting their presence. It also means that most positive results may be false positives under low base rate conditions. This is why it is difficult for researchers or social policy makers to screen large populations for rare conditions without many false positives. […] The effect of BR on predictive values is striking but often overlooked, even by professionals […]. One misunderstanding involves confusing sensitivity and specificity, which are invariant to BR, with PPV and NPV, which are not. This means that diagnosticians fail to adjust their estimates of test accuracy for changes in base rates, which exemplifies the base rate fallacy. […] In general, test results have greater impact on changing the pretest odds when the base rate is moderate, neither extremely low (close to 0) nor extremely high (close to 1.0). But if the target disorder is either very rare or very common, only a result from a highly accurate screening test will change things much.”

“The technique of ANCOVA [ANalysis of COVAriance, US] has two more assumptions than ANOVA does. One is homogeneity of regression, which requires equal within-populations unstandardized regression coefficients for predicting outcome from the covariate. In nonexperimental designs where groups differ systematically on the covariate […] the homogeneity of regression assumption is rather likely to be violated. The second assumption is that the covariate is measured without error […] Violation of either assumption may lead to inaccurate results. For example, an unreliable covariate in experimental designs causes loss of statistical power and in nonexperimental designs may also cause inaccurate adjustment of the means […]. In nonexperimental designs where groups differ systematically, these two extra assumptions are especially likely to be violated. An alternative to ANCOVA is propensity score analysis (PSA). It involves the use of logistic regression to estimate the probability for each case of belonging to different groups, such as treatment versus control, in designs without randomization, given the covariate(s). These probabilities are the propensities, and they can be used to match cases from nonequivalent groups.”

August 5, 2017 Posted by | Books, Epidemiology, Papers, Statistics | Leave a comment

A few diabetes papers of interest

i. Clinically Relevant Cognitive Impairment in Middle-Aged Adults With Childhood-Onset Type 1 Diabetes.

“Modest cognitive dysfunction is consistently reported in children and young adults with type 1 diabetes (T1D) (1). Mental efficiency, psychomotor speed, executive functioning, and intelligence quotient appear to be most affected (2); studies report effect sizes between 0.2 and 0.5 (small to modest) in children and adolescents (3) and between 0.4 and 0.8 (modest to large) in adults (2). Whether effect sizes continue to increase as those with T1D age, however, remains unknown.

A key issue not yet addressed is whether aging individuals with T1D have an increased risk of manifesting “clinically relevant cognitive impairment,” defined by comparing individual cognitive test scores to demographically appropriate normative means, as opposed to the more commonly investigated “cognitive dysfunction,” or between-group differences in cognitive test scores. Unlike the extensive literature examining cognitive impairment in type 2 diabetes, we know of only one prior study examining cognitive impairment in T1D (4). This early study reported a higher rate of clinically relevant cognitive impairment among children (10–18 years of age) diagnosed before compared with after age 6 years (24% vs. 6%, respectively) or a non-T1D cohort (6%).”

“This study tests the hypothesis that childhood-onset T1D is associated with an increased risk of developing clinically relevant cognitive impairment detectable by middle age. We compared cognitive test results between adults with and without T1D and used demographically appropriate published norms (1012) to determine whether participants met criteria for impairment for each test; aging and dementia studies have selected a score ≥1.5 SD worse than the norm on that test, corresponding to performance at or below the seventh percentile (13).”

“During 2010–2013, 97 adults diagnosed with T1D and aged <18 years (age and duration 49 ± 7 and 41 ± 6 years, respectively; 51% female) and 138 similarly aged adults without T1D (age 49 ± 7 years; 55% female) completed extensive neuropsychological testing. Biomedical data on participants with T1D were collected periodically since 1986–1988.  […] The prevalence of clinically relevant cognitive impairment was five times higher among participants with than without T1D (28% vs. 5%; P < 0.0001), independent of education, age, or blood pressure. Effect sizes were large (Cohen d 0.6–0.9; P < 0.0001) for psychomotor speed and visuoconstruction tasks and were modest (d 0.3–0.6; P < 0.05) for measures of executive function. Among participants with T1D, prevalent cognitive impairment was related to 14-year average A1c >7.5% (58 mmol/mol) (odds ratio [OR] 3.0; P = 0.009), proliferative retinopathy (OR 2.8; P = 0.01), and distal symmetric polyneuropathy (OR 2.6; P = 0.03) measured 5 years earlier; higher BMI (OR 1.1; P = 0.03); and ankle-brachial index ≥1.3 (OR 4.2; P = 0.01) measured 20 years earlier, independent of education.”

“Having T1D was the only factor significantly associated with the between-group difference in clinically relevant cognitive impairment in our sample. Traditional risk factors for age-related cognitive impairment, in particular older age and high blood pressure (24), were not related to the between-group difference we observed. […] Similar to previous studies of younger adults with T1D (14,26), we found no relationship between the number of severe hypoglycemic episodes and cognitive impairment. Rather, we found that chronic hyperglycemia, via its associated vascular and metabolic changes, may have triggered structural changes in the brain that disrupt normal cognitive function.”

Just to be absolutely clear about these results: The type 1 diabetics they recruited in this study were on average not yet fifty years old, yet more than one in four of them were cognitively impaired to a clinically relevant degree. This is a huge effect. As they note later in the paper:

“Unlike previous reports of mild/modest cognitive dysfunction in young adults with T1D (1,2), we detected clinically relevant cognitive impairment in 28% of our middle-aged participants with T1D. This prevalence rate in our T1D cohort is comparable to the prevalence of mild cognitive impairment typically reported among community-dwelling adults aged 85 years and older (29%) (20).”

The type 1 diabetics included in the study had had diabetes for roughly a decade more than I have. And the number of cognitively impaired individuals in that sample corresponds roughly to what you find when you test random 85+ year-olds. Having type 1 diabetes is not good for your brain.

ii. Comment on Nunley et al. Clinically Relevant Cognitive Impairment in Middle-Aged Adults With Childhood-Onset Type 1 Diabetes.

This one is a short comment to the above paper, below I’ve quoted ‘the meat’ of the comment:

“While the […] study provides us with important insights regarding cognitive impairment in adults with type 1 diabetes, we regret that depression has not been taken into account. A systematic review and meta-analysis published in 2014 identified significant objective cognitive impairment in adults and adolescents with depression regarding executive functioning, memory, and attention relative to control subjects (2). Moreover, depression is two times more common in adults with diabetes compared with those without this condition, regardless of type of diabetes (3). There is even evidence that the co-occurrence of diabetes and depression leads to additional health risks such as increased mortality and dementia (3,4); this might well apply to cognitive impairment as well. Furthermore, in people with diabetes, the presence of depression has been associated with the development of diabetes complications, such as retinopathy, and higher HbA1c values (3). These are exactly the diabetes-specific correlates that Nunley et al. (1) found.”

“We believe it is a missed opportunity that Nunley et al. (1) mainly focused on biological variables, such as hyperglycemia and microvascular disease, and did not take into account an emotional disorder widely represented among people with diabetes and closely linked to cognitive impairment. Even though severe or chronic cases of depression are likely to have been excluded in the group without type 1 diabetes based on exclusion criteria (1), data on the presence of depression (either measured through a diagnostic interview or by using a validated screening questionnaire) could have helped to interpret the present findings. […] Determining the role of depression in the relationship between cognitive impairment and type 1 diabetes is of significant importance. Treatment of depression might improve cognitive impairment both directly by alleviating cognitive depression symptoms and indirectly by improving treatment nonadherence and glycemic control, consequently lowering the risk of developing complications.”

iii. Prevalence of Diabetes and Diabetic Nephropathy in a Large U.S. Commercially Insured Pediatric Population, 2002–2013.

“[W]e identified 96,171 pediatric patients with diabetes and 3,161 pediatric patients with diabetic nephropathy during 2002–2013. We estimated prevalence of pediatric diabetes overall, by diabetes type, age, and sex, and prevalence of pediatric diabetic nephropathy overall, by age, sex, and diabetes type.”

“Although type 1 diabetes accounts for a majority of childhood and adolescent diabetes, type 2 diabetes is becoming more common with the increasing rate of childhood obesity and it is estimated that up to 45% of all new patients with diabetes in this age-group have type 2 diabetes (1,2). With the rising prevalence of diabetes in children, a rise in diabetes-related complications, such as nephropathy, is anticipated. Moreover, data suggest that the development of clinical macrovascular complications, neuropathy, and nephropathy may be especially rapid among patients with young-onset type 2 diabetes (age of onset <40 years) (36). However, the natural history of young patients with type 2 diabetes and resulting complications has not been well studied.”

I’m always interested in the identification mechanisms applied in papers like this one, and I’m a little confused about the high number of patients without prescriptions (almost one-third of patients); I sort of assume these patients do take (/are given) prescription drugs, but get them from sources not available to the researchers (parents get prescriptions for the antidiabetic drugs, and the researchers don’t have access to these data? Something like this..) but this is a bit unclear. The mechanism they employ in the paper is not perfect (no mechanism is), but it probably works:

“Patients who had one or more prescription(s) for insulin and no prescriptions for another antidiabetes medication were classified as having type 1 diabetes, while those who filled prescriptions for noninsulin antidiabetes medications were considered to have type 2 diabetes.”

When covering limitations of the paper, they observe incidentally in this context that:

“Klingensmith et al. (31) recently reported that in the initial month after diagnosis of type 2 diabetes around 30% of patients were treated with insulin only. Thus, we may have misclassified a small proportion of type 2 cases as type 1 diabetes or vice versa. Despite this, we found that 9% of patients had onset of type 2 diabetes at age <10 years, consistent with the findings of Klingensmith et al. (8%), but higher than reported by the SEARCH for Diabetes in Youth study (<3%) (31,32).”

Some more observations from the paper:

“There were 149,223 patients aged <18 years at first diagnosis of diabetes in the CCE database from 2002 through 2013. […] Type 1 diabetes accounted for a majority of the pediatric patients with diabetes (79%). Among these, 53% were male and 53% were aged 12 to <18 years at onset, while among patients with type 2 diabetes, 60% were female and 79% were aged 12 to <18 years at onset.”

“The overall annual prevalence of all diabetes increased from 1.86 to 2.82 per 1,000 during years 2002–2013; it increased on average by 9.5% per year from 2002 to 2006 and slowly increased by 0.6% after that […] The prevalence of type 1 diabetes increased from 1.48 to 2.32 per 1,000 during the study period (average increase of 8.5% per year from 2002 to 2006 and 1.4% after that; both P values <0.05). The prevalence of type 2 diabetes increased from 0.38 to 0.67 per 1,000 during 2002 through 2006 (average increase of 13.3% per year; P < 0.05) and then dropped from 0.56 to 0.49 per 1,000 during 2007 through 2013 (average decrease of 2.7% per year; P < 0.05). […] Prevalence of any diabetes increased by age, with the highest prevalence in patients aged 12 to <18 years (ranging from 3.47 to 5.71 per 1,000 from 2002 through 2013).” […] The annual prevalence of diabetes increased over the study period mainly because of increases in type 1 diabetes.”

“Dabelea et al. (8) reported, based on data from the SEARCH for Diabetes in Youth study, that the annual prevalence of type 1 diabetes increased from 1.48 to 1.93 per 1,000 and from 0.34 to 0.46 per 1,000 for type 2 diabetes from 2001 to 2009 in U.S. youth. In our study, the annual prevalence of type 1 diabetes was 1.48 per 1,000 in 2002 and 2.10 per 1,000 in 2009, which is close to their reported prevalence.”

“We identified 3,161 diabetic nephropathy cases. Among these, 1,509 cases (47.7%) were of specific diabetic nephropathy and 2,253 (71.3%) were classified as probable cases. […] The annual prevalence of diabetic nephropathy in pediatric patients with diabetes increased from 1.16 to 3.44% between 2002 and 2013; it increased by on average 25.7% per year from 2002 to 2005 and slowly increased by 4.6% after that (both P values <0.05).”

Do note that the relationship between nephropathy prevalence and diabetes prevalence is complicated and that you cannot just explain an increase in the prevalence of nephropathy over time easily by simply referring to an increased prevalence of diabetes during the same time period. This would in fact be a very wrong thing to do, in part but not only on account of the data structure employed in this study. One problem which is probably easy to understand is that if more children got diabetes but the same proportion of those new diabetics got nephropathy, the diabetes prevalence would go up but the diabetic nephropathy prevalence would remain fixed; when you calculate the diabetic nephropathy prevalence you implicitly condition on diabetes status. But this just scratches the surface of the issues you encounter when you try to link these variables, because the relationship between the two variables is complicated; there’s an age pattern to diabetes risk, with risk (incidence) increasing with age (up to a point, after which it falls – in most samples I’ve seen in the past peak incidence in pediatric populations is well below the age of 18). However diabetes prevalence increases monotonously with age as long as the age-specific death rate of diabetics is lower than the age-specific incidence, because diabetes is chronic, and then on top of that you have nephropathy-related variables, which display diabetes-related duration-dependence (meaning that although nephropathy risk is also increasing with age when you look at that variable in isolation, that age-risk relationship is confounded by diabetes duration – a type 1 diabetic at the age of 12 who’s had diabetes for 10 years has a higher risk of nephropathy than a 16-year old who developed diabetes the year before). When a newly diagnosed pediatric patient is included in the diabetes sample here this will actually decrease the nephropathy prevalence in the short run, but not in the long run, assuming no changes in diabetes treatment outcomes over time. This is because the probability that that individual has diabetes-related kidney problems as a newly diagnosed child is zero, so he or she will unquestionably only contribute to the denominator during the first years of illness (the situation in the middle-aged type 2 context is different; here you do sometimes have newly-diagnosed patients who have developed complications already). This is one reason why it would be quite wrong to say that increased diabetes prevalence in this sample is the reason why diabetic nephropathy is increasing as well. Unless the time period you look at is very long (e.g. you have a setting where you follow all individuals with a diagnosis until the age of 18), the impact of increasing prevalence of one condition may well be expected to have a negative impact on the estimated risk of associated conditions, if those associated conditions display duration-dependence (which all major diabetes complications do). A second factor supporting a default assumption of increasing incidence of diabetes leading to an expected decreasing rate of diabetes-related complications is of course the fact that treatment options have tended to increase over time, and especially if you take a long view (look back 30-40 years) the increase in treatment options and improved medical technology have lead to improved metabolic control and better outcomes.

That both variables grew over time might be taken to indicate that both more children got diabetes and that a larger proportion of this increased number of children with diabetes developed kidney problems, but this stuff is a lot more complicated than it might look and it’s in particular important to keep in mind that, say, the 2005 sample and the 2010 sample do not include the same individuals, although there’ll of course be some overlap; in age-stratified samples like this you always have some level of implicit continuous replacement, with newly diagnosed patients entering and replacing the 18-year olds who leave the sample. As long as prevalence is constant over time, associated outcome variables may be reasonably easy to interpret, but when you have dynamic samples as well as increasing prevalence over time it gets difficult to say much with any degree of certainty unless you crunch the numbers in a lot of detail (and it might be difficult even if you do that). A factor I didn’t mention above but which is of course also relevant is that you need to be careful about how to interpret prevalence rates when you look at complications with high mortality rates (and late-stage diabetic nephropathy is indeed a complication with high mortality); in such a situation improvements in treatment outcomes may have large effects on prevalence rates but no effect on incidence. Increased prevalence is not always bad news, sometimes it is good news indeed. Gleevec substantially increased the prevalence of CML.

In terms of the prevalence-outcomes (/complication risk) connection, there are also in my opinion reasons to assume that there may be multiple causal pathways between prevalence and outcomes. For example a very low prevalence of a condition in a given area may mean that fewer specialists are educated to take care of these patients than would be the case for an area with a higher prevalence, and this may translate into a more poorly developed care infrastructure. Greatly increasing prevalence may on the other hand lead to a lower level of care for all patients with the illness, not just the newly diagnosed ones, due to binding budget constraints and care rationing. And why might you have changes in prevalence; might they not sometimes rather be related to changes in diagnostic practices, rather than changes in the True* prevalence? If that’s the case, you might not be comparing apples to apples when you’re comparing the evolving complication rates. There are in my opinion many reasons to believe that the relationship between chronic conditions and the complication rates of these conditions is far from simple to model.

All this said, kidney problems in children with diabetes is still rare, compared to the numbers you see when you look at adult samples with longer diabetes duration. It’s also worth distinguishing between microalbuminuria and overt nephropathy; children rarely proceed to develop diabetes-related kidney failure, although poor metabolic control may mean that they do develop this complication later, in early adulthood. As they note in the paper:

“It has been reported that overt diabetic nephropathy and kidney failure caused by either type 1 or type 2 diabetes are uncommon during childhood or adolescence (24). In this study, the annual prevalence of diabetic nephropathy for all cases ranged from 1.16 to 3.44% in pediatric patients with diabetes and was extremely low in the whole pediatric population (range 2.15 to 9.70 per 100,000), confirming that diabetic nephropathy is a very uncommon condition in youth aged <18 years. We observed that the prevalence of diabetic nephropathy increased in both specific and unspecific cases before 2006, with a leveling off of the specific nephropathy cases after 2005, while the unspecific cases continued to increase.”

iv. Adherence to Oral Glucose-Lowering Therapies and Associations With 1-Year HbA1c: A Retrospective Cohort Analysis in a Large Primary Care Database.

“Between a third and a half of medicines prescribed for type 2 diabetes (T2DM), a condition in which multiple medications are used to control cardiovascular risk factors and blood glucose (1,2), are not taken as prescribed (36). However, estimates vary widely depending on the population being studied and the way in which adherence to recommended treatment is defined.”

“A number of previous studies have used retrospective databases of electronic health records to examine factors that might predict adherence. A recent large cohort database examined overall adherence to oral therapy for T2DM, taking into account changes of therapy. It concluded that overall adherence was 69%, with individuals newly started on treatment being significantly less likely to adhere (19).”

“The impact of continuing to take glucose-lowering medicines intermittently, but not as recommended, is unknown. Medication possession (expressed as a ratio of actual possession to expected possession), derived from prescribing records, has been identified as a valid adherence measure for people with diabetes (7). Previous studies have been limited to small populations in managed-care systems in the U.S. and focused on metformin and sulfonylurea oral glucose-lowering treatments (8,9). Further studies need to be carried out in larger groups of people that are more representative of the general population.

The Clinical Practice Research Database (CPRD) is a long established repository of routine clinical data from more than 13 million patients registered with primary care services in England. […] The Genetics of Diabetes and Audit Research Tayside Study (GoDARTS) database is derived from integrated health records in Scotland with primary care, pharmacy, and hospital data on 9,400 patients with diabetes. […] We conducted a retrospective cohort study using [these databases] to examine the prevalence of nonadherence to treatment for type 2 diabetes and investigate its potential impact on HbA1c reduction stratified by type of glucose-lowering medication.”

“In CPRD and GoDARTS, 13% and 15% of patients, respectively, were nonadherent. Proportions of nonadherent patients varied by the oral glucose-lowering treatment prescribed (range 8.6% [thiazolidinedione] to 18.8% [metformin]). Nonadherent, compared with adherent, patients had a smaller HbA1c reduction (0.4% [4.4 mmol/mol] and 0.46% [5.0 mmol/mol] for CPRD and GoDARTs, respectively). Difference in HbA1c response for adherent compared with nonadherent patients varied by drug (range 0.38% [4.1 mmol/mol] to 0.75% [8.2 mmol/mol] lower in adherent group). Decreasing levels of adherence were consistently associated with a smaller reduction in HbA1c.”

“These findings show an association between adherence to oral glucose-lowering treatment, measured by the proportion of medication obtained on prescription over 1 year, and the corresponding decrement in HbA1c, in a population of patients newly starting treatment and continuing to collect prescriptions. The association is consistent across all commonly used oral glucose-lowering therapies, and the findings are consistent between the two data sets examined, CPRD and GoDARTS. Nonadherent patients, taking on average <80% of the intended medication, had about half the expected reduction in HbA1c. […] Reduced medication adherence for commonly used glucose-lowering therapies among patients persisting with treatment is associated with smaller HbA1c reductions compared with those taking treatment as recommended. Differences observed in HbA1c responses to glucose-lowering treatments may be explained in part by their intermittent use.”

“Low medication adherence is related to increased mortality (20). The mean difference in HbA1c between patients with MPR <80% and ≥80% is between 0.37% and 0.55% (4 mmol/mol and 6 mmol/mol), equivalent to up to a 10% reduction in death or an 18% reduction in diabetes complications (21).”

v. Health Care Transition in Young Adults With Type 1 Diabetes: Perspectives of Adult Endocrinologists in the U.S.

“Empiric data are limited on best practices in transition care, especially in the U.S. (10,1316). Prior research, largely from the patient perspective, has highlighted challenges in the transition process, including gaps in care (13,1719); suboptimal pediatric transition preparation (13,20); increased post-transition hospitalizations (21); and patient dissatisfaction with the transition experience (13,1719). […] Young adults with type 1 diabetes transitioning from pediatric to adult care are at risk for adverse outcomes. Our objective was to describe experiences, resources, and barriers reported by a national sample of adult endocrinologists receiving and caring for young adults with type 1 diabetes.”

“We received responses from 536 of 4,214 endocrinologists (response rate 13%); 418 surveys met the eligibility criteria. Respondents (57% male, 79% Caucasian) represented 47 states; 64% had been practicing >10 years and 42% worked at an academic center. Only 36% of respondents reported often/always reviewing pediatric records and 11% reported receiving summaries for transitioning young adults with type 1 diabetes, although >70% felt that these activities were important for patient care.”

“A number of studies document deficiencies in provider hand-offs across other chronic conditions and point to the broader relevance of our findings. For example, in two studies of inflammatory bowel disease, adult gastroenterologists reported inadequacies in young adult transition preparation (31) and infrequent receipt of medical histories from pediatric providers (32). In a study of adult specialists caring for young adults with a variety of chronic diseases (33), more than half reported that they had no contact with the pediatric specialists.

Importantly, more than half of the endocrinologists in our study reported a need for increased access to mental health referrals for young adult patients with type 1 diabetes, particularly in nonacademic settings. Report of barriers to care was highest for patient scenarios involving mental health issues, and endocrinologists without easy access to mental health referrals were significantly more likely to report barriers to diabetes management for young adults with psychiatric comorbidities such as depression, substance abuse, and eating disorders.”

“Prior research (34,35) has uncovered the lack of mental health resources in diabetes care. In the large cross-national Diabetes Attitudes, Wishes and Needs (DAWN) study (36) […] diabetes providers often reported not having the resources to manage mental health problems; half of specialist diabetes physicians felt unable to provide psychiatric support for patients and one-third did not have ready access to outside expertise in emotional or psychiatric matters. Our results, which resonate with the DAWN findings, are particularly concerning in light of the vulnerability of young adults with type 1 diabetes for adverse medical and mental health outcomes (4,34,37,38). […] In a recent report from the Mental Health Issues of Diabetes conference (35), which focused on type 1 diabetes, a major observation included the lack of trained mental health professionals, both in academic centers and the community, who are knowledgeable about the mental health issues germane to diabetes.”

August 3, 2017 Posted by | Diabetes, Epidemiology, Medicine, Nephrology, Neurology, Pharmacology, Psychiatry, Psychology, Statistics, Studies | Leave a comment

Beyond Significance Testing (III)

There are many ways to misinterpret significance tests, and this book spends quite a bit of time and effort on these kinds of issues. I decided to include in this post quite a few quotes from chapter 4 of the book, which deals with these topics in some detail. I also included some notes on effect sizes.

“[P] < .05 means that the likelihood of the data or results even more extreme given random sampling under the null hypothesis is < .05, assuming that all distributional requirements of the test statistic are satisfied and there are no other sources of error variance. […] the odds-against-chance fallacy […] [is] the false belief that p indicates the probability that a result happened by sampling error; thus, p < .05 says that there is less than a 5% likelihood that a particular finding is due to chance. There is a related misconception i call the filter myth, which says that p values sort results into two categories, those that are a result of “chance” (H0 not rejected) and others that are due to “real” effects (H0 rejected). These beliefs are wrong […] When p is calculated, it is already assumed that H0 is true, so the probability that sampling error is the only explanation is already taken to be 1.00. It is thus illogical to view p as measuring the likelihood of sampling error. […] There is no such thing as a statistical technique that determines the probability that various causal factors, including sampling error, acted on a particular result.

Most psychology students and professors may endorse the local Type I error fallacy [which is] the mistaken belief that p < .05 given α = .05 means that the likelihood that the decision just taken to reject H0 is a type I error is less than 5%. […] p values from statistical tests are conditional probabilities of data, so they do not apply to any specific decision to reject H0. This is because any particular decision to do so is either right or wrong, so no probability is associated with it (other than 0 or 1.0). Only with sufficient replication could one determine whether a decision to reject H0 in a particular study was correct. […] the valid research hypothesis fallacy […] refers to the false belief that the probability that H1 is true is > .95, given p < .05. The complement of p is a probability, but 1 – p is just the probability of getting a result even less extreme under H0 than the one actually found. This fallacy is endorsed by most psychology students and professors”.

“[S]everal different false conclusions may be reached after deciding to reject or fail to reject H0. […] the magnitude fallacy is the false belief that low p values indicate large effects. […] p values are confounded measures of effect size and sample size […]. Thus, effects of trivial magnitude need only a large enough sample to be statistically significant. […] the zero fallacy […] is the mistaken belief that the failure to reject a nil hypothesis means that the population effect size is zero. Maybe it is, but you cannot tell based on a result in one sample, especially if power is low. […] The equivalence fallacy occurs when the failure to reject H0: µ1 = µ2 is interpreted as saying that the populations are equivalent. This is wrong because even if µ1 = µ2, distributions can differ in other ways, such as variability or distribution shape.”

“[T]he reification fallacy is the faulty belief that failure to replicate a result is the failure to make the same decision about H0 across studies […]. In this view, a result is not considered replicated if H0 is rejected in the first study but not in the second study. This sophism ignores sample size, effect size, and power across different studies. […] The sanctification fallacy refers to dichotomous thinking about continuous p values. […] Differences between results that are “significant” versus “not significant” by close margins, such as p = .03 versus p = .07 when α = .05, are themselves often not statistically significant. That is, relatively large changes in p can correspond to small, nonsignificant changes in the underlying variable (Gelman & Stern, 2006). […] Classical parametric statistical tests are not robust against outliers or violations of distributional assumptions, especially in small, unrepresentative samples. But many researchers believe just the opposite, which is the robustness fallacy. […] most researchers do not provide evidence about whether distributional or other assumptions are met”.

“Many [of the above] fallacies involve wishful thinking about things that researchers really want to know. These include the probability that H0 or H1 is true, the likelihood of replication, and the chance that a particular decision to reject H0 is wrong. Alas, statistical tests tell us only the conditional probability of the data. […] But there is [however] a method that can tell us what we want to know. It is not a statistical technique; rather, it is good, old-fashioned replication, which is also the best way to deal with the problem of sampling error. […] Statistical significance provides even in the best case nothing more than low-level support for the existence of an effect, relation, or difference. That best case occurs when researchers estimate a priori power, specify the correct construct definitions and operationalizations, work with random or at least representative samples, analyze highly reliable scores in distributions that respect test assumptions, control other major sources of imprecision besides sampling error, and test plausible null hypotheses. In this idyllic scenario, p values from statistical tests may be reasonably accurate and potentially meaningful, if they are not misinterpreted. […] The capability of significance tests to address the dichotomous question of whether effects, relations, or differences are greater than expected levels of sampling error may be useful in some new research areas. Due to the many limitations of statistical tests, this period of usefulness should be brief. Given evidence that an effect exists, the next steps should involve estimation of its magnitude and evaluation of its substantive significance, both of which are beyond what significance testing can tell us. […] It should be a hallmark of a maturing research area that significance testing is not the primary inference method.”

“[An] effect size [is] a quantitative reflection of the magnitude of some phenomenon used for the sake of addressing a specific research question. In this sense, an effect size is a statistic (in samples) or parameter (in populations) with a purpose, that of quantifying a phenomenon of interest. more specific definitions may depend on study design. […] cause size refers to the independent variable and specifically to the amount of change in it that produces a given effect on the dependent variable. A related idea is that of causal efficacy, or the ratio of effect size to the size of its cause. The greater the causal efficacy, the more that a given change on an independent variable results in proportionally bigger changes on the dependent variable. The idea of cause size is most relevant when the factor is experimental and its levels are quantitative. […] An effect size measure […] is a named expression that maps data, statistics, or parameters onto a quantity that represents the magnitude of the phenomenon of interest. This expression connects dimensions or generalized units that are abstractions of variables of interest with a specific operationalization of those units.”

“A good effect size measure has the [following properties:] […] 1. Its scale (metric) should be appropriate for the research question. […] 2. It should be independent of sample size. […] 3. As a point estimate, an effect size should have good statistical properties; that is, it should be unbiased, consistent […], and efficient […]. 4. The effect size [should be] reported with a confidence interval. […] Not all effect size measures […] have all the properties just listed. But it is possible to report multiple effect sizes that address the same question in order to improve the communication of the results.” 

“Examples of outcomes with meaningful metrics include salaries in dollars and post-treatment survival time in years. means or contrasts for variables with meaningful units are unstandardized effect sizes that can be directly interpreted. […] In medical research, physical measurements with meaningful metrics are often available. […] But in psychological research there are typically no “natural” units for abstract, nonphysical constructs such as intelligence, scholastic achievement, or self-concept. […] Therefore, metrics in psychological research are often arbitrary instead of meaningful. An example is the total score for a set of true-false items. Because responses can be coded with any two different numbers, the total is arbitrary. Standard scores such as percentiles and normal deviates are arbitrary, too […] Standardized effect sizes can be computed for results expressed in arbitrary metrics. Such effect sizes can also be directly compared across studies where outcomes have different scales. this is because standardized effect sizes are based on units that have a common meaning regardless of the original metric.”

“1. It is better to report unstandardized effect sizes for outcomes with meaningful metrics. This is because the original scale is lost when results are standardized. 2. Unstandardized effect sizes are best for comparing results across different samples measured on the same outcomes. […] 3. Standardized effect sizes are better for comparing conceptually similar results based on different units of measure. […] 4. Standardized effect sizes are affected by the corresponding unstandardized effect sizes plus characteristics of the study, including its design […], whether factors are fixed or random, the extent of error variance, and sample base rates. This means that standardized effect sizes are less directly comparable over studies that differ in their designs or samples. […] 5. There is no such thing as T-shirt effect sizes (Lenth, 2006– 2009) that classify standardized effect sizes as “small,” “medium,” or “large” and apply over all research areas. This is because what is considered a large effect in one area may be seen as small or trivial in another. […] 6. There is usually no way to directly translate standardized effect sizes into implications for substantive significance. […] It is standardized effect sizes from sets of related studies that are analyzed in most meta analyses.”

July 16, 2017 Posted by | Books, Psychology, Statistics | Leave a comment

Beyond Significance Testing (II)

I have added some more quotes and observations from the book below.

“The least squares estimators M and s2 are not robust against the effects of extreme scores. […] Conventional methods to construct confidence intervals rely on sample standard deviations to estimate standard errors. These methods also rely on critical values in central test distributions, such as t and z, that assume normality or homoscedasticity […] Such distributional assumptions are not always plausible. […] One option to deal with outliers is to apply transformations, which convert original scores with a mathematical operation to new ones that may be more normally distributed. The effect of applying a monotonic transformation is to compress one part of the distribution more than another, thereby changing its shape but not the rank order of the scores. […] It can be difficult to find a transformation that works in a particular data set. Some distributions can be so severely nonnormal that basically no transformation will work. […] An alternative that also deals with departures from distributional assumptions is robust estimation. Robust (resistant) estimators are generally less affected than least squares estimators by outliers or nonnormality.”

“An estimator’s quantitative robustness can be described by its finite-sample breakdown point (BP), or the smallest proportion of scores that when made arbitrarily very large or small renders the statistic meaningless. The lower the value of BP, the less robust the estimator. For both M and s2, BP = 0, the lowest possible value. This is because the value of either statistic can be distorted by a single outlier, and the ratio 1/N approaches zero as sample size increases. In contrast, BP = .50 for the median because its value is not distorted by arbitrarily extreme scores unless they make up at least half the sample. But the median is not an optimal estimator because its value is determined by a single score, the one at the 50th percentile. In this sense, all the other scores are discarded by the median. A compromise between the sample mean and the median is the trimmed mean. A trimmed mean Mtr is calculated by (a) ordering the scores from lowest to highest, (b) deleting the same proportion of the most extreme scores from each tail of the distribution, and then (c) calculating the average of the scores that remain. […] A common practice is to trim 20% of the scores from each tail of the distribution when calculating trimmed estimators. This proportion tends to maintain the robustness of trimmed means while minimizing their standard errors when sampling from symmetrical distributions […] For 20% trimmed means, BP = .20, which says they are robust against arbitrarily extreme scores unless such outliers make up at least 20% of the sample.”

The standard H0 is both a point hypothesis and a nil hypothesis. A point hypothesis specifies the numerical value of a parameter or the difference between two or more parameters, and a nil hypothesis states that this value is zero. The latter is usually a prediction that an effect, difference, or association is zero. […] Nil hypotheses as default explanations may be fine in new research areas when it is unknown whether effects exist at all. But they are less suitable in established areas when it is known that some effect is probably not zero. […] Nil hypotheses are tested much more often than non-nil hypotheses even when the former are implausible. […] If a nil hypothesis is implausible, estimated probabilities of data will be too low. This means that risk for Type I error is basically zero and a Type II error is the only possible kind when H0 is known in advance to be false.”

“Too many researchers treat the conventional levels of α, either .05 or .01, as golden rules. If other levels of α are specifed, they tend to be even lower […]. Sanctification of .05 as the highest “acceptable” level is problematic. […] Instead of blindly accepting either .05 or .01, one does better to […] [s]pecify a level of α that reflects the desired relative seriousness (DRS) of Type I error versus Type II error. […] researchers should not rely on a mechanical ritual (i.e., automatically specify .05 or .01) to control risk for Type I error that ignores the consequences of Type II error.”

“Although p and α are derived in the same theoretical sampling distribution, p does not estimate the conditional probability of a Type I error […]. This is because p is based on a range of results under H0, but α has nothing to do with actual results and is supposed to be specified before any data are collected. Confusion between p and α is widespread […] To differentiate the two, Gigerenzer (1993) referred to p as the exact level of significance. If p = .032 and α = .05, H0 is rejected at the .05 level, but .032 is not the long-run probability of Type I error, which is .05 for this example. The exact level of significance is the conditional probability of the data (or any result even more extreme) assuming H0 is true, given all other assumptions about sampling, distributions, and scores. […] Because p values are estimated assuming that H0 is true, they do not somehow measure the likelihood that H0 is correct. […] The false belief that p is the probability that H0 is true, or the inverse probability error […] is widespread.”

“Probabilities from significance tests say little about effect size. This is because essentially any test statistic (TS) can be expressed as the product TS = ES × f(N) […] where ES is an effect size and f(N) is a function of sample size. This equation explains how it is possible that (a) trivial effects can be statistically significant in large samples or (b) large effects may not be statistically significant in small samples. So p is a confounded measure of effect size and sample size.”

“Power is the probability of getting statistical significance over many random replications when H1 is true. it varies directly with sample size and the magnitude of the population effect size. […] This combination leads to the greatest power: a large population effect size, a large sample, a higher level of α […], a within-subjects design, a parametric test rather than a nonparametric test (e.g., t instead of Mann–Whitney), and very reliable scores. […] Power .80 is generally desirable, but an even higher standard may be need if consequences of Type II error are severe. […] Reviews from the 1970s and 1980s indicated that the typical power of behavioral science research is only about .50 […] and there is little evidence that power is any higher in more recent studies […] Ellis (2010) estimated that < 10% of studies have samples sufficiently large to detect smaller population effect sizes. Increasing sample size would address low power, but the number of additional cases necessary to reach even nominal power when studying smaller effects may be so great as to be practically impossible […] Too few researchers, generally < 20% (Osborne, 2008), bother to report prospective power despite admonitions to do so […] The concept of power does not stand without significance testing. as statistical tests play a smaller role in the analysis, the relevance of power also declines. If significance tests are not used, power is irrelevant. Cumming (2012) described an alternative called precision for research planning, where the researcher specifies a target margin of error for estimating the parameter of interest. […] The advantage over power analysis is that researchers must consider both effect size and precision in study planning.”

“Classical nonparametric tests are alternatives to the parametric t and F tests for means (e.g., the Mann–Whitney test is the nonparametric analogue to the t test). Nonparametric tests generally work by converting the original scores to ranks. They also make fewer assumptions about the distributions of those ranks than do parametric tests applied to the original scores. Nonparametric tests date to the 1950s–1960s, and they share some limitations. One is that they are not generally robust against heteroscedasticity, and another is that their application is typically limited to single-factor designs […] Modern robust tests are an alternative. They are generally more flexible than nonparametric tests and can be applied in designs with multiple factors. […] At the end of the day, robust statistical tests are subject to many of the same limitations as other statistical tests. For example, they assume random sampling albeit from population distributions that may be nonnormal or heteroscedastic; they also assume that sampling error is the only source of error variance. Alternative tests, such as the Welch–James and Yuen–Welch versions of a robust t test, do not always yield the same p value for the same data, and it is not always clear which alternative is best (Wilcox, 2003).”

July 11, 2017 Posted by | Books, Psychology, Statistics | Leave a comment

Beyond Significance Testing (I)

“This book introduces readers to the principles and practice of statistics reform in the behavioral sciences. it (a) reviews the now even larger literature about shortcomings of significance testing; (b) explains why these criticisms have sufficient merit to justify major changes in the ways researchers analyze their data and report the results; (c) helps readers acquire new skills concerning interval estimation and effect size estimation; and (d) reviews alternative ways to test hypotheses, including Bayesian estimation. […] I assume that the reader has had undergraduate courses in statistics that covered at least the basics of regression and factorial analysis of variance. […] This book is suitable as a textbook for an introductory course in behavioral science statistics at the graduate level.”

I’m currently reading this book. I have so far read 8 out of the 10 chapters included, and I’m currently sort of hovering between a 3 and 4 star goodreads rating; some parts of the book are really great, but there are also a few aspects I don’t like. Some parts of the coverage are rather technical and I’m still debating to which extent I should cover the technical stuff in detail later here on the blog; there are quite a few equations included in the book and I find it annoying to cover math using the wordpress format of this blog. For now I’ll start out with a reasonably non-technical post with some quotes and key ideas from the first parts of the book.

“In studies of intervention outcomes, a statistically significant difference between treated and untreated cases […] has nothing to do with whether treatment leads to any tangible benefits in the real world. In the context of diagnostic criteria, clinical significance concerns whether treated cases can no longer be distinguished from control cases not meeting the same criteria. For example, does treatment typically prompt a return to normal levels of functioning? A treatment effect can be statistically significant yet trivial in terms of its clinical significance, and clinically meaningful results are not always statistically significant. Accordingly, the proper response to claims of statistical significance in any context should be “so what?” — or, more pointedly, “who cares?” — without more information.”

“There are free computer tools for estimating power, but most researchers — probably at least 80% (e.g., Ellis, 2010) — ignore the power of their analyses. […] Ignoring power is regrettable because the median power of published nonexperimental studies is only about .50 (e.g., Maxwell, 2004). This implies a 50% chance of correctly rejecting the null hypothesis based on the data. In this case the researcher may as well not collect any data but instead just toss a coin to decide whether or not to reject the null hypothesis. […] A consequence of low power is that the research literature is often difficult to interpret. Specifically, if there is a real effect but power is only .50, about half the studies will yield statistically significant results and the rest will yield no statistically significant findings. If all these studies were somehow published, the number of positive and negative results would be roughly equal. In an old-fashioned, narrative review, the research literature would appear to be ambiguous, given this balance. It may be concluded that “more research is needed,” but any new results will just reinforce the original ambiguity, if power remains low.”

“Statistical tests of a treatment effect that is actually clinically significant may fail to reject the null hypothesis of no difference when power is low. If the researcher in this case ignored whether the observed effect size is clinically significant, a potentially beneficial treatment may be overlooked. This is exactly what was found by Freiman, Chalmers, Smith, and Kuebler (1978), who reviewed 71 randomized clinical trials of mainly heart- and cancer-related treatments with “negative” results (i.e., not statistically significant). They found that if the authors of 50 of the 71 trials had considered the power of their tests along with the observed effect sizes, those authors should have concluded just the opposite, or that the treatments resulted in clinically meaningful improvements.”

“Even if researchers avoided the kinds of mistakes just described, there are grounds to suspect that p values from statistical tests are simply incorrect in most studies: 1. They (p values) are estimated in theoretical sampling distributions that assume random sampling from known populations. Very few samples in behavioral research are random samples. Instead, most are convenience samples collected under conditions that have little resemblance to true random sampling. […] 2. Results of more quantitative reviews suggest that, due to assumptions violations, there are few actual data sets in which significance testing gives accurate results […] 3. Probabilities from statistical tests (p values) generally assume that all other sources of error besides sampling error are nil. This includes measurement error […] Other sources of error arise from failure to control for extraneous sources of variance or from flawed operational definitions of hypothetical constructs. It is absurd to assume in most studies that there is no error variance besides sampling error. Instead it is more practical to expect that sampling error makes up the small part of all possible kinds of error when the number of cases is reasonably large (Ziliak & mcCloskey, 2008).”

“The p values from statistical tests do not tell researchers what they want to know, which often concerns whether the data support a particular hypothesis. This is because p values merely estimate the conditional probability of the data under a statistical hypothesis — the null hypothesis — that in most studies is an implausible, straw man argument. In fact, p values do not directly “test” any hypothesis at all, but they are often misinterpreted as though they describe hypotheses instead of data. Although p values ultimately provide a yes-or-no answer (i.e., reject or fail to reject the null hypothesis), the question — p < a?, where a is the criterion level of statistical significance, usually .05 or .01 — is typically uninteresting. The yes-or-no answer to this question says nothing about scientific relevance, clinical significance, or effect size. […] determining clinical significance is not just a matter of statistics; it also requires strong knowledge about the subject matter.”

“[M]any null hypotheses have little if any scientific value. For example, Anderson et al. (2000) reviewed null hypotheses tested in several hundred empirical studies published from 1978 to 1998 in two environmental sciences journals. They found many implausible null hypotheses that specified things such as equal survival probabilities for juvenile and adult members of a species or that growth rates did not differ across species, among other assumptions known to be false before collecting data. I am unaware of a similar survey of null hypotheses in the behavioral sciences, but I would be surprised if the results would be very different.”

“Hoekstra, Finch, Kiers, and Johnson (2006) examined a total of 266 articles published in Psychonomic Bulletin & Review during 2002–2004. Results of significance tests were reported in about 97% of the articles, but confidence intervals were reported in only about 6%. Sadly, p values were misinterpreted in about 60% of surveyed articles. Fidler, Burgman, Cumming, Buttrose, and Thomason (2006) sampled 200 articles published in two different biology journals. Results of significance testing were reported in 92% of articles published during 2001–2002, but this rate dropped to 78% in 2005. There were also corresponding increases in the reporting of confidence intervals, but power was estimated in only 8% and p values were misinterpreted in 63%. […] Sun, Pan, and Wang (2010) reviewed a total of 1,243 works published in 14 different psychology and education journals during 2005–2007. The percentage of articles reporting effect sizes was 49%, and 57% of these authors interpreted their effect sizes.”

“It is a myth that the larger the sample, the more closely it approximates a normal distribution. This idea probably stems from a misunderstanding of the central limit theorem, which applies to certain group statistics such as means. […] This theorem justifies approximating distributions of random means with normal curves, but it does not apply to distributions of scores in individual samples. […] larger samples do not generally have more normal distributions than smaller samples. If the population distribution is, say, positively skewed, this shape will tend to show up in the distributions of random samples that are either smaller or larger.”

“A standard error is the standard deviation in a sampling distribution, the probability distribution of a statistic across all random samples drawn from the same population(s) and with each sample based on the same number of cases. It estimates the amount of sampling error in standard deviation units. The square of a standard error is the error variance. […] Variability of the sampling distributions […] decreases as the sample size increases. […] The standard error sM, which estimates variability of the group statistic M, is often confused with the standard deviation s, which measures variability at the case level. This confusion is a source of misinterpretation of both statistical tests and confidence intervals […] Note that the standard error sM itself has a standard error (as do standard errors for all other kinds of statistics). This is because the value of sM varies over random samples. This explains why one should not overinterpret a confidence interval or p value from a significance test based on a single sample.”

“Standard errors estimate sampling error under random sampling. What they measure when sampling is not random may not be clear. […] Standard errors also ignore […] other sources of error [:] 1. Measurement error [which] refers to the difference between an observed score X and the true score on the underlying construct. […] Measurement error reduces absolute effect sizes and the power of statistical tests. […] 2. Construct definition error [which] involves problems with how hypothetical constructs are defined or operationalized. […] 3. Specification error [which] refers to the omission from a regression equation of at least one predictor that covaries with the measured (included) predictors. […] 4. Treatment implementation error occurs when an intervention does not follow prescribed procedures. […] Gosset used the term real error to refer all types of error besides sampling error […]. In reasonably large samples, the impact of real error may be greater than that of sampling error.”

“The technique of bootstrapping […] is a computer-based method of resampling that recombines the cases in a data set in different ways to estimate statistical precision, with fewer assumptions than traditional methods about population distributions. Perhaps the best known form is nonparametric bootstrapping, which generally makes no assumptions other than that the distribution in the sample reflects the basic shape of that in the population. It treats your data file as a pseudo-population in that cases are randomly selected with replacement to generate other data sets, usually of the same size as the original. […] The technique of nonparametric bootstrapping seems well suited for interval estimation when the researcher is either unwilling or unable to make a lot of assumptions about population distributions. […] potential limitations of nonparametric bootstrapping: 1. Nonparametric bootstrapping simulates random sampling, but true random sampling is rarely used in practice. […] 2. […] If the shape of the sample distribution is very different compared with that in the population, results of nonparametric bootstrapping may have poor external validity. 3. The “population” from which bootstrapped samples are drawn is merely the original data file. If this data set is small or the observations are not independent, resampling from it will not somehow fix these problems. In fact, resampling can magnify the effects of unusual features in a small data set […] 4. Results of bootstrap analyses are probably quite biased in small samples, but this is true of many traditional methods, too. […] [In] parametric bootstrapping […] the researcher specifies the numerical and distributional properties of a theoretical probability density function, and then the computer randomly samples from that distribution. When repeated many times by the computer, values of statistics in these synthesized samples vary randomly about the parameters specified by the researcher, which simulates sampling error.”

July 9, 2017 Posted by | Books, Psychology, Statistics | Leave a comment

Melanoma therapeutic strategies that select against resistance

A short lecture, but interesting:

If you’re not an oncologist, these two links in particular might be helpful to have a look at before you start out: BRAF (gene) & Myc. A very substantial proportion of the talk is devoted to math and stats methodology (which some people will find interesting and others …will not).

July 3, 2017 Posted by | Biology, Cancer/oncology, Genetics, Lectures, Mathematics, Medicine, Statistics | Leave a comment

The Personality Puzzle (I)

I don’t really like this book, which is a personality psychology introductory textbook by David Funder. I’ve read the first 400 pages (out of 700), but I’m still debating whether or not to finish it, it just isn’t very good; the level of coverage is low, it’s very fluffy and the signal-to-noise ratio is nowhere near where I’d like it to be when I’m reading academic texts. Some parts of it frankly reads like popular science. However despite not feeling that the book is all that great I can’t justify not blogging it; stuff I don’t blog I tend to forget, and if I’m reading a mediocre textbook anyway I should at least try to pick out some of the decent stuff in there which keeps me reading and try to make it easier for me to recall that stuff later. Some parts of- and arguments/observations included in the book are in my opinion just plain silly or stupid, but I won’t go into these things in this post because I don’t really see what would be the point of doing that.

The main reason why I decided to give the book a go was that I liked Funder’s book Personality Judgment, which I read a few years ago and which deals with some topics also covered superficially in this text – it’s a much better book, in my opinion, at least as far as I can remember (…I have actually been starting to wonder if it was really all that great, if it was written by the same guy who wrote this book…), if you’re interested in these matters. If you’re interested in a more ‘pure’ personality psychology text, a significantly better alternative is Leary et al.‘s Handbook of Individual Differences in Social Behavior. Because of the multi-author format it also includes some very poor chapters, but those tend to be somewhat easy to identify and skip to get to the good stuff if you’re so inclined, and the general coverage is at a much higher level than that of this book.

Below I have added some quotes and observations from the first 150 pages of the book.

“A theory that accounts for certain things extremely well will probably not explain everything else so well. And a theory that tries to explain almost everything […] would probably not provide the best explanation for any one thing. […] different [personality psychology] basic approaches address different sets of questions […] each basic approach usually just ignores the topics it is not good at explaining.”

Personality psychology tends to emphasize how individuals are different from one another. […] Other areas of psychology, by contrast, are more likely to treat people as if they were the same or nearly the same. Not only do the experimental subfields of psychology, such as cognitive and social psychology, tend to ignore how people are different from each other, but also the statistical analyses central to their research literally put individual differences into their “error” terms […] Although the emphasis of personality psychology often entails categorizing and labeling people, it also leads the field to be extraordinarily sensitive — more than any other area of psychology — to the fact that people really are different.”

“If you want to “look at” personality, what do you look at, exactly? Four different things. First, and perhaps most obviously, you can have the person describe herself. Personality psychologists often do exactly this. Second, you can ask people who know the person to describe her. Third, you can check on how the person is faring in life. And finally, you can observe what the person does and try to measure her behavior as directly and objectively as possible. These four types of clues can be called S [self-judgments], I [informants], L [life], and B [behavior] data […] The point of the four-way classification […] is not to place every kind of data neatly into one and only one category. Rather, the point is to illustrate the types of data that are relevant to personality and to show how they all have both advantages and disadvantages.”

“For cost-effectiveness, S data simply cannot be beat. […] According to one analysis, 70 percent of the articles in an important personality journal were based on self-report (Vazire, 2006).”

“I data are judgments by knowledgeable “informants” about general attributes of the individual’s personality. […] Usually, close acquaintanceship paired with common sense is enough to allow people to make judgments of each other’s attributes with impressive accuracy […]. Indeed, they may be more accurate than self-judgments, especially when the judgments concern traits that are extremely desirable or extremely undesirable […]. Only when the judgments are of a technical nature (e.g., the diagnosis of a mental disorder) does psychological education become relevant. Even then, acquaintances without professional training are typically well aware when someone has psychological problems […] psychologists often base their conclusions on contrived tests of one kind or another, or on observations in carefully constructed and controlled environments. Because I data derive from behaviors informants have seen in daily social interactions, they enjoy an extra chance of being relevant to aspects of personality that affect important life outcomes. […] I data reflect the opinions of people who interact with the person every day; they are the person’s reputation. […] personality judgments can [however] be [both] unfair as well as mistaken […] The most common problem that arises from letting people choose their own informants — the usual practice in research — may be the “letter of recommendation effect” […] research participants may tend to nominate informants who think well of them, leading to I data that provide a more positive picture than might have been obtained from more neutral parties.”

“L data […] are verifable, concrete, real-life facts that may hold psychological significance. […] An advantage of using archival records is that they are not prone to the potential biases of self-report or the judgments of others. […] [However] L data have many causes, so trying to establish direct connections between specific attributes of personality and life outcomes is chancy. […] a psychologist can predict a particular outcome from psychological data only to the degree that the outcome is psychologically caused. L data often are psychologically caused only to a small degree.”

“The idea of B data is that participants are found, or put, in some sort of a situation, sometimes referred to as a testing situation, and then their behavior is directly observed. […] B data are expensive [and] are not used very often compared to the other types. Relatively few psychologists have the necessary resources.”

“Reliable data […] are measurements that reflect what you are trying to assess and are not affected by anything else. […] When trying to measure a stable attribute of personality—a trait rather than a state — the question of reliability reduces to this: Can you get the same result more than once? […] Validity is the degree to which a measurement actually reflects what one thinks or hopes it does. […] for a measure to be valid, it must be reliable. But a reliable measure is not necessarily valid. […] A measure that is reliable gives the same answer time after time. […] But even if a measure is the same time after time, that does not necessarily mean it is correct.”

“[M]ost personality tests provide S data. […] Other personality tests yield B data. […] IQ tests […] yield B data. Imagine trying to assess intelligence using an S-data test, asking questions such as “Are you an intelligent person?” and “Are you good at math?” Researchers have actually tried this, but simply asking people whether they are smart turns out to be a poor way to measure intelligence”.

“The answer an individual gives to any one question might not be particularly informative […] a single answer will tend to be unreliable. But if a group of similar questions is asked, the average of the answers ought to be much more stable, or reliable, because random fluctuations tend to cancel each other out. For this reason, one way to make a personality test more reliable is simply to make it longer.”

“The factor analytic method of test construction is based on a statistical technique. Factor analysis identifies groups of things […] that seem to have something in common. […] To use factor analysis to construct a personality test, researchers begin with a long list of […] items […] The next step is to administer these items to a large number of participants. […] The analysis is based on calculating correlation coefficients between each item and every other item. Many items […] will not correlate highly with anything and can be dropped. But the items that do correlate with each other can be assembled into groups. […] The next steps are to consider what the items have in common, and then name the factor. […] Factor analysis has been used not only to construct tests, but also to decide how many fundamental traits exist […] Various analysts have come up with different answers.”

[The Big Five were derived from factor analyses.]

The empirical strategy of test construction is an attempt to allow reality to speak for itself. […] Like the factor analytic approach described earlier, the frst step of the empirical approach is to gather lots of items. […] The second step, however, is quite different. For this step, you need to have a sample of participants who have already independently been divided into the groups you are interested in. Occupational groups and diagnostic categories are often used for this purpose. […] Then you are ready for the third step: administering your test to your participants. The fourth step is to compare the answers given by the different groups of participants. […] The basic assumption of the empirical approach […] is that certain kinds of people answer certain questions on personality inventories in distinctive ways. If you answer questions the same way as members of some occupational or diagnostic group did in the original derivation study, then you might belong to that group too. […] responses to empirically derived tests are difficult to fake. With a personality test of the straightforward, S-data variety, you can describe yourself the way you want to be seen, and that is indeed the score you will get. But because the items on empirically derived scales sometimes seem backward or absurd, it is difficult to know how to answer in such a way as to guarantee the score you want. This is often held up as one of the great advantages of the empirical approach […] [However] empirically derived tests are only as good as the criteria by which they are developed or against which they are cross-validated. […] the empirical correlates of item responses by which these tests are assembled are those found in one place, at one time, with one group of participants. If no attention is paid to item content, then there is no way to be confident that the test will work in a similar manner at another time, in another place, with different participants. […] A particular concern is that the empirical correlates of item response might change over time. The MMPI was developed decades ago and has undergone a major revision only once”.

“It is not correct, for example, that the significance level provides the probability that the substantive (non-null) hypothesis is true. […] the significance level gives the probability of getting the result one found if the null hypothesis were true. One statistical writer offered the following analogy (Dienes, 2011): The probability that a person is dead, given that a shark has bitten his head off, is 1.0. However, the probability that a person’s head was bitten off by a shark, given that he is dead, is much lower. The probability of the data given the hypothesis, and of the hypothesis given the data, is not the same thing. And the latter is what we really want to know. […] An effect size is more meaningful than a significance level. […] It is both facile and misleading to use the frequently taught method of squaring correlations if the intention is to evaluate effect size.”

June 30, 2017 Posted by | Books, Psychology, Statistics | Leave a comment