Econstudentlog

Big Data (I?)

Below a few observations from the first half of the book, as well as some links related to the topic coverage.

“The data we derive from the Web can be classified as structured, unstructured, or semi-structured. […] Carefully structured and tabulated data is relatively easy to manage and is amenable to statistical analysis, indeed until recently statistical analysis methods could be applied only to structured data. In contrast, unstructured data is not so easily categorized, and includes photos, videos, tweets, and word-processing documents. Once the use of the World Wide Web became widespread, it transpired that many such potential sources of information remained inaccessible because they lacked the structure needed for existing analytical techniques to be applied. However, by identifying key features, data that appears at first sight to be unstructured may not be completely without structure. Emails, for example, contain structured metadata in the heading as well as the actual unstructured message […] and so may be classified as semi-structured data. Metadata tags, which are essentially descriptive references, can be used to add some structure to unstructured data. […] Dealing with unstructured data is challenging: since it cannot be stored in traditional databases or spreadsheets, special tools have had to be developed to extract useful information. […] Approximately 80 per cent of the world’s data is unstructured in the form of text, photos, and images, and so is not amenable to the traditional methods of structured data analysis. ‘Big data’ is now used to refer not just to the total amount of data generated and stored electronically, but also to specific datasets that are large in both size and complexity, with which new algorithmic techniques are required in order to extract useful information from them.”

“In the digital age we are no longer entirely dependent on samples, since we can often collect all the data we need on entire populations. But the size of these increasingly large sets of data cannot alone provide a definition for the term ‘big data’ — we must include complexity in any definition. Instead of carefully constructed samples of ‘small data’ we are now dealing with huge amounts of data that has not been collected with any specific questions in mind and is often unstructured. In order to characterize the key features that make data big and move towards a definition of the term, Doug Laney, writing in 2001, proposed using the three ‘v’s: volume, variety, and velocity. […] ‘Volume’ refers to the amount of electronic data that is now collected and stored, which is growing at an ever-increasing rate. Big data is big, but how big? […] Generally, we can say the volume criterion is met if the dataset is such that we cannot collect, store, and analyse it using traditional computing and statistical methods. […] Although a great variety of data [exists], ultimately it can all be classified as structured, unstructured, or semi-structured. […] Velocity is necessarily connected with volume: the faster the data is generated, the more there is. […] Velocity also refers to the speed at which data is electronically processed. For example, sensor data, such as that generated by an autonomous car, is necessarily generated in real time. If the car is to work reliably, the data […] must be analysed very quickly […] Variability may be considered as an additional dimension of the velocity concept, referring to the changing rates in flow of data […] computer systems are more prone to failure [during peak flow periods]. […] As well as the original three ‘v’s suggested by Laney, we may add ‘veracity’ as a fourth. Veracity refers to the quality of the data being collected. […] Taken together, the four main characteristics of big data – volume, variety, velocity, and veracity – present a considerable challenge in data management.” [As regular readers of this blog might be aware, not everybody would agree with the author here about the inclusion of veracity as a defining feature of big data – “Many have suggested that there are more V’s that are important to the big data problem [than volume, variety & velocity] such as veracity and value (IEEE BigData 2013). Veracity refers to the trustworthiness of the data, and value refers to the value that the data adds to creating knowledge about a topic or situation. While we agree that these are important data characteristics, we do not see these as key features that distinguish big data from regular data. It is important to evaluate the veracity and value of all data, both big and small. (Knoth & Schmid)]

“Anyone who uses a personal computer, laptop, or smartphone accesses data stored in a database. Structured data, such as bank statements and electronic address books, are stored in a relational database. In order to manage all this structured data, a relational database management system (RDBMS) is used to create, maintain, access, and manipulate the data. […] Once […] the database [has been] constructed we can populate it with data and interrogate it using structured query language (SQL). […] An important aspect of relational database design involves a process called normalization which includes reducing data duplication to a minimum and hence reduces storage requirements. This allows speedier queries, but even so as the volume of data increases the performance of these traditional databases decreases. The problem is one of scalability. Since relational databases are essentially designed to run on just one server, as more and more data is added they become slow and unreliable. The only way to achieve scalability is to add more computing power, which has its limits. This is known as vertical scalability. So although structured data is usually stored and managed in an RDBMS, when the data is big, say in terabytes or petabytes and beyond, the RDBMS no longer works efficiently, even for structured data. An important feature of relational databases and a good reason for continuing to use them is that they conform to the following group of properties: atomicity, consistency, isolation, and durability, usually known as ACID. Atomicity ensures that incomplete transactions cannot update the database; consistency excludes invalid data; isolation ensures one transaction does not interfere with another transaction; and durability means that the database must update before the next transaction is carried out. All these are desirable properties but storing and accessing big data, which is mostly unstructured, requires a different approach. […] given the current data explosion there has been intensive research into new storage and management techniques. In order to store these massive datasets, data is distributed across servers. As the number of servers involved increases, the chance of failure at some point also increases, so it is important to have multiple, reliably identical copies of the same data, each stored on a different server. Indeed, with the massive amounts of data now being processed, systems failure is taken as inevitable and so ways of coping with this are built into the methods of storage.”

“A distributed file system (DFS) provides effective and reliable storage for big data across many computers. […] Hadoop DFS [is] one of the most popular DFS […] When we use Hadoop DFS, the data is distributed across many nodes, often tens of thousands of them, physically situated in data centres around the world. […] The NameNode deals with all requests coming in from a client computer; it distributes storage space, and keeps track of storage availability and data location. It also manages all the basic file operations (e.g. opening and closing files) and controls data access by client computers. The DataNodes are responsible for actually storing the data and in order to do so, create, delete, and replicate blocks as necessary. Data replication is an essential feature of the Hadoop DFS. […] It is important that several copies of each block are stored so that if a DataNode fails, other nodes are able to take over and continue with processing tasks without loss of data. […] Data is written to a DataNode only once but will be read by an application many times. […] One of the functions of the NameNode is to determine the best DataNode to use given the current usage, ensuring fast data access and processing. The client computer then accesses the data block from the chosen node. DataNodes are added as and when required by the increased storage requirements, a feature known as horizontal scalability. One of the main advantages of Hadoop DFS over a relational database is that you can collect vast amounts of data, keep adding to it, and, at that time, not yet have any clear idea of what you want to use it for. […] structured data with identifiable rows and columns can be easily stored in a RDBMS while unstructured data can be stored cheaply and readily using a DFS.”

NoSQL is the generic name used to refer to non-relational databases and stands for Not only SQL. […] The non-relational model has some features that are necessary in the management of big data, namely scalability, availability, and performance. With a relational database you cannot keep scaling vertically without loss of function, whereas with NoSQL you scale horizontally and this enables performance to be maintained. […] Within the context of a distributed database system, consistency refers to the requirement that all copies of data should be the same across nodes. […] Availability requires that if a node fails, other nodes still function […] Data, and hence DataNodes, are distributed across physically separate servers and communication between these machines will sometimes fail. When this occurs it is called a network partition. Partition tolerance requires that the system continues to operate even if this happens. In essence, what the CAP [Consistency, Availability, Partition Tolerance] Theorem states is that for any distributed computer system, where the data is shared, only two of these three criteria can be met. There are therefore three possibilities; the system must be: consistent and available, consistent and partition tolerant, or partition tolerant and available. Notice that since in a RDMS the network is not partitioned, only consistency and availability would be of concern and the RDMS model meets both of these criteria. In NoSQL, since we necessarily have partitioning, we have to choose between consistency and availability. By sacrificing availability, we are able to wait until consistency is achieved. If we choose instead to sacrifice consistency it follows that sometimes the data will differ from server to server. The somewhat contrived acronym BASE (Basically Available, Soft, and Eventually consistent) is used as a convenient way of describing this situation. BASE appears to have been chosen in contrast to the ACID properties of relational databases. ‘Soft’ in this context refers to the flexibility in the consistency requirement. The aim is not to abandon any one of these criteria but to find a way of optimizing all three, essentially a compromise. […] The name NoSQL derives from the fact that SQL cannot be used to query these databases. […] There are four main types of non-relational or NoSQL database: key-value, column-based, document, and graph – all useful for storing large amounts of structured and semi-structured data. […] Currently, an approach called NewSQL is finding a niche. […] the aim of this latent technology is to solve the scalability problems associated with the relational model, making it more useable for big data.”

“A popular way of dealing with big data is to divide it up into small chunks and then process each of these individually, which is basically what MapReduce does by spreading the required calculations or queries over many, many computers. […] Bloom filters are particularly suited to applications where storage is an issue and where the data can be thought of as a list. The basic idea behind Bloom filters is that we want to build a system, based on a list of data elements, to answer the question ‘Is X in the list?’ With big datasets, searching through the entire set may be too slow to be useful, so we use a Bloom filter which, being a probabilistic method, is not 100 per cent accurate—the algorithm may decide that an element belongs to the list when actually it does not; but it is a fast, reliable, and storage efficient method of extracting useful knowledge from data. Bloom filters have many applications. For example, they can be used to check whether a particular Web address leads to a malicious website. In this case, the Bloom filter would act as a blacklist of known malicious URLs against which it is possible to check, quickly and accurately, whether it is likely that the one you have just clicked on is safe or not. Web addresses newly found to be malicious can be added to the blacklist. […] A related example is that of malicious email messages, which may be spam or may contain phishing attempts. A Bloom filter provides us with a quick way of checking each email address and hence we would be able to issue a timely warning if appropriate. […] they can [also] provide a very useful way of detecting fraudulent credit card transactions.”

Links:

Data.
Punched card.
Clickstream log.
HTTP cookie.
Australian Square Kilometre Array Pathfinder.
The Millionaire Calculator.
Data mining.
Supervised machine learning.
Unsupervised machine learning.
Statistical classification.
Cluster analysis.
Moore’s Law.
Cloud storage. Cloud computing.
Data compression. Lossless data compression. Lossy data compression.
ASCII. Huffman algorithm. Variable-length encoding.
Data compression ratio.
Grayscale.
Discrete cosine transform.
JPEG.
Bit array. Hash function.
PageRank algorithm.
Common crawl.

Advertisements

July 14, 2018 Posted by | Books, Data, Statistics, Computer science | Leave a comment

Frontiers in Statistical Quality Control (I)

The XIth International Workshop on Intelligent Statistical Quality Control took place in Sydney, Australia from August 20 to August 23, 2013. […] The 23 papers in this volume were carefully selected by the scientific program committee, reviewed by its members, revised by the authors and, finally, adapted by the editors for this volume. The focus of the book lies on three major areas of statistical quality control: statistical process control (SPC), acceptance sampling and design of experiments. The majority of the papers deal with statistical process control while acceptance sampling, and design of experiments are treated to a lesser extent.”

I’m currently reading this book. It’s quite technical and a bit longer than many of the other non-fiction books I’ve read this year (…but shorter than others; however it is still ~400 pages of content exclusively devoted to statistical papers), so it may take me a while to finish it. I figured the fact that I may not finish the book in a while was not a good argument against blogging relevant sections of the book now, especially as it’s already been some time since I read the first few chapters.

When reading a book like this one I care a lot more about understanding the concepts than about understanding the proofs, so as usual the amount of math included in the post is limited; please don’t assume it’s because there are no equations in the book.

Below I have added some ideas and observations from the first 100 pages or so of the book’s coverage.

“A growing number of [statistical quality control] applications involve monitoring with rare event data. […] The most common approaches for monitoring such processes involve using an exponential distribution to model the time between the events or using a Bernoulli distribution to model whether or not each opportunity for the event results in its occurrence. The use of a sequence of independent Bernoulli random variables leads to a geometric distribution for the number of non-occurrences between the occurrences of the rare events. One surveillance method is to use a power transformation on the exponential or geometric observations to achieve approximate normality of the in control distribution and then use a standard individuals control chart. We add to the argument that use of this approach is very counterproductive and cover some alternative approaches. We discuss the choice of appropriate performance metrics. […] Most often the focus is on detecting process deterioration, i.e., an increase in the probability of the adverse event or a decrease in the average time between events. Szarka and Woodall (2011) reviewed the extensive number of methods that have been proposed for monitoring processes using Bernoulli data. Generally, it is difficult to better the performance of the Bernoulli cumulative sum (CUSUM) chart of Reynolds and Stoumbos (1999). The Bernoulli and geometric CUSUM charts can be designed to be equivalent […] Levinson (2011) argued that control charts should not be used with healthcare rare event data because in many situations there is an assignable cause for each error, e.g., each hospital-acquired infection or serious prescription error, and each incident should be investigated. We agree that serious adverse events should be investigated whether or not they result in a control chart signal. The investigation of rare adverse events, however, and the implementation of process improvements to prevent future such errors, does not preclude using a control chart to determine if the rate of such events has increased or decreased over time. In fact, a control chart can be used to evaluate the success of any process improvement initiative.”

“The choice of appropriate performance metrics for comparing surveillance schemes for monitoring Bernoulli and exponential data is quite important. The usual Average Run Length (ARL) metric refers to the average number of points plotted on the chart until a signal is given. This metric is most clearly appropriate when the time between the plotted points is constant. […] In some cases, such as in monitoring the number of near-miss accidents, it may be informative to use a metric that reflects the actual time required to obtain an out-of-control signal. Thus one can consider the number of Bernoulli trials until an out-of-control signal is given for Bernoulli data, leading to its average, the ANOS. The ANOS will be proportional to the average time before a signal if the rate at which the Bernoulli trials are observed is constant over time. For exponentially distributed data one could consider the average time to signal, the ATS. If the process is stable, then ANOS = ARL / p and ATS = ARS * θ, where p and θ are the Bernoulli probability and the exponential mean, respectively. […] To assess out-of-control performance we believe it is most realistic to consider steady-state performance where the shift in the parameter occurs at some time after monitoring has begun. […] Under this scenario one cannot easily convert the ARL metric to the ANOS and ATS metrics. Consideration of steady state performance of competing methods is important because some methods have an implicit headstart feature that results in good zero-state performance, but poor steady-state performance.”

“Data aggregation is frequently done when monitoring rare events and for count data generally. For example, one might monitor the number of accidents per month in a plant or the number of patient falls per week in a hospital. […] Schuh et al. (2013) showed […] that there can be significantly long expected delays in detecting process deterioration when data are aggregated over time even when there are few samples with zero events. One can always aggregate data over long enough time periods to avoid zero counts, but the consequence is slower detection of increases in the rate of the adverse event. […] aggregating event data over fixed time intervals, as frequently done in practice, can result in significant delays in detecting increases in the rate of adverse events. […] Another type of aggregation is to wait until one has observed a given number of events before updating a control chart based on a proportion or waiting time. […] This type of aggregation […] does not appear to delay the detection of process changes nearly as much as aggregating data over fixed time periods. […] We believe that the adverse effect of aggregating data over time has not been fully appreciated in practice and more research work is needed on this topic. Only a couple of the most basic scenarios for count data have been studied. […] Virtually all of the work on monitoring the rate of rare events is based on the assumption that there is a sustained shift in the rate. In some applications the rate change may be transient. In this scenario other performance metrics would be needed, such as the probability of detecting the process shift during the transient period. The effect of data aggregation over time might be larger if shifts in the parameter are not sustained.”

Big data is a popular term that is used to describe the large, diverse, complex and/or longitudinal datasets generated from a variety of instruments, sensors and/or computer-based transactions. […] The acquisition of data does not automatically transfer to new knowledge about the system under study. […] To be able to gain knowledge from big data, it is imperative to understand both the scale and scope of big data. The challenges with processing and analyzing big data are not only limited to the size of the data. These challenges include the size, or volume, as well as the variety and velocity of the data (Zikopoulos et al. 2012). Known as the 3V’s, the volume, variety, and/or velocity of the data are the three main characteristics that distinguish big data from the data we have had in the past. […] Many have suggested that there are more V’s that are important to the big data problem such as veracity and value (IEEE BigData 2013). Veracity refers to the trustworthiness of the data, and value refers to the value that the data adds to creating knowledge about a topic or situation. While we agree that these are important data characteristics, we do not see these as key features that distinguish big data from regular data. It is important to evaluate the veracity and value of all data, both big and small. Both veracity and value are related to the concept of data quality, an important research area in the Information Systems (IS) literature for more than 50 years. The research literature discussing the aspects and measures of data quality is extensive in the IS field, but seems to have reached a general agreement that the multiple aspects of data quality can be grouped into several broad categories […]. Two of the categories relevant here are contextual and intrinsic dimensions of data quality. Contextual aspects of data quality are context specific measures that are subjective in nature, including concepts like value-added, believability, and relevance. […] Intrinsic aspects of data quality are more concrete in nature, and include four main dimensions: accuracy, timeliness, consistency, and completeness […] From our perspective, many of the contextual and intrinsic aspects of data quality are related to the veracity and value of the data. That said, big data presents new challenges in conceptualizing, evaluating, and monitoring data quality.”

The application of SPC methods to big data is similar in many ways to the application of SPC methods to regular data. However, many of the challenges inherent to properly studying and framing a problem can be more difficult in the presence of massive amounts of data. […] it is important to note that building the model is not the end-game. The actual use of the analysis in practice is the goal. Thus, some consideration needs to be given to the actual implementation of the statistical surveillance applications. This brings us to another important challenge, that of the complexity of many big data applications. SPC applications have a tradition of back of the napkin methods. The custom within SPC practice is the use of simple methods that are easy to explain like the Shewhart control chart. These are often the best methods to use to gain credibility because they are easy to understand and easy to explain to a non-statistical audience. However, big data often does not lend itself to easy-to-compute or easy-to-explain methods. While a control chart based on a neural net may work well, it may be so difficult to understand and explain that it may be abandoned for inferior, yet simpler methods. Thus, it is important to consider the dissemination and deployment of advanced analytical methods in order for them to be effectively used in practice. […] Another challenge in monitoring high dimensional data sets is the fact that not all of the monitored variables are likely to shift at the same time; thus, some method is necessary to identify the process variables that have changed. In high dimensional data sets, the decomposition methods used with multivariate control charts can become very computationally expensive. Several authors have considered variable selection methods combined with control charts to quickly detect process changes in a variety of practical scenarios including fault detection, multistage processes, and profile monitoring. […] All of these methods based on variable selection techniques are based on the idea of monitoring subsets of potentially faulty variables. […] Some variable reduction methods are needed to better identify shifts. We believe that further work in the areas combining variable selection methods and surveillance are important for quickly and efficiently diagnosing changes in high-dimensional data.

“A multiple stream process (MSP) is a process that generates several streams of output. From the statistical process control standpoint, the quality variable and its specifications are the same in all streams. A classical example is a filling process such as the ones found in beverage, cosmetics, pharmaceutical and chemical industries, where a filler machine may have many heads. […] Although multiple-stream processes are found very frequently in industry, the literature on schemes for the statistical control of such kind of processes is far from abundant. This paper presents a survey of the research on this topic. […] The first specific techniques for the statistical control of MSPs are the group control charts (GCCs) […] Clearly the chief motivation for these charts was to avoid the proliferation of control charts that would arise if every stream were controlled with a separate pair of charts (one for location and other for spread). Assuming the in-control distribution of the quality variable to be the same in all streams (an assumption which is sometimes too restrictive), the control limits should be the same for every stream. So, the basic idea is to build only one chart (or a pair of charts) with the information from all streams.”

“The GCC will work well if the values of the quality variable in the different streams are independent and identically distributed, that is, if there is no cross-correlation between streams. However, such an assumption is often unrealistic. In many real multiple-stream processes, the value of the observed quality variable is typically better described as the sum of two components: a common component (let’s refer to it as “mean level”), exhibiting variation that affects all streams in the same way, and the individual component of each stream, which corresponds to the difference between the stream observation and the common mean level. […] [T]he presence of the mean level component leads to reduced sensitivity of Boyd’s GCC to shifts in the individual component of a stream if the variance […] of the mean level is large with respect to the variance […] of the individual stream components. Moreover, the GCC is a Shewhart-type chart; if the data exhibit autocorrelation, the traditional form of estimating the process standard deviation (for establishing the control limits) based on the average range or average standard deviation of individual samples (even with the Bonferroni or Dunn-Sidak correction) will result in too frequent false alarms, due to the underestimation of the process total variance. […] [I]in the converse situation […] the GCC will have little sensitivity to causes that affect all streams — at least, less sensitivity than would have a chart on the average of the measurements across all streams, since this one would have tighter limits than the GCC. […] Therefore, to monitor MSPs with the two components described, Mortell and Runger (1995) proposed using two control charts: First, a chart for the grand average between streams, to monitor the mean level. […] For monitoring the individual stream components, they proposed using a special range chart (Rt chart), whose statistic is the range between streams, that is, the difference between the largest stream average and the smallest stream average […] the authors commented that both the chart on the average of all streams and the Rt chart can be used even when at each sampling time only a subset of the streams are sampled (provided that the number of streams sampled remains constant). The subset can be varied periodically or even chosen at random. […] it is common in practice to measure only a subset of streams at each sampling time, especially when the number of streams is large. […] Although almost the totality of Mortell and Runger’s paper is about the monitoring of the individual streams, the importance of the chart on the average of all streams for monitoring the mean level of the process cannot be overemphasized.”

“Epprecht and Barros (2013) studied a filling process application where the stream variances were similar, but the stream means differed, wandered, changed from day to day, were very difficult to adjust, and the production runs were too short to enable good estimation of the parameters of the individual streams. The solution adopted to control the process was to adjust the target above the nominal level to compensate for the variation between streams, as a function of the lower specification limit, of the desired false-alarm rate and of a point (shift, power) arbitrarily selected. This would be a MSP version of “acceptance control charts” (Montgomery 2012, Sect. 10.2) if taking samples with more than one observation per stream [is] feasible.”

Most research works consider a small to moderate number of streams. Some processes may have hundreds of streams, and in this case the issue of how to control the false-alarm rate while keeping enough detection power […] becomes a real problem. […] Real multiple-stream processes can be very ill-behaved. The author of this paper has seen a plant with six 20-stream filling processes in which the stream levels had different means and variances and could not be adjusted separately (one single pump and 20 hoses). For many real cases with particular twists like this one, it happens that no previous solution in the literature is applicable. […] The appropriateness and efficiency of [different monitoring methods] depends on the dynamic behaviour of the process over time, on the degree of cross-correlation between streams, on the ratio between the variabilities of the individual streams and of the common component (note that these three factors are interrelated), on the type and size of shifts that are likely and/or relevant to detect, on the ease or difficulty to adjust all streams in the same target, on the process capability, on the number of streams, on the feasibility of taking samples of more than one observation per stream at each sampling time (or even the feasibility of taking one observation of every stream at each sampling time!), on the length of the production runs, and so on. So, the first problem in a practical application is to characterize the process and select the appropriate monitoring scheme (or to adapt one, or to develop a new one). This analysis may not be trivial for the average practitioner in industry. […] Jirasettapong and Rojanarowan (2011) is the only work I have found on the issue of selecting the most suitable monitoring scheme for an MSP. It considers only a limited number of alternative schemes and a few aspects of the problem. More comprehensive analyses are needed.”

June 27, 2018 Posted by | Books, Data, Engineering, Statistics | Leave a comment

Alcohol and Aging (II)

I gave the book 3 stars on goodreads.

As is usual for publications of this nature, the book includes many chapters that cover similar topics and so the coverage can get a bit repetitive if you’re reading it from cover to cover the way I did; most of the various chapter authors obviously didn’t read the other contributions included in the book, and as each chapter is meant to stand on its own you end up with a lot of chapter introductions which cover very similar topics. If you can disregard such aspects it’s a decent book, which covers a wide variety of topics.

Below I have added some observations from some of the chapters of the book which I did not cover in my first post.

It is widely accepted that consuming heavy amounts of alcohol and binge drinking are detrimental to the brain. Animal studies that have examined the anatomical changes that occur to the brain as a consequence of consuming alcohol indicate that heavy alcohol consumption and binge drinking leads to the death of existing neurons [10, 11] and prevents production of new neurons [12, 13]. […] While animal studies indicate that consuming even moderate amounts of alcohol is detrimental to the brain, the evidence from epidemiological studies is less clear. […] Epidemiological studies that have examined the relationship between late life alcohol consumption and cognition have frequently reported that older adults who consume light to moderate amounts of alcohol are less likely to develop dementia and have higher cognitive functioning compared to older adults who do not consume alcohol. […] In a meta-analysis of 15 prospective cohort studies, consuming light to moderate amounts of alcohol was associated with significantly lower relative risk (RR) for Alzheimer’s disease (RR=0.72, 95% CI=0.61–0.86), vascular dementia (RR=0.75, 95% CI=0.57–0.98), and any type of dementia (RR=0.74, 95% CI=0.61–0.91), but not cognitive decline (RR=0.28, 95 % CI=0.03–2.83) [31]. These findings are consistent with a previous meta-analysis by Peters et al. [33] in which light to moderate alcohol consumption was associated with a decreased risk for dementia (RR=0.63, 95 % CI=0.53–0.75) and Alzheimer’s disease (RR=0.57, 95 % CI=0.44–0.74), but not vascular dementia (RR=0.82, 95% CI=0.50–1.35) or cognitive decline RR=0.89, 95% CI=0.67–1.17). […] Mild cognitive impairment (MCI) has been used to describe the prodromal stage of Alzheimer’s disease […]. There is no strong evidence to suggest that consuming alcohol is protective against MCI [39, 40] and several studies have reported non-significant findings [41–43].”

The majority of research on the relationship between alcohol consumption and cognitive outcomes has focused on the amount of alcohol consumed during old age, but there is a growing body of research that has examined the relationship between alcohol consumption during middle age and cognitive outcomes several years or decades later. The evidence from this area of research is mixed with some studies not detecting a significant relationship [17, 58, 59], while others have reported that light to moderate alcohol consumption is associated with preserved cognition [60] and decreased risk for cognitive impairment [31, 61, 62]. […] Several epidemiological studies have reported that light to moderate alcohol consumption is associated with a decreased risk for stroke, diabetes, and heart disease [36, 84, 85]. Similar to the U-shaped relationship between alcohol consumption and dementia, heavy alcohol consumption has been associated with poor health [86, 87]. The decreased risk for several metabolic and vascular health conditions for alcohol consumers has been attributed to antioxidants [54], greater concentrations of high-density lipoprotein cholesterol in the bloodstream [88], and reduced blood clot formation [89]. Stroke, diabetes, heart disease, and related conditions have all been associated with lower cognitive functioning during old age [90, 91]. The reduced prevalence of metabolic and vascular health conditions among light to moderate alcohol consumers may contribute to the decreased risk for dementia and cognitive decline for older adults who consume alcohol. A limitation of the hypothesis that the reduced risk for dementia among light and moderate alcohol consumers is conferred through the reduced prevalence of adverse health conditions associated with dementia is the possibility that this relationship is confounded by reverse causality. Alcohol consumption decreases with advancing age and adults may reduce their alcohol consumption in response to the onset of adverse health conditions […] the higher prevalence of dementia and lower cognitive functioning among abstainers may be due in part to their worse health rather than their alcohol consumption.”

A limitation of large cohort studies is that subjects who choose not to participate or are unable to participate are often less healthy than those who do participate. Non-response bias becomes more pronounced with age because only subjects who have survived to old age and are healthy enough to participate are observed. Studies on alcohol consumption and cognition are sensitive to non-response bias because light and moderate drinkers who are not healthy enough to participate in the study will not be observed. Adults who survive to old age despite consuming very high amounts of alcohol represent an even more select segment of the general population because they may have genetic, behavioral, health, social, or other factors that protect them against the negative effects of heavy alcohol consumption. As a result, the analytic sample of epidemiological studies is more likely to be comprised of “healthy” drinkers, which biases results in favor of finding a positive effect of light to moderate alcohol consumption for cognition and health in general. […] The incidence of Alzheimer’s disease doubles every 5 years after 65 years of age [94] and nearly 40% of older adults aged 85 and over are diagnosed with Alzheimer’s disease [7]. The relatively old age of onset for most dementia cases means the observed protective effect of light to moderate alcohol consumption for dementia may be due to alcohol consumers being more likely to die or drop out of a study as a result of their alcohol consumption before they develop dementia. This bias may be especially strong for heavy alcohol consumers. Not properly accounting for death as a competing outcome has been observed to artificially increase the risk of dementia among older adults with diabetes [95] and the effect that death and other competing outcomes may have on the relationship between alcohol consumption and dementia risk is unclear. […] The majority of epidemiological studies that have studied the relationship between alcohol consumption and cognition treat abstainers as the reference category. This can be problematic because often times the abstainer or non-drinking category includes older adults who stopped consuming alcohol because of poor health […] Not differentiating former alcohol consumers from lifelong abstainers has been found to explain some but not all of the benefit of alcohol consumption for preventing mortality from cardiovascular causes [96].”

“It is common for people to engage in other behaviors while consuming alcohol. This complicates the relationship between alcohol consumption and cognition because many of the behaviors associated with alcohol consumption are positively and negatively associated with cognitive functioning. For example, alcohol consumers are more likely to smoke than non-drinkers [104] and smoking has been associated with an increased risk for dementia and cognitive decline [105]. […] The relationship between alcohol consumption and cognition may also differ between people with or without a history of mental illness. Depression reduces the volume of the hippocampus [106] and there is growing evidence that depression plays an important role in dementia. Depression during middle age is recognized as a risk factor for dementia [107], and high depressive symptoms during old age may be an early symptom of dementia [108]. Middle aged adults with depression or other mental illness who self-medicate with alcohol may be at especially high risk for dementia later in life because of synergistic effects that alcohol and depression has on the brain. […] While current evidence from epidemiological studies indicates that consuming light to moderate amounts of alcohol, in particular wine, does not negatively affect cognition and in many cases is associated with cognitive health, adults who do not consume alcohol should not be encouraged to increase their alcohol consumption until further research clarifies these relationships. Inconsistencies between studies on how alcohol consumption categories are defined make it difficult to determine the “optimal” amount of alcohol consumption to prevent dementia. It is likely that the optimal amount of alcohol varies according to a person’s gender, as well as genetic, physiological, behavioral, and health characteristics, making the issue extremely complex.”

Falls are the leading cause of both fatal and nonfatal injuries among older adults, with one in three older adults falling each year, and 20–30% of people who fall suffer moderate to severe injuries such as lacerations, hip fractures, and head traumas. In fact, falls are the foremost cause of both fractures and traumatic brain injury (TBI) among older adults […] In 2013, 2.5 million nonfatal falls among older adults were treated in ED and more than 734,000 of these patients were hospitalized. […] Our analysis of the 2012 Nationwide Emergency Department Sample (NEDS) data set show that fall-related injury was a presenting problem among 12% of all ED visits by those aged 65+, with significant differences among age groups: 9% among the 65–74 age group, 12 % among the 75–84 age group, and 18 % among the 85+ age group [4]. […] heavy alcohol use predicts fractures. For example, among those 55+ years old in a health survey in England, men who consumed more than 8 units of alcohol and women who consumed more than 6 units on their heaviest drinking day in the past week had significantly increased odds of fractures (OR =1.65, 95% CI =1.37–1.98 for men and OR=2.07, 95% CI =1.28–3.35 for women) [63]. […] The 2008–2009 Canadian Community Health Survey-Healthy Aging also showed that consumption of at least one alcoholic drink per week increased the odds of falling by 40 % among those 65+ years [57].”

I at first was not much impressed by the effect sizes mentioned above because there are surely 100 relevant variables they didn’t account for/couldn’t account for, but then I thought a bit more about it. An important observation here – they don’t mention it in the coverage, but it sprang to mind – is that if sick or frail elderly people consume less alcohol than their more healthy counterparts, and are more likely to not consume alcohol (which they do, and which they are, we know this), and if frail or sick(er) elderly people are more likely to suffer a fall/fracture than are people who are relatively healthy (they are, again, we know this), well, then you’d expect consumption of alcohol to be found to have a ‘protective effect’ simply due to confounding by (reverse) indication (unless the researchers were really careful about adjusting for such things, but no such adjustments are mentioned in the coverage, which makes sense as these are just raw numbers being reported). The point is that the null here should not be that ‘these groups should be expected to have the same fall rate/fracture rate’, but rather ‘people who drink alcohol should be expected to be doing better, all else equal’ – but they aren’t, quite the reverse. So ‘the true effect size’ here may be larger than what you’d think.

I’m reasonably sure things are a lot more complicated than the above makes it appear (because of those 100 relevant variables we were talking about…), but I find it interesting anyway. Two more things to note: 1. Have another look at the numbers above if they didn’t sink in the first time. This is more than 10% of emergency department visits for that age group. Falls are a really big deal. 2. Fractures in the elderly are also a potentially really big deal. Here’s a sample quote: “One-fifth of hip fracture victims will die within 6 months of the injury, and only 50% will return to their previous level of independence.” (link). In some contexts, a fall is worse news than a cancer diagnosis, and they are very common events in the elderly. This also means that even relatively small effect sizes here can translate into quite large public health effects, because baseline incidence is so high.

The older adult population is a disproportionate consumer of prescription and over-the-counter medications. In a nationally representative sample of community-dwelling adults aged 57–84 years from the National Social Life, Health, and Aging Project (NSHAP) in 2005–2006, 81 % regularly used at least one prescription medication on a regular basis and 29% used at least five prescription medications. Forty-two percent used at least one nonprescription medication and concurrent use with a prescription medication was common, with 46% of prescription medication users also using OTC medications [2]. Prescription drug use by older adults in the U.S. is also growing. The percentage of older adults taking at least one prescription drug in the last 30 days increased from 73.6% in 1988–1994 to 89.7 % in 2007–2010 and the percentage taking five or more prescription drugs in the last 30 days increased from 13.8% in 1988–1994 to 39.7 % in 2007–2010 [3].”

The aging process can affect the response to a medication by altering its pharmacokinetics and pharmacodynamics [9, 10]. Reduced gastrointestinal motility and gastric acidity can alter the rate or extent of drug absorption. Changes in body composition, including decreased total body water and increased body fat can alter drug distribution. For alcohol, changes in body composition result in higher blood alcohol levels in older adults compared to younger adults after the same dose or quantity  of alcohol consumed. Decreased size of the liver, hepatic blood flow, and function of Phase I (oxidation, reduction, and hydrolysis) metabolic pathways result in reduced drug metabolism and increased drug exposure for drugs that undergo Phase I metabolism. Phase II hepatic metabolic pathways are generally preserved with aging. Decreased size of the kidney, renal blood flow, and glomerular filtration result in slower elimination of medications and metabolites by the kidney and increased drug exposure for medications that undergo renal elimination. Age-related impairment of homeostatic mechanisms and changes in receptor number and function can result in changes in pharmacodynamics as well. Older adults are generally more sensitive to the effects of medications and alcohol which act on the central nervous system for example. The consequences of these physiologic changes with aging are that older adults often experience increased drug exposure for the same dose (higher drug concentrations over time) and increased sensitivity to medications (greater response at a given drug concentration) than their younger counterparts.”

“Aging-related changes in physiology are not the only sources of variability in pharmacokinetics and pharmacodynamics that must be considered for an individual person. Older adults experience more chronic diseases that may decrease drug metabolism and renal elimination than younger cohorts. Frailty may result in further decline in drug metabolism, including Phase II metabolic pathways in the liver […] Drug interactions must also be considered […] A drug interaction is defined as a clinically meaningful change in the effect of one drug when coadministered with another drug [12]. Many drugs, including alcohol, have the potential for a drug interaction when administered concurrently, but whether a clinically meaningful change in effect occurs for a specific person depends on patient-specifc factors including age. Drug interactions are generally classified as pharmacokinetic interactions, where one drug alters the absorption, distribution, metabolism, or elimination of another drug resulting in increased or decreased drug exposure, or pharmacodynamic interactions, where one drug alters the response to another medication through additive or antagonistic pharmacologic effects [13]. An adverse drug event occurs when a pharmacokinetic or pharmacodynamic interaction or combination of both results in changes in drug exposure or response that lead to negative clinical outcomes. The adverse drug event could be a therapeutic failure if drug exposure is decreased or the pharmacologic response is antagonistic. The adverse drug event could be drug toxicity if the drug exposure is increased or the pharmacologic response is additive or synergistic. The threshold for experiencing an adverse event is often lower in older adults due to physiologic changes with aging and medical comorbidities, increasing their risk of experiencing an adverse drug event when medications are taken concurrently.”

“A large number of potential medication–alcohol interactions have been reported in the literature. Mechanisms of these interactions range from pharmacokinetic interactions affecting either alcohol or medication exposure to pharmacodynamics interactions resulting in exaggerated response. […] Epidemiologic evidence suggests that concurrent use of alcohol and medications among older adults is common. […] In a nationally representative U.S. sample of community-dwelling older adults in the National Social Life, Health and Aging Project (NSHAP) 2005–2006, 41% of participants reported consuming alcohol at least once per week and 20% were at risk for an alcohol–medication interaction because they were using both alcohol and alcohol-interacting medications on a regular basis [17]. […] Among participants in the Pennsylvania Assistance Contract for the Elderly program (aged 65–106 years) taking at least one prescription medication, 77% were taking an alcohol-interacting medication and 19% of the alcohol-interacting medication users reported concurrent use of alcohol [18]. […] Although these studies do not document adverse outcomes associated with alcohol–medication interactions, they do document that the potential exists for many older adults. […] High prevalence of concurrent use of alcohol and alcohol-interacting medications have also been reported in Australian men (43% of sedative or anxiolytic users were daily drinkers) [19], in older adults in Finland (42% of at-risk alcohol users were also taking alcohol-interacting medications) [20], and in older Irish adults (72% of participants were exposed to alcohol-interacting medications and 60% of these reported concurrent alcohol use) [21]. Drinking and medication use patterns in older adults may differ across countries, but alcohol–medication interactions appear to be a worldwide concern. […] Polypharmacy in general, and psychotropic burden specifically, has been associated with an increased risk of experiencing a geriatric syndrome such as falls or delirium, in older adults [26, 27]. Based on its pharmacology, alcohol can be considered as a psychotropic drug, and alcohol use should be assessed as part of the medication regimen evaluation to support efforts to prevent or manage geriatric syndromes. […] Combining alcohol and CNS active medications can be particularly problematic […] Older adults suffering from sleep problems or pain may be a particular risk for alcohol–medication interaction-related adverse events.”

In general, alcohol use in younger couples has been found to be highly concordant, that is, individuals in a relationship tend to engage in similar drinking behaviors [67,68]. Less is known, however, about alcohol use concordance between older couples. Graham and Braun [69] examined similarities in drinking behavior between spouses in a study of 826 community-dwelling older adults in Ontario, Canada. Results showed high concordance of drinking between spouses — whether they drank at all, how much they drank, and how frequently. […] Social learning theory suggests that alcohol use trajectories are strongly influenced by attitudes and behaviors of an individual’s social networks, particularly family and friends. When individuals engage in social activities with family and friends who approve of and engage in drinking, alcohol use, and misuse are reinforced [58, 59]. Evidence shows that among older adults, participation in social activities is correlated with higher levels of alcohol consumption [34, 60]. […] Brennan and Moos [29] […] found that older adults who reported less empathy and support from friends drank more alcohol, were more depressed, and were less self-confident. More stressors involving friends were associated with more drinking problems. Similar to the findings on marital conflict […], conflict in close friendships can prompt alcohol-use problems; conversely, these relationships can suffer as a result of alcohol-related problems. […] As opposed to social network theory […], social selection theory proposes that alcohol consumption changes an individual’s social context [33]. Studies among younger adults have shown that heavier drinkers chose partners and friends who approve of heavier drinking [70] and that excessive drinking can alienate social networks. The Moos study supports the idea that social selection also has a strong influence on drinking behavior among older adults.”

Traditionally, treatment studies in addiction have excluded patients over the age of 65. This bias has left a tremendous gap in knowledge regarding treatment outcomes and an understanding of the neurobiology of addiction in older adults.

Alcohol use causes well-established changes in sleep patterns, such as decreased sleep latency, decreased stage IV sleep, and precipitation or aggravation of sleep apnea [101]. There are also age-associated changes in sleep patterns including increased REM episodes, a decrease in REM length, a decrease in stage III and IV sleep, and increased awakenings. Age-associated changes in sleep can all be worsened by alcohol use and depression. Moeller and colleagues [102] demonstrated in younger subjects that alcohol and depression had additive effects upon sleep disturbances when they occurred together [102]. Wagman and colleagues [101] also have demonstrated that abstinent alcoholics did not sleep well because of insomnia, frequent awakenings, and REM fragmentation [101]; however, when these subjects ingested alcohol, sleep periodicity normalized and REM sleep was temporarily suppressed, suggesting that alcohol use could be used to self-medicate for sleep disturbances. A common anecdote from patients is that alcohol is used to help with sleep problems. […] The use of alcohol to self-medicate is considered maladaptive [34] and is associated with a host of negative outcomes. […] The use of alcohol to aid with sleep has been found to disrupt sleep architecture and cause sleep-related problems and daytime sleepiness [35, 36, 46]. Though alcohol is commonly used to aid with sleep initiation, it can worsen sleep-related breathing disorders and cause snoring and obstructive sleep apnea [36].”

Epidemiologic studies have clearly demonstrated that comorbidity between alcohol use and other psychiatric symptoms is common in younger age groups. Less is known about comorbidity between alcohol use and psychiatric illness in late life [88]. […] Blow et al. [90] reviewed the diagnosis of 3,986 VA patients between ages 60 and 69 presenting for alcohol treatment [90]. The most common comorbid psychiatric disorder was an affective disorder found in 21 % of the patients. […] Blazer et al. [91] studied 997 community dwelling elderly of whom only 4.5% had a history of alcohol use problems [91]; […] of these subjects, almost half had a comorbid diagnosis of depression or dysthymia. Comorbid depressive symptoms are not only common in late life but are also an important factor in the course and prognosis of psychiatric disorders. Depressed alcoholics have been shown to have a more complicated clinical course of depression with an increased risk of suicide and more social dysfunction than non-depressed alcoholics [9296]. […]  Alcohol use prior to late life has also been shown to influence treatment of late life depression. Cook and colleagues [94] found that a prior history of alcohol use problems predicted a more severe and chronic course for depression [94]. […] The effect of past heavy alcohol use is [also] highlighted in the findings from the Liverpool Longitudinal Study demonstrating a fivefold increase in psychiatric illness among elderly men who had a lifetime history of 5 or more years of heavy drinking [24]. The association between heavy alcohol consumption in earlier years and psychiatric morbidity in later life was not explained by current drinking habits. […] While Wernicke-Korsakoff’s syndrome is well described and often caused by alcohol use disorders, alcohol-related dementia may be difficult to differentiate from Alzheimer’s disease. Clinical diagnostic criteria for alcohol-related dementia (ARD) have been proposed and now validated in at least one trial, suggesting a method for distinguishing ARD, including Wernicke-Korsakoff’s syndrome, from other types of dementia [97, 98]. […] Finlayson et al. [100] found that 49 of 216 (23%) elderly patients presenting for alcohol treatment had dementia associated with alcohol use disorders [100].”

 

May 24, 2018 Posted by | Books, Demographics, Epidemiology, Medicine, Neurology, Pharmacology, Psychiatry, Statistics | Leave a comment

Trade-offs when doing medical testing

I was considering whether or not to blog the molecular biology text I recently read today, but I decided against it. However as I did feel like blogging today, I decided instead to add here a few comments I left on SCC. I rarely leave comments on other blogs, but it does happen, and the question I was ‘answering’ (partially – other guys had already added some pretty good comments by the time I joined the debate) is probably a question that I imagine a lot of e.g. undergrads are asking themselves, namely: “What’s the standard procedure, when designing a medical test, to determine the right tradeoff between sensitivity and specificity (where I’m picturing a tradeoff involved in choosing the threshold for a positive test or something similar)?

The ‘short version’, if you want an answer to this question, is probably to read Newman and Kohn’s wonderful book on these- and related- topics (which I blogged here), but that’s not actually a ‘short answer’ in terms of how people usually think about these things. I’ll just reproduce my own comment here, and mention that other guys had already covered some key topics by the time I joined ‘the fray’:

“Some good comments already. I don’t know to which extent the following points have been included in the links provided, but I decided to add them here anyway.

One point worth emphasizing is that you’ll always want a mixture of sensitivity and specificity (or, more broadly, test properties) that’ll mean that your test has clinical relevance. This relates both to the type of test you consider and when/whether to test at all (rather than treat/not treat without testing first). If you’re worried someone has disease X and there’s a high risk of said individual having disease X due to the clinical presentation, some tests will for example be inappropriate even if they are very good at making the distinction between individuals requiring treatment X and individuals not requiring treatment X, for example because they take time to perform that the patient might not have – not an uncommon situation in emergency medicine. If you’re so worried you’d treat him regardless of the test result, you shouldn’t test. And the same goes for e.g. low-sensitivity screens; if a positive test result of a screen does not imply that you’ll actually act on the result of the screen, you shouldn’t perform it (in screening contexts cost effectiveness is usually critically dependent on how you follow up on the test result, and in many contexts inadequate follow-up means that the value of the test goes down a lot […on a related note I have been thinking that I was perhaps not as kind as I could have been when I reviewed Juth & Munthe’s book and I have actually considered whether or not to change my rating of the book; it does give a decent introduction to some key trade-offs with which you’re confronted when you’re dealing with topics related to screening].

Cost effectiveness is another variable that would/should probably (in an ideal world?) enter the analysis when you’re judging what is or is not a good mixture of sensitivity and specificity – you should be willing to pay more for more precise tests, but only to the extent that those more precise tests lead to better outcomes (you’re usually optimizing over patient outcomes, not test accuracy).

Skef also mentions this, but the relative values of specificity and sensitivity may well vary during the diagnostic process; i.e. the (ideal) trade-off will depend on what you plan to use the test for. Is the idea behind testing this guy to make (reasonably?) sure he doesn’t have colon cancer, or to figure out if he needs a more accurate, but also more expensive, test? Screening setups will usually involve a multi-level testing structure, and tests at different levels will not treat these trade-offs the same way, nor should they. This also means that the properties of individual tests can not really be viewed in isolation, which makes the problem of finding ‘the ideal mix’ of test properties (whatever these might be) even harder; if you have three potential tests for example, it’s not enough to compare the tests individually against each other, you’d ideally also want to implicitly take into account that different combinations of tests have different properties, and that the timing of the test may also be an important parameter in the decision problem.”

On a related note I think that in general the idea of looking for some kind of ‘approved method’ that you can use to save yourself from thinking is a very dangerous approach when you’re doing applied statistics. If you’re not thinking about relevant trade-offs and how to deal with them, odds are you’re missing a big part of the picture. If somebody claims to have somehow discovered some simple approach to dealing with all of the relevant trade-offs, well, you should be very skeptical. Statistics usually don’t work like that.

May 4, 2018 Posted by | Medicine, Statistics | Leave a comment

Medical Statistics (III)

In this post I’ll include some links and quotes related to topics covered in chapters 4, 6, and 7 of the book. Before diving in, I’ll however draw attention to some of Gerd Gigerenzer’s work as it is quite relevant to in particular the coverage included in chapter 4 (‘Presenting research findings’), even if the authors seem unaware of this. One of Gigerenzer’s key insights, which I consider important and which I have thus tried to keep in mind, unfortunately goes unmentioned in the book; namely the idea that how you communicate risk might be very important in terms of whether or not people actually understand what you are trying to tell them. A related observation is that people have studied these things and they’ve figured out that some types of risk communication are demonstrably better than others at enabling people to understand the issues at hand and the trade-offs involved in a given situation. I covered some of these ideas in a comment on SCC some time ago; if those comments spark your interest you should definitely go read the book).

IMRAD format.
CONSORT Statement (randomized trials).
Equator Network.

“Abstracts may appear easy to write since they are very short […] and often required to be written in a structured format. It is therefore perhaps surprising that they are sometimes poorly written, too bland, contain inaccuracies, and/or are simply misleading.1  The reason for poor quality abstracts are complex; abstracts are often written at the end of a long process of data collection, analysis, and writing up, when time is short and researchers are weary. […] statistical issues […] can lead to an abstract that is not a fair representation of the research conducted. […] it is important that the abstract is consistent with the body of text and that it gives a balanced summary of the work. […] To maximize its usefulness, a summary or abstract should include estimates and confidence intervals for the main findings and not simply present P values.”

“The methods section should describe how the study was conducted. […] it is important to include the following: *The setting or area […] The date(s) […] subjects included […] study design […] measurements used […] source of any non-original data […] sample size, including a justification […] statistical methods, including any computer software used […] The discussion section is where the findings of the study are discussed and interpreted […] this section tends to include less statistics than the results section […] Some medical journals have a specific structure for the discussion for researchers to follow, and so it is important to check the journal’s guidelines before submitting. […] [When] reporting statistical analyses from statistical programs: *Don’t put unedited computer output into a research document. *Extract the relevant data only and reformat as needed […] Beware of presenting percentages for very small samples as they may be misleading. Simply give the numbers alone. […] In general the following is recommended for P values: *Give the actual P value whenever possible. *Rounding: Two significant figures are usually enough […] [Confidence intervals] should be given whenever possible to indicate the precision of estimates. […] Avoid graphs with missing zeros or stretched scales […] a table or graph should stand alone so that a reader does not need to read the […] article to be able to understand it.”

Statistical data type.
Level of measurement.
Descriptive statistics.
Summary statistics.
Geometric mean.
Harmonic mean.
Mode.
Interquartile range.
Histogram.
Stem and leaf plot.
Box and whisker plot.
Dot plot.

“Quantitative data are data that can be measured numerically and may be continuous or discrete. *Continuous data lie on a continuum and so can take any value between two limits. […] *Discrete data do not lie on a continuum and can only take certain values, usually counts (integers) […] On an interval scale, differences between values at different points of the scale have the same meaning […] Data can be regarded as on a ratio scale if the ratio of the two measurements has a meaning. For example we can say that twice as many people in one group had a particular characteristic compared with another group and this has a sensible meaning. […] Quantitative data are always ordinal – the data values can be arranged in a numerical order from the smallest to the largest. […] *Interval scale data are always ordinal. Ratio scale data are always interval scale data and therefore must also be ordinal. *In practice, continuous data may look discrete because of the way they are measured and/or reported. […] All continuous measurements are limited by the accuracy of the instrument used to measure them, and many quantities such as age and height are reported in whole numbers for convenience”.

“Categorical data are data where individuals fall into a number of separate categories or classes. […] Different categories of categorical data may be assigned a number for coding purposes […] and if there are several categories, there may be an implied ordering, such as with stage of cancer where stage I is the least advanced and stage IV is the most advanced. This means that such data are ordinal but not interval because the ‘distance’ between adjacent categories has no real measurement attached to it. The ‘gap’ between stages I and II disease is not necessarily the same as the ‘gap’ between stages III and IV. […] Where categorical data are coded with numerical codes, it might appear that there is an ordering but this may not necessarily be so. It is important to distinguish between ordered and non-ordered data because it affects the analysis.”

“It is usually useful to present more than one summary measure for a set of data […] If the data are going to be analyzed using methods based on means then it makes sense to present means rather than medians. If the data are skewed they may need to be transformed before analysis and so it is best to present summaries based on the transformed data, such as geometric means. […] For very skewed data rather than reporting the median, it may be helpful to present a different percentile (i.e. not the 50th), which better reflects the shape of the distribution. […] Some researchers are reluctant to present the standard deviation when the data are skewed and so present the median and range and/or quartiles. If analyses are planned which are based on means then it makes sense to be consistent and give standard deviations. Further, the useful relationship that approximately 95% of the data lie between mean +/- 2 standard deviations, holds even for skewed data […] If data are transformed, the standard deviation cannot be back-transformed correctly and so for transformed data a standard deviation cannot be given. In this case the untransformed standard deviation can be given or another measure of spread. […] For discrete data with a narrow range, such as stage of cancer, it may be better to present the actual frequency distribution to give a fair summary of the data, rather than calculate a mean or dichotomize it. […] It is often useful to tabulate one categorical variable against another to show the proportions or percentages of the categories of one variable by the other”.

Random variable.
Independence (probability theory).
Probability.
Probability distribution.
Binomial distribution.
Poisson distribution.
Continuous probability distribution.
Normal distribution.
Uniform distribution.

“The central limit theorem is a very important mathematical theorem that links the Normal distribution with other distributions in a unique and surprising way and is therefore very useful in statistics. *The sum of a large number of independent random variables will follow an approximately Normal distribution irrespective of their underlying distributions. *This means that any random variable which can be regarded as a the sum of a large number of small, independent contributions is likely to follow the Normal distribution. [I didn’t really like this description as it’s insufficiently detailed for my taste (and this was pretty much all they wrote about the CLT in that chapter); and one problem with the CLT is that people often think it applies when it might not actually do so, because the data restrictions implied by the theorem(s) are not really fully appreciated. On a related note people often seem to misunderstand what these theorems actually say and where they apply – see e.g. paragraph 10 in this post. See also the wiki link above for a more comprehensive treatment of these topicsUS] *The Normal distribution can be used as an approximation to the Binomial distribution when n is large […] The Normal distribution can be used as an approximation to the Poisson distribution as the mean of the Poisson distribution increases […] The main advantage in using the Normal rather than the Binomial or the Poisson distribution is that it makes it easier to calculate probabilities and confidence intervals”

“The t distribution plays an important role in statistics as the sampling distribution of the sample mean divided by its standard error and is used in significance testing […] The shape is symmetrical about the mean value, and is similar to the Normal distribution but with a higher peak and longer tails to take account of the reduced precision in smaller samples. The exact shape is determined by the mean and variance plus the degrees of freedom. As the degrees of freedom increase, the shape comes closer to the Normal distribution […] The chi-squared distribution also plays an important role in statistics. If we take several variables, say n, which each follow a standard Normal distribution, and square each and add them, the sum of these will follow a chi-squared distribution with n degrees of freedom. This theoretical result is very useful and widely used in statistical testing […] The chi-squared distribution is always positive and its shape is uniquely determined by the degrees of freedom. The distribution becomes more symmetrical as the degrees of freedom increases. […] [The (noncentral) F distribution] is the distribution of the ratio of two chi-squared distributions and is used in hypothesis testing when we want to compare variances, such as in doing analysis of variance […] Sometimes data may follow a positively skewed distribution which becomes a Normal distribution when each data point is log-transformed [..] In this case the original data can be said to follow a lognormal distribution. The transformation of such data from log-normal to Normal is very useful in allowing skewed data to be analysed using methods based on the Normal distribution since these are usually more powerful than alternative methods”.

Half-Normal distribution.
Bivariate Normal distribution.
Negative binomial distribution.
Beta distribution.
Gamma distribution.
Conditional probability.
Bayes theorem.

April 26, 2018 Posted by | Books, Data, Mathematics, Medicine, Statistics | Leave a comment

Medical Statistics (II)

In this post I’ll include some links and quotes related to topics covered in chapters 2 and 3 of the book. Chapter 2 is about ‘Collecting data’ and chapter 3 is about ‘Handling data: what steps are important?’

“Data collection is a key part of the research process, and the collection method will impact on later statistical analysis of the data. […] Think about the anticipated data analysis [in advance] so that data are collected in the appropriate format, e.g. if a mean will be needed for the analysis, then don’t record the data in categories, record the actual value. […] *It is useful to pilot the data collection process in a range of circumstances to make sure it will work in practice. *This usually involves trialling the data collection form on a smaller sample than intended for the study and enables problems with the data collection form to be identified and resolved prior to main data collection […] In general don’t expect the person filling out the form to do calculations as this may lead to errors, e.g. calculating a length of time between two dates. Instead, record each piece of information to allow computation of the particular value later […] The coding scheme should be designed at the same time as the form so that it can be built into the form. […] It may be important to distinguish between data that are simply missing from the original source and data that the data extractor failed to record. This can be achieved using different codes […] The use of numerical codes for non-numerical data may give the false impression that these data can be treated as if they were numerical data in the statistical analysis. This is not so.”

“It is critical that data quality is monitored and that this happens as the study progresses. It may be too late if problems are only discovered at the analysis stage. If checks are made during the data collection then problems can be corrected. More frequent checks may be worthwhile at the beginning of data collection when processes may be new and staff may be less experienced. […] The layout […] affects questionnaire completion rates and therefore impacts on the overall quality of the data collected.”

“Sometimes researchers need to develop a new measurement or questionnaire scale […] To do this rigorously requires a thorough process. We will outline the main steps here and note the most common statistical measures used in the process. […] Face validity *Is the scale measuring what it sets out to measure? […] Content validity *Does the scale cover all the relevant areas? […] *Between-observers consistency: is there agreement between different observers assessing the same individuals? *Within-observers consistency: is there agreement between assessments on the same individuals by the same observer on two different occasions? *Test-retest consistency: are assessments made on two separate occasions on the same individual similar? […] If a scale has several questions or items which all address the same issue then we usually expect each individual to get similar scores for those questions, i.e. we expect their responses to be internally consistent. […] Cronbach’s alpha […] is often used to assess the degree of internal consistency. [It] is calculated as an average of all correlations among the different questions on the scale. […] *Values are usually expected to be above 0.7 and below 0.9 *Alpha below 0.7 broadly indicates poor internal consistency *Alpha above 0.9 suggests that the items are very similar and perhaps fewer items could be used to obtain the same overall information”.

Bland–Altman plot.
Coefficient of variation.
Intraclass correlation.
Cohen’s kappa.
Likert scale. (“The key characteristic of Likert scales is that the scale is symmetrical. […] Care is needed when analyzing Likert scale data even though a numerical code is assigned to the responses, since the data are ordinal and discrete. Hence an average may be misleading […] It is quite common to collapse Likert scales into two or three categories such as agree versus disagree, but this has the disadvantage that data are discarded.”)
Visual analogue scale. (“VAS scores can be treated like continuous data […] Where it is feasible to use a VAS, it is preferable as it provides greater statistical power than a categorical scale”)

“Correct handling of data is essential to produce valid and reliable statistics. […] Data from research studies need to be coded […] It is important to document the coding scheme for categorical variables such as sex where it will not be obviously [sic, US] what the values mean […] It is strongly recommended that a unique numerical identifier is given to each subject, even if the research is conducted anonymously. […] Computerized datasets are often stored in a spreadsheet format with rows and columns of data. For most statistical analyses it is best to enter the data so that each row represents a different subject and each column a different variable. […] Prefixes or suffixes can be used to denote […] repeated measurements. If there are several repeated variables, use the same ‘scheme’ for all to avoid confusion. […] Try to avoid mixing suffixes and prefixes as it can cause confusion.”

“When data are entered onto a computer at different times it may be necessary to join datasets together. […] It is important to avoid over-writing a current dataset with a new updated version without keeping the old version as a separate file […] the two datasets must use exactly the same variable names for the same variables and the same coding. Any spelling mistakes will prevent a successful joining. […] It is worth checking that the joining has worked as expected by checking that the total number of observations in the updated file is the sum of the two previous files, and that the total number of variables is unchanged. […] When new data are collected on the same individuals at a later stage […], it may [again] be necessary to merge datasets. In order to do this the unique subject identifier must be used to identify the records that must be matched. For the merge to work, all variable names in the two datasets must be different except for the unique identifier. […] Spreadsheets are useful for entering and storing data. However, care should be taken when cutting and pasting different datasets to avoid misalignment of data. […] it is best not to join or sort datasets using a spreadsheet […in some research contexts, I’d add, this is also just plain impossible to even try, due to the amount of data involved – US…] […] It is important to ensure that a unique copy of the current file, the ‘master copy’, is stored at all times. Where the study involves more than one investigator, everyone needs to know who has responsibility for this. It is also important to avoid having two people revising the same file at the same time. […] It is important to keep a record of any changes that are made to the dataset and keep dated copies of datasets as changes are made […] Don’t overwrite datasets with edited versions as older versions may be needed later on.”

“Where possible, it is important to do some [data entry] checks early on to leave time for addressing problems while the study is in progress. […] *Check a random sample of forms for data entry accuracy. If this reveals problems then further checking may be needed. […] If feasible, consider checking data entry forms for key variables, e.g. the primary outcome. […] Range checks: […] tabulate all data to ensure there are no invalid values […] make sure responses are consistent with each other within subjects, e.g. check for any impossible or unlikely combination of responses such as a male with a pregnancy […] Check where feasible that any gaps are true gaps and not missed data entry […] Sometimes finding one error may lead to others being uncovered. For example, if a spreadsheet was used for data entry and one entry was missed, all following entries may be in the wrong columns. Hence, always consider if the discovery of one error may imply that there are others. […] Plots can be useful for checking larger datasets.”

Data monitoring committee.
Damocles guidelines.
Overview of stopping rules for clinical trials.
Pocock boundary.
Haybittle–Peto boundary.

“Trials are only stopped early when it is considered that the evidence for either benefit or harm is overwhelmingly strong. In such cases, the effect size will inevitably be larger than anticipated at the outset of the trial in order to trigger the early stop. Hence effect estimates from trials stopped early tend to be more extreme than would be the case if these trials had continued to the end, and so estimates of the efficacy or harm of a particular treatment may be exaggerated. This phenomenon has been demonstrated in recent reviews.1,2 […] Sometimes it becomes apparent part way through a trial that the assumptions made in the original sample size calculations are not correct. For example, where the primary outcome is a continuous variable, an estimate of the standard deviation (SD) is needed to calculate the required sample size. When the data are summarized during the trial, it may become apparent that the observed SD is different from that expected. This has implications for the statistical power. If the observed SD is smaller than expected then it may be reasonable to reduce the sample size but if it is bigger then it may be necessary to increase it.”

April 16, 2018 Posted by | Books, Medicine, Statistics | Leave a comment

Medical Statistics (I)

I was more than a little critical of the book in my review on goodreads, and the review is sufficiently detailed that I thought it would be worth including it in this post. Here’s what I wrote on goodreads (slightly edited to take full advantage of the better editing options on wordpress):

“The coverage is excessively focused on significance testing. The book also provides very poor coverage of model selection topics, where the authors not once but repeatedly recommend employing statistically invalid approaches to model selection (the authors recommend using hypothesis testing mechanisms to guide model selection, as well as using adjusted R-squared for model selection decisions – both of which are frankly awful ideas, for reasons which are obvious to people familiar with the field of model selection. “Generally, hypothesis testing is a very poor basis for model selection […] There is no statistical theory that supports the notion that hypothesis testing with a fixed α level is a basis for model selection.” “While adjusted R2 is useful as a descriptive statistic, it is not useful in model selection” – quotes taken directly from Burnham & Anderson’s book Model Selection and Multi-Model Inference: A Practical Information-Theoretic Approach).

The authors do not at any point in the coverage even mention the option of using statistical information criteria to guide model selection decisions, and frankly repeatedly recommend doing things which are known to be deeply problematic. The authors also cover material from Borenstein and Hedges’ meta-analysis text in the book, yet still somehow manage to give poor advice in the context of meta-analysis along similar lines (implicitly advising people to base model decisions within the context of whether to use fixed effects or random effects on the results of heterogeneity tests, despite this approach being criticized as problematic in the formerly mentioned text).

Basic and not terrible, but there are quite a few problems with this text.”

I’ll add a few more details about the above-mentioned problems before moving on to the main coverage. As for the model selection topic I refer specifically to my coverage of Burnham and Anderson’s book here and here – these guys spent a lot of pages talking about why you shouldn’t do what the authors of this book recommend, and I’m sort of flabbergasted medical statisticians don’t know this kind of stuff by now. To people who’ve read both these books, it’s not really in question who’s in the right here.

I believe part of the reason why I was very annoyed at the authors at times was that they seem to promote exactly a sort of blind unthinking hypothesis-testing approach to things that is unfortunately very common – the entire book is saturated with hypothesis testing stuff, which means that many other topics are woefully insufficiently covered. The meta-analysis example is probably quite illustrative; the authors spend multiple pages on study heterogeneity and how to deal with it, but the entire coverage there is centered around the discussion of a most-likely underpowered test, the result of which should perhaps in the best case scenario direct the researcher’s attention to topics he should be have been thinking carefully about from the very start of his data analysis. You don’t need to quote many words from Borenstein and Hedges (here’s a relevant link) to get to the heart of the matter here:

“It makes sense to use the fixed-effect model if two conditions are met. First, we believe that all the studies included in the analysis are functionally identical. Second, our goal is to compute the common effect size for the identified population, and not to generalize to other populations. […] this situation is relatively rare. […] By contrast, when the researcher is accumulating data from a series of studies that had been performed by researchers operating independently, it would be unlikely that all the studies were functionally equivalent. Typically, the subjects or interventions in these studies would have differed in ways that would have impacted on the results, and therefore we should not assume a common effect size. Therefore, in these cases the random-effects model is more easily justified than the fixed-effect model.

A report should state the computational model used in the analysis and explain why this model was selected. A common mistake is to use the fixed-effect model on the basis that there is no evidence of heterogeneity. As [already] explained […], the decision to use one model or the other should depend on the nature of the studies, and not on the significance of this test [because the test will often have low power anyway].”

Yet these guys spend their efforts here talking about a test that is unlikely to yield useful information and which if anything probably distracts the reader from the main issues at hand; are the studies functionally equivalent? Do we assume there’s one (‘true’) effect size, or many? What do those coefficients we’re calculating actually mean? The authors do in fact include a lot of cautionary notes about how to interpret the test, but in my view all this means is that they’re devoting critical pages to peripheral issues – and perhaps even reinforcing the view that the test is important, or why else would they spend so much effort on it? – rather than promote good thinking about the key topics at hand.

Anyway, enough of the critical comments. Below a few links related to the first chapter of the book, as well as some quotes.

Declaration of Helsinki.
Randomized controlled trial.
Minimization (clinical trials).
Blocking (statistics).
Informed consent.
Blinding (RCTs). (…related xkcd link).
Parallel study. Crossover trial.
Zelen’s design.
Superiority, equivalence, and non-inferiority trials.
Intention-to-treat concept: A review.
Case-control study. Cohort study. Nested case-control study. Cross-sectional study.
Bradford Hill criteria.
Research protocol.
Sampling.
Type 1 and type 2 errors.
Clinical audit. A few quotes on this topic:

“‘Clinical audit’ is a quality improvement process that seeks to improve the patient care and outcomes through systematic review of care against explicit criteria and the implementation of change. Aspects of the structures, processes and outcomes of care are selected and systematically evaluated against explicit criteria. […] The aim of audit is to monitor clinical practice against agreed best practice standards and to remedy problems. […] the choice of topic is guided by indications of areas where improvement is needed […] Possible topics [include] *Areas where a problem has been identified […] *High volume practice […] *High risk practice […] *High cost […] *Areas of clinical practice where guidelines or firm evidence exists […] The organization carrying out the audit should have the ability to make changes based on their findings. […] In general, the same methods of statistical analysis are used for audit as for research […] The main difference between audit and research is in the aim of the study. A clinical research study aims to determine what practice is best, whereas an audit checks to see that best practice is being followed.”

A few more quotes from the end of the chapter:

“In clinical medicine and in medical research it is fairly common to categorize a biological measure into two groups, either to aid diagnosis or to classify an outcome. […] It is often useful to categorize a measurement in this way to guide decision-making, and/or to summarize the data but doing this leads to a loss of information which in turn has statistical consequences. […] If a continuous variable is used for analysis in a research study, a substantially smaller sample size will be needed than if the same variable is categorized into two groups […] *Categorization of a continuous variable into two groups loses much data and should be avoided whenever possible *Categorization of a continuous variable into several groups is less problematic”

“Research studies require certain specific data which must be collected to fulfil the aims of the study, such as the primary and secondary outcomes and main factors related to them. Beyond these data there are often other data that could be collected and it is important to weigh the costs and consequences of not collecting data that will be needed later against the disadvantages of collecting too much data. […] collecting too much data is likely to add to the time and cost to data collection and processing, and may threaten the completeness and/or quality of all of the data so that key data items are threatened. For example if a questionnaire is overly long, respondents may leave some questions out or may refuse to fill it out at all.”

Stratified samples are used when fixed numbers are needed from particular sections or strata of the population in order to achieve balance across certain important factors. For example a study designed to estimate the prevalence of diabetes in different ethnic groups may choose a random sample with equal numbers of subjects in each ethnic group to provide a set of estimates with equal precision for each group. If a simple random sample is used rather than a stratified sample, then estimates for minority ethnic groups may be based on small numbers and have poor precision. […] Cluster samples may be chosen where individuals fall naturally into groups or clusters. For example, patients on a hospital wards or patients in a GP practice. If a sample is needed of these patients, it may be easier to list the clusters and then to choose a random sample of clusters, rather than to choose a random sample of the whole population. […] Cluster sampling is less efficient statistically than simple random sampling […] the ICC summarizes the extent of the ‘clustering effect’. When individuals in the same cluster are much more alike than individuals in different clusters with respect to an outcome, then the clustering effect is greater and the impact on the required sample size is correspondingly greater. In practice there can be a substantial effect on the sample size even when the ICC is quite small. […] As well as considering how representative a sample is, it is important […] to consider the size of the sample. A sample may be unbiased and therefore representative, but too small to give reliable estimates. […] Prevalence estimates from small samples will be imprecise and therefore may be misleading. […] The greater the variability of a measure, the greater the number of subjects needed in the sample to estimate it precisely. […] the power of a study is the ability of the study to detect a difference if one exists.”

April 9, 2018 Posted by | Books, Epidemiology, Medicine, Statistics | Leave a comment

Networks

I actually think this was a really nice book, considering the format – I gave it four stars on goodreads. One of the things I noticed people didn’t like about it in the reviews is that it ‘jumps’ a bit in terms of topic coverage; it covers a wide variety of applications and analytical settings. I mostly don’t consider this a weakness of the book – even if occasionally it does get a bit excessive – and I can definitely understand the authors’ choice of approach; it’s sort of hard to illustrate the potential the analytical techniques described within this book have if you’re not allowed to talk about all the areas in which they have been – or could be gainfully – applied. A related point is that many people who read the book might be familiar with the application of these tools in specific contexts but have perhaps not thought about the fact that similar methods are applied in many other areas (and they might all of them be a bit annoyed the authors don’t talk more about computer science applications, or foodweb analyses, or infectious disease applications, or perhaps sociometry…). Most of the book is about graph-theory-related stuff, but a very decent amount of the coverage deals with applications, in a broad sense of the word at least, not theory. The discussion of theoretical constructs in the book always felt to me driven to a large degree by their usefulness in specific contexts.

I have covered related topics before here on the blog, also quite recently – e.g. there’s at least some overlap between this book and Holland’s book about complexity theory in the same series (I incidentally think these books probably go well together) – and as I found the book slightly difficult to blog as it was I decided against covering it in as much detail as I sometimes do when covering these texts – this means that I decided to leave out the links I usually include in posts like these.

Below some quotes from the book.

“The network approach focuses all the attention on the global structure of the interactions within a system. The detailed properties of each element on its own are simply ignored. Consequently, systems as different as a computer network, an ecosystem, or a social group are all described by the same tool: a graph, that is, a bare architecture of nodes bounded by connections. […] Representing widely different systems with the same tool can only be done by a high level of abstraction. What is lost in the specific description of the details is gained in the form of universality – that is, thinking about very different systems as if they were different realizations of the same theoretical structure. […] This line of reasoning provides many insights. […] The network approach also sheds light on another important feature: the fact that certain systems that grow without external control are still capable of spontaneously developing an internal order. […] Network models are able to describe in a clear and natural way how self-organization arises in many systems. […] In the study of complex, emergent, and self-organized systems (the modern science of complexity), networks are becoming increasingly important as a universal mathematical framework, especially when massive amounts of data are involved. […] networks are crucial instruments to sort out and organize these data, connecting individuals, products, news, etc. to each other. […] While the network approach eliminates many of the individual features of the phenomenon considered, it still maintains some of its specific features. Namely, it does not alter the size of the system — i.e. the number of its elements — or the pattern of interaction — i.e. the specific set of connections between elements. Such a simplified model is nevertheless enough to capture the properties of the system. […] The network approach [lies] somewhere between the description by individual elements and the description by big groups, bridging the two of them. In a certain sense, networks try to explain how a set of isolated elements are transformed, through a pattern of interactions, into groups and communities.”

“[T]he random graph model is very important because it quantifies the properties of a totally random network. Random graphs can be used as a benchmark, or null case, for any real network. This means that a random graph can be used in comparison to a real-world network, to understand how much chance has shaped the latter, and to what extent other criteria have played a role. The simplest recipe for building a random graph is the following. We take all the possible pair of vertices. For each pair, we toss a coin: if the result is heads, we draw a link; otherwise we pass to the next pair, until all the pairs are finished (this means drawing the link with a probability p = ½, but we may use whatever value of p). […] Nowadays [the random graph model] is a benchmark of comparison for all networks, since any deviations from this model suggests the presence of some kind of structure, order, regularity, and non-randomness in many real-world networks.”

“…in networks, topology is more important than metrics. […] In the network representation, the connections between the elements of a system are much more important than their specific positions in space and their relative distances. The focus on topology is one of its biggest strengths of the network approach, useful whenever topology is more relevant than metrics. […] In social networks, the relevance of topology means that social structure matters. […] Sociology has classified a broad range of possible links between individuals […]. The tendency to have several kinds of relationships in social networks is called multiplexity. But this phenomenon appears in many other networks: for example, two species can be connected by different strategies of predation, two computers by different cables or wireless connections, etc. We can modify a basic graph to take into account this multiplexity, e.g. by attaching specific tags to edges. […] Graph theory [also] allows us to encode in edges more complicated relationships, as when connections are not reciprocal. […] If a direction is attached to the edges, the resulting structure is a directed graph […] In these networks we have both in-degree and out-degree, measuring the number of inbound and outbound links of a node, respectively. […] in most cases, relations display a broad variation or intensity [i.e. they are not binary/dichotomous]. […] Weighted networks may arise, for example, as a result of different frequencies of interactions between individuals or entities.”

“An organism is […] the outcome of several layered networks and not only the deterministic result of the simple sequence of genes. Genomics has been joined by epigenomics, transcriptomics, proteomics, metabolomics, etc., the disciplines that study these layers, in what is commonly called the omics revolution. Networks are at the heart of this revolution. […] The brain is full of networks where various web-like structures provide the integration between specialized areas. In the cerebellum, neurons form modules that are repeated again and again: the interaction between modules is restricted to neighbours, similarly to what happens in a lattice. In other areas of the brain, we find random connections, with a more or less equal probability of connecting local, intermediate, or distant neurons. Finally, the neocortex — the region involved in many of the higher functions of mammals — combines local structures with more random, long-range connections. […] typically, food chains are not isolated, but interwoven in intricate patterns, where a species belongs to several chains at the same time. For example, a specialized species may predate on only one prey […]. If the prey becomes extinct, the population of the specialized species collapses, giving rise to a set of co-extinctions. An even more complicated case is where an omnivore species predates a certain herbivore, and both eat a certain plant. A decrease in the omnivore’s population does not imply that the plant thrives, because the herbivore would benefit from the decrease and consume even more plants. As more species are taken into account, the population dynamics can become more and more complicated. This is why a more appropriate description than ‘foodchains’ for ecosystems is the term foodwebs […]. These are networks in which nodes are species and links represent relations of predation. Links are usually directed (big fishes eat smaller ones, not the other way round). These networks provide the interchange of food, energy, and matter between species, and thus constitute the circulatory system of the biosphere.”

“In the cell, some groups of chemicals interact only with each other and with nothing else. In ecosystems, certain groups of species establish small foodwebs, without any connection to external species. In social systems, certain human groups may be totally separated from others. However, such disconnected groups, or components, are a strikingly small minority. In all networks, almost all the elements of the systems take part in one large connected structure, called a giant connected component. […] In general, the giant connected component includes not less than 90 to 95 per cent of the system in almost all networks. […] In a directed network, the existence of a path from one node to another does not guarantee that the journey can be made in the opposite direction. Wolves eat sheep, and sheep eat grass, but grass does not eat sheep, nor do sheep eat wolves. This restriction creates a complicated architecture within the giant connected component […] according to an estimate made in 1999, more than 90 per cent of the WWW is composed of pages connected to each other, if the direction of edges is ignored. However, if we take direction into account, the proportion of nodes mutually reachable is only 24 per cent, the giant strongly connected component. […] most networks are sparse, i.e. they tend to be quite frugal in connections. Take, for example, the airport network: the personal experience of every frequent traveller shows that direct flights are not that common, and intermediate stops are necessary to reach several destinations; thousands of airports are active, but each city is connected to less than 20 other cities, on average. The same happens in most networks. A measure of this is given by the mean number of connection of their nodes, that is, their average degree.”

“[A] puzzling contradiction — a sparse network can still be very well connected — […] attracted the attention of the Hungarian mathematicians […] Paul Erdős and Alfréd Rényi. They tackled it by producing different realizations of their random graph. In each of them, they changed the density of edges. They started with a very low density: less than one edge per node. It is natural to expect that, as the density increases, more and more nodes will be connected to each other. But what Erdős and Rényi found instead was a quite abrupt transition: several disconnected components coalesced suddenly into a large one, encompassing almost all the nodes. The sudden change happened at one specific critical density: when the average number of links per node (i.e. the average degree) was greater than one, then the giant connected component suddenly appeared. This result implies that networks display a very special kind of economy, intrinsic to their disordered structure: a small number of edges, even randomly distributed between nodes, is enough to generate a large structure that absorbs almost all the elements. […] Social systems seem to be very tightly connected: in a large enough group of strangers, it is not unlikely to find pairs of people with quite short chains of relations connecting them. […] The small-world property consists of the fact that the average distance between any two nodes (measured as the shortest path that connects them) is very small. Given a node in a network […], few nodes are very close to it […] and few are far from it […]: the majority are at the average — and very short — distance. This holds for all networks: starting from one specific node, almost all the nodes are at very few steps from it; the number of nodes within a certain distance increases exponentially fast with the distance. Another way of explaining the same phenomenon […] is the following: even if we add many nodes to a network, the average distance will not increase much; one has to increase the size of a network by several orders of magnitude to notice that the paths to new nodes are (just a little) longer. The small-world property is crucial to many network phenomena. […] The small-world property is something intrinsic to networks. Even the completely random Erdős-Renyi graphs show this feature. By contrast, regular grids do not display it. If the Internet was a chessboard-like lattice, the average distance between two routers would be of the order of 1,000 jumps, and the Net would be much slower [the authors note elsewhere that “The Internet is composed of hundreds of thousands of routers, but just about ten ‘jumps’ are enough to bring an information packet from one of them to any other.”] […] The key ingredient that transforms a structure of connections into a small world is the presence of a little disorder. No real network is an ordered array of elements. On the contrary, there are always connections ‘out of place’. It is precisely thanks to these connections that networks are small worlds. […] Shortcuts are responsible for the small-world property in many […] situations.”

“Body size, IQ, road speed, and other magnitudes have a characteristic scale: that is, an average value that in the large majority of cases is a rough predictor of the actual value that one will find. […] While height is a homogeneous magnitude, the number of social connection[s] is a heterogeneous one. […] A system with this feature is said to be scale-free or scale-invariant, in the sense that it does not have a characteristic scale. This can be rephrased by saying that the individual fluctuations with respect to the average are too large for us to make a correct prediction. […] In general, a network with heterogeneous connectivity has a set of clear hubs. When a graph is small, it is easy to find whether its connectivity is homogeneous or heterogeneous […]. In the first case, all the nodes have more or less the same connectivity, while in the latter it is easy to spot a few hubs. But when the network to be studied is very big […] things are not so easy. […] the distribution of the connectivity of the nodes of the […] network […] is the degree distribution of the graph. […] In homogeneous networks, the degree distribution is a bell curve […] while in heterogeneous networks, it is a power law […]. The power law implies that there are many more hubs (and much more connected) in heterogeneous networks than in homogeneous ones. Moreover, hubs are not isolated exceptions: there is a full hierarchy of nodes, each of them being a hub compared with the less connected ones.”

“Looking at the degree distribution is the best way to check if a network is heterogeneous or not: if the distribution is fat tailed, then the network will have hubs and heterogeneity. A mathematically perfect power law is never found, because this would imply the existence of hubs with an infinite number of connections. […] Nonetheless, a strongly skewed, fat-tailed distribution is a clear signal of heterogeneity, even if it is never a perfect power law. […] While the small-world property is something intrinsic to networked structures, hubs are not present in all kind of networks. For example, power grids usually have very few of them. […] hubs are not present in random networks. A consequence of this is that, while random networks are small worlds, heterogeneous ones are ultra-small worlds. That is, the distance between their vertices is relatively smaller than in their random counterparts. […] Heterogeneity is not equivalent to randomness. On the contrary, it can be the signature of a hidden order, not imposed by a top-down project, but generated by the elements of the system. The presence of this feature in widely different networks suggests that some common underlying mechanism may be at work in many of them. […] the Barabási–Albert model gives an important take-home message. A simple, local behaviour, iterated through many interactions, can give rise to complex structures. This arises without any overall blueprint”.

Homogamy, the tendency of like to marry like, is very strong […] Homogamy is a specific instance of homophily: this consists of a general trend of like to link to like, and is a powerful force in shaping social networks […] assortative mixing [is] a special form of homophily, in which nodes tend to connect with others that are similar to them in the number of connections. By contrast [when] high- and low-degree nodes are more connected to each other [it] is called disassortative mixing. Both cases display a form of correlation in the degrees of neighbouring nodes. When the degrees of neighbours are positively correlated, then the mixing is assortative; when negatively, it is disassortative. […] In random graphs, the neighbours of a given node are chosen completely at random: as a result, there is no clear correlation between the degrees of neighbouring nodes […]. On the contrary, correlations are present in most real-world networks. Although there is no general rule, most natural and technological networks tend to be disassortative, while social networks tend to be assortative. […] Degree assortativity and disassortativity are just an example of the broad range of possible correlations that bias how nodes tie to each other.”

“[N]etworks (neither ordered lattices nor random graphs), can have both large clustering and small average distance at the same time. […] in almost all networks, the clustering of a node depends on the degree of that node. Often, the larger the degree, the smaller the clustering coefficient. Small-degree nodes tend to belong to well-interconnected local communities. Similarly, hubs connect with many nodes that are not directly interconnected. […] Central nodes usually act as bridges or bottlenecks […]. For this reason, centrality is an estimate of the load handled by a node of a network, assuming that most of the traffic passes through the shortest paths (this is not always the case, but it is a good approximation). For the same reason, damaging central nodes […] can impair radically the flow of a network. Depending on the process one wants to study, other definitions of centrality can be introduced. For example, closeness centrality computes the distance of a node to all others, and reach centrality factors in the portion of all nodes that can be reached in one step, two steps, three steps, and so on.”

“Domino effects are not uncommon in foodwebs. Networks in general provide the backdrop for large-scale, sudden, and surprising dynamics. […] most of the real-world networks show a doubled-edged kind of robustness. They are able to function normally even when a large fraction of the network is damaged, but suddenly certain small failures, or targeted attacks, bring them down completely. […] networks are very different from engineered systems. In an airplane, damaging one element is enough to stop the whole machine. In order to make it more resilient, we have to use strategies such as duplicating certain pieces of the plane: this makes it almost 100 per cent safe. In contrast, networks, which are mostly not blueprinted, display a natural resilience to a broad range of errors, but when certain elements fail, they collapse. […] A random graph of the size of most real-world networks is destroyed after the removal of half of the nodes. On the other hand, when the same procedure is performed on a heterogeneous network (either a map of a real network or a scale-free model of a similar size), the giant connected component resists even after removing more than 80 per cent of the nodes, and the distance within it is practically the same as at the beginning. The scene is different when researchers simulate a targeted attack […] In this situation the collapse happens much faster […]. However, now the most vulnerable is the second: while in the homogeneous network it is necessary to remove about one-fifth of its more connected nodes to destroy it, in the heterogeneous one this happens after removing the first few hubs. Highly connected nodes seem to play a crucial role, in both errors and attacks. […] hubs are mainly responsible for the overall cohesion of the graph, and removing a few of them is enough to destroy it.”

“Studies of errors and attacks have shown that hubs keep different parts of a network connected. This implies that they also act as bridges for spreading diseases. Their numerous ties put them in contact with both infected and healthy individuals: so hubs become easily infected, and they infect other nodes easily. […] The vulnerability of heterogeneous networks to epidemics is bad news, but understanding it can provide good ideas for containing diseases. […] if we can immunize just a fraction, it is not a good idea to choose people at random. Most of the times, choosing at random implies selecting individuals with a relatively low number of connections. Even if they block the disease from spreading in their surroundings, hubs will always be there to put it back into circulation. A much better strategy would be to target hubs. Immunizing hubs is like deleting them from the network, and the studies on targeted attacks show that eliminating a small fraction of hubs fragments the network: thus, the disease will be confined to a few isolated components. […] in the epidemic spread of sexually transmitted diseases the timing of the links is crucial. Establishing an unprotected link with a person before they establish an unprotected link with another person who is infected is not the same as doing so afterwards.”

April 3, 2018 Posted by | Biology, Books, Ecology, Engineering, Epidemiology, Genetics, Mathematics, Statistics | Leave a comment

Safety-Critical Systems

Some related links to topics covered in the lecture:

Safety-critical system.
Safety engineering.
Fault tree analysis.
Failure mode and effects analysis.
Fail-safe.
Value of a statistical life.
ALARP principle.
Hazards and Risk (HSA).
Software system safety.
Aleatoric and epistemic uncertainty.
N-version programming.
An experimental evaluation of the assumption of independence in multiversion programming (Knight & Leveson).
Safety integrity level.
Software for Dependable Systems – Sufficient Evidence? (consensus study report).

March 15, 2018 Posted by | Computer science, Economics, Engineering, Lectures, Statistics | Leave a comment

Prevention of Late-Life Depression (I)

Late-life depression is a common and highly disabling condition and is also associated with higher health care utilization and overall costs. The presence of depression may complicate the course and treatment of comorbid major medical conditions that are also highly prevalent among older adults — including diabetes, hypertension, and heart disease. Furthermore, a considerable body of evidence has demonstrated that, for older persons, residual symptoms and functional impairment due to depression are common — even when appropriate depression therapies are being used. Finally, the worldwide phenomenon of a rapidly expanding older adult population means that unprecedented numbers of seniors — and the providers who care for them — will be facing the challenge of late-life depression. For these reasons, effective prevention of late-life depression will be a critical strategy to lower overall burden and cost from this disorder. […] This textbook will illustrate the imperative for preventing late-life depression, introduce a broad range of approaches and key elements involved in achieving effective prevention, and provide detailed examples of applications of late-life depression prevention strategies”.

I gave the book two stars on goodreads. There are 11 chapters in the book, written by 22 different contributors/authors, so of course there’s a lot of variation in the quality of the material included; the two star rating was an overall assessment of the quality of the material, and the last two chapters – but in particular chapter 10 – did a really good job convincing me that the the book did not deserve a 3rd star (if you decide to read the book, I advise you to skip chapter 10). In general I think many of the authors are way too focused on statistical significance and much too hesitant to report actual effect sizes, which are much more interesting. Gender is mentioned repeatedly throughout the coverage as an important variable, to the extent that people who do not read the book carefully might think this is one of the most important variables at play; but when you look at actual effect sizes, you get reported ORs of ~1.4 for this variable, compared to e.g. ORs in the ~8-9 for the bereavement variable (see below). You can quibble about population attributable fraction and so on here, but if the effect size is that small it’s unlikely to be all that useful in terms of directing prevention efforts/resource allocation (especially considering that women make out the majority of the total population in these older age groups anyway, as they have higher life expectancy than their male counterparts).

Anyway, below I’ve added some quotes and observations from the first few chapters of the book.

Meta-analyses of more than 30 randomized trials conducted in the High Income Countries show that the incidence of new depressive and anxiety disorders can be reduced by 25–50 % over 1–2 years, compared to usual care, through the use of learning-based psychotherapies (such as interpersonal psychotherapy, cognitive behavioral therapy, and problem solving therapy) […] The case for depression prevention is compelling and represents the key rationale for this volume: (1) Major depression is both prevalent and disabling, typically running a relapsing or chronic course. […] (2) Major depression is often comorbid with other chronic conditions like diabetes, amplifying the disability associated with these conditions and worsening family caregiver burden. (3) Depression is associated with worse physical health outcomes, partly mediated through poor treatment adherence, and it is associated with excess mortality after myocardial infarction, stroke, and cancer. It is also the major risk factor for suicide across the life span and particularly in old age. (4) Available treatments are only partially effective in reducing symptom burden, sustaining remission, and averting years lived with disability.”

“[M]any people suffering from depression do not receive any care and approximately a third of those receiving care do not respond to current treatments. The risk of recurrence is high, also in older persons: half of those who have experienced a major depression will experience one or even more recurrences [4]. […] Depression increases the risk at death: among people suffering from depression the risk of dying is 1.65 times higher than among people without a depression [7], with a dose-response relation between severity and duration of depression and the resulting excess mortality [8]. In adults, the average length of a depressive episode is 8 months but among 20 % of people the depression lasts longer than 2 years [9]. […] It has been estimated that in Australia […] 60 % of people with an affective disorder receive treatment, and using guidelines and standards only 34 % receives effective treatment [14]. This translates in preventing 15 % of Years Lived with Disability [15], a measure of disease burden [14] and stresses the need for prevention [16]. Primary health care providers frequently do not recognize depression, in particular among elderly. Older people may present their depressive symptoms differently from younger adults, with more emphasis on physical complaints [17, 18]. Adequate diagnosis of late-life depression can also be hampered by comorbid conditions such as Parkinson and dementia that may have similar symptoms, or by the fact that elderly people as well as care workers may assume that “feeling down” is part of becoming older [17, 18]. […] Many people suffering from depression do not seek professional help or are not identied as depressed [21]. Almost 14 % of elderly people living in community-type living suffer from a severe depression requiring clinical attention [22] and more than 50 % of those have a chronic course [4, 23]. Smit et al. reported an incidence of 6.1 % of chronic or recurrent depression among a sample of 2,200 elderly people (ages 55–85) [21].”

“Prevention differs from intervention and treatment as it is aimed at general population groups who vary in risk level for mental health problems such as late-life depression. The Institute of Medicine (IOM) has introduced a prevention framework, which provides a useful model for comprehending the different objectives of the interventions [29]. The overall goal of prevention programs is reducing risk factors and enhancing protective factors.
The IOM framework distinguishes three types of prevention interventions: (1) universal preventive interventions, (2) selective preventive interventions, and (3) indicated preventive interventions. Universal preventive interventions are targeted at the general audience, regardless of their risk status or the presence of symptoms. Selective preventive interventions serve those sub-populations who have a significantly higher than average risk of a disorder, either imminently or over a lifetime. Indicated preventive interventions target identified individuals with minimal but detectable signs or symptoms suggesting a disorder. This type of prevention consists of early recognition and early intervention of the diseases to prevent deterioration [30]. For each of the three types of interventions, the goal is to reduce the number of new cases. The goal of treatment, on the other hand, is to reduce prevalence or the total number of cases. By reducing incidence you also reduce prevalence [5]. […] prevention research differs from treatment research in various ways. One of the most important differences is the fact that participants in treatment studies already meet the criteria for the illness being studied, such as depression. The intervention is targeted at improvement or remission of the specific condition quicker than if no intervention had taken place. In prevention research, the participants do not meet the specific criteria for the illness being studied and the overall goal of the intervention is to prevent the development of a clinical illness at a lower rate than a comparison group [5].”

A couple of risk factors [for depression] occur more frequently among the elderly than among young adults. The loss of a loved one or the loss of a social role (e.g., employment), decrease of social support and network, and the increasing change of isolation occur more frequently among the elderly. Many elderly also suffer from physical diseases: 64 % of elderly aged 65–74 has a chronic disease [36] […]. It is important to note that depression often co-occurs with other disorders such as physical illness and other mental health problems (comorbidity). Losing a spouse can have significant mental health effects. Almost half of all widows and widowers during the first year after the loss meet the criteria for depression according to the DSM-IV [37]. Depression after loss of a loved one is normal in times of mourning. However, when depressive symptoms persist during a longer period of time it is possible that a depression is developing. Zisook and Shuchter found that a year after the loss of a spouse 16 % of widows and widowers met the criteria of a depression compared to 4 % of those who did not lose their spouse [38]. […] People with a chronic physical disease are also at a higher risk of developing a depression. An estimated 12–36 % of those with a chronic physical illness also suffer from clinical depression [40]. […] around 25 % of cancer patients suffer from depression [40]. […] Depression is relatively common among elderly residing in hospitals and retirement- and nursing homes. An estimated 6–11 % of residents have a depressive illness and among 30 % have depressive symptoms [41]. […] Loneliness is common among the elderly. Among those of 60 years or older, 43 % reported being lonely in a study conducted by Perissinotto et al. […] Loneliness is often associated with physical and mental complaints; apart from depression it also increases the chance of developing dementia and excess mortality [43].”

From the public health perspective it is important to know what the potential health benefits would be if the harmful effect of certain risk factors could be removed. What health benefits would arise from this, at which efforts and costs? To measure this the population attributive fraction (PAF) can be used. The PAF is expressed in a percentage and demonstrates the decrease of the percentage of incidences (number of new cases) when the harmful effects of the targeted risk factors are fully taken away. For public health it would be more effective to design an intervention targeted at a risk factor with a high PAF than a low PAF. […] An intervention needs to be effective in order to be implemented; this means that it has to show a statistically significant difference with placebo or other treatment. Secondly, it needs to be effective; it needs to prove its benefits also in real life (“everyday care”) circumstances. Thirdly, it needs to be efficient. The measure to address this is the Number Needed to Be Treated (NNT). The NNT expresses how many people need to be treated to prevent the onset of one new case with the disorder; the lower the number, the more efficient the intervention [45]. To summarize, an indicated preventative intervention would ideally be targeted at a relatively small group of people with a high, absolute chance of developing the disease, and a risk profile that is responsible for a high PAF. Furthermore, there needs to be an intervention that is both effective and efficient. […] a more detailed and specific description of the target group results in a higher absolute risk, a lower NNT, and also a lower PAF. This is helpful in determining the costs and benefits of interventions aiming at more specific or broader subgroups in the population. […] Unfortunately very large samples are required to demonstrate reductions in universal or selected interventions [46]. […] If the incidence rate is higher in the target population, which is usually the case in selective and even more so in indicated prevention, the number of participants needed to prove an effect is much smaller [5]. This shows that, even though universal interventions may be effective, its effect is harder to prove than that of indicated prevention. […] Indicated and selective preventions appear to be the most successful in preventing depression to date; however, more research needs to be conducted in larger samples to determine which prevention method is really most effective.”

Groffen et al. [6] recently conducted an investigation among a sample of 4,809 participants from the Reykjavik Study (aged 66–93 years). Similar to the findings presented by Vink and colleagues [3], education level was related to depression risk: participants with lower education levels were more likely to report depressed mood in late-life than those with a college education (odds ratio [OR] = 1.87, 95 % confidence interval [CI] = 1.35–2.58). […] Results from a meta-analysis by Lorant and colleagues [8] showed that lower SES individuals had a greater odds of developing depression than those in the highest SES group (OR = 1.24, p= 0.004); however, the studies involved in this review did not focus on older populations. […] Cole and Dendukuri [10] performed a meta-analysis of studies involving middle-aged and older adult community residents, and determined that female gender was a risk factor for depression in this population (Pooled OR = 1.4, 95 % CI = 1.2–1.8), but not old age. Blazer and colleagues [11] found a significant positive association between older age and depressive symptoms in a sample consisting of community-dwelling older adults; however, when potential confounders such as physical disability, cognitive impairment, and gender were included in the analysis, the relationship between chronological age and depressive symptoms was reversed (p< 0.01). A study by Schoevers and colleagues [14] had similar results […] these findings suggest that higher incidence of depression observed among the oldest-old may be explained by other relevant factors. By contrast, the association of female gender with increased risk of late-life depression has been observed to be a highly consistent finding.”

In an examination of marital bereavement, Turvey et al. [16] analyzed data among 5,449 participants aged70 years […] recently bereaved participants had nearly nine times the odds of developing syndromal depression as married participants (OR = 8.8, 95 % CI = 5.1–14.9, p<0.0001), and they also had significantly higher risk of depressive symptoms 2 years after the spousal loss. […] Caregiving burden is well-recognized as a predisposing factor for depression among older adults [18]. Many older persons are coping with physically and emotionally challenging caregiving roles (e.g., caring for a spouse/partner with a serious illness or with cognitive or physical decline). Additionally, many caregivers experience elements of grief, as they mourn the loss of relationship with or the decline of valued attributes of their care recipients. […] Concepts of social isolation have also been examined with regard to late-life depression risk. For example, among 892 participants aged 65 years […], Gureje et al. [13] found that women with a poor social network and rural residential status were more likely to develop major depressive disorder […] Harlow and colleagues [21] assessed the association between social network and depressive symptoms in a study involving both married and recently widowed women between the ages of 65 and 75 years; they found that number of friends at baseline had an inverse association with CES-D (Centers for Epidemiologic Studies Depression Scale) score after 1 month (p< 0.05) and 12 months (p= 0.06) of follow-up. In a study that explicitly addressed the concept of loneliness, Jaremka et al. [22] conducted a study relating this factor to late-life depression; importantly, loneliness has been validated as a distinct construct, distinguishable among older adults from depression. Among 229 participants (mean age = 70 years) in a cohort of older adults caring for a spouse with dementia, loneliness (as measured by the NYU scale) significantly predicted incident depression (p<0.001). Finally, social support has been identified as important to late-life depression risk. For example, Cui and colleagues [23] found that low perceived social support significantly predicted worsening depression status over a 2-year period among 392 primary care patients aged 65 years and above.”

“Saunders and colleagues [26] reported […] findings with alcohol drinking behavior as the predictor. Among 701 community-dwelling adults aged 65 years and above, the authors found a significant association between prior heavy alcohol consumption and late-life depression among men: compared to those who were not heavy drinkers, men with a history of heavy drinking had a nearly fourfold higher odds of being diagnosed with depression (OR = 3.7, 95 % CI = 1.3–10.4, p< 0.05). […] Almeida et al. found that obese men were more likely than non-obese (body mass index [BMI] < 30) men to develop depression (HR = 1.31, 95 % CI = 1.05–1.64). Consistent with these results, presence of the metabolic syndrome was also found to increase risk of incident depression (HR = 2.37, 95 % CI = 1.60–3.51). Finally, leisure-time activities are also important to study with regard to late-life depression risk, as these too are readily modifiable behaviors. For example, Magnil et al. [30] examined such activities among a sample of 302 primary care patients aged 60 years. The authors observed that those who lacked leisure activities had an increased risk of developing depressive symptoms over the 2-year study period (OR = 12, 95 % CI = 1.1–136, p= 0.041). […] an important future direction in addressing social and behavioral risk factors in late-life depression is to make more progress in trials that aim to alter those risk factors that are actually modifiable.”

February 17, 2018 Posted by | Books, Epidemiology, Health Economics, Medicine, Psychiatry, Psychology, Statistics | Leave a comment

Random stuff

I have almost stopped posting posts like these, which has resulted in the accumulation of a very large number of links and studies which I figured I might like to blog at some point. This post is mainly an attempt to deal with the backlog – I won’t cover the material in too much detail.

i. Do Bullies Have More Sex? The answer seems to be a qualified yes. A few quotes:

“Sexual behavior during adolescence is fairly widespread in Western cultures (Zimmer-Gembeck and Helfland 2008) with nearly two thirds of youth having had sexual intercourse by the age of 19 (Finer and Philbin 2013). […] Bullying behavior may aid in intrasexual competition and intersexual selection as a strategy when competing for mates. In line with this contention, bullying has been linked to having a higher number of dating and sexual partners (Dane et al. 2017; Volk et al. 2015). This may be one reason why adolescence coincides with a peak in antisocial or aggressive behaviors, such as bullying (Volk et al. 2006). However, not all adolescents benefit from bullying. Instead, bullying may only benefit adolescents with certain personality traits who are willing and able to leverage bullying as a strategy for engaging in sexual behavior with opposite-sex peers. Therefore, we used two independent cross-sectional samples of older and younger adolescents to determine which personality traits, if any, are associated with leveraging bullying into opportunities for sexual behavior.”

“…bullying by males signal the ability to provide good genes, material resources, and protect offspring (Buss and Shackelford 1997; Volk et al. 2012) because bullying others is a way of displaying attractive qualities such as strength and dominance (Gallup et al. 2007; Reijntjes et al. 2013). As a result, this makes bullies attractive sexual partners to opposite-sex peers while simultaneously suppressing the sexual success of same-sex rivals (Gallup et al. 2011; Koh and Wong 2015; Zimmer-Gembeck et al. 2001). Females may denigrate other females, targeting their appearance and sexual promiscuity (Leenaars et al. 2008; Vaillancourt 2013), which are two qualities relating to male mate preferences. Consequently, derogating these qualities lowers a rivals’ appeal as a mate and also intimidates or coerces rivals into withdrawing from intrasexual competition (Campbell 2013; Dane et al. 2017; Fisher and Cox 2009; Vaillancourt 2013). Thus, males may use direct forms of bullying (e.g., physical, verbal) to facilitate intersexual selection (i.e., appear attractive to females), while females may use relational bullying to facilitate intrasexual competition, by making rivals appear less attractive to males.”

The study relies on the use of self-report data, which I find very problematic – so I won’t go into the results here. I’m not quite clear on how those studies mentioned in the discussion ‘have found self-report data [to be] valid under conditions of confidentiality’ – and I remain skeptical. You’ll usually want data from independent observers (e.g. teacher or peer observations) when analyzing these kinds of things. Note in the context of the self-report data problem that if there’s a strong stigma associated with being bullied (there often is, or bullying wouldn’t work as well), asking people if they have been bullied is not much better than asking people if they’re bullying others.

ii. Some topical advice that some people might soon regret not having followed, from the wonderful Things I Learn From My Patients thread:

“If you are a teenage boy experimenting with fireworks, do not empty the gunpowder from a dozen fireworks and try to mix it in your mother’s blender. But if you do decide to do that, don’t hold the lid down with your other hand and stand right over it. This will result in the traumatic amputation of several fingers, burned and skinned forearms, glass shrapnel in your face, and a couple of badly scratched corneas as a start. You will spend months in rehab and never be able to use your left hand again.”

iii. I haven’t talked about the AlphaZero-Stockfish match, but I was of course aware of it and did read a bit about that stuff. Here’s a reddit thread where one of the Stockfish programmers answers questions about the match. A few quotes:

“Which of the two is stronger under ideal conditions is, to me, neither particularly interesting (they are so different that it’s kind of like comparing the maximum speeds of a fish and a bird) nor particularly important (since there is only one of them that you and I can download and run anyway). What is super interesting is that we have two such radically different ways to create a computer chess playing entity with superhuman abilities. […] I don’t think there is anything to learn from AlphaZero that is applicable to Stockfish. They are just too different, you can’t transfer ideas from one to the other.”

“Based on the 100 games played, AlphaZero seems to be about 100 Elo points stronger under the conditions they used. The current development version of Stockfish is something like 40 Elo points stronger than the version used in Google’s experiment. There is a version of Stockfish translated to hand-written x86-64 assembly language that’s about 15 Elo points stronger still. This adds up to roughly half the Elo difference between AlphaZero and Stockfish shown in Google’s experiment.”

“It seems that Stockfish was playing with only 1 GB for transposition tables (the area of memory used to store data about the positions previously encountered in the search), which is way too little when running with 64 threads.” [I seem to recall a comp sci guy observing elsewhere that this was less than what was available to his smartphone version of Stockfish, but I didn’t bookmark that comment].

“The time control was a very artificial fixed 1 minute/move. That’s not how chess is traditionally played. Quite a lot of effort has gone into Stockfish’s time management. It’s pretty good at deciding when to move quickly, and when to spend a lot of time on a critical decision. In a fixed time per move game, it will often happen that the engine discovers that there is a problem with the move it wants to play just before the time is out. In a regular time control, it would then spend extra time analysing all alternative moves and trying to find a better one. When you force it to move after exactly one minute, it will play the move it already know is bad. There is no doubt that this will cause it to lose many games it would otherwise have drawn.”

iv. Thrombolytics for Acute Ischemic Stroke – no benefit found.

“Thrombolysis has been rigorously studied in >60,000 patients for acute thrombotic myocardial infarction, and is proven to reduce mortality. It is theorized that thrombolysis may similarly benefit ischemic stroke patients, though a much smaller number (8120) has been studied in relevant, large scale, high quality trials thus far. […] There are 12 such trials 1-12. Despite the temptation to pool these data the studies are clinically heterogeneous. […] Data from multiple trials must be clinically and statistically homogenous to be validly pooled.14 Large thrombolytic studies demonstrate wide variations in anatomic stroke regions, small- versus large-vessel occlusion, clinical severity, age, vital sign parameters, stroke scale scores, and times of administration. […] Examining each study individually is therefore, in our opinion, both more valid and more instructive. […] Two of twelve studies suggest a benefit […] In comparison, twice as many studies showed harm and these were stopped early. This early stoppage means that the number of subjects in studies demonstrating harm would have included over 2400 subjects based on originally intended enrollments. Pooled analyses are therefore missing these phantom data, which would have further eroded any aggregate benefits. In their absence, any pooled analysis is biased toward benefit. Despite this, there remain five times as many trials showing harm or no benefit (n=10) as those concluding benefit (n=2), and 6675 subjects in trials demonstrating no benefit compared to 1445 subjects in trials concluding benefit.”

“Thrombolytics for ischemic stroke may be harmful or beneficial. The answer remains elusive. We struggled therefore, debating between a ‘yellow’ or ‘red’ light for our recommendation. However, over 60,000 subjects in trials of thrombolytics for coronary thrombosis suggest a consistent beneficial effect across groups and subgroups, with no studies suggesting harm. This consistency was found despite a very small mortality benefit (2.5%), and a very narrow therapeutic window (1% major bleeding). In comparison, the variation in trial results of thrombolytics for stroke and the daunting but consistent adverse effect rate caused by ICH suggested to us that thrombolytics are dangerous unless further study exonerates their use.”

“There is a Cochrane review that pooled estimates of effect. 17 We do not endorse this choice because of clinical heterogeneity. However, we present the NNT’s from the pooled analysis for the reader’s benefit. The Cochrane review suggested a 6% reduction in disability […] with thrombolytics. This would mean that 17 were treated for every 1 avoiding an unfavorable outcome. The review also noted a 1% increase in mortality (1 in 100 patients die because of thrombolytics) and a 5% increase in nonfatal intracranial hemorrhage (1 in 20), for a total of 6% harmed (1 in 17 suffers death or brain hemorrhage).”

v. Suicide attempts in Asperger Syndrome. An interesting finding: “Over 35% of individuals with AS reported that they had attempted suicide in the past.”

Related: Suicidal ideation and suicide plans or attempts in adults with Asperger’s syndrome attending a specialist diagnostic clinic: a clinical cohort study.

“374 adults (256 men and 118 women) were diagnosed with Asperger’s syndrome in the study period. 243 (66%) of 367 respondents self-reported suicidal ideation, 127 (35%) of 365 respondents self-reported plans or attempts at suicide, and 116 (31%) of 368 respondents self-reported depression. Adults with Asperger’s syndrome were significantly more likely to report lifetime experience of suicidal ideation than were individuals from a general UK population sample (odds ratio 9·6 [95% CI 7·6–11·9], p<0·0001), people with one, two, or more medical illnesses (p<0·0001), or people with psychotic illness (p=0·019). […] Lifetime experience of depression (p=0·787), suicidal ideation (p=0·164), and suicide plans or attempts (p=0·06) did not differ significantly between men and women […] Individuals who reported suicide plans or attempts had significantly higher Autism Spectrum Quotient scores than those who did not […] Empathy Quotient scores and ages did not differ between individuals who did or did not report suicide plans or attempts (table 4). Patients with self-reported depression or suicidal ideation did not have significantly higher Autism Spectrum Quotient scores, Empathy Quotient scores, or age than did those without depression or suicidal ideation”.

The fact that people with Asperger’s are more likely to be depressed and contemplate suicide is consistent with previous observations that they’re also more likely to die from suicide – for example a paper I blogged a while back found that in that particular (large Swedish population-based cohort-) study, people with ASD were more than 7 times as likely to die from suicide than were the comparable controls.

Also related: Suicidal tendencies hard to spot in some people with autism.

This link has some great graphs and tables of suicide data from the US.

Also autism-related: Increased perception of loudness in autism. This is one of the ‘important ones’ for me personally – I am much more sound-sensitive than are most people.

vi. Early versus Delayed Invasive Intervention in Acute Coronary Syndromes.

“Earlier trials have shown that a routine invasive strategy improves outcomes in patients with acute coronary syndromes without ST-segment elevation. However, the optimal timing of such intervention remains uncertain. […] We randomly assigned 3031 patients with acute coronary syndromes to undergo either routine early intervention (coronary angiography ≤24 hours after randomization) or delayed intervention (coronary angiography ≥36 hours after randomization). The primary outcome was a composite of death, myocardial infarction, or stroke at 6 months. A prespecified secondary outcome was death, myocardial infarction, or refractory ischemia at 6 months. […] Early intervention did not differ greatly from delayed intervention in preventing the primary outcome, but it did reduce the rate of the composite secondary outcome of death, myocardial infarction, or refractory ischemia and was superior to delayed intervention in high-risk patients.”

vii. Some wikipedia links:

Behrens–Fisher problem.
Sailing ship tactics (I figured I had to read up on this if I were to get anything out of the Aubrey-Maturin books).
Anatomical terms of muscle.
Phatic expression (“a phatic expression […] is communication which serves a social function such as small talk and social pleasantries that don’t seek or offer any information of value.”)
Three-domain system.
Beringian wolf (featured).
Subdural hygroma.
Cayley graph.
Schur polynomial.
Solar neutrino problem.
Hadamard product (matrices).
True polar wander.
Newton’s cradle.

viii. Determinant versus permanent (mathematics – technical).

ix. Some years ago I wrote a few English-language posts about some of the various statistical/demographic properties of immigrants living in Denmark, based on numbers included in a publication by Statistics Denmark. I did it by translating the observations included in that publication, which was only published in Danish. I was briefly considering doing the same thing again when the 2017 data arrived, but I decided not to do it as I recalled that it took a lot of time to write those posts back then, and it didn’t seem to me to be worth the effort – but Danish readers might be interested to have a look at the data, if they haven’t already – here’s a link to the publication Indvandrere i Danmark 2017.

x. A banter blitz session with grandmaster Peter Svidler, who recently became the first Russian ever to win the Russian Chess Championship 8 times. He’s currently shared-second in the World Rapid Championship after 10 rounds and is now in the top 10 on the live rating list in both classical and rapid – seems like he’s had a very decent year.

xi. I recently discovered Dr. Whitecoat’s blog. The patient encounters are often interesting.

December 28, 2017 Posted by | Astronomy, autism, Biology, Cardiology, Chess, Computer science, History, Mathematics, Medicine, Neurology, Physics, Psychiatry, Psychology, Random stuff, Statistics, Studies, Wikipedia, Zoology | Leave a comment

Occupational Epidemiology (III)

This will be my last post about the book.

Some observations from the final chapters:

“Often there is confusion about the difference between systematic reviews and metaanalyses. A meta-analysis is a quantitative synthesis of two or more studies […] A systematic review is a synthesis of evidence on the effects of an intervention or an exposure which may also include a meta-analysis, but this is not a prerequisite. It may be that the results of the studies which have been included in a systematic review are reported in such a way that it is impossible to synthesize them quantitatively. They can then be reported in a narrative manner.10 However, a meta-analysis always requires a systematic review of the literature. […] There is a long history of debate about the value of meta-analysis for occupational cohort studies or other occupational aetiological studies. In 1994, Shapiro argued that ‘meta-analysis of published non-experimental data should be abandoned’. He reasoned that ‘relative risks of low magnitude (say, less than 2) are virtually beyond the resolving power of the epidemiological microscope because we can seldom demonstrably eliminate all sources of bias’.13 Because the pooling of studies in a meta-analysis increases statistical power, the pooled estimate may easily become significant and thus incorrectly taken as an indication of causality, even though the biases in the included studies may not have been taken into account. Others have argued that the method of meta-analysis is important but should be applied appropriately, taking into account the biases in individual studies.14 […] We believe that the synthesis of aetiological studies should be based on the same general principles as for intervention studies, and the existing methods adapted to the particular challenges of cohort and case-control studies. […] Since 2004, there is a special entity, the Cochrane Occupational Safety and Health Review Group, that is responsible for the preparing and updating of reviews of occupational safety and health interventions […]. There were over 100 systematic reviews on these topics in the Cochrane Library in 2012.”

“The believability of a systematic review’s results depends largely on the quality of the included studies. Therefore, assessing and reporting on the quality of the included studies is important. For intervention studies, randomized trials are regarded as of higher quality than observational studies, and the conduct of the study (e.g. in terms of response rate or completeness of follow-up) also influences quality. A conclusion derived from a few high-quality studies will be more reliable than when the conclusion is based on even a large number of low-quality studies. Some form of quality assessment is nowadays commonplace in intervention reviews but is still often missing in reviews of aetiological studies. […] It is tempting to use quality scores, such as the Jadad scale for RCTs34 and the Downs and Black scale for non-RCT intervention studies35 but these, in their original format, are insensitive to variation in the importance of risk areas for a given research question. The score system may give the same value to two studies (say, 10 out of 12) when one, for example, lacked blinding and the other did not randomize, thus implying that their quality is equal. This would not be a problem if randomization and blinding were equally important for all questions in all reviews, but this is not the case. For RCTs an important development in this regard has been the Cochrane risk of bias tool.36 This is a checklist of six important domains that have been shown to be important areas of bias in RCTs: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, and selective reporting.”

“[R]isks of bias tools developed for intervention studies cannot be used for reviews of aetiological studies without relevant modification. This is because, unlike interventions, exposures are usually more complicated to assess when we want to attribute the outcome to them alone. These scales do not cover all items that may need assessment in an aetiological study, such as confounding and information bias relating to exposures. […] Surprisingly little methodological work has been done to develop validated tools for aetiological epidemiology and most tools in use are not validated,38 […] Two separate checklists, for observational studies of incidence and prevalence and for risk factor assessment, have been developed and validated recently.40 […] Publication and other reporting bias is probably a much bigger issue for aetiological studies than for intervention studies. This is because, for clinical trials, the introduction of protocol registration, coupled with the regulatory system for new medications, has helped in assessing and preventing publication and reporting bias. No such checks exist for observational studies.”

“Most ill health that arises from occupational exposures can also arise from nonoccupational exposures, and the same type of exposure can occur in occupational and non-occupational settings. With the exception of malignant mesothelioma (which is essentially only caused by exposure to asbestos), there is no way to determine which exposure caused a particular disorder, nor where the causative exposure occurred. This means that usually it is not possible to determine the burden just by counting the number of cases. Instead, approaches to estimating this burden have been developed. There are also several ways to define burden and how best to measure it.”

“The population attributable fraction (PAF) is the proportion of cases that would not have occurred in the absence of an occupational exposure. It can be estimated by combining two measures — a risk estimate (usually relative risk (RR) or odds ratio) of the disorder of interest that is associated with exposure to the substance of concern; and an estimate of the proportion of the population exposed to the substance at work (p(E)). This approach has been used in several studies, particularly for estimating cancer burden […] There are several possible equations that can be used to calculate the PAF, depending on the available data […] PAFs cannot in general be combined by summing directly because: (1) summing PAFs for overlapping exposures (i.e. agents to which the same ‘ever exposed’ workers may have been exposed) may give an overall PAF exceeding 100%, and (2) summing disjoint (not concurrently occurring) exposures also introduces upward bias. Strategies to avoid this include partitioning exposed numbers between overlapping exposures […] or estimating only for the ‘dominant’ carcinogen with the highest risk. Where multiple exposures remain, one approach is to assume that the exposures are independent and their joint effects are multiplicative. The PAFs can then be combined to give an overall PAF for that cancer using a product sum. […] Potential sources of bias for PAFs include inappropriate choice of risk estimates, imprecision in the risk estimates and estimates of proportions exposed, inaccurate risk exposure period and latency assumptions, and a lack of separate risk estimates in some cases for women and/or cancer incidence. In addition, a key decision is the choice of which diseases and exposures are to be included.”

“The British Cancer Burden study is perhaps the most detailed study of occupationally related cancers in that it includes all those relevant carcinogens classified at the end of 2008 […] In the British study the attributable fractions ranged from less than 0.01% to 95% overall, the most important cancer sites for occupational attribution being, for men, mesothelioma (97%), sinonasal (46%), lung (21.1%), bladder (7.1%), and non-melanoma skin cancer (7.1%) and, for women, mesothelioma (83%), sinonasal (20.1%), lung (5.3%), breast (4.6%), and nasopharynx (2.5%). Occupation also contributed 2% or more overall to cancers of the larynx, oesophagus, and stomach, and soft tissue sarcoma with, in addition for men, melanoma of the eye (due to welding), and non-Hodgkin lymphoma. […] The overall results from the occupational risk factors component of the Global Burden of Disease 2010 study illustrate several important aspects of burden studies.14 Of the estimated 850 000 occupationally related deaths worldwide, the top three causes were: (1) injuries (just over a half of all deaths); (2) particulate matter, gases, and fumes leading to COPD; and (3) carcinogens. When DALYs were used as the burden measure, injuries still accounted for the highest proportion (just over one-third), but ergonomic factors leading to low back pain resulted in almost as many DALYs, and both were almost an order of magnitude higher than the DALYs from carcinogens. The difference in relative contributions of the various risk factors between deaths and DALYs arises because of the varying ages of those affected, and the differing chronicity of the resulting conditions. Both measures are valid, but they represent a different aspect of the burden arising from the hazardous exposures […]. Both the British and Global Burden of Disease studies draw attention to the important issues of: (1) multiple occupational carcinogens causing specific types of cancer, for example, the British study evaluated 21 lung carcinogens; and (2) specific carcinogens causing several different cancers, for example, IARC now defines asbestos as a group 1 or 2A carcinogen for seven cancer sites. These issues require careful consideration for burden estimation and for prioritizing risk reduction strategies. […] The long latency of many cancers means that estimates of current burden are based on exposures occurring in the past, often much higher than those existing today. […] long latency [also] means that risk reduction measures taken now will take a considerable time to be reflected in reduced disease incidence.”

“Exposures and effects are linked by dynamic processes occurring across time. These processes can often be usefully decomposed into two distinct biological relationships, each with several components: 1. The exposure-dose relationship […] 2. The dose-effect relationship […] These two component relationships are sometimes represented by two different mathematical models: a toxicokinetic model […], and a disease process model […]. Depending on the information available, these models may be relatively simple or highly complex. […] Often the various steps in the disease process do not occur at the same rate, some of these processes are ‘fast’, such as cell killing, while others are ‘slow’, such as damage repair. Frequently a few slow steps in a process become limiting to the overall rate, which sets the temporal pattern for the entire exposure-response relationship. […] It is not necessary to know the full mechanism of effects to guide selection of an exposure-response model or exposure metric. Because of the strong influence of the rate-limiting steps, often it is only necessary to have observations on the approximate time course of effects. This is true whether the effects appear to be reversible or irreversible, and whether damage progresses proportionately with each unit of exposure (actually dose) or instead occurs suddenly, and seemingly without regard to the amount of exposure, such as an asthma attack.”

“In this chapter, we argue that formal disease process models have the potential to improve the sensitivity of epidemiology for detecting new and emerging occupational and environmental risks where there is limited mechanistic information. […] In our approach, these models are often used to create exposure or dose metrics, which are in turn used in epidemiological models to estimate exposure-disease associations. […] Our goal is a methodology to formulate strong tests of our exposure-disease hypotheses in which a hypothesis is developed in as much biological detail as it can be, expressed in a suitable dynamic (temporal) model, and tested by its fit with a rich data set, so that its flaws and misperceptions of reality are fully displayed. Rejecting such a fully developed biological hypothesis is more informative than either rejecting or failing to reject a generic or vaguely defined hypothesis.” For example, the hypothesis ‘truck drivers have more risk of lung cancer than non-drivers’13 is of limited usefulness for prevention […]. Hypothesizing that a particular chemical agent in truck exhaust is associated with lung cancer — whether the hypothesis is refuted or supported by data — is more likely to lead to successful prevention activities. […] we believe that the choice of models against which to compare the data should, so far as possible, be guided by explicit hypotheses about the underlying biological processes. In other words, you can get as much as possible from epidemiology by starting from well-thought-out hypotheses that are formalized as mathematical models into which the data will be placed. The disease process models can serve this purpose.2″

“The basic idea of empirical Bayes (EB) and semiBayes (SB) adjustments for multiple associations is that the observed variation of the estimated relative risks around their geometric mean is larger than the variation of the true (but unknown) relative risks. In SB adjustments, an a priori value for the extra variation is chosen which assigns a reasonable range of variation to the true relative risks and this value is then used to adjust the observed relative risks.7 The adjustment consists in shrinking outlying relative risks towards the overall mean (of the relative risks for all the different exposures being considered). The larger the individual variance of the relative risks, the stronger the shrinkage, so that the shrinkage is stronger for less reliable estimates based on small numbers. Typical applications in which SB adjustments are a useful alternative to traditional methods of adjustment for multiple comparisons are in large occupational surveillance studies, where many relative risks are estimated with few or no a priori beliefs about which associations might be causal.7″

“The advantage of [the SB adjustment] approach over classical Bonferroni corrections is that on the average it produces more valid estimates of the odds ratio for each occupation/exposure. If we do a study which involves assessing hundreds of occupations, the problem is not only that we get many ‘false positive’ results by chance. A second problem is that even the ‘true positives’ tend to have odds ratios that are too high. For example, if we have a group of occupations with true odds ratios around 1.5, then the ones that stand out in the analysis are those with the highest odds ratios (e.g. 2.5) which will be elevated partly because of real effects and partly by chance. The Bonferroni correction addresses the first problem (too many chance findings) but not the second, that the strongest odds ratios are probably too high. In contrast, SB adjustment addresses the second problem by correcting for the anticipated regression to the mean that would have occurred if the study had been repeated, and thereby on the average produces more valid odds ratio estimates for each occupation/exposure. […] most epidemiologists write their Methods and Results sections as frequentists and their Introduction and Discussion sections as Bayesians. In their Methods and Results sections, they ‘test’ their findings as if their data are the only data that exist. In the Introduction and Discussion, they discuss their findings with regard to their consistency with previous studies, as well as other issues such as biological plausibility. This creates tensions when a small study has findings which are not statistically significant but which are consistent with prior knowledge, or when a study finds statistically significant findings which are inconsistent with prior knowledge. […] In some (but not all) instances, things can be made clearer if we include Bayesian methods formally in the Methods and Results sections of our papers”.

“In epidemiology, risk is most often quantified in terms of relative risk — i.e. the ratio of the probability of an adverse outcome in someone with a specified exposure to that in someone who is unexposed, or exposed at a different specified level. […] Relative risks can be estimated from a wider range of study designs than individual attributable risks. They have the advantage that they are often stable across different groups of people (e.g. of different ages, smokers, and non-smokers) which makes them easier to estimate and quantify. Moreover, high relative risks are generally unlikely to be explained by unrecognized bias or confounding. […] However, individual attributable risks are a more relevant measure by which to quantify the impact of decisions in risk management on individuals. […] Individual attributable risk is the difference in the probability of an adverse outcome between someone with a specified exposure and someone who is unexposed, or exposed at a different specified level. It is the critical measure when considering the impact of decisions in risk management on individuals. […] Population attributable risk is the difference in the frequency of an adverse outcome between a population with a given distribution of exposures to a hazardous agent, and that in a population with no exposure, or some other specified distribution of exposures. It depends on the prevalence of exposure at different levels within the population, and on the individual attributable risk for each level of exposure. It is a measure of the impact of the agent at a population level, and is relevant to decisions in risk management for populations. […] Population attributable risks are highest when a high proportion of a population is exposed at levels which carry high individual attributable risks. On the other hand, an exposure which carries a high individual attributable risk may produce only a small population attributable risk if the prevalence of such exposure is low.”

“Hazard characterization entails quantification of risks in relation to routes, levels, and durations of exposure. […] The findings from individual studies are often used to determine a no observed adverse effect level (NOAEL), lowest observed effect level (LOEL), or benchmark dose lower 95% confidence limit (BMDL) for relevant effects […] [NOAEL] is the highest dose or exposure concentration at which there is no discernible adverse effect. […] [LOEL] is the lowest dose or exposure concentration at which a discernible effect is observed. If comparison with unexposed controls indicates adverse effects at all of the dose levels in an experiment, a NOAEL cannot be derived, but the lowest dose constitutes a LOEL, which might be used as a comparator for estimated exposures or to derive a toxicological reference value […] A BMDL is defined in relation to a specified adverse outcome that is observed in a study. Usually, this is the outcome which occurs at the lowest levels of exposure and which is considered critical to the assessment of risk. Statistical modelling is applied to the experimental data to estimate the dose or exposure concentration which produces a specified small level of effect […]. The BMDL is the lower 95% confidence limit for this estimate. As such, it depends both on the toxicity of the test chemical […], and also on the sample sizes used in the study (other things being equal, larger sample sizes will produce more precise estimates, and therefore higher BMDLs). In addition to accounting for sample size, BMDLs have the merit that they exploit all of the data points in a study, and do not depend so critically on the spacing of doses that is adopted in the experimental design (by definition a NOAEL or LOEL can only be at one of the limited number of dose levels used in the experiment). On the other hand, BMDLs can only be calculated where an adverse effect is observed. Even if there are no clear adverse effects at any dose level, a NOAEL can be derived (it will be the highest dose administered).”

December 8, 2017 Posted by | Books, Cancer/oncology, Epidemiology, Medicine, Statistics | Leave a comment

Occupational Epidemiology (II)

Some more observations from the book below.

“RD [Retinal detachment] is the separation of the neurosensory retina from the underlying retinal pigment epithelium.1 RD is often preceded by posterior vitreous detachment — the separation of the posterior vitreous from the retina as a result of vitreous degeneration and shrinkage2 — which gives rise to the sudden appearance of floaters and flashes. Late symptoms of RD may include visual field defects (shadows, curtains) or even blindness. The success rate of RD surgery has been reported to be over 90%;3 however, a loss of visual acuity is frequently reported by patients, particularly if the macula is involved.4 Since the natural history of RD can be influenced by early diagnosis, patients experiencing symptoms of posterior vitreous detachment are advised to undergo an ophthalmic examination.5 […] Studies of the incidence of RD give estimates ranging from 6.3 to 17.9 cases per 100 000 person-years.6 […] Age is a well-known risk factor for RD. In most studies the peak incidence was recorded among subjects in their seventh decade of life. A secondary peak at a younger age (20–30 years) has been identified […] attributed to RD among highly myopic patients.6 Indeed, depending on the severity,
myopia is associated with a four- to ten-fold increase in risk of RD.7 [Diabetics with retinopathy are also at increased risk of RD, US] […] While secondary prevention of RD is current practice, no effective primary prevention strategy is available at present. The idea is widespread among practitioners that RD is not preventable, probably the consequence of our historically poor understanding of the aetiology of RD. For instance, on the website of the Mayo Clinic — one of the top-ranked hospitals for ophthalmology in the US — it is possible to read that ‘There’s no way to prevent retinal detachment’.9

“Intraocular pressure […] is influenced by physical activity. Dynamic exercise causes an acute reduction in intraocular pressure, whereas physical fitness is associated with a lower baseline value.29 Conversely, a sudden rise in intraocular pressure has been reported during the Valsalva manoeuvre.30-32 […] Occupational physical activity may […] cause both short- and long-term variations in intraocular pressure. On the one hand, physically demanding jobs may contribute to decreased baseline levels by increasing physical fitness but, on the other hand, lifting tasks may cause an important acute increase in pressure. Moreover, the eye of a manual worker who performs repeated lifting tasks involving the Valsalva manoeuvre may undergo several dramatic changes in intraocular pressure within a single working shift. […] A case-control study was carried out to test the hypothesis that repeated lifting tasks involving the Valsalva manoeuvre could be a risk factor for RD. […] heavy lifting was a strong risk factor for RD (OR 4.4, 95% CI 1.6–13). Intriguingly, body mass index (BMI) also showed a clear association with RD (top quartile: OR 6.8, 95% CI 1.6–29). […] Based on their findings, the authors concluded that heavy occupational lifting (involving the Valsalva manoeuvre) may be a relevant risk factor for RD in myopics.

“The proportion of the world’s population over 60 is forecast to double from 11.6% in 2012 to 21.8% in 2050.1 […] the International Labour Organization notes that, worldwide, just 40% of the working age population has legal pension coverage, and only 26% of the working population is effectively covered by old-age pension schemes. […] in less developed regions, labour force participation in those over 65 is much higher than in more developed regions.8 […] Longer working lives increase cumulative exposures, as well as increasing the time since exposure — important when there is a long latency period between exposure and resultant disease. Further, some exposures may have a greater effect when they occur to older workers, e.g. carcinogens that are promoters rather than initiators. […] Older workers tend to have more chronic health conditions. […] Older workers have fewer injuries, but take longer to recover. […] For some ‘knowledge workers’, like physicians, even a relatively minor cognitive decline […] might compromise their competence. […]  Most past studies have treated age as merely a confounding variable and rarely, if ever, have considered it an effect modifier. […]  Jex and colleagues24 argue that conceptually we should treat age as the variable of interest so that other variables are viewed as moderating the impact of age. […] The single best improvement to epidemiological research on ageing workers is to conduct longitudinal studies, including follow-up of workers into retirement. Cross-sectional designs almost certainly incur the healthy survivor effect, since unhealthy workers may retire early.25 […] Analyses should distinguish ageing per se, genetic factors, work exposures, and lifestyle in order to understand their relative and combined effects on health.”

“Musculoskeletal disorders have long been recognized as an important source of morbidity and disability in many occupational populations.1,2 Most musculoskeletal disorders, for most people, are characterized by recurrent episodes of pain that vary in severity and in their consequences for work. Most episodes subside uneventfully within days or weeks, often without any intervention, though about half of people continue to experience some pain and functional limitations after 12 months.3,4 In working populations, musculoskeletal disorders may lead to a spell of sickness absence. Sickness absence is increasingly used as a health parameter of interest when studying the consequences of functional limitations due to disease in occupational groups. Since duration of sickness absence contributes substantially to the indirect costs of illness, interventions increasingly address return to work (RTW).5 […] The Clinical Standards Advisory Group in the United Kingdom reported RTW within 2 weeks for 75% of all low back pain (LBP) absence episodes and suggested that approximately 50% of all work days lost due to back pain in the working population are from the 85% of people who are off work for less than 7 days.6″

Any RTW curve over time can be described with a mathematical Weibull function.15 This Weibull function is characterized by a scale parameter λ and a shape parameter k. The scale parameter λ is a function of different covariates that include the intervention effect, preferably expressed as hazard ratio (HR) between the intervention group and the reference group in a Cox’s proportional hazards regression model. The shape parameter k reflects the relative increase or decrease in survival time, thus expressing how much the RTW rate will decrease with prolonged sick leave. […] a HR as measure of effect can be introduced as a covariate in the scale parameter λ in the Weibull model and the difference in areas under the curve between the intervention model and the basic model will give the improvement in sickness absence days due to the intervention. By introducing different times of starting the intervention among those workers still on sick leave, the impact of timing of enrolment can be evaluated. Subsequently, the estimated changes in total sickness absence days can be expressed in a benefit/cost ratio (BC ratio), where benefits are the costs saved due to a reduction in sickness absence and costs are the expenditures relating to the intervention.15″

“A crucial factor in understanding why interventions are effective or not is the timing of the enrolment of workers on sick leave into the intervention. The RTW pattern over time […] has important consequences for appropriate timing of the best window for effective clinical and occupational interventions. The evidence presented by Palmer and colleagues clearly suggests that [in the context of LBP] a stepped care approach is required. In the first step of rapid RTW, most workers will return to work even without specific interventions. Simple, short interventions involving effective coordination and cooperation between primary health care and the workplace will be sufficient to help the majority of workers to achieve an early RTW. In the second step, more expensive, structured interventions are reserved for those who are having difficulties returning, typically between 4 weeks and 3 months. However, to date there is little evidence on the optimal timing of such interventions for workers on sick leave due to LBP.14,15 […] the cost-benefits of a structured RTW intervention among workers on sick leave will be determined by the effectiveness of the intervention, the natural speed of RTW in the target population, the timing of the enrolment of workers into the intervention, and the costs of both the intervention and of a day of sickness absence. […] The cost-effectiveness of a RTW intervention will be determined by the effectiveness of the intervention, the costs of the intervention and of a day of sickness absence, the natural course of RTW in the target population, the timing of the enrolment of workers into the RTW intervention, and the time lag before the intervention takes effect. The latter three factors are seldom taken into consideration in systematic reviews and guidelines for management of RTW, although their impact may easily be as important  as classical measures of effectiveness, such as effect size or HR.”

“In order to obtain information of the highest quality and utility, surveillance schemes have to be designed, set up, and managed with the same methodological rigour as high-calibre prospective cohort studies. Whether surveillance schemes are voluntary or not, considerable effort has to be invested to ensure a satisfactory and sufficient denominator, the best numerator quality, and the most complete ascertainment. Although the force of statute is relied upon in some surveillance schemes, even in these the initial and continuing motivation of the reporters (usually physicians) is paramount. […] There is a surveillance ‘pyramid’ within which the patient’s own perception is at the base, the GP is at a higher level, and the clinical specialist is close to the apex. The source of the surveillance reports affects the numerator because case severity and case mix differ according to the level in the pyramid.19 Although incidence rate estimates may be expected to be lower at the higher levels in the surveillance pyramid this is not necessarily always the case. […] Although surveillance undertaken by physicians who specialize in the organ system concerned or in occupational disease (or in both aspects) may be considered to be the medical ‘gold standard’ it can suffer from a more limited patient catchment because of various referral filters. Surveillance by GPs will capture numerator cases as close to the base of the pyramid as possible, but may suffer from greater diagnostic variation than surveillance by specialists. Limiting recruitment to GPs with a special interest, and some training, in occupational medicine is a compromise between the two levels.20

“When surveillance is part of a statutory or other compulsory scheme then incident case identification is a continuous and ongoing process. However, when surveillance is voluntary, for a research objective, it may be preferable to sample over shorter, randomly selected intervals, so as to reduce the demands associated with the data collection and ‘reporting fatigue’. Evidence so far suggests that sampling over shorter time intervals results in higher incidence estimates than continuous sampling.21 […] Although reporting fatigue is an important consideration in tempering conclusions drawn from […] multilevel models, it is possible to take account of this potential bias in various ways. For example, when evaluating interventions, temporal trends in outcomes resulting from other exposures can be used to control for fatigue.23,24 The phenomenon of reporting fatigue may be characterized by an ‘excess of zeroes’ beyond what is expected of a Poisson distribution and this effect can be quantified.27 […] There are several considerations in determining incidence from surveillance data. It is possible to calculate an incidence rate based on the general population, on the population of working age, or on the total working population,19 since these denominator bases are generally readily available, but such rates are not the most useful in determining risk. Therefore, incidence rates are usually calculated in respect of specific occupations or industries.22 […] Ideally, incidence rates should be expressed in relation to quantitative estimates of exposure but most surveillance schemes would require additional data collection as special exercises to achieve this aim.” [for much more on these topics, see also M’ikanatha & Iskander’s book.]

“Estimates of lung cancer risk attributable to occupational exposures vary considerably by geographical area and depend on study design, especially on the exposure assessment method, but may account for around 5–20% of cancers among men, but less (<5%) among women;2 among workers exposed to (suspected) lung carcinogens, the percentage will be higher. […] most exposure to known lung carcinogens originates from occupational settings and will affect millions of workers worldwide.  Although it has been established that these agents are carcinogenic, only limited evidence is available about the risks encountered at much lower levels in the general population. […] One of the major challenges in community-based occupational epidemiological studies has been valid assessment of the occupational exposures experienced by the population at large. Contrary to the detailed information usually available for an industrial population (e.g. in a retrospective cohort study in a large chemical company) that often allows for quantitative exposure estimation, community-based studies […] have to rely on less precise and less valid estimates. The choice of method of exposure assessment to be applied in an epidemiological study depends on the study design, but it boils down to choosing between acquiring self-reported exposure, expert-based individual exposure assessment, or linking self-reported job histories with job-exposure matrices (JEMs) developed by experts. […] JEMs have been around for more than three decades.14 Their main distinction from either self-reported or expert-based exposure assessment methods is that exposures are no longer assigned at the individual subject level but at job or task level. As a result, JEMs make no distinction in assigned exposure between individuals performing the same job, or even between individuals performing a similar job in different companies. […] With the great majority of occupational exposures having a rather low prevalence (<10%) in the general population it is […] extremely important that JEMs are developed aiming at a highly specific exposure assessment so that only jobs with a high likelihood (prevalence) and intensity of exposure are considered to be exposed. Aiming at a high sensitivity would be disastrous because a high sensitivity would lead to an enormous number of individuals being assigned an exposure while actually being unexposed […] Combinations of the methods just described exist as well”.

“Community-based studies, by definition, address a wider range of types of exposure and a much wider range of encountered exposure levels (e.g. relatively high exposures in primary production but often lower in downstream use, or among indirectly exposed individuals). A limitation of single community-based studies is often the relatively low number of exposed individuals. Pooling across studies might therefore be beneficial. […] Pooling projects need careful planning and coordination, because the original studies were conducted for different purposes, at different time periods, using different questionnaires. This heterogeneity is sometimes perceived as a disadvantage but also implies variations that can be studied and thereby provide important insights. Every pooling project has its own dynamics but there are several general challenges that most pooling projects confront. Creating common variables for all studies can stretch from simple re-naming of variables […] or recoding of units […] to the re-categorization of national educational systems […] into years of formal education. Another challenge is to harmonize the different classification systems of, for example, diseases (e.g. International Classification of Disease (ICD)-9 versus ICD-10), occupations […], and industries […]. This requires experts in these respective fields as well as considerable time and money. Harmonization of data may mean losing some information; for example, ISCO-68 contains more detail than ISCO-88, which makes it possible to recode ISCO-68 to ISCO-88 with only a little loss of detail, but it is not possible to recode ISCO-88 to ISCO-68 without losing one or two digits in the job code. […] Making the most of the data may imply that not all studies will qualify for all analyses. For example, if a study did not collect data regarding lung cancer cell type, it can contribute to the overall analyses but not to the cell type-specific analyses. It is important to remember that the quality of the original data is critical; poor data do not become better by pooling.”

December 6, 2017 Posted by | Books, Cancer/oncology, Demographics, Epidemiology, Health Economics, Medicine, Ophthalmology, Statistics | Leave a comment

Quotes

i. “The party that negotiates in haste is often at a disadvantage.” (Howard Raiffa)

ii. “Advice: don’t embarrass your bargaining partner by forcing him or her to make all the concessions.” (-ll-)

iii. “Disputants often fare poorly when they each act greedily and deceptively.” (-ll-)

iv. “Each man does seek his own interest, but, unfortunately, not according to the dictates of reason.” (Kenneth Waltz)

v. “Whatever is said after I’m gone is irrelevant.” (Jimmy Savile)

vi. “Trust is an important lubricant of a social system. It is extremely efficient; it saves a lot of trouble to have a fair degree of reliance on other people’s word. Unfortunately this is not a commodity which can be bought very easily. If you have to buy it, you already have some doubts about what you have bought.” (Kenneth Arrow)

vii. “… an author never does more damage to his readers than when he hides a difficulty.” (Évariste Galois)

viii. “A technical argument by a trusted author, which is hard to check and looks similar to arguments known to be correct, is hardly ever checked in detail” (Vladimir Voevodsky)

ix. “Suppose you want to teach the “cat” concept to a very young child. Do you explain that a cat is a relatively small, primarily carnivorous mammal with retractible claws, a distinctive sonic output, etc.? I’ll bet not. You probably show the kid a lot of different cats, saying “kitty” each time, until it gets the idea. To put it more generally, generalizations are best made by abstraction from experience. They should come one at a time; too many at once overload the circuits.” (Ralph P. Boas Jr.)

x. “Every author has several motivations for writing, and authors of technical books always have, as one motivation, the personal need to understand; that is, they write because they want to learn, or to understand a phenomenon, or to think through a set of ideas.” (Albert Wymore)

xi. “Great mathematics is achieved by solving difficult problems not by fabricating elaborate theories in search of a problem.” (Harold Davenport)

xii. “Is science really gaining in its assault on the totality of the unsolved? As science learns one answer, it is characteristically true that it also learns several new questions. It is as though science were working in a great forest of ignorance, making an ever larger circular clearing within which, not to insist on the pun, things are clear… But as that circle becomes larger and larger, the circumference of contact with ignorance also gets longer and longer. Science learns more and more. But there is an ultimate sense in which it does not gain; for the volume of the appreciated but not understood keeps getting larger. We keep, in science, getting a more and more sophisticated view of our essential ignorance.” (Warren Weaver)

xiii. “When things get too complicated, it sometimes makes sense to stop and wonder: Have I asked the right question?” (Enrico Bombieri)

xiv. “The mean and variance are unambiguously determined by the distribution, but a distribution is, of course, not determined by its mean and variance: A number of different distributions have the same mean and the same variance.” (Richard von Mises)

xv. “Algorithms existed for at least five thousand years, but people did not know that they were algorithmizing. Then came Turing (and Post and Church and Markov and others) and formalized the notion.” (Doron Zeilberger)

xvi. “When a problem seems intractable, it is often a good idea to try to study “toy” versions of it in the hope that as the toys become increasingly larger and more sophisticated, they would metamorphose, in the limit, to the real thing.” (-ll-)

xvii. “The kind of mathematics foisted on children in schools is not meaningful, fun, or even very useful. This does not mean that an individual child cannot turn it into a valuable and enjoyable personal game. For some the game is scoring grades; for others it is outwitting the teacher and the system. For many, school math is enjoyable in its repetitiveness, precisely because it is so mindless and dissociated that it provides a shelter from having to think about what is going on in the classroom. But all this proves is the ingenuity of children. It is not a justifications for school math to say that despite its intrinsic dullness, inventive children can find excitement and meaning in it.” (Seymour Papert)

xviii. “The optimist believes that this is the best of all possible worlds, and the pessimist fears that this might be the case.” (Ivar Ekeland)

xix. “An equilibrium is not always an optimum; it might not even be good. This may be the most important discovery of game theory.” (-ll-)

xx. “It’s not all that rare for people to suffer from a self-hating monologue. Any good theories about what’s going on there?”

“If there’s things you don’t like about your life, you can blame yourself, or you can blame others. If you blame others and you’re of low status, you’ll be told to cut that out and start blaming yourself. If you blame yourself and you can’t solve the problems, self-hate is the result.” (Nancy Lebovitz & ‘The Nybbler’)

December 1, 2017 Posted by | Mathematics, Quotes/aphorisms, Science, Statistics | 4 Comments

Common Errors in Statistics… (III)

This will be my last post about the book. I liked most of it, and I gave it four stars on goodreads, but that doesn’t mean there weren’t any observations included in the book with which I took issue/disagreed. Here’s one of the things I didn’t like:

“In the univariate [model selection] case, if the errors were not normally distributed, we could take advantage of permutation methods to obtain exact significance levels in tests of the coefficients. Exact permutation methods do not exist in the multivariable case.

When selecting variables to incorporate in a multivariable model, we are forced to perform repeated tests of hypotheses, so that the resultant p-values are no longer meaningful. One solution, if sufficient data are available, is to divide the dataset into two parts, using the first part to select variables, and the second part to test these same variables for significance.” (chapter 13)

The basic idea is to use the results of hypothesis tests to decide which variables to include in the model. This is both common- and bad practice. I found it surprising that such a piece of advice would be included in this book, as I’d figured beforehand that this would precisely be the sort of thing a book like this one would tell people not to do. I’ve said this before multiple times on this blog, but I’ll keep saying it, especially if/when I find this sort of advice in statistics textbooks: Using hypothesis testing as a basis for model selection is an invalid approach to model selection, and it’s in general a terrible idea. “There is no statistical theory that supports the notion that hypothesis testing with a fixed α level is a basis for model selection.” (Burnham & Anderson). Use information criteria, not hypothesis tests, to make your model selection decisions. (And read Burnham & Anderson’s book on these topics.)

Anyway, much of the stuff included in the book was good stuff and it’s a very decent book. I’ve added some quotes and observations from the last part of the book below.

“OLS is not the only modeling technique. To diminish the effect of outliers, and treat prediction errors as proportional to their absolute magnitude rather than their squares, one should use least absolute deviation (LAD) regression. This would be the case if the conditional distribution of the dependent variable were characterized by a distribution with heavy tails (compared to the normal distribution, increased probability of values far from the mean). One should also employ LAD regression when the conditional distribution of the dependent variable given the predictors is not symmetric and we wish to estimate its median rather than its mean value.
If it is not clear which variable should be viewed as the predictor and which the dependent variable, as is the case when evaluating two methods of measurement, then one should employ Deming or error in variable (EIV) regression.
If one’s primary interest is not in the expected value of the dependent variable but in its extremes (the number of bacteria that will survive treatment or the number of individuals who will fall below the poverty line), then one ought consider the use of quantile regression.
If distinct strata exist, one should consider developing separate regression models for each stratum, a technique known as ecological regression [] If one’s interest is in classification or if the majority of one’s predictors are dichotomous, then one should consider the use of classification and regression trees (CART) […] If the outcomes are limited to success or failure, one ought employ logistic regression. If the outcomes are counts rather than continuous measurements, one should employ a generalized linear model (GLM).”

“Linear regression is a much misunderstood and mistaught concept. If a linear model provides a good fit to data, this does not imply that a plot of the dependent variable with respect to the predictor would be a straight line, only that a plot of the dependent variable with respect to some not-necessarily monotonic function of the predictor would be a line. For example, y = A + B log[x] and y = A cos(x) + B sin(x) are both linear models whose coefficients A and B might be derived by OLS or LAD methods. Y = Ax5 is a linear model. Y = xA is nonlinear. […] Perfect correlation (ρ2 = 1) does not imply that two variables are identical but rather that one of them, Y, say, can be written as a linear function of the other, Y = a + bX, where b is the slope of the regression line and a is the intercept. […] Nonlinear regression methods are appropriate when the form of the nonlinear model is known in advance. For example, a typical pharmacological model will have the form A exp[bX] + C exp[dW]. The presence of numerous locally optimal but globally suboptimal solutions creates challenges, and validation is essential. […] To be avoided are a recent spate of proprietary algorithms available solely in software form that guarantee to find a best-fitting solution. In the words of John von Neumann, “With four parameters I can fit an elephant and with five I can make him wiggle his trunk.””

“[T]he most common errors associated with quantile regression include: 1. Failing to evaluate whether the model form is appropriate, for example, forcing linear fit through an obvious nonlinear response. (Of course, this is also a concern with mean regression, OLS, LAD, or EIV.) 2. Trying to over interpret a single quantile estimate (say 0.85) with a statistically significant nonzero slope (p < 0.05) when the majority of adjacent quantiles (say 0.5 − 0.84 and 0.86 − 0.95) are clearly zero (p > 0.20). 3. Failing to use all the information a quantile regression provides. Even if you think you are only interested in relations near maximum (say 0.90 − 0.99), your understanding will be enhanced by having estimates (and sampling variation via confidence intervals) across a wide range of quantiles (say 0.01 − 0.99).”

“Survival analysis is used to assess time-to-event data including time to recovery and time to revision. Most contemporary survival analysis is built around the Cox model […] Possible sources of error in the application of this model include all of the following: *Neglecting the possible dependence of the baseline function λ0 on the predictors. *Overmatching, that is, using highly correlated predictors that may well mask each other’s effects. *Using the parametric Breslow or Kaplan–Meier estimators of the survival function rather than the nonparametric Nelson–Aalen estimator. *Excluding patients based on post-hoc criteria. Pathology workups on patients who died during the study may reveal that some of them were wrongly diagnosed. Regardless, patients cannot be eliminated from the study as we lack the information needed to exclude those who might have been similarly diagnosed but who are still alive at the conclusion of the study. *Failure to account for differential susceptibility (frailty) of the patients”.

“In reporting the results of your modeling efforts, you need to be explicit about the methods used, the assumptions made, the limitations on your model’s range of application, potential sources of bias, and the method of validation […] Multivariable regression is plagued by the same problems univariate regression is heir to, plus many more of its own. […] If choosing the correct functional form of a model in a univariate case presents difficulties, consider that in the case of k variables, there are k linear terms (should we use logarithms? should we add polynomial terms?) and k(k − 1) first-order cross products of the form xixk. Should we include any of the k(k − 1)(k − 2) second-order cross products? A common error is to attribute the strength of a relationship to the magnitude of the predictor’s regression coefficient […] Just scale the units in which the predictor is reported to see how erroneous such an assumption is. […] One of the main problems in multiple regression is multicollinearity, which is the correlation among predictors. Even relatively weak levels of multicollinearity are enough to generate instability in multiple regression models […]. A simple solution is to evaluate the correlation matrix M among predictors, and use this matrix to choose the predictors that are less correlated. […] Test M for each predictor, using the variance inflation factor (VIF) given by (1 − R2) − 1, where R2 is the multiple coefficient of determination of the predictor against all other predictors. If VIF is large for a given predictor (>8, say) delete this predictor and reestimate the model. […] Dropping collinear variables from the analysis can result in a substantial loss of power”.

“It can be difficult to predict the equilibrium point for a supply-and-demand model, because producers change their price in response to demand and consumers change their demand in response to price. Failing to account for endogeneous variables can lead to biased estimates of the regression coefficients.
Endogeneity can arise not only as a result of omitted variables, but of measurement error, autocorrelated errors, simultaneity, and sample selection errors. One solution is to make use of instrument variables that should satisfy two conditions: 1. They should be correlated with the endogenous explanatory variables, conditional on the other covariates. 2. They should not be correlated with the error term in the explanatory equation, that is, they should not suffer from the same problem as the original predictor.
Instrumental variables are commonly used to estimate causal effects in contexts in which controlled experiments are not possible, for example in estimating the effects of past and projected government policies.”

“[T]he following errors are frequently associated with factor analysis: *Applying it to datasets with too few cases in relation to the number of variables analyzed […], without noticing that correlation coefficients have very wide confidence intervals in small samples. *Using oblique rotation to get a number of factors bigger or smaller than the number of factors obtained in the initial extraction by principal components, as a way to show the validity of a questionnaire. For example, obtaining only one factor by principal components and using the oblique rotation to justify that there were two differentiated factors, even when the two factors were correlated and the variance explained by the second factor was very small. *Confusion among the total variance explained by a factor and the variance explained in the reduced factorial space. In this way a researcher interpreted that a given group of factors explaining 70% of the variance before rotation could explain 100% of the variance after rotation.”

“Poisson regression is appropriate when the dependent variable is a count, as is the case with the arrival of individuals in an emergency room. It is also applicable to the spatial distributions of tornadoes and of clusters of galaxies.2 To be applicable, the events underlying the outcomes must be independent […] A strong assumption of the Poisson regression model is that the mean and variance are equal (equidispersion). When the variance of a sample exceeds the mean, the data are said to be overdispersed. Fitting the Poisson model to overdispersed data can lead to misinterpretation of coefficients due to poor estimates of standard errors. Naturally occurring count data are often overdispersed due to correlated errors in time or space, or other forms of nonindependence of the observations. One solution is to fit a Poisson model as if the data satisfy the assumptions, but adjust the model-based standard errors usually employed. Another solution is to estimate a negative binomial model, which allows for scalar overdispersion.”

“When multiple observations are collected for each principal sampling unit, we refer to the collected information as panel data, correlated data, or repeated measures. […] The dependency of observations violates one of the tenets of regression analysis: that observations are supposed to be independent and identically distributed or IID. Several concerns arise when observations are not independent. First, the effective number of observations (that is, the effective amount of information) is less than the physical number of observations […]. Second, any model that fails to specifically address [the] correlation is incorrect […]. Third, although the correct specification of the correlation will yield the most efficient estimator, that specification is not the only one to yield a consistent estimator.”

“The basic issue in deciding whether to utilize a fixed- or random-effects model is whether the sampling units (for which multiple observations are collected) represent the collection of most or all of the entities for which inference will be drawn. If so, the fixed-effects estimator is to be preferred. On the other hand, if those same sampling units represent a random sample from a larger population for which we wish to make inferences, then the random-effects estimator is more appropriate. […] Fixed- and random-effects models address unobserved heterogeneity. The random-effects model assumes that the panel-level effects are randomly distributed. The fixed-effects model assumes a constant disturbance that is a special case of the random-effects model. If the random-effects assumption is correct, then the random-effects estimator is more efficient than the fixed-effects estimator. If the random-effects assumption does not hold […], then the random effects model is not consistent. To help decide whether the fixed- or random-effects models is more appropriate, use the Durbin–Wu–Hausman3 test comparing coefficients from each model. […] Although fixed-effects estimators and random-effects estimators are referred to as subject-specific estimators, the GEEs available through PROC GENMOD in SAS or xtgee in Stata, are called population-averaged estimators. This label refers to the interpretation of the fitted regression coefficients. Subject-specific estimators are interpreted in terms of an effect for a given panel, whereas population-averaged estimators are interpreted in terms of an affect averaged over panels.”

“A favorite example in comparing subject-specific and population-averaged estimators is to consider the difference in interpretation of regression coefficients for a binary outcome model on whether a child will exhibit symptoms of respiratory illness. The predictor of interest is whether or not the child’s mother smokes. Thus, we have repeated observations on children and their mothers. If we were to fit a subject-specific model, we would interpret the coefficient on smoking as the change in likelihood of respiratory illness as a result of the mother switching from not smoking to smoking. On the other hand, the interpretation of the coefficient in a population-averaged model is the likelihood of respiratory illness for the average child with a nonsmoking mother compared to the likelihood for the average child with a smoking mother. Both models offer equally valid interpretations. The interpretation of interest should drive model selection; some studies ultimately will lead to fitting both types of models. […] In addition to model-based variance estimators, fixed-effects models and GEEs [Generalized Estimating Equation models] also admit modified sandwich variance estimators. SAS calls this the empirical variance estimator. Stata refers to it as the Robust Cluster estimator. Whatever the name, the most desirable property of the variance estimator is that it yields inference for the regression coefficients that is robust to misspecification of the correlation structure. […] Specification of GEEs should include careful consideration of reasonable correlation structure so that the resulting estimator is as efficient as possible. To protect against misspecification of the correlation structure, one should base inference on the modified sandwich variance estimator. This is the default estimator in SAS, but the user must specify it in Stata.”

“There are three main approaches to [model] validation: 1. Independent verification (obtained by waiting until the future arrives or through the use of surrogate variables). 2. Splitting the sample (using one part for calibration, the other for verification) 3. Resampling (taking repeated samples from the original sample and refitting the model each time).
Goodness of fit is no guarantee of predictive success. […] Splitting the sample into two parts, one for estimating the model parameters, the other for verification, is particularly appropriate for validating time series models in which the emphasis is on prediction or reconstruction. If the observations form a time series, the more recent observations should be reserved for validation purposes. Otherwise, the data used for validation should be drawn at random from the entire sample. Unfortunately, when we split the sample and use only a portion of it, the resulting estimates will be less precise. […] The proportion to be set aside for validation purposes will depend upon the loss function. If both the goodness-of-fit error in the calibration sample and the prediction error in the validation sample are based on mean-squared error, Picard and Berk [1990] report that we can minimize their sum by using between a quarter and a third of the sample for validation purposes.”

November 13, 2017 Posted by | Books, Statistics | Leave a comment

Common Errors in Statistics… (II)

Some more observations from the book below:

“[A] multivariate test, can be more powerful than a test based on a single variable alone, providing the additional variables are relevant. Adding variables that are unlikely to have value in discriminating among the alternative hypotheses simply because they are included in the dataset can only result in a loss of power. Unfortunately, what works when making a comparison between two populations based on a single variable fails when we attempt a multivariate comparison. Unless the data are multivariate normal, Hötelling’s T2, the multivariate analog of Student’s t, will not provide tests with the desired significance level. Only samples far larger than those we are likely to afford in practice are likely to yield multi-variate results that are close to multivariate normal. […] [A]n exact significance level can [however] be obtained in the multivariate case regardless of the underlying distribution by making use of the permutation distribution of Hötelling’s T2.”

“If you are testing against a one-sided alternative, for example, no difference versus improvement, then you require a one-tailed or one-sided test. If you are doing a head-to-head comparison — which alternative is best? — then a two-tailed test is required. […] A comparison of two experimental effects requires a statistical test on their difference […]. But in practice, this comparison is often based on an incorrect procedure involving two separate tests in which researchers conclude that effects differ when one effect is significant (p < 0.05) but the other is not (p > 0.05). Nieuwenhuis, Forstmann, and Wagenmakers [2011] reviewed 513 behavioral, systems, and cognitive neuroscience articles in five top-ranking journals and found that 78 used the correct procedure and 79 used the incorrect procedure. […] When the logic of a situation calls for demonstration of similarity rather than differences among responses to various treatments, then equivalence tests are often more relevant than tests with traditional no-effect null hypotheses […] Two distributions F and G, such that G[x] = F[x − δ], are said to be equivalent providing |δ| < Δ, where Δ is the smallest difference of clinical significance. To test for equivalence, we obtain a confidence interval for δ, rejecting equivalence only if this interval contains values in excess of |Δ|. The width of a confidence interval decreases as the sample size increases; thus, a very large sample may be required to demonstrate equivalence just as a very large sample may be required to demonstrate a clinically significant effect.”

“The most common test for comparing the means of two populations is based upon Student’s t. For Student’s t-test to provide significance levels that are exact rather than approximate, all the observations must be independent and, under the null hypothesis, all the observations must come from identical normal distributions. Even if the distribution is not normal, the significance level of the t-test is almost exact for sample sizes greater than 12; for most of the distributions one encounters in practice,5 the significance level of the t-test is usually within a percent or so of the correct value for sample sizes between 6 and 12. For testing against nonnormal alternatives, more powerful tests than the t-test exist. For example, a permutation test replacing the original observations with their normal scores is more powerful than the t-test […]. Permutation tests are derived by looking at the distribution of values the test statistic would take for each of the possible assignments of treatments to subjects. For example, if in an experiment two treatments were assigned at random to six subjects so that three subjects got one treatment and three the other, there would have been a total of 20 possible assignments of treatments to subjects.6 To determine a p-value, we compute for the data in hand each of the 20 possible values the test statistic might have taken. We then compare the actual value of the test statistic with these 20 values. If our test statistic corresponds to the most extreme value, we say that p = 1/20 = 0.05 (or 1/10 = 0.10 if this is a two-tailed permutation test). Against specific normal alternatives, this two-sample permutation test provides a most powerful unbiased test of the distribution-free hypothesis that the centers of the two distributions are the same […]. Violation of assumptions can affect not only the significance level of a test but the power of the test […] For example, although the significance level of the t-test is robust to departures from normality, the power of the t-test is not.”

“Group randomized trials (GRTs) in public health research typically use a small number of randomized groups with a relatively large number of participants per group. Typically, some naturally occurring groups are targeted: work sites, schools, clinics, neighborhoods, even entire towns or states. A group can be assigned to either the intervention or control arm but not both; thus, the group is nested within the treatment. This contrasts with the approach used in multicenter clinical trials, in which individuals within groups (treatment centers) may be assigned to any treatment. GRTs are characterized by a positive correlation of outcomes within a group and by the small number of groups. Feng et al. [2001] report a positive intraclass correlation (ICC) between the individuals’ target-behavior outcomes within the same group. […] The variance inflation factor (VIF) as a result of such commonalities is 1 + (n − 1)σ. […] Although σ in GRTs is usually quite small, the VIFs could still be quite large because VIF is a function of the product of the correlation and group size n. […] To be appropriate, an analysis method of GRTs need acknowledge both the ICC and the relatively small number of groups.”

“Recent simulations reveal that the classic test based on Pearson correlation is almost distribution free [Good, 2009]. Still, too often we treat a test of the correlation between two variables X and Y as if it were a test of their independence. X and Y can have a zero correlation coefficient, yet be totally dependent (for example, Y = X2). Even when the expected value of Y is independent of the expected value of X, the variance of Y might be directly proportional to the variance of X.”

“[O]ne of the most common statistical errors is to assume that because an effect is not statistically significant it does not exist. One of the most common errors in using the analysis of variance is to assume that because a factor such as sex does not yield a significant p-value that we may eliminate it from the model. […] The process of eliminating nonsignificant factors one by one from an analysis of variance means that we are performing a series of tests rather than a single test; thus, the actual significance level is larger than the declared significance level.”

“The greatest error associated with the use of statistical procedures is to make the assumption that one single statistical methodology can suffice for all applications. From time to time, a new statistical procedure will be introduced or an old one revived along with the assertion that at last the definitive solution has been found. […] Every methodology [however] has a proper domain of application and another set of applications for which it fails. Every methodology has its drawbacks and its advantages, its assumptions and its sources of error.”

“[T]o use the bootstrap or any other statistical methodology effectively, one has to be aware of its limitations. The bootstrap is of value in any situation in which the sample can serve as a surrogate for the population. If the sample is not representative of the population because the sample is small or biased, not selected at random, or its constituents are not independent of one another, then the bootstrap will fail. […] When using Bayesian methods[:] Do not use an arbitrary prior. Never report a p-value. Incorporate potential losses in the decision. Report the Bayes’ factor. […] In performing a meta-analysis, we need to distinguish between observational studies and randomized trials. Confounding and selection bias can easily distort the findings from observational studies. […] Publication and selection bias also plague the meta-analysis of completely randomized trials. […] One can not incorporate in a meta-analysis what one is not aware of. […] Similarly, the decision as to which studies to incorporate can dramatically affect the results. Meta-analyses of the same issue may reach opposite conclusions […] Where there are substantial differences between the different studies incorporated in a meta-analysis (their subjects or their environments), or substantial quantitative differences in the results from the different trials, a single overall summary estimate of treatment benefit has little practical applicability […]. Any analysis that ignores this heterogeneity is clinically misleading and scientifically naive […]. Heterogeneity should be scrutinized, with an attempt to explain it […] Bayesian methods can be effective in meta-analyses […]. In such situations, the parameters of various trials are considered to be random samples from a distribution of trial parameters. The parameters of this higher-level distribution are called hyperparameters, and they also have distributions. The model is called hierarchical. The extent to which the various trials reinforce each other is determined by the data. If the trials are very similar, the variation of the hyperparameters will be small, and the analysis will be very close to a classical meta-analysis. If the trials do not reinforce each other, the conclusions of the hierarchical Bayesian analysis will show a very high variance in the results. A hierarchical Bayesian analysis avoids the necessity of a prior decision as to whether the trials can be combined; the extent of the combination is determined purely by the data. This does not come for free; in contrast to the meta-analyses discussed above, all the original data (or at least the sufficient statistics) must be available for inclusion in the hierarchical model. The Bayesian method is also vulnerable to […] selection bias”.

“For small samples of three to five observations, summary statistics are virtually meaningless. Reproduce the actual observations; this is easier to do and more informative. Though the arithmetic mean or average is in common use for summarizing measurements, it can be very misleading. […] When the arithmetic mean is meaningful, it is usually equal to or close to the median. Consider reporting the median in the first place. The geometric mean is more appropriate than the arithmetic in three sets of circumstances: 1. When losses or gains can best be expressed as a percentage rather than a fixed value. 2. When rapid growth is involved, as is the case with bacterial and viral populations. 3. When the data span several orders of magnitude, as with the concentration of pollutants. […] Most populations are actually mixtures of populations. If multiple modes are observed in samples greater than 25 in size, the number of modes should be reported. […] The terms dispersion, precision, and accuracy are often confused. Dispersion refers to the variation within a sample or a population. Standard measures of dispersion include the variance, the mean absolute deviation, the interquartile range, and the range. Precision refers to how close several estimates based upon successive samples will come to one another, whereas accuracy refers to how close an estimate based on a sample will come to the population parameter it is estimating.”

“One of the most egregious errors in statistics, one encouraged, if not insisted upon by the editors of journals in the biological and social sciences, is the use of the notation “Mean ± Standard Error” to report the results of a set of observations. The standard error is a useful measure of population dispersion if the observations are continuous measurements that come from a normal or Gaussian distribution. […] But if the observations come from a nonsymmetric distribution such as an exponential or a Poisson, or a truncated distribution such as the uniform, or a mixture of populations, we cannot draw any such inference. Recall that the standard error equals the standard deviation divided by the square root of the sample size […] As the standard error depends on the squares of individual observations, it is particularly sensitive to outliers. A few extreme or outlying observations will have a dramatic impact on its value. If you can not be sure your observations come from a normal distribution, then consider reporting your results either in the form of a histogram […] or a Box and Whiskers plot […] If the underlying distribution is not symmetric, the use of the ± SE notation can be deceptive as it suggests a nonexistent symmetry. […] When the estimator is other than the mean, we cannot count on the Central Limit Theorem to ensure a symmetric sampling distribution. We recommend that you use the bootstrap whenever you report an estimate of a ratio or dispersion. […] If you possess some prior knowledge of the shape of the population distribution, you should take advantage of that knowledge by using a parametric bootstrap […]. The parametric bootstrap is particularly recommended for use in determining the precision of percentiles in the tails (P20, P10, P90, and so forth).”

“A common error is to misinterpret the confidence interval as a statement about the unknown parameter. It is not true that the probability that a parameter is included in a 95% confidence interval is 95%. What is true is that if we derive a large number of 95% confidence intervals, we can expect the true value of the parameter to be included in the computed intervals 95% of the time. (That is, the true values will be included if the assumptions on which the tests and confidence intervals are based are satisfied 100% of the time.) Like the p-value, the upper and lower confidence limits of a particular confidence interval are random variables, for they depend upon the sample that is drawn. […] In interpreting a confidence interval based on a test of significance, it is essential to realize that the center of the interval is no more likely than any other value, and the confidence to be placed in the interval is no greater than the confidence we have in the experimental design and statistical test it is based upon.”

“How accurate our estimates are and how consistent they will be from sample to sample will depend upon the nature of the error terms. If none of the many factors that contribute to the value of ε make more than a small contribution to the total, then ε will have a Gaussian distribution. If the {εi} are independent and normally distributed (Gaussian), then the ordinary least-squares estimates of the coefficients produced by most statistical software will be unbiased and have minimum variance. These desirable properties, indeed the ability to obtain coefficient values that are of use in practical applications, will not be present if the wrong model has been adopted. They will not be present if successive observations are dependent. The values of the coefficients produced by the software will not be of use if the associated losses depend on some function of the observations other than the sum of the squares of the differences between what is observed and what is predicted. In many practical problems, one is more concerned with minimizing the sum of the absolute values of the differences or with minimizing the maximum prediction error. Finally, if the error terms come from a distribution that is far from Gaussian, a distribution that is truncated, flattened or asymmetric, the p-values and precision estimates produced by the software may be far from correct.”

“I have attended far too many biology conferences at which speakers have used a significant linear regression of one variable on another as “proof” of a “linear” relationship or first-order behavior. […] The unfortunate fact, which should not be forgotten, is that if EY = a f[X], where f is a monotonically, increasing function of X, then any attempt to fit the equation Y = bg[X], where g is also a monotonically increasing function of X, will result in a value of b that is significantly different from zero. The “trick,” […] is in selecting an appropriate (cause-and-effect-based) functional form g to begin with. Regression methods and expensive software will not find the correct form for you.”

November 4, 2017 Posted by | Books, Statistics | Leave a comment

A few diabetes papers of interest

i. Chronic Fatigue in Type 1 Diabetes: Highly Prevalent but Not Explained by Hyperglycemia or Glucose Variability.

“Fatigue is a classical symptom of hyperglycemia, but the relationship between chronic fatigue and diabetes has not been systematically studied. […] glucose control [in diabetics] is often suboptimal with persistent episodes of hyperglycemia that may result in sustained fatigue. Fatigue may also sustain in diabetic patients because it is associated with the presence of a chronic disease, as has been demonstrated in patients with rheumatoid arthritis and various neuromuscular disorders (2,3).

It is important to distinguish between acute and chronic fatigue, because chronic fatigue, defined as severe fatigue that persists for at least 6 months, leads to substantial impairments in patients’ daily functioning (4,5). In contrast, acute fatigue can largely vary during the day and generally does not cause functional impairments.

Literature provides limited evidence for higher levels of fatigue in diabetic patients (6,7), but its chronicity, impact, and determinants are unknown. In various chronic diseases, it has been proven useful to distinguish between precipitating and perpetuating factors of chronic fatigue (3,8). Illness-related factors trigger acute fatigue, while other factors, often cognitions and behaviors, cause fatigue to persist. Sleep disturbances, low self-efficacy concerning fatigue, reduced physical activity, and a strong focus on fatigue are examples of these fatigue-perpetuating factors (810). An episode of hyperglycemia or hypoglycemia could trigger acute fatigue for diabetic patients (11,12). However, variations in blood glucose levels might also contribute to chronic fatigue, because these variations continuously occur.

The current study had two aims. First, we investigated the prevalence and impact of chronic fatigue in a large sample of type 1 diabetic (T1DM) patients and compared the results to a group of age- and sex-matched population-based controls. Secondly, we searched for potential determinants of chronic fatigue in T1DM.”

“A significantly higher percentage of T1DM patients were chronically fatigued (40%; 95% CI 34–47%) than matched controls (7%; 95% CI 3–10%). Mean fatigue severity was also significantly higher in T1DM patients (31 ± 14) compared with matched controls (17 ± 9; P < 0.001). T1DM patients with a comorbidity_mr [a comorbidity affecting patients’ daily functioning, based on medical records – US] or clinically relevant depressive symptoms [based on scores on the Beck Depression Inventory for Primary Care – US] were significantly more often chronically fatigued than patients without a comorbidity_mr (55 vs. 36%; P = 0.014) or without clinically relevant depressive symptoms (88 vs. 31%; P < 0.001). Patients who reported neuropathy, nephropathy, or cardiovascular disease as complications of diabetes were more often chronically fatigued […] Chronically fatigued T1DM patients were significantly more impaired compared with nonchronically fatigued T1DM patients on all aspects of daily functioning […]. Fatigue was the most troublesome symptom of the 34 assessed diabetes-related symptoms. The five most troublesome symptoms were overall sense of fatigue, lack of energy, increasing fatigue in the course of the day, fatigue in the morning when getting up, and sleepiness or drowsiness”.

“This study establishes that chronic fatigue is highly prevalent and clinically relevant in T1DM patients. While current blood glucose level was only weakly associated with chronic fatigue, cognitive behavioral factors were by far the strongest potential determinants.”

“Another study found that type 2 diabetic, but not T1DM, patients had higher levels of fatigue compared with healthy controls (7). This apparent discrepancy may be explained by the relatively small sample size of this latter study, potential selection bias (patients were not randomly selected), and the use of a different fatigue questionnaire.”

“Not only was chronic fatigue highly prevalent, fatigue also had a large impact on T1DM patients. Chronically fatigued T1DM patients had more functional impairments than nonchronically fatigued patients, and T1DM patients considered fatigue as the most burdensome diabetes-related symptom.

Contrary to what was expected, there was at best a weak relationship between blood glucose level and chronic fatigue. Chronically fatigued T1DM patients spent slightly less time in hypoglycemia, but average glucose levels, glucose variability, hyperglycemia, or HbA1c were not related to chronic fatigue. In type 2 diabetes mellitus also, no relationship was found between fatigue and HbA1c (7).”

“Regarding demographic characteristics, current health status, diabetes-related factors, and fatigue-related cognitions and behaviors as potential determinants of chronic fatigue, we found that sleeping problems, physical activity, self-efficacy concerning fatigue, age, depression, and pain were significantly associated with chronic fatigue in T1DM. Although depression was strongly related, it could not completely explain the presence of chronic fatigue (38), as 31% was chronically fatigued without having clinically relevant depressive symptoms.”

Some comments may be worth adding here. It’s important to note to people who may not be aware of this that although chronic fatigue is a weird entity that’s hard to get a handle on (and, to be frank, is somewhat controversial), specific organic causes have been identified that greatly increases the risk. Many survivors of cancer experience chronic fatigue (see e.g. this paper, or wikipedia), and chronic fatigue is also not uncommon in a kidney failure setting (“The silence of renal disease creeps up on us (doctors and patients). Do not dismiss odd chronic symptoms such as fatigue or ‘not being quite with it’ without considering checking renal function” (Oxford Handbook of Clinical Medicine, 9th edition. My italics – US)). As observed above, linkage with RA and some neuromuscular disorders has also been observed. The brief discussion of related topics in Houghton & Grey made it clear to me that some people with chronic fatigue are almost certainly suffering from an organic illness which has not been diagnosed or treated. Here’s a relevant quote from that book’s coverage: “it is unusual to find a definite organic cause for fatigue. However, consider anaemia, thyroid dysfunction, Addison’s disease and hypopituitarism.” It’s sort of neat, if you think about the potential diabetes-fatigue link investigated by the guys above, that some of these diseases are likely to be relevant, as type 1 diabetics are more likely to develop them (anemia is not linked to diabetes, as far as I know, and I believe the relationship between autoimmune hypophysitis – which is a cause of hypopituitarism – and type 1 diabetes is at best unclear, but the others are definitely involved) due to their development being caused by some of the same genetic mutations which cause type 1 diabetes; the combinations of some of these diseases even have fancy names of their own, like ‘Type I Polyglandular Autoimmune Syndrome’ and ‘Schmidt Syndrome’ (if you’re interested here are a couple of medscape links). It’s noteworthy that although most of these diseases are uncommon in the general population, their incidence/prevalence is likely to be greatly increased in type 1 diabetics due to the common genetic pathways at play (variants regulating T-cell function seem to be important, but there’s no need to go into these details here). Sperling et al. note in their book that: “Hypothyroid or hyperthyroid AITD [autoimmune thyroid disease] has been observed in 10–24% of patients with type 1 diabetes”. In one series including 151 patients with APS [/PAS]-2, when they looked at disease combinations they found that: “Of combinations of the component diseases, [type 1] diabetes with thyroid disease was the most common, occurring in 33%. The second, diabetes with adrenal insufficiency, made up 15%” (same source).

It seems from estimates like these likely that a not unsubstantial proportion of type 1 diabetics over time go on to develop other health problems that might if unaddressed/undiagnosed cause fatigue, and this may in my opinion be a potentially much more important cause than direct metabolic effects such as hyperglycemia, or chronic inflammation. If this is the case you’d however expect to see a substantial sex difference, as the autoimmune syndromes are in general much more likely to hit females than males. I’m not completely sure how to interpret a few of the results reported, but to me it doesn’t look like the sex differences in this study are anywhere near ‘large enough’ to support such an explanatory model, though. Another big problem is also that fatigue seems to be more common in young patients, which is weird; most long-term complications display significant (positive) duration dependence, and when diabetes is a component of an autoimmune syndrome diabetes tend to develop first, with other diseases hitting later, usually in middle age. Duration and age are strongly correlated, and a negative duration dependence in a diabetes complication setting is a surprising and unusual finding that needs to be explained, badly; it’s unexpected and may in my opinion be the sign of a poor disease model. It’d make more sense for disease-related fatigue to present late, rather than early, I don’t really know what to make of that negative age gradient. ‘More studies needed’ (preferably by people familiar with those autoimmune syndromes..), etc…

ii. Risk for End-Stage Renal Disease Over 25 Years in the Population-Based WESDR Cohort.

“It is well known that diabetic nephropathy is the leading cause of end-stage renal disease (ESRD) in many regions, including the U.S. (1). Type 1 diabetes accounts for >45,000 cases of ESRD per year (2), and the incidence may be higher than in people with type 2 diabetes (3). Despite this, there are few population-based data available regarding the prevalence and incidence of ESRD in people with type 1 diabetes in the U.S. (4). A declining incidence of ESRD has been suggested by findings of lower incidence with increasing calendar year of diagnosis and in comparison with older reports in some studies in Europe and the U.S. (58). This is consistent with better diabetes management tools becoming available and increased renoprotective efforts, including the greater use of ACE inhibitors and angiotensin type II receptor blockers, over the past two to three decades (9). Conversely, no reduction in the incidence of ESRD across enrollment cohorts was found in a recent clinic-based study (9). Further, an increase in ESRD has been suggested for older but not younger people (9). Recent improvements in diabetes care have been suggested to delay rather than prevent the development of renal disease in people with type 1 diabetes (4).

A decrease in the prevalence of proliferative retinopathy by increasing calendar year of type 1 diabetes diagnosis was previously reported in the Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR) cohort (10); therefore, we sought to determine if a similar pattern of decline in ESRD would be evident over 25 years of follow-up. Further, we investigated factors that may mediate a possible decline in ESRD as well as other factors associated with incident ESRD over time.”

“At baseline, 99% of WESDR cohort members were white and 51% were male. Individuals were 3–79 years of age (mean 29) with diabetes duration of 0–59 years (mean 15), diagnosed between 1922 and 1980. Four percent of individuals used three or more daily insulin injections and none used an insulin pump. Mean HbA1c was 10.1% (87 mmol/mol). Only 16% were using an antihypertensive medication, none was using an ACE inhibitor, and 3% reported a history of renal transplant or dialysis (ESRD). At 25 years, 514 individuals participated (52% of original cohort at baseline, n = 996) and 367 were deceased (37% of baseline). Mean HbA1c was much lower than at baseline (7.5%, 58 mmol/mol), the decline likely due to the improvements in diabetes care, with 80% of participants using intensive insulin management (three or more daily insulin injections or insulin pump). The decline in HbA1c was steady, becoming slightly steeper following the results of the DCCT (25). Overall, at the 25-year follow-up, 47% had proliferative retinopathy, 53% used aspirin daily, and 54% reported taking antihypertensive medications, with the majority (87%) using an ACE inhibitor. Thirteen percent reported a history of ESRD.”

“Prevalence of ESRD was negligible until 15 years of diabetes duration and then steadily increased with 5, 8, 10, 13, and 14% reporting ESRD by 15–19, 20–24, 25–29, 30–34, and 35+ years of diabetes duration, respectively. […] After 15 years of diagnosis, prevalence of ESRD increased with duration in people diagnosed from 1960 to 1980, with the lowest increase in people with the most recent diagnosis. People diagnosed from 1922 to 1959 had consistent rather than increasing levels of ESRD with duration of 20+ years. If not for their greater mortality (at the 25-year follow-up, 48% of the deceased had been diagnosed prior to 1960), an increase with duration may have also been observed.

From baseline, the unadjusted cumulative 25-year incidence of ESRD was 17.9% (95% CI 14.3–21.5) in males, 10.3% (7.4–13.2) in females, and 14.2% (11.9–16.5) overall. For those diagnosed in 1970–1980, the cumulative incidence at 14, 20, and 25 years of follow-up (or ∼15–25, 20–30, and 25–35 years diabetes duration) was 5.2, 7.9, and 9.3%, respectively. At 14, 20, and 25 years of follow-up (or 35, 40, and 45 up to 65+ years diabetes duration), the cumulative incidence in those diagnosed during 1922–1969 was 13.6, 16.3, and 18.8%, respectively, consistent with the greater prevalence observed for these diagnosis periods at longer duration of diabetes.”

“The unadjusted hazard of ESRD was reduced by 70% among those diagnosed in 1970–1980 as compared with those in 1922–1969 (HR 0.29 [95% CI 0.19–0.44]). Duration (by 10%) and HbA1c (by an additional 10%) partially mediated this association […] Blood pressure and antihypertensive medication use each further attenuated the association. When fully adjusted for these and [other risk factors included in the model], period of diagnosis was no longer significant (HR 0.89 [0.55–1.45]). Sensitivity analyses for the hazard of incident ESRD or death due to renal disease showed similar findings […] The most parsimonious model included diabetes duration, HbA1c, age, sex, systolic and diastolic blood pressure, and history of antihypertensive medication […]. A 32% increased risk for incident ESRD was found per increasing year of diabetes duration at 0–15 years (HR 1.32 per year [95% CI 1.16–1.51]). The hazard plateaued (1.01 per year [0.98–1.05]) after 15 years of duration of diabetes. Hazard of ESRD increased with increasing HbA1c (1.28 per 1% or 10.9 mmol/mol increase [1.14–1.45]) and blood pressure (1.51 per 10 mmHg increase in systolic pressure [1.35–1.68]; 1.12 per 5 mmHg increase in diastolic pressure [1.01–1.23]). Use of antihypertensive medications increased the hazard of incident ESRD nearly fivefold [this finding is almost certainly due to confounding by indication, as also noted by the authors later on in the paper – US], and males had approximately two times the risk as compared with females. […] Having proliferative retinopathy was strongly associated with increased risk (HR 5.91 [3.00–11.6]) and attenuated the association between sex and ESRD.”

“The current investigation […] sought to provide much-needed information on the prevalence and incidence of ESRD and associated risk specific to people with type 1 diabetes. Consistent with a few previous studies (5,7,8), we observed decreased prevalence and incidence of ESRD among individuals with type 1 diabetes diagnosed in the 1970s compared with prior to 1970. The Epidemiology of Diabetes Complications (EDC) Study, another large cohort of people with type 1 diabetes followed over a long period of time, reported cumulative incidence rates of 2–6% for those diagnosed after 1970 and with similar duration (7), comparable to our findings. Slightly higher cumulative incidence (7–13%) reported from older studies at slightly lower duration also supports a decrease in incidence of ESRD (2830). Cumulative incidences through 30 years in European cohorts were even lower (3.3% in Sweden [6] and 7.8% in Finland [5]), compared with the 9.3% noted for those diagnosed during 1970–1980 in the WESDR cohort. The lower incidence could be associated with nationally organized care, especially in Sweden where a nationwide intensive diabetes management treatment program was implemented at least a decade earlier than recommendations for intensive care followed from the results of the DCCT in the U.S.”

“We noted an increased risk of incident ESRD in the first 15 years of diabetes not evident at longer durations. This pattern also demonstrated by others could be due to a greater earlier risk among people most genetically susceptible, as only a subset of individuals with type 1 diabetes will develop renal disease (27,28). The risk plateau associated with greater durations of diabetes and lower risk associated with increasing age may also reflect more death at longer durations and older ages. […] Because age and duration are highly correlated, we observed a positive association between age and ESRD only in univariate analyses, without adjustment for duration. The lack of adjustment for diabetes duration may have, in part, explained the increasing incidence of ESRD shown with age for some people in a recent investigation (9). Adjustment for both age and duration was found appropriate after testing for collinearity in the current analysis.”

In conclusion, this U.S. population-based report showed a lower prevalence and incidence of ESRD among those more recently diagnosed, explained by improvements in glycemic and blood pressure control over the last several decades. Even lower rates may be expected for those diagnosed during the current era of diabetes care. Intensive diabetes management, especially for glycemic control, remains important even in long-standing diabetes as potentially delaying the development of ESRD.

iii. Earlier Onset of Complications in Youth With Type 2 Diabetes.

The prevalence of type 2 diabetes in youth is increasing worldwide, coinciding with the rising obesity epidemic (1,2). […] Diabetes is associated with both microvascular and macrovascular complications. The evolution of these complications has been well described in type 1 diabetes (6) and in adult type 2 diabetes (7), wherein significant complications typically manifest 15–20 years after diagnosis (8). Because type 2 diabetes is a relatively new disease in children (first described in the 1980s), long-term outcome data on complications are scant, and risk factors for the development of complications are incompletely understood. The available literature suggests that development of complications in youth with type 2 diabetes may be more rapid than in adults, thus afflicting individuals at the height of their individual and social productivity (9). […] A small but notable proportion of type 2 diabetes is associated with a polymorphism of hepatic nuclear factor (HNF)-1α, a transcription factor expressed in many tissues […] It is not yet known what effect the HNF-1α polymorphism has on the risk of complications associated with diabetes.”

“The main objective of the current study was to describe the time course and risk factors for microvascular complications (nephropathy, retinopathy, and neuropathy) and macrovascular complications (cardiac, cerebrovascular, and peripheral vascular diseases) in a large cohort of youth [diagnosed with type 2 diabetes] who have been carefully followed for >20 years and to compare this evolution with that of youth with type 1 diabetes. We also compared vascular complications in the youth with type 2 diabetes with nondiabetic control youth. Finally, we addressed the impact of HNF-1α G319S on the evolution of complications in young patients with type 2 diabetes.”

“All prevalent cases of type 2 diabetes and type 1 diabetes (control group 1) seen between January 1986 and March 2007 in the DER-CA for youth aged 1–18 years were included. […] The final type 2 diabetes cohort included 342 youth, and the type 1 diabetes control group included 1,011. The no diabetes control cohort comprised 1,710 youth matched to the type 2 diabetes cohort from the repository […] Compared with the youth with type 1 diabetes, the youth with type 2 diabetes were, on average, older at the time of diagnosis and more likely to be female. They were more likely to have a higher BMIz, live in a rural area, have a low SES, and have albuminuria at diagnosis. […] one-half of the type 2 diabetes group was either a heterozygote (GS) or a homozygote (SS) for the HNF-1α polymorphism […] At the time of the last available follow-up in the DER-CA, the youth with diabetes were, on average, between 15 and 16 years of age. […] The median follow-up times in the repository were 4.4 (range 0–27.4) years for youth with type 2 diabetes, 6.7 ( 0–28.2) years for youth with type 1 diabetes, and 6.0 (0–29.9) years for nondiabetic control youth.”

“After controlling for low SES, sex, and BMIz, the risk associated with type 2 versus type 1 diabetes of any complication was an HR of 1.47 (1.02–2.12, P = 0.04). […] In the univariate analysis, youth with type 2 diabetes were at significantly higher risk of developing any vascular (HR 6.15 [4.26–8.87], P < 0.0001), microvascular (6.26 [4.32–9.10], P < 0.0001), or macrovascular (4.44 [1.71–11.52], P < 0.0001) disease compared with control youth without diabetes. In addition, the youth with type 2 diabetes had an increased risk of opthalmologic (19.49 [9.75–39.00], P < 0.0001), renal (16.13 [7.66–33.99], P < 0.0001), and neurologic (2.93 [1.79–4.80], P ≤ 0.001) disease. There were few cardiovascular, cerebrovascular, and peripheral vascular disease events in all groups (five or fewer events per group). Despite this, there was still a statistically significant higher risk of peripheral vascular disease in the type 2 diabetes group (6.25 [1.68–23.28], P = 0.006).”

“Differences in renal and neurologic complications between the two diabetes groups began to occur before 5 years postdiagnosis, whereas differences in ophthalmologic complications began 10 years postdiagnosis. […] Both cardiovascular and cerebrovascular complications were rare in both groups, but peripheral vascular complications began to occur 15 years after diagnosis in the type 2 diabetes group […] The presence of HNF-1α G319S polymorphism in youth with type 2 diabetes was found to be protective of complications. […] Overall, major complications were rare in the type 1 diabetes group, but they occurred in 1.1% of the type 2 diabetes cohort at 10 years, in 26.0% at 15 years, and in 47.9% at 20 years after diagnosis (P < 0.001) […] youth with type 2 diabetes have a higher risk of any complication than youth with type 1 diabetes and nondiabetic control youth. […] The time to both renal and neurologic complications was significantly shorter in youth with type 2 diabetes than in control youth, whereas differences were not significant with respect to opthalmologic and cardiovascular complications between cohorts. […] The current study is consistent with the literature, which has shown high rates of cardiovascular risk factors in youth with type 2 diabetes. However, despite the high prevalence of risk, this study reports low rates of clinical events. Because the median follow-up time was between 5 and 8 years, it is possible that a longer follow-up period would be required to correctly evaluate macrovascular outcomes in young adults. Also possible is that diagnoses of mild disease are not being made because of a low index of suspicion in 20- and 30-year-old patients.”

In conclusion, youth with type 2 diabetes have an increased risk of complications early in the course of their disease. Microvascular complications and cardiovascular risk factors are highly prevalent, whereas macrovascular complications are rare in young adulthood. HbA1c is an important modifiable risk factor; thus, optimizing glycemic control should remain an important goal of therapy.”

iv. HbA1c and Coronary Heart Disease Risk Among Diabetic Patients.

“We prospectively investigated the association of HbA1c at baseline and during follow-up with CHD risk among 17,510 African American and 12,592 white patients with type 2 diabetes. […] During a mean follow-up of 6.0 years, 7,258 incident CHD cases were identified. The multivariable-adjusted hazard ratios of CHD associated with different levels of HbA1c at baseline (<6.0 [reference group], 6.0–6.9, 7.0–7.9, 8.0–8.9, 9.0–9.9, 10.0–10.9, and ≥11.0%) were 1.00, 1.07 (95% CI 0.97–1.18), 1.16 (1.04–1.31), 1.15 (1.01–1.32), 1.26 (1.09–1.45), 1.27 (1.09–1.48), and 1.24 (1.10–1.40) (P trend = 0.002) for African Americans and 1.00, 1.04 (0.94–1.14), 1.15 (1.03–1.28), 1.29 (1.13–1.46), 1.41 (1.22–1.62), 1.34 (1.14–1.57), and 1.44 (1.26–1.65) (P trend <0.001) for white patients, respectively. The graded association of HbA1c during follow-up with CHD risk was observed among both African American and white diabetic patients (all P trend <0.001). Each one percentage increase of HbA1c was associated with a greater increase in CHD risk in white versus African American diabetic patients. When stratified by sex, age, smoking status, use of glucose-lowering agents, and income, this graded association of HbA1c with CHD was still present. […] The current study in a low-income population suggests a graded positive association between HbA1c at baseline and during follow-up with the risk of CHD among both African American and white diabetic patients with low socioeconomic status.”

A few more observations from the conclusions:

“Diabetic patients experience high mortality from cardiovascular causes (2). Observational studies have confirmed the continuous and positive association between glycemic control and the risk of cardiovascular disease among diabetic patients (4,5). But the findings from RCTs are sometimes uncertain. Three large RCTs (79) designed primarily to determine whether targeting different glucose levels can reduce the risk of cardiovascular events in patients with type 2 diabetes failed to confirm the benefit. Several reasons for the inconsistency of these studies can be considered. First, small sample sizes, short follow-up duration, and few CHD cases in some RCTs may limit the statistical power. Second, most epidemiological studies only assess a single baseline measurement of HbA1c with CHD risk, which may produce potential bias. The recent analysis of 10 years of posttrial follow-up of the UKPDS showed continued reductions for myocardial infarction and death from all causes despite an early loss of glycemic differences (10). The scientific evidence from RCTs was not sufficient to generate strong recommendations for clinical practice. Thus, consensus groups (AHA, ACC, and ADA) have provided a conservative endorsement (class IIb recommendation, level of evidence A) for the cardiovascular benefits of glycemic control (11). In the absence of conclusive evidence from RCTs, observational epidemiological studies might provide useful information to clarify the relationship between glycemia and CHD risk. In the current study with 30,102 participants with diabetes and 7,258 incident CHD cases during a mean follow-up of 6.0 years, we found a graded positive association by various HbA1c intervals of clinical relevance or by using HbA1c as a continuous variable at baseline and during follow-up with CHD risk among both African American and white diabetic patients. Each one percentage increase in baseline and follow-up HbA1c was associated with a 2 and 5% increased risk of CHD in African American and 6 and 11% in white diabetic patients. Each one percentage increase of HbA1c was associated with a greater increase in CHD risk in white versus African American diabetic patients.”

v. Blood Viscosity in Subjects With Normoglycemia and Prediabetes.

“Blood viscosity (BV) is the force that counteracts the free sliding of the blood layers within the circulation and depends on the internal cohesion between the molecules and the cells. Abnormally high BV can have several negative effects: the heart is overloaded to pump blood in the vascular bed, and the blood itself, more viscous, can damage the vessel wall. Furthermore, according to Poiseuille’s law (1), BV is inversely related to flow and might therefore reduce the delivery of insulin and glucose to peripheral tissues, leading to insulin resistance or diabetes (25).

It is generally accepted that BV is increased in diabetic patients (68). Although the reasons for this alteration are still under investigation, it is believed that the increase in osmolarity causes increased capillary permeability and, consequently, increased hematocrit and viscosity (9). It has also been suggested that the osmotic diuresis, consequence of hyperglycemia, could contribute to reduce plasma volume and increase hematocrit (10).

Cross-sectional studies have also supported a link between BV, hematocrit, and insulin resistance (1117). Recently, a large prospective study has demonstrated that BV and hematocrit are risk factors for type 2 diabetes. Subjects in the highest quartile of BV were >60% more likely to develop diabetes than their counterparts in the lowest quartile (18). This finding confirms previous observations obtained in smaller or selected populations, in which the association between hemoglobin or hematocrit and occurrence of type 2 diabetes was investigated (1922).

These observations suggest that the elevation in BV may be very early, well before the onset of diabetes, but definite data in subjects with normal glucose or prediabetes are missing. In the current study, we evaluated the relationship between BV and blood glucose in subjects with normal glucose or prediabetes in order to verify whether alterations in viscosity are appreciable in these subjects and at which blood glucose concentration they appear.”

“According to blood glucose levels, participants were divided into three groups: group A, blood glucose <90 mg/dL; group B, blood glucose between 90 and 99 mg/dL; and group C, blood glucose between 100 and 125 mg/dL. […] Hematocrit (P < 0.05) and BV (P between 0.01 and 0.001) were significantly higher in subjects with prediabetes and in those with blood glucose ranging from 90 to 99 mg/dL compared with subjects with blood glucose <90 mg/dL. […] The current study shows, for the first time, a direct relationship between BV and blood glucose in nondiabetic subjects. It also suggests that, even within glucose values ​​considered completely normal, individuals with higher blood glucose levels have increases in BV comparable with those observed in subjects with prediabetes. […] Overall, changes in viscosity in diabetic patients are accepted as common and as a result of the disease. However, the relationship between blood glucose, diabetes, and viscosity may be much more complex. […] the main finding of the study is that BV significantly increases already at high-normal blood glucose levels, independently of other common determinants of hemorheology. Intervention studies are needed to verify whether changes in BV can influence the development of type 2 diabetes.”

vi. Higher Relative Risk for Multiple Sclerosis in a Pediatric and Adolescent Diabetic Population: Analysis From DPV Database.

“Type 1 diabetes and multiple sclerosis (MS) are organ-specific inflammatory diseases, which result from an autoimmune attack against either pancreatic β-cells or the central nervous system; a combined appearance has been described repeatedly (13). For children and adolescents below the age of 21 years, the prevalence of type 1 diabetes in Germany and Austria is ∼19.4 cases per 100,000 population, and for MS it is 7–10 per 100,000 population (46). A Danish cohort study revealed a three times higher risk for the development of MS in patients with type 1 diabetes (7). Further, an Italian study conducted in Sardinia showed a five times higher risk for the development of type 1 diabetes in MS patients (8,9). An American study on female adults in whom diabetes developed before the age of 21 years yielded an up to 20 times higher risk for the development of MS (10).

These findings support the hypothesis of clustering between type 1 diabetes and MS. The pathogenesis behind this association is still unclear, but T-cell cross-reactivity was discussed as well as shared disease associations due to the HLA-DRB1-DQB1 gene loci […] The aim of this study was to evaluate the prevalence of MS in a diabetic population and to look for possible factors related to the co-occurrence of MS in children and adolescents with type 1 diabetes using a large multicenter survey from the Diabetes Patienten Verlaufsdokumentation (DPV) database.”

“We used a large database of pediatric and adolescent type 1 diabetic patients to analyze the RR of MS co-occurrence. The DPV database includes ∼98% of the pediatric diabetic population in Germany and Austria below the age of 21 years. In children and adolescents, the RR for MS in type 1 diabetes was estimated to be three to almost five times higher in comparison with the healthy population.”

November 2, 2017 Posted by | Cardiology, Diabetes, Epidemiology, Genetics, Immunology, Medicine, Nephrology, Statistics, Studies | Leave a comment

Common Errors in Statistics…

“Pressed by management or the need for funding, too many research workers have no choice but to go forward with data analysis despite having insufficient statistical training. Alas, though a semester or two of undergraduate statistics may develop familiarity with the names of some statistical methods, it is not enough to be aware of all the circumstances under which these methods may be applicable.

The purpose of the present text is to provide a mathematically rigorous but readily understandable foundation for statistical procedures. Here are such basic concepts in statistics as null and alternative hypotheses, p-value, significance level, and power. Assisted by reprints from the statistical literature, we reexamine sample selection, linear regression, the analysis of variance, maximum likelihood, Bayes’ Theorem, meta-analysis and the bootstrap. New to this edition are sections on fraud and on the potential sources of error to be found in epidemiological and case-control studies.

Examples of good and bad statistical methodology are drawn from agronomy, astronomy, bacteriology, chemistry, criminology, data mining, epidemiology, hydrology, immunology, law, medical devices, medicine, neurology, observational studies, oncology, pricing, quality control, seismology, sociology, time series, and toxicology. […] Lest the statisticians among you believe this book is too introductory, we point out the existence of hundreds of citations in statistical literature calling for the comprehensive treatment we have provided. Regardless of past training or current specialization, this book will serve as a useful reference; you will find applications for the information contained herein whether you are a practicing statistician or a well-trained scientist who just happens to apply statistics in the pursuit of other science.”

I’ve been reading this book. I really like it so far, this is a nice book. A lot of the stuff included is review, but there are of course also some new ideas here and there (for example I’d never heard about Stein’s paradox before) and given how much stuff you need to remember and keep in mind in order not to make silly mistakes when analyzing data or interpreting the results of statistical analyses, occasional reviews of these things is probably a very good idea.

I have added some more observations from the first 100 pages or so below:

“Test only relevant null hypotheses. The null hypothesis has taken on an almost mythic role in contemporary statistics. Obsession with the null (more accurately spelled and pronounced nil), has been allowed to shape the direction of our research. […] Virtually any quantifiable hypothesis can be converted into null form. There is no excuse and no need to be content with a meaningless nil. […] we need to have an alternative hypothesis or alternatives firmly in mind when we set up a test. Too often in published research, such alternative hypotheses remain unspecified or, worse, are specified only after the data are in hand. We must specify our alternatives before we commence an analysis, preferably at the same time we design our study. Are our alternatives one-sided or two-sided? If we are comparing several populations at the same time, are their means ordered or unordered? The form of the alternative will determine the statistical procedures we use and the significance levels we obtain. […] The critical values and significance levels are quite different for one-tailed and two-tailed tests and, all too often, the wrong test has been employed in published work. McKinney et al. [1989] reviewed some 70-plus articles that appeared in six medical journals. In over half of these articles, Fisher’s exact test was applied improperly. Either a one-tailed test had been used when a two-tailed test was called for or the authors of the paper simply had not bothered to state which test they had used. […] the F-ratio and the chi-square are what are termed omnibus tests, designed to be sensitive to all possible alternatives. As such, they are not particularly sensitive to ordered alternatives such “as more fertilizer equals more growth” or “more aspirin equals faster relief of headache.” Tests for such ordered responses at k distinct treatment levels should properly use the Pitman correlation“.

“Before we initiate data collection, we must have a firm idea of what we will measure and how we will measure it. A good response variable

  • Is easy to record […]
  • Can be measured objectively on a generally accepted scale.
  • Is measured in appropriate units.
  • Takes values over a sufficiently large range that discriminates well.
  • Is well defined. […]
  • Has constant variance over the range used in the experiment (Bishop and Talbot, 2001).”

“A second fundamental principle is also applicable to both experiments and surveys: Collect exact values whenever possible. Worry about grouping them in intervals or discrete categories later.”

“Sample size must be determined for each experiment; there is no universally correct value. We need to understand and make use of the relationships among effect size, sample size, significance level, power, and the precision of our measuring instruments. Increase the precision (and hold all other parameters fixed) and we can decrease the required number of observations. Decreases in any or all of the intrinsic and extrinsic sources of variation will also result in a decrease in the required number. […] The smallest effect size of practical interest may be determined through consultation with one or more domain experts. The smaller this value, the greater the number of observations that will be required. […] Strictly speaking, the significance level and power should be chosen so as to minimize the overall cost of any project, balancing the cost of sampling with the costs expected from Type I and Type II errors. […] When determining sample size for data drawn from the binomial or any other discrete distribution, one should always display the power curve. […] As a result of inspecting the power curve by eye, you may come up with a less-expensive solution than your software. […] If the data do not come from a well-tabulated distribution, then one might use a bootstrap to estimate the power and significance level. […] Many researchers today rely on menu-driven software to do their power and sample-size calculations. Most such software comes with default settings […] — settings that are readily altered, if, that is, investigators bother to take the time.”

“The relative ease with which a program like Stata […] can produce a sample size may blind us to the fact that the number of subjects with which we begin a study may bear little or no relation to the number with which we conclude it. […] Potential subjects can and do refuse to participate. […] Worse, they may agree to participate initially, then drop out at the last minute […]. They may move without a forwarding address before a scheduled follow-up, or may simply do not bother to show up for an appointment. […] The key to a successful research program is to plan for such drop-outs in advance and to start the trials with some multiple of the number required to achieve a given power and significance level. […] it is the sample you end with, not the sample you begin with, that determines the power of your tests. […] An analysis of those who did not respond to a survey or a treatment can sometimes be as or more informative than the survey itself. […] Be sure to incorporate in your sample design and in your budget provisions for sampling nonresponders.”

“[A] randomly selected sample may not be representative of the population as a whole. For example, if a minority comprises less than 10% of a population, then a jury of 12 persons selected at random from that population will fail to contain a single member of that minority at least 28% of the time.”

“The proper starting point for the selection of the best method of estimation is with the objectives of our study: What is the purpose of our estimate? If our estimate is θ* and the actual value of the unknown parameter is θ, what losses will we be subject to? It is difficult to understand the popularity of the method of maximum likelihood and other estimation procedures that do not take these losses into consideration. The majority of losses will be monotonically nondecreasing in nature, that is, the further apart the estimate θ* and the true value θ, the larger our losses are likely to be. Typical forms of the loss function are the absolute deviation |θ* – θ|, the square deviation (θ* − θ)2, and the jump, that is, no loss if |θ* − θ| < i, and a big loss otherwise. Or the loss function may resemble the square deviation but take the form of a step function increasing in discrete increments. Desirable estimators are impartial, consistent, efficient, robust, and minimum loss. […] Interval estimates are to be preferred to point estimates; they are less open to challenge for they convey information about the estimate’s precision.”

“Estimators should be consistent, that is, the larger the sample, the greater the probability the resultant estimate will be close to the true population value. […] [A] consistent estimator […] is to be preferred to another if the first consistent estimator can provide the same degree of accuracy with fewer observations. To simplify comparisons, most statisticians focus on the asymptotic relative efficiency (ARE), defined as the limit with increasing sample size of the ratio of the number of observations required for each of two consistent statistical procedures to achieve the same degree of accuracy. […] Estimators that are perfectly satisfactory for use with symmetric, normally distributed populations may not be as desirable when the data come from nonsymmetric or heavy-tailed populations, or when there is a substantial risk of contamination with extreme values. When estimating measures of central location, one way to create a more robust estimator is to trim the sample of its minimum and maximum values […]. As information is thrown away, trimmed estimators are [however] less efficient. […] Many semiparametric estimators are not only robust but provide for high ARE with respect to their parametric counterparts. […] The accuracy of an estimate […] and the associated losses will vary from sample to sample. A minimum loss estimator is one that minimizes the losses when the losses are averaged over the set of all possible samples. Thus, its form depends upon all of the following: the loss function, the population from which the sample is drawn, and the population characteristic that is being estimated. An estimate that is optimal in one situation may only exacerbate losses in another. […] It is easy to envision situations in which we are less concerned with the average loss than with the maximum possible loss we may incur by using a particular estimation procedure. An estimate that minimizes the maximum possible loss is termed a mini–max estimator.”

“In survival studies and reliability analyses, we follow each subject and/or experiment unit until either some event occurs or the experiment is terminated; the latter observation is referred to as censored. The principal sources of error are the following:

  • Lack of independence within a sample
  • Lack of independence of censoring
  • Too many censored values
  • Wrong test employed”

“Lack of independence within a sample is often caused by the existence of an implicit factor in the data. For example, if we are measuring survival times for cancer patients, diet may be correlated with survival times. If we do not collect data on the implicit factor(s) (diet in this case), and the implicit factor has an effect on survival times, then we no longer have a sample from a single population. Rather, we have a sample that is a mixture drawn from several populations, one for each level of the implicit factor, each with a different survival distribution. Implicit factors can also affect censoring times, by affecting the probability that a subject will be withdrawn from the study or lost to follow-up. […] Stratification can be used to control for an implicit factor. […] This is similar to using blocking in analysis of variance. […] If the pattern of censoring is not independent of the survival times, then survival estimates may be too high (if subjects who are more ill tend to be withdrawn from the study), or too low (if subjects who will survive longer tend to drop out of the study and are lost to follow-up). If a loss or withdrawal of one subject could increase the probability of loss or withdrawal of other subjects, this would also lead to lack of independence between censoring and the subjects. […] A study may end up with many censored values as a result of having large numbers of subjects withdrawn or lost to follow-up, or from having the study end while many subjects are still alive. Large numbers of censored values decrease the equivalent number of subjects exposed (at risk) at later times, reducing the effective sample sizes. […] Survival tests perform better when the censoring is not too heavy, and, in particular, when the pattern of censoring is similar across the different groups.”

“Kaplan–Meier survival analysis (KMSA) is the appropriate starting point [in the type 2 censoring setting]. KMSA can estimate survival functions even in the presence of censored cases and requires minimal assumptions. If covariates other than time are thought to be important in determining duration to outcome, results reported by KMSA will represent misleading averages, obscuring important differences in groups formed by the covariates (e.g., men vs. women). Since this is often the case, methods that incorporate covariates, such as event-history models and Cox regression, may be preferred. For small samples, the permutation distributions of the Gehan–Breslow, Mantel–Cox, and Tarone–Ware survival test statistics and not the chi-square distribution should be used to compute p-values. If the hazard or survival functions are not parallel, then none of the three tests […] will be particularly good at detecting differences between the survival functions.”

November 1, 2017 Posted by | Books, Statistics | Leave a comment

A few diabetes papers of interest

i. The Pharmacogenetics of Type 2 Diabetes: A Systematic Review.

“We performed a systematic review to identify which genetic variants predict response to diabetes medications.

RESEARCH DESIGN AND METHODS We performed a search of electronic databases (PubMed, EMBASE, and Cochrane Database) and a manual search to identify original, longitudinal studies of the effect of diabetes medications on incident diabetes, HbA1c, fasting glucose, and postprandial glucose in prediabetes or type 2 diabetes by genetic variation.

RESULTS Of 7,279 citations, we included 34 articles (N = 10,407) evaluating metformin (n = 14), sulfonylureas (n = 4), repaglinide (n = 8), pioglitazone (n = 3), rosiglitazone (n = 4), and acarbose (n = 4). […] Significant medication–gene interactions for glycemic outcomes included 1) metformin and the SLC22A1, SLC22A2, SLC47A1, PRKAB2, PRKAA2, PRKAA1, and STK11 loci; 2) sulfonylureas and the CYP2C9 and TCF7L2 loci; 3) repaglinide and the KCNJ11, SLC30A8, NEUROD1/BETA2, UCP2, and PAX4 loci; 4) pioglitazone and the PPARG2 and PTPRD loci; 5) rosiglitazone and the KCNQ1 and RBP4 loci; and 5) acarbose and the PPARA, HNF4A, LIPC, and PPARGC1A loci. Data were insufficient for meta-analysis.

CONCLUSIONS We found evidence of pharmacogenetic interactions for metformin, sulfonylureas, repaglinide, thiazolidinediones, and acarbose consistent with their pharmacokinetics and pharmacodynamics.”

“In this systematic review, we identified 34 articles on the pharmacogenetics of diabetes medications, with several reporting statistically significant interactions between genetic variants and medications for glycemic outcomes. Most pharmacogenetic interactions were only evaluated in a single study, did not use a control group, and/or did not report enough information to judge internal validity. However, our results do suggest specific, biologically plausible, gene–medication interactions, and we recommend confirmation of the biologically plausible interactions as a priority, including those for drug transporters, metabolizers, and targets of action. […] Given the number of comparisons reported in the included studies and the lack of accounting for multiple comparisons in approximately 53% of studies, many of the reported findings may [however] be false positives.”

ii. Insights Offered by Economic Analyses.

“This issue of Diabetes Care includes three economic analyses. The first describes the incremental costs of diabetes over a lifetime and highlights how interventions to prevent diabetes may reduce lifetime costs (1). The second demonstrates that although an expensive, intensive lifestyle intervention for type 2 diabetes does not reduce adverse cardiovascular outcomes over 10 years, it significantly reduces the costs of non-intervention−related medical care (2). The third demonstrates that although the use of the International Association of the Diabetes and Pregnancy Study Groups (IADPSG) criteria for the screening and diagnosis of gestational diabetes mellitus (GDM) results in a threefold increase in the number of people labeled as having GDM, it reduces the risk of maternal and neonatal adverse health outcomes and reduces costs (3). The first report highlights the enormous potential value of intervening in adults at high risk for type 2 diabetes to prevent its development. The second illustrates the importance of measuring economic outcomes in addition to standard clinical outcomes to fully assess the value of new treatments. The third demonstrates the importance of rigorously weighing the costs of screening and treatment against the costs of health outcomes when evaluating new approaches to care.”

“The costs of diabetes monitoring and treatment accrue as of function of the duration of diabetes, so adults who are younger at diagnosis are more likely to survive to develop the late, expensive complications of diabetes, thus they incur higher lifetime costs attributable to diabetes. Zhuo et al. report that people with diabetes diagnosed at age 40 spend approximately $125,000 more for medical care over their lifetimes than people without diabetes. For people diagnosed with diabetes at age 50, the discounted lifetime excess medical spending is approximately $91,000; for those diagnosed at age 60, it is approximately $54,000; and for those diagnosed at age 65, it is approximately $36,000 (1).

These results are very consistent with results reported by the Diabetes Prevention Program (DPP) Research Group, which assessed the cost-effectiveness of diabetes prevention. […] In the simulated lifetime economic analysis [included in that study] the lifestyle intervention was more cost-effective in younger participants than in older participants (5). By delaying the onset of type 2 diabetes, the lifestyle intervention delayed or prevented the need for diabetes monitoring and treatment, surveillance of diabetic microvascular and neuropathic complications, and treatment of the late, expensive complications and comorbidities of diabetes, including end-stage renal disease and cardiovascular disease (5). Although this finding was controversial at the end of the randomized, controlled clinical trial, all but 1 of 12 economic analyses published by 10 research groups in nine countries have demonstrated that lifestyle intervention for the prevention of type 2 diabetes is very cost-effective, if not cost-saving, compared with a placebo intervention (6).

Empiric, within-trial economic analyses of the DPP have now demonstrated that the incremental costs of the lifestyle intervention are almost entirely offset by reductions in the costs of medical care outside the study, especially the cost of self-monitoring supplies, prescription medications, and outpatient and inpatient care (7). Over 10 years, the DPP intensive lifestyle intervention cost only ∼$13,000 per quality-adjusted life-year gained when the analysis used an intent-to-treat approach (7) and was even more cost-effective when the analysis assessed outcomes and costs among adherent participants (8).”

“The American Diabetes Association has reported that although institutional care (hospital, nursing home, and hospice care) still account for 52% of annual per capita health care expenditures for people with diabetes, outpatient medications and supplies now account for 30% of expenditures (9). Between 2007 and 2012, annual per capita expenditures for inpatient care increased by 2%, while expenditures for medications and supplies increased by 51% (9). As the costs of diabetes medications and supplies continue to increase, it will be even more important to consider cost savings arising from the less frequent use of medications when evaluating the benefits of nonpharmacologic interventions.”

iii. The Lifetime Cost of Diabetes and Its Implications for Diabetes Prevention. (This is the Zhuo et al. paper mentioned above.)

“We aggregated annual medical expenditures from the age of diabetes diagnosis to death to determine lifetime medical expenditure. Annual medical expenditures were estimated by sex, age at diagnosis, and diabetes duration using data from 2006–2009 Medical Expenditure Panel Surveys, which were linked to data from 2005–2008 National Health Interview Surveys. We combined survival data from published studies with the estimated annual expenditures to calculate lifetime spending. We then compared lifetime spending for people with diabetes with that for those without diabetes. Future spending was discounted at 3% annually. […] The discounted excess lifetime medical spending for people with diabetes was $124,600 ($211,400 if not discounted), $91,200 ($135,600), $53,800 ($70,200), and $35,900 ($43,900) when diagnosed with diabetes at ages 40, 50, 60, and 65 years, respectively. Younger age at diagnosis and female sex were associated with higher levels of lifetime excess medical spending attributed to diabetes.

CONCLUSIONS Having diabetes is associated with substantially higher lifetime medical expenditures despite being associated with reduced life expectancy. If prevention costs can be kept sufficiently low, diabetes prevention may lead to a reduction in long-term medical costs.”

The selection criteria employed in this paper are not perfect; they excluded all individuals below the age of 30 “because they likely had type 1 diabetes”, which although true is only ‘mostly true’. Some of those individuals had(/have) type 2, but if you’re evaluating prevention schemes it probably makes sense to error on the side of caution (better to miss some type 2 patients than to include some type 1s), assuming the timing of the intervention is not too important. This gets more complicated if prevention schemes are more likely to have large and persistent effects in young people – however I don’t think that’s the case, as a counterpoint drug adherence studies often seem to find that young people aren’t particularly motivated to adhere to their treatment schedules compared to their older counterparts (who might have more advanced disease and so are more likely to achieve symptomatic relief by adhering to treatments).

A few more observations from the paper:

“The prevalence of participants with diabetes in the study population was 7.4%, of whom 54% were diagnosed between the ages of 45 and 64 years. The mean age at diagnosis was 55 years, and the mean length of time since diagnosis was 9.4 years (39% of participants with diabetes had been diagnosed for ≤5 years, 32% for 6–15 years, and 27% for ≥16 years). […] The observed annual medical spending for people with diabetes was $13,966—more than twice that for people without diabetes.”

“Regardless of diabetes status, the survival-adjusted annual medical spending decreased after age 60 years, primarily because of a decreasing probability of survival. Because the probability of survival decreased more rapidly in people with diabetes than in those without, corresponding spending declined as people died and no longer accrued medical costs. For example, among men diagnosed with diabetes at age 40 years, 34% were expected to survive to age 80 years; among men of the same age who never developed diabetes, 55% were expected to survive to age 80 years. The expected annual expenditure for a person diagnosed with diabetes at age 40 years declined from $8,500 per year at age 40 years to $3,400 at age 80 years, whereas the expenses for a comparable person without diabetes declined from $3,900 to $3,200 over that same interval. […] People diagnosed with diabetes at age 40 years lived with the disease for an average of 34 years after diagnosis. Those diagnosed when older lived fewer years and, therefore, lost fewer years of life. […] The annual excess medical spending attributed to diabetes […] was smaller among people who were diagnosed at older ages. For men diagnosed at age 40 years, annual medical spending was $3,700 higher than that of similar men without diabetes; spending was $2,900 higher for those diagnosed at age 50 years; $2,200 higher for those diagnosed at age 60 years; and $2,000 higher for those diagnosed at age 65 years. Among women diagnosed with diabetes, the excess annual medical spending was consistently higher than for men of the same age at diagnosis.”

“Regardless of age at diagnosis, people with diabetes spent considerably more on health care after age 65 years than their nondiabetic counterparts. Health care spending attributed to diabetes after age 65 years ranged from $23,900 to $40,900, depending on sex and age at diagnosis. […] Of the total excess lifetime medical spending among an average diabetic patient diagnosed at age 50 years, prescription medications and inpatient care accounted for 44% and 35% of costs, respectively. Outpatient care and other medical care accounted for 17% and 4% of costs, respectively.”

“Our findings differed from those of studies of the lifetime costs of other chronic conditions. For instance, smokers have a lower average lifetime medical cost than nonsmokers (29) because of their shorter life spans. Smokers have a life expectancy about 10 years less than those who do not smoke (30); life expectancy is 16 years less for those who develop smoking-induced cancers (31). As a result, smoking cessation leads to increased lifetime spending (32). Studies of the lifetime costs for an obese person relative to a person with normal body weight show mixed results: estimated excess lifetime medical costs for people with obesity range from $3,790 less to $39,000 more than costs for those who are nonobese (33,34). […] obesity, when considered alone, results in much lower annual excess medical costs than diabetes (–$940 to $1,150 for obesity vs. $2,000 to $4,700 for diabetes) when compared with costs for people who are nonobese (33,34).”

iv. Severe Hypoglycemia and Mortality After Cardiovascular Events for Type 1 Diabetic Patients in Sweden.

“This study examines factors associated with all-cause mortality after cardiovascular complications (myocardial infarction [MI] and stroke) in patients with type 1 diabetes. In particular, we aim to determine whether a previous history of severe hypoglycemia is associated with increased mortality after a cardiovascular event in type 1 diabetic patients.

Hypoglycemia is the most common and dangerous acute complication of type 1 diabetes and can be life threatening if not promptly treated (1). The average individual with type 1 diabetes experiences about two episodes of symptomatic hypoglycemia per week, with an annual prevalence of 30–40% for hypoglycemic episodes requiring assistance for recovery (2). We define severe hypoglycemia to be an episode of hypoglycemia that requires hospitalization in this study. […] Patients with type 1 diabetes are more susceptible to hypoglycemia than those with type 2 diabetes, and therefore it is potentially of greater relevance if severe hypoglycemia is associated with mortality (6).”

“This study uses a large linked data set comprising health records from the Swedish National Diabetes Register (NDR), which were linked to administrative records on hospitalization, prescriptions, and national death records. […] [The] study is based on data from four sources: 1) risk factor data from the Swedish NDR […], 2) hospital records of inpatient episodes from the National Inpatients Register (IPR) […], 3) death records […], and 4) prescription data records […]. A study comparing registered diagnoses in the IPR with information in medical records found positive predictive values of IPR diagnoses were 85–95% for most diagnoses (8). In terms of NDR coverage, a recent study found that 91% of those aged 18–34 years and with type 1 diabetes in the Prescribed Drug Register could be matched with those in the NDR for 2007–2009 (9).”

“The outcome of the study was all-cause mortality after a major cardiovascular complication (MI or stroke). Our sample for analysis included patients with type 1 diabetes who visited a clinic after 2002 and experienced a major cardiovascular complication after this clinic visit. […] We define type 1 diabetes as diabetes diagnosed under the age of 30 years, being reported as being treated with insulin only at some clinic visit, and when alive, having had at least one prescription for insulin filled per year between 2006 and 2010 […], and not having filled a prescription for metformin at any point between July 2005 and December 2010 (under the assumption that metformin users were more likely to be type 2 diabetes patients).”

“Explanatory variables included in both models were type of complication (MI or stroke), age at complication, duration of diabetes, sex, smoking status, HbA1c, BMI, systolic blood pressure, diastolic blood pressure, chronic kidney disease status based on estimated glomerular filtration rate, microalbuminuria and macroalbuminuria status, HDL, LDL, total–to–HDL cholesterol ratio, triglycerides, lipid medication status, clinic visits within the year prior to the CVD event, and prior hospitalization events: hypoglycemia, hyperglycemia, MI, stroke, heart failure, AF, amputation, PVD, ESRD, IHD/unstable angina, PCI, and CABG. The last known value for each clinical risk factor, prior to the cardiovascular complication, was used for analysis. […] Initially, all explanatory variables were included and excluded if the variable was not statistically significant at a 5% level (P < 0.05) via stepwise backward elimination.” [Aaaaaaargh! – US. These guys are doing a lot of things right, but this is not one of them. Just to mention this one more time: “Generally, hypothesis testing is a very poor basis for model selection […] There is no statistical theory that supports the notion that hypothesis testing with a fixed α level is a basis for model selection.” (Burnham & Anderson)]

“Patients who had prior hypoglycemic events had an estimated HR for mortality of 1.79 (95% CI 1.37–2.35) in the first 28 days after a CVD event and an estimated HR of 1.25 (95% CI 1.02–1.53) of mortality after 28 days post CVD event in the backward regression model. The univariate analysis showed a similar result compared with the backward regression model, with prior hypoglycemic events having an estimated HR for mortality of 1.79 (95% CI 1.38–2.32) and 1.35 (95% CI 1.11–1.65) in the logistic and Cox regressions, respectively. Even when all explanatory factors were included in the models […], the mortality increase associated with a prior severe hypoglycemic event was still significant, and the P values and SE are similar when compared with the backward stepwise regression. Similarly, when explanatory factors were included individually, the mortality increase associated with a prior severe hypoglycemic event was also still significant.” [Again, this sort of testing scheme is probably not a good approach to getting at a good explanatory model, but it’s what they did – US]

“The 5-year cumulative estimated mortality risk for those without complications after MI and stroke were 40.1% (95% CI 35.2–45.1) and 30.4% (95% CI 26.3–34.6), respectively. Patients with prior heart failure were at the highest estimated 5-year cumulative mortality risk, with those who suffered an MI and stroke having a 56.0% (95% CI 47.5–64.5) and 44.0% (95% CI 35.8–52.2) 5-year cumulative mortality risk, respectively. Patients who had a prior severe hypoglycemic event and suffered an MI had an estimated 5-year cumulative mortality risk at age 60 years of 52.4% (95% CI 45.3–59.5), and those who suffered a stroke had a 5-year cumulative mortality risk of 39.8% (95% CI 33.4–46.3). Patients at age 60 years who suffer a major CVD complication have over twofold risk of 5-year mortality compared with the general type 1 diabetic Swedish population, who had an estimated 5-year mortality risk of 13.8% (95% CI 12.0–16.1).”

“We found evidence that prior severe hypoglycemia is associated with reduced survival after a major CVD event but no evidence that prior severe hypoglycemia is associated with an increased risk of a subsequent CVD event.

Compared with the general type 1 diabetic Swedish population, a major CVD complication increased 5-year mortality risk at age 60 years by >25% and 15% in patients with an MI and stroke, respectively. Patients with a history of a hypoglycemic event had an even higher mortality after a major CVD event, with approximately an additional 10% being dead at the 5-year mark. This risk was comparable with that in those with late-stage kidney disease. This information is useful in determining the prognosis of patients after a major cardiovascular event and highlights the need to include this as a risk factor in simulation models (18) that are used to improve decision making (19).”

“This is the first study that has found some evidence of a dose-response relationship, where patients who experienced two or more severe hypoglycemic events had higher mortality after a cardiovascular event compared with those who experienced one severe hypoglycemic event. A lack of statistical power prevented us from investigating this further when we tried to stratify by number of prior severe hypoglycemic events in our regression models. There was no evidence of a dose-response relationship between repeated episodes of severe hypoglycemia and vascular outcomes or death in previous type 2 diabetes studies (5).”

v. Alterations in White Matter Structure in Young Children With Type 1 Diabetes.

“Careful regulation of insulin dosing, dietary intake, and activity levels are essential for optimal glycemic control in individuals with type 1 diabetes. However, even with optimal treatment many children with type 1 diabetes have blood glucose levels in the hyperglycemic range for more than half the day and in the hypoglycemic range for an hour or more each day (1). Brain cells may be especially sensitive to aberrant blood glucose levels, as glucose is the brain’s principal substrate for its energy needs.

Research in animal models has shown that white matter (WM) may be especially sensitive to dysglycemia-associated insult in diabetes (24). […] Early childhood is a period of rapid myelination and brain development (6) and of increased sensitivity to insults affecting the brain (6,7). Hence, study of the developing brain is particularly important in type 1 diabetes.”

“WM structure can be measured with diffusion tensor imaging (DTI), a method based on magnetic resonance imaging (MRI) that uses the movement of water molecules to characterize WM brain structure (8,9). Results are commonly reported in terms of mathematical scalars (representing vectors in vector space) such as fractional anisotropy (FA), axial diffusivity (AD), and radial diffusivity (RD). FA reflects the degree of diffusion anisotropy of water (how diffusion varies along the three axes) within a voxel (three-dimensional pixel) and is determined by fiber diameter and density, myelination, and intravoxel fiber-tract coherence (increases in which would increase FA), as well as extracellular diffusion and interaxonal spacing (increases in which would decrease FA) (10). AD, a measure of water diffusivity along the main axis of diffusion within a voxel, is thought to reflect fiber coherence and structure of axonal membranes (increases in which would increase AD), as well as microtubules, neurofilaments, and axonal branching (increases in which would decrease AD) (11,12). RD, the mean of the diffusivities perpendicular to the vector with the largest eigenvalue, is thought to represent degree of myelination (13,14) (more myelin would decrease RD values) and axonal “leakiness” (which would increase RD). Often, however, a combination of these WM characteristics results in opposing contributions to the final observed FA/AD/RD value, and thus DTI scalars should not be interpreted globally as “good” or “bad” (15). Rather, these scalars can show between-group differences and relationships between WM structure and clinical variables and are suggestive of underlying histology. Definitive conclusions about histology of WM can only be derived from direct microscopic examination of biological tissue.”

“Children (ages 4 to <10 years) with type 1 diabetes (n = 127) and age-matched nondiabetic control subjects (n = 67) had diffusion weighted magnetic resonance imaging scans in this multisite neuroimaging study. Participants with type 1 diabetes were assessed for HbA1c history and lifetime adverse events, and glucose levels were monitored using a continuous glucose monitor (CGM) device and standardized measures of cognition.

RESULTS Between-group analysis showed that children with type 1 diabetes had significantly reduced axial diffusivity (AD) in widespread brain regions compared with control subjects. Within the type 1 diabetes group, earlier onset of diabetes was associated with increased radial diffusivity (RD) and longer duration was associated with reduced AD, reduced RD, and increased fractional anisotropy (FA). In addition, HbA1c values were significantly negatively associated with FA values and were positively associated with RD values in widespread brain regions. Significant associations of AD, RD, and FA were found for CGM measures of hyperglycemia and glucose variability but not for hypoglycemia. Finally, we observed a significant association between WM structure and cognitive ability in children with type 1 diabetes but not in control subjects. […] These results suggest vulnerability of the developing brain in young children to effects of type 1 diabetes associated with chronic hyperglycemia and glucose variability.”

“The profile of reduced overall AD in type 1 diabetes observed here suggests possible axonal damage associated with diabetes (30). Reduced AD was associated with duration of type 1 diabetes suggesting that longer exposure to diabetes worsens the insult to WM structure. However, measures of hyperglycemia and glucose variability were either not associated or were positively associated with AD values, suggesting that these measures did not contribute to the observed decreased AD in the type 1 diabetes group. A possible explanation for these observations is that several biological processes influence WM structure in type 1 diabetes. Some processes may be related to insulin insufficiency or C-peptide levels independent of glucose levels (31,32) and may affect WM coherence (and reduce AD values as observed in the between-group results). Other processes related to hyperglycemia and glucose variability may target myelin (resulting in reduced FA and increased RD) as well as reduced axonal branching (both would result in increased AD values). Alternatively, these seemingly conflicting AD observations may be due to a dominant effect of age, which could overshadow effects from dysglycemia.

Early age of onset is one of the most replicable risk factors for cognitive impairments in type 1 diabetes (33,34). It has been hypothesized that young children are especially vulnerable to brain insults resulting from episodes of chronic hyperglycemia, hypoglycemia, and acute hypoglycemic complications of type 1 diabetes (seizures and severe hypoglycemic episodes). In addition, fear of hypoglycemia often results in caregivers maintaining relatively higher blood glucose to avoid lows altogether (1), especially in very young children. However, our study suggests that this approach of aggressive hypoglycemia avoidance resulting in hyperglycemia may not be optimal and may be detrimental to WM structure in young children.

Neuronal damage (reflected in altered WM structure) may affect neuronal signal transfer and, thus, cognition (35). Cognitive domains commonly reported to be affected in children with type 1 diabetes include general intellectual ability, visuospatial abilities, attention, memory, processing speed, and executive function (3638). In our sample, even though the duration of illness was relatively short (2.9 years on average), there were modest but significant cognitive differences between children with type 1 diabetes and control subjects (24).”

“In summary, we present results from the largest study to date investigating WM structure in very young children with type 1 diabetes. We observed significant and widespread brain differences in the WM microstructure of children with type 1 diabetes compared with nondiabetic control subjects and significant associations between WM structure and measures of hyperglycemia, glucose variability, and cognitive ability in the type 1 diabetic population.”

vi. Ultrasound Findings After Surgical Decompression of the Tarsal Tunnel in Patients With Painful Diabetic Polyneuropathy: A Prospective Randomized Study.

“Polyneuropathy is a common complication in diabetes. The prevalence of neuropathy in patients with diabetes is ∼30%. During the course of the disease, up to 50% of the patients will eventually develop neuropathy (1). Its clinical features are characterized by numbness, tingling, or burning sensations and typically extend in a distinct stocking and glove pattern. Prevention plays a key role since poor glucose control is a major risk factor in the development of diabetic polyneuropathy (DPN) (1,2).

There is no clear definition for the onset of painful diabetic neuropathy. Different hypotheses have been formulated.

Hyperglycemia in diabetes can lead to osmotic swelling of the nerves, related to increased glucose conversion into sorbitol by the enzyme aldose reductase (2,3). High sorbitol concentrations might also directly cause axonal degeneration and demyelination (2). Furthermore, stiffening and thickening of ligamental structures and the plantar fascia make underlying structures more prone to biomechanical compression (46). A thicker and stiffer retinaculum might restrict movements and lead to alterations of the nerve in the tarsal tunnel.

Both swelling of the nerve and changes in the tarsal tunnel might lead to nerve damage through compression.

Furthermore, vascular changes may diminish endoneural blood flow and oxygen distribution. Decreased blood supply in the (compressed) nerve might lead to ischemic damage as well as impaired nerve regeneration.

Several studies suggest that surgical decompression of nerves at narrow anatomic sites, e.g., the tarsal tunnel, is beneficial and has a positive effect on pain, sensitivity, balance, long-term risk of ulcers and amputations, and quality of life (3,710). Since the effect of decompression of the tibial nerve in patients with DPN has not been proven with a randomized clinical trial, its contribution as treatment for patients with painful DPN is still controversial. […] In this study, we compare the mean CSA and any changes in shape of the tibial nerve before and after decompression of the tarsal tunnel using ultrasound in order to test the hypothesis that the tarsal tunnel leads to compression of the tibial nerve in patients with DPN.”

“This study, with a large sample size and standardized sonographic imaging procedure with a good reliability, is the first randomized controlled trial that evaluates the effect of decompression of the tibial nerve on the CSA. Although no effect on CSA after surgery was found, this study using ultrasound demonstrates a larger and swollen tibial nerve and thicker flexor retinaculum at the ankle in patients with DPN compared with healthy control subjects.”

I would have been interested to know if there were any observable changes in symptom relief measures post-surgery, even if such variables are less ‘objective’ than measures like CSA (less objective, but perhaps more relevant to the patient…), but the authors did not look at those kinds of variables.

vii. Nonalcoholic Fatty Liver Disease Is Independently Associated With an Increased Incidence of Chronic Kidney Disease in Patients With Type 1 Diabetes.

“Nonalcoholic fatty liver disease (NAFLD) has reached epidemic proportions worldwide (1). Up to 30% of adults in the U.S. and Europe have NAFLD, and the prevalence of this disease is much higher in people with diabetes (1,2). Indeed, the prevalence of NAFLD on ultrasonography ranges from ∼50 to 70% in patients with type 2 diabetes (35) and ∼40 to 50% in patients with type 1 diabetes (6,7). Notably, patients with diabetes and NAFLD are also more likely to develop more advanced forms of NAFLD that may result in end-stage liver disease (8). However, accumulating evidence indicates that NAFLD is associated not only with liver-related morbidity and mortality but also with an increased risk of developing cardiovascular disease (CVD) and other serious extrahepatic complications (810).”

“Increasing evidence indicates that NAFLD is strongly associated with an increased risk of CKD [chronic kidney disease, US] in people with and without diabetes (11). Indeed, we have previously shown that NAFLD is associated with an increased prevalence of CKD in patients with both type 1 and type 2 diabetes (1517), and that NAFLD independently predicts the development of incident CKD in patients with type 2 diabetes (18). However, many of the risk factors for CKD are different in patients with type 1 and type 2 diabetes, and to date, it is uncertain whether NAFLD is an independent risk factor for incident CKD in type 1 diabetes or whether measurement of NAFLD improves risk prediction for CKD, taking account of traditional risk factors for CKD.

Therefore, the aim of the current study was to investigate 1) whether NAFLD is associated with an increased incidence of CKD and 2) whether measurement of NAFLD improves risk prediction for CKD, adjusting for traditional risk factors, in type 1 diabetic patients.”

“Using a retrospective, longitudinal cohort study design, we have initially identified from our electronic database all Caucasian type 1 diabetic outpatients with preserved kidney function (i.e., estimated glomerular filtration rate [eGFR] ≥60 mL/min/1.73 m2) and with no macroalbuminuria (n = 563), who regularly attended our adult diabetes clinic between 1999 and 2001. Type 1 diabetes was diagnosed by the typical presentation of disease, the absolute dependence on insulin treatment for survival, the presence of undetectable fasting C-peptide concentrations, and the presence of anti–islet cell autoantibodies. […] Overall, 261 type 1 diabetic outpatients were included in the final analysis and were tested for the development of incident CKD during the follow-up period […] All participants were periodically seen (every 3–6 months) for routine medical examinations of glycemic control and chronic complications of diabetes. No participants were lost to follow-up. […] For this study, the development of incident CKD was defined as occurrence of eGFR <60 mL/min/1.73 m2 and/or macroalbuminuria (21). Both of these outcome measures were confirmed in all participants in a least two consecutive occasions (within 3–6 months after the first examination).”

“At baseline, the mean eGFRMDRD was 92 ± 23 mL/min/1.73 m2 (median 87.9 [IQR 74–104]), or eGFREPI was 98.6 ± 19 mL/min/1.73 m2 (median 99.7 [84–112]). Most patients (n = 234; 89.7%) had normal albuminuria, whereas 27 patients (10.3%) had microalbuminuria. NAFLD was present in 131 patients (50.2%). […] At baseline, patients who developed CKD at follow-up were older, more likely to be female and obese, and had a longer duration of diabetes than those who did not. These patients also had higher values of systolic blood pressure, A1C, triglycerides, serum GGT, and urinary ACR and lower values of eGFRMDRD and eGFREPI. Moreover, there was a higher percentage of patients with hypertension, metabolic syndrome, microalbuminuria, and some degree of diabetic retinopathy in patients who developed CKD at follow-up compared with those remaining free from CKD. The proportion using antihypertensive drugs (that always included the use of ACE inhibitors or angiotensin receptor blockers) was higher in those who progressed to CKD. Notably, […] this patient group also had a substantially higher frequency of NAFLD on ultrasonography.”

“During follow-up (mean duration 5.2 ± 1.7 years, range 2–10), 61 patients developed CKD using the MDRD study equation to estimate eGFR (i.e., ∼4.5% of participants progressed every year to eGFR <60 mL/min/1.73 m2 or macroalbuminuria). Of these, 28 developed an eGFRMDRD <60 mL/min/1.73 m2 with abnormal albuminuria (micro- or macroalbuminuria), 21 developed a reduced eGFRMDRD with normal albuminuria (but 9 of them had some degree of diabetic retinopathy at baseline), and 12 developed macroalbuminuria alone. None of them developed kidney failure requiring chronic dialysis. […] The annual eGFRMDRD decline for the whole cohort was 2.68 ± 3.5 mL/min/1.73 m2 per year. […] NAFLD patients had a greater annual decline in eGFRMDRD than those without NAFLD at baseline (3.28 ± 3.8 vs. 2.10 ± 3.0 mL/min/1.73 m2 per year, P < 0.005). Similarly, the frequency of a renal functional decline (arbitrarily defined as ≥25% loss of baseline eGFRMDRD) was greater among those with NAFLD than among those without the disease (26 vs. 11%, P = 0.005). […] Interestingly, BMI was not significantly associated with CKD.”

“Our novel findings indicate that NAFLD is strongly associated with an increased incidence of CKD during a mean follow-up of 5 years and that measurement of NAFLD improves risk prediction for CKD, independently of traditional risk factors (age, sex, diabetes duration, A1C, hypertension, baseline eGFR, and microalbuminuria [i.e., the last two factors being the strongest known risk factors for CKD]), in type 1 diabetic adults. Additionally, although NAFLD was strongly associated with obesity, obesity (or increased BMI) did not explain the association between NAFLD and CKD. […] The annual cumulative incidence rate of CKD in our cohort of patients (i.e., ∼4.5% per year) was essentially comparable to that previously described in other European populations with type 1 diabetes and similar baseline characteristics (∼2.5–9% of patients who progressed every year to CKD) (25,26). In line with previously published information (2528), we also found that hypertension, microalbuminuria, and lower eGFR at baseline were strong predictors of incident CKD in type 1 diabetic patients.”

“There is a pressing and unmet need to determine whether NAFLD is associated with a higher risk of CKD in people with type 1 diabetes. It has only recently been recognized that NAFLD represents an important burden of disease for type 2 diabetic patients (11,17,18), but the magnitude of the problem of NAFLD and its association with risk of CKD in type 1 diabetes is presently poorly recognized. Although there is clear evidence that NAFLD is closely associated with a higher prevalence of CKD both in those without diabetes (11) and in those with type 1 and type 2 diabetes (1517), only four prospective studies have examined the association between NAFLD and risk of incident CKD (18,2931), and only one of these studies was published in patients with type 2 diabetes (18). […] The underlying mechanisms responsible for the observed association between NAFLD and CKD are not well understood. […] The possible clinical implication for these findings is that type 1 diabetic patients with NAFLD may benefit from more intensive surveillance or early treatment interventions to decrease the risk for CKD. Currently, there is no approved treatment for NAFLD. However, NAFLD and CKD share numerous cardiometabolic risk factors, and treatment strategies for NAFLD and CKD should be similar and aimed primarily at modifying the associated cardiometabolic risk factors.”

 

October 25, 2017 Posted by | Cardiology, Diabetes, Epidemiology, Genetics, Health Economics, Medicine, Nephrology, Neurology, Pharmacology, Statistics, Studies | Leave a comment

Infectious Disease Surveillance (IV)

I have added some more observations from the second half of the book below.

“The surveillance systems for all stages of HIV infection, including stage 3 (AIDS), are the most highly developed, complex, labor-intensive, and expensive of all routine infectious disease surveillance systems. […] Although some behaviorally based prevention interventions (e.g., individual counseling and testing) are relatively inexpensive and simple to implement, others are expensive and difficult to maintain. Consequently, HIV control programs have added more treatment-based methods in recent years. These consist primarily of routine and, in some populations, repeated and frequent testing for HIV with an emphasis on diagnosing every infected person as quickly as possible, linking them to clinical care, prescribing ART, monitoring for retention in care, and maintaining an undetectable viral load. This approach is referred to as “treatment as prevention.” […] Prior to the advent of HAART in the mid-1990s, surveillance consisted primarily of collecting initial HIV diagnosis, followed by monitoring of progression to AIDS and death. The current need to monitor adherence to treatment and care has led to surveillance to collect results of all CD4 count and viral load tests conducted on HIV-infected persons. Treatment guidelines recommend such testing quarterly [11], leading to dozens of laboratory tests being reported for each HIV-infected person in care; hence, the need to receive laboratory results electronically and efficiently has increased. […] The standard set by CDC for completeness is that at least 85% of diagnosed cases are reported to public health within the year of diagnosis. […] As HIV-infected persons live longer as a consequence of ART, the scope of HIV surveillance has expanded […] A critical part of collecting HIV data is maintaining the database.”

“The World Health Organization (WHO) estimates that 8.7 million new cases of TB and 1.4 million deaths from TB occurred in 2011 worldwide [2]. […] WHO estimates that one of every three individuals worldwide is infected with TB [6]. An estimated 5–10% of persons with LTBI [latent TB infection] in the general population will eventually develop active TB disease. Persons with latent infection who are immune suppressed for any reason are more likely to develop active disease. It is estimated that people infected with human immunodeficiency virus (HIV) are 21–34 times more likely to progress from latent to active TB disease […] By 2010, the percentage of all TB cases tested for HIV was 65% and the prevalence of coinfection was 6% [in the United States] [4]. […] From a global perspective, the United States is considered a low morbidity and mortality country for TB. In 2010, the national annual incidence rate for TB was 3.6 per 100,000 persons with 11,182 reported cases of TB  […] In 1953, 113,531 tuberculosis cases were reported in the United States […] Tuberculosis surveillance in the United States has changed a great deal in depth and quality since its inception more than a century ago. […] To assure uniformity and standardization of surveillance data, all TB programs in the United States report verified TB cases via the Report of Verified Case of Tuberculosis (RVCT) [43]. The RVCT collects demographic, diagnostic, clinical, and risk-factor information on incident TB cases […] A companion form, the Follow-up 1 (FU-1), records the date of specimen collection and results of the initial drug susceptibility test at the time of diagnosis for all culture-confirmed TB cases. […]  The Follow-up 2 (FU-2) form collects outcome data on patient treatment and additional clinical and laboratory information. […] Since 1993, the RVCT, FU-1, and FU-2 have been used to collect demographic and clinical information, as well as laboratory results for all reported TB cases in the United States […] The RVCT collects information about known risk factors for TB disease; and in an effort to more effectively monitor TB caused by drug-resistant strains, CDC also gathers information regarding drug susceptibility testing for culture-confirmed cases on the FU-2.”

“Surveillance data may come from widely different systems with different specific purposes. It is essential that the purpose and context of any specific system be understood before attempting to analyze and interpret the surveillance data produced by that system. It is also essential to understand the methodology by which the surveillance system collects data. […] The most fundamental challenge for analysis and interpretation of surveillance data is the identification of a baseline. […] For infections characterized by seasonal outbreaks, the baseline range will vary by season in a generally predictable manner […] The comparison of observations to the baseline range allows characterization of the impact of intentional interventions or natural phenomenon and determination of the direction of change. […] Resource investment in surveillance often occurs in response to a newly recognized disease […] a suspected change in the frequency, virulence, geography, or risk population of a familiar disease […] or following a natural disaster […] In these situations, no baseline data are available against which to judge the significance of data collected under newly implemented surveillance.”

“Differences in data collection methods may result in apparent differences in disease occurrence between geographic regions or over time that are merely artifacts resulting from variations in surveillance methodology. Data should be analyzed using standard periods of observation […] It may be helpful to examine the same data by varied time frames. An outbreak of short duration may be recognizable through hourly, daily, or weekly grouping of data but obscured if data are examined only on an annual basis. Conversely, meaningful longer-term trends may be recognized more efficiently by examining data on an annual basis or at multiyear intervals. […] An early approach to analysis of infectious disease surveillance data was to convert observation of numbers into observations of rates. Describing surveillance observations as rates […] standardizes the data in a way that allows comparisons of the impact of disease across time and geography and among different populations”.

“Understanding the sensitivity and specificity of surveillance systems is important. […] Statistical methods based on tests of randomness have been applied to infectious disease surveillance data for the purpose of analysis of aberrations. Methods include adaptations of quality control charts from industry; Bayesian, cluster, regression, time series, and bootstrap analyses; and application of smoothing algorithms, simulation, and spatial statistics [1,14].[…] Time series forecasting and regression methods have been fitted to mortality data series to forecast future epidemics of seasonal diseases, most commonly influenza, and to estimate the excess associated mortality. […] While statistical analysis can be applied to surveillance data, the use of statistics for this purpose is often limited by the nature of surveillance data. Populations under surveillance are often not random samples of a general population, and may not be broadly representative, complicating efforts to use statistics to estimate morbidity and mortality impacts on populations. […] The more information an epidemiologist has about the purpose of the surveillance system, the people who perform the reporting, and the circumstances under which the data are collected and conveyed through the system, the more likely it is that the epidemiologist will interpret the data correctly. […] In the context of public health practice, a key value of surveillance data is not just in the observations from the surveillance system but also in the fact that these data often stimulate action to collect better data, usually through field investigations. Field investigations may improve understanding of risk factors that were suggested by the surveillance data itself. Often, field investigations triggered by surveillance observations lead to research studies such as case control comparisons that identify and better define the strength of risk factors.”

“The increasing frequency of disease outbreaks that have spread across national borders has led to the development of multicountry surveillance networks. […] Countries that participate in surveillance networks typically agree to share disease outbreak information and to collaborate in efforts to control disease spread. […] Multicountry disease surveillance networks now exist in many parts of the world, such as the Middle East, Southeast Asia, Southern Africa, Southeastern Europe, and East Africa. […] Development of accurate and reliable diagnoses of illnesses is a fundamental challenge in global surveillance. Clinical specimen collection, analysis, and laboratory confirmation of the etiology of disease outbreaks are important components of any disease surveillance system [37]. In many areas of the world, however, insufficient diagnostic capacity leads to no or faulty diagnoses, inappropriate treatments, and disease misreporting. For example, surveillance for malaria is challenged by a common reliance on clinical symptoms for diagnosis, which has been shown to be a poor predictor of actual infection [38,39]. […] A WHO report indicates that more than 60% of laboratory equipment in countries with limited resources is outdated or not functioning [46]. Even when there is sufficient laboratory capacity, laboratory-based diagnosis of disease can also be slow, delaying detection of outbreaks. For example, it can take more than a month to determine whether a patient is infected with drug-resistant strains of tuberculosis. […] The International Health Regulations (IHR) codify the measures that countries must take to limit the international spread of disease while ensuring minimum interference with trade and travel. […] From the perspective of an individual nation, there are few incentives to report an outbreak of a disease to the international community. Rather, the decision to report diseases may result in adverse consequences — significant drops in tourism and trade, closings of borders, and other measures that the IHR are supposed to prevent.”

“Concerns about biological terrorism have raised the profile of infectious disease surveillance in the United States and around the globe [14]. […] Improving global surveillance for biological terrorism and emerging infectious diseases is now a major focus of the U.S. Department of Defense’s (DoD) threat reduction programs [17]. DoD spends more on global health surveillance than any other U.S. governmental agency [18].”

“Zoonoses, or diseases that can transmit between humans and animals, have been responsible for nearly two-thirds of infectious disease outbreaks that have occurred since 1950 and more than $200 billion in worldwide economic losses in the last 10 years [52]. Despite the significant economic and health threats caused by these diseases, worldwide capacity for surveillance of zoonotic diseases is insufficient [52]. […] Over the last few decades, there have been significant changes in the way in which infectious disease surveillance is practiced. New regulations and goals for infectious disease surveillance have given rise to the development of new surveillance approaches and methods and have resulted in participation by nontraditional sectors, including the security community. Though most of these developments have positively shaped global surveillance, there remain key challenges that stand in the way of continued improvements. These include insufficient diagnostic capabilities and lack of trained staff, lack of integration between human and animal-health surveillance efforts, disincentives for countries to report disease outbreaks, and lack of information exchange between public health agencies and other sectors that are critical for surveillance.

“The biggest limitations to the development and sustainment of electronic disease surveillance systems, particularly in resource-limited countries, are the ease with which data are collected, accessed, and used by public health officials. Systems that require large amounts of resources, whether that is in the form of the workforce or information technology (IT) infrastructure, will not be successful in the long term. Successful systems run on existing hardware that can be maintained by modestly trained IT professionals and are easy to use by end users in public health [20].”

October 20, 2017 Posted by | Books, Epidemiology, Infectious disease, Medicine, Statistics | Leave a comment