This will be my last post about the book. Yesterday I finished reading Darwin’s Origin of Species, which was my 100th book this year (here’s the list), but I can’t face blogging that book at the moment so coverage of that one will have to wait a bit.
In my second post about this book I had originally planned to cover chapter 7 – ‘Analysing costs’ – but as I didn’t like to spend too much time on the post I ended up cutting it short. This omission of coverage in the last post means that some themes to be discussed below are closely related to stuff covered in the second post, whereas on the other hand most of the remaining material, more specifically the material from chapters 8, 9 and 10, deal with decision analytic modelling, a quite different topic; in other words the coverage will be slightly more fragmented and less structured than I’d have liked it to be, but there’s not really much to do about that (it doesn’t help in this respect that I decided to not cover chapter 8, but doing that as well was out of the question).
I’ll start with coverage of some of the things they talk about in chapter 7, which as mentioned deals with how to analyze costs in a cost-effectiveness analysis context. They observe in the chapter that health cost data are often skewed to the right, for several reasons (costs incurred by an individual cannot be negative; for many patients the costs may be zero; some study participants may require much more care than the rest, creating a long tail). One way to address skewness is to use the median instead of the mean as the variable of interest, but a problem with this approach is that the median will not be as useful to policy-makers as will be the mean; as the mean times the population of interest will give a good estimate of the total costs of an intervention, whereas the median is not a very useful variable in the context of arriving at an estimate of the total costs. Doing data transformations and analyzing transformed data is another way to deal with skewness, but their use in cost effectiveness analysis have been questioned for a variety of reasons discussed in the chapter (to give a couple of examples, data transformation methods perform badly if inappropriate transformations are used, and many transformations cannot be used if there are data points with zero costs in the data, which is very common). Of the non-parametric methods aimed at dealing with skewness they discuss a variety of tests which are rarely used, as well as the bootstrap, the latter being one approach which has gained widespread use. They observe in the context of the bootstrap that “it has increasingly been recognized that the conditions the bootstrap requires to produce reliable parameter estimates are not fundamentally different from the conditions required by parametric methods” and note in a later chapter (chapter 11) that: “it is not clear that boostrap results in the presence of severe skewness are likely to be any more or less valid than parametric results […] bootstrap and parametric methods both rely on sufficient sample sizes and are likely to be valid or invalid in similar circumstances. Instead, interest in the bootstrap has increasingly focused on its usefulness in dealing simultaneously with issues such as censoring, missing data, multiple statistics of interest such as costs and effects, and non-normality.” Going back to the coverage in chapter 7, in the context of skewness they also briefly touch upon the potential use of a GLM framework to address this problem.
Data is often missing in cost datasets. Some parts of their coverage of these topics was to me but a review of stuff already covered in Bartholomew. Data can be missing for different reasons and through different mechanisms; one distinction is among data missing completely at random (MCAR), missing at random (MAR) (“missing data are correlated in an observable way with the mechanism that generates the cost, i.e. after adjusting the data for observable differences between complete and missing cases, the cost for those with missing data is the same, except for random variation, as for those with complete data”), and not missing at random (NMAR); the last type is also called non-ignorably missing data, and if you have that sort of data the implication is that the costs of those in the observed and unobserved groups differ in unpredictable ways, and if you ignore the process that drives these differences you’ll probably end up with a biased estimator. Another way to distinguish between different types of missing data is to look at patterns within the dataset, where you have:
“*univariate missingness – a single variable in a dataset is causing a problem through missing values, while the remaining variables contain complete information
*unit non-response – no data are recorded for any of the variables for some patients
*monotone missing – caused, for example, by drop-out in panel or longitudinal studies, resulting in variables observed up to a certain time point or wave but not beyond that
*multivariate missing – also called item non-response or general missingness, where some but not all of the variables are missing for some of the subjects.”
The authors note that the most common types of missingness in cost information analyses are the latter two. They discuss some techniques for dealing with missing data, such as complete-case analysis, available-case analysis, and imputation, but I won’t go into the details here. In the last parts of the chapter they talk a little bit about censoring, which can be viewed as a specific type of missing data, and ways to deal with it. Censoring happens when follow-up information on some subjects is not available for the full duration of interest, which may be caused e.g. by attrition (people dropping out of the trial), or insufficient follow up (the final date of follow-up might be set before all patients reach the endpoint of interest, e.g. death). The two most common methods for dealing with censored cost data are the Kaplan-Meier sample average (-KMSA) estimator and the inverse probability weighting (-IPW) estimator, both of which are non-parametric interval methods. “Comparisons of the IPW and KMSA estimators have shown that they both perform well over different levels of censoring […], and both are considered reasonable approaches for dealing with censoring.” One difference between the two is that the KMSA, unlike the IPW, is not appropriate for dealing with censoring due to attrition unless the attrition is MCAR (and it almost never is), because the KM estimator, and by extension the KMSA estimator, assumes that censoring is independent of the event of interest.
The focus in chapter 8 is on decision tree models, and I decided to skip that chapter as most of it is known stuff which I felt no need to review here (do remember that I to a large extent use this blog as an extended memory, so I’m not only(/mainly?) writing this stuff for other people..). Chapter 9 deals with Markov models, and I’ll talk a little bit about those in the following.
“Markov models analyse uncertain processes over time. They are suited to decisions where the timing of events is important and when events may happen more than once, and therefore they are appropriate where the strategies being evaluated are of a sequential or repetitive nature. Whereas decision trees model uncertain events at chance nodes, Markov models differ in modelling uncertain events as transitions between health states. In particular, Markov models are suited to modelling long-term outcomes, where costs and effects are spread over a long period of time. Therefore Markov models are particularly suited to chronic diseases or situations where events are likely to recur over time […] Over the last decade there has been an increase in the use of Markov models for conducting economic evaluations in a health-care setting […]
A Markov model comprises a finite set of health states in which an individual can be found. The states are such that in any given time interval, the individual will be in only one health state. All individuals in a particular health state have identical characteristics. The number and nature of the states are governed by the decisions problem. […] Markov models are concerned with transitions during a series of cycles consisting of short time intervals. The model is run for several cycles, and patients move between states or remain in the same state between cycles […] Movements between states are defined by transition probabilities which can be time dependent or constant over time. All individuals within a given health state are assumed to be identical, and this leads to a limitation of Markov models in that the transition probabilities only depend on the current health state and not on past health states […the process is memoryless…] – this is known as the Markovian assumption”.
The note that in order to build and analyze a Markov model, you need to do the following: *define states and allowable transitions [for example from ‘non-dead’ to ‘dead’ is okay, but going the other way is, well… For a Markov process to end, you need at least one state that cannot be left after it has been reached, and those states are termed ‘absorbing states’], *specify initial conditions in terms of starting probabilities/initial distribution of patients, *specify transition probabilities, *specify a cycle length, *set a stopping rule, *determine rewards, *implement discounting if required, *analysis and evaluation of the model, and *exploration of uncertainties. They talk about each step in more detail in the book, but I won’t go too much into this.
Markov models may be governed by transitions that are either constant over time or time-dependent. In a Markov chain transition probabilities are constant over time, whereas in a Markov process transition probabilities vary over time (/from cycle to cycle). In a simple Markov model the baseline assumption is that transitions only occur once in each cycle and usually the transition is modelled as taking place either at the beginning or the end of cycles, but in reality transitions can take place at any point in time during the cycle. One way to deal with the problem of misidentification (people assumed to be in one health state throughout the cycle even though they’ve transfered to another health state during the cycle) is to use half-cycle corrections, in which an assumption is made that on average state transitions occur halfway through the cycle, instead of at the beginning or the end of a cycle. They note that: “the important principle with the half-cycle correction is not when the transitions occur, but when state membership (i.e. the proportion of the cohort in that state) is counted. The longer the cycle length, the more important it may be to use half-cycle corrections.” When state transitions are assumed to take place may influence factors such as cost discounting (if the cycle is long, it can be important to get the state transition timing reasonably right).
When time dependency is introduced into the model, there are in general two types of time dependencies that impact on transition probabilities in the models. One is time dependency depending on the number of cycles since the start of the model (this is e.g. dealing with how transition probabilities depend on factors like age), whereas the other, which is more difficult to implement, deals with state dependence (curiously they don’t use these two words, but I’ve worked with state dependence models before in labour economics and this is what we’re dealing with here); i.e. here the transition probability will depend upon how long you’ve been in a given state.
Below I mostly discuss stuff covered in chapter 10, however I also include a few observations from the final chapter, chapter 11 (on ‘Presenting cost-effectiveness results’). Chapter 10 deals with how to represent uncertainty in decision analytic models. This is an important topic because as noted later in the book, “The primary objective of economic evaluation should not be hypothesis testing, but rather the estimation of the central parameter of interest—the incremental cost-effectiveness ratio—along with appropriate representation of the uncertainty surrounding that estimate.” In chapter 10 a distinction is made between variability, heterogeneity, and uncertainty. Variability has also been termed first-order uncertainty or stochastic uncertainty, and pertains to variation observed when recording information on resource use or outcomes within a homogenous sample of individuals. Heterogeneity relates to differences between patients which can be explained, at least in part. They distinguish between two types of uncertainty, structural uncertainty – dealing with decisions and assumptions made about the structure of the model – and parameter uncertainty, which of course relates to the precision of the parameters estimated. After briefly talking about ways to deal with these, they talk about sensitivity analysis.
“Sensitivity analysis involves varying parameter estimates across a range and seeing how this impacts on he model’s results. […] The simplest form is a one-way analysis where each parameter estimate is varied independently and singly to observe the impact on the model results. […] One-way sensitivity analysis can give some insight into the factors influencing the results, and may provide a validity check to assess what happens when particular variables take extreme values. However, it is likely to grossly underestimate overall uncertainty, and ignores correlation between parameters.”
Multi-way sensitivity analysis is a more refined approach, in which more than one parameter estimate is varied – this is sometimes termed scenario analysis. A different approach is threshold analysis, where one attempts to identify the critical value of one or more variables so that the conclusion/decision changes. All of these approaches are deterministic approaches, and they are not without problems. “They fail to take account of the joint parameter uncertainty and correlation between parameters, and rather than providing the decision-maker with a useful indication of the likelihood of a result, they simply provide a range of results associated with varying one or more input estimates.” So of course an alternative has been developed, namely probabilistic sensitivity analysis (-PSA), which already in the mid-80es started to be used in health economic decision analyses.
“PSA permits the joint uncertainty across all the parameters in the model to be addressed at the same time. It involves sampling model parameter values from distributions imposed on variables in the model. […] The types of distribution imposed are dependent on the nature of the input parameters [but] decision analytic models for the purpose of economic evaluation tend to use homogenous types of input parameters, namely costs, life-years, QALYs, probabilities, and relative treatment effects, and consequently the number of distributions that are frequently used, such as the beta, gamma, and log-normal distributions, is relatively small. […] Uncertainty is then propagated through the model by randomly selecting values from these distributions for each model parameter using Monte Carlo simulation“.
“This comprehensive new Handbook explores the significance and nature of armed intrastate conflict and civil war in the modern world.
Civil wars and intrastate conflict represent the principal form of organised violence since the end of World War II, and certainly in the contemporary era. These conflicts have a huge impact and drive major political change within the societies in which they occur, as well as on an international scale. The global importance of recent intrastate and regional conflicts in Afghanistan, Pakistan, Iraq, Somalia, Nepal, Côte d’Ivoire, Syria and Libya – amongst others – has served to refocus academic and policy interest upon civil war. […] This volume will be of much interest to students of civil wars and intrastate conflict, ethnic conflict, political violence, peace and conflict studies, security studies and IR in general.”
I’m currently reading this handbook. One observation I’ll make here before moving on to the main coverage is that although I’ve read more than 100 pages and although every single one of the conflicts argued in the introduction above to be motivating study into these topics aside from one (the exception being Nepal) involve muslims, the word ‘islam’ has been mentioned exactly once in the coverage so far (an updated list would arguably include yet another muslim country, Yemen, as well). I noted while doing the text search that they seem to take up the topic of religion and religious motivation later on, so I sort of want to withhold judgment for now, but if they don’t deal more seriously with this topic later on than they have so far, I’ll have great difficulties giving this book a high rating, despite the coverage being in general actually quite interesting, detailed and well written so far – chapter 7, on so-called ‘critical perspectives’ is in my opinion a load of crap [a few illustrative quotes/words/concepts from that chapter: “Frankfurt School-inspired Critical Theory”, “approaches such as critical constructivism, post-structuralism, feminism, post-colonialism”, “an openly ethical–normative commitment to human rights, progressive politics”, “labelling”, “dialectical”, “power–knowledge structures”, “conflict discourses”, “Foucault”, “an abiding commitment to being aware of, and trying to overcome, the Eurocentric, Orientalist and patriarchal forms of knowledge often prevalent within civil war studies”, “questioning both morally and intellectually the dominant paradigm”… I read the chapter very fast, to the point of almost only skimming it, and I have not quoted from that chapter in my coverage below, for reasons which should be obvious – I was reminded of Poe’s Corollary while reading the chapter as I briefly started wondering along the way if the chapter was an elaborate joke which had somehow made it into the publication, and I also briefly was reminded of the Sokal affair, mostly because of the unbelievable amount of meaningless buzzwords], but that’s just one chapter and most of the others so far have been quite okay. A few of the points in the problematic chapter are actually arguably worth having in mind, but there’s so much bullshit included as well that you’re having a really hard time taking any of it seriously.
Some observations from the first 100 pages:
“There are wide differences of opinion across the broad field of scholars who work on civil war regarding the basis of legitimate and scientific knowledge in this area, on whether cross-national studies can generate reliable findings, and on whether objective, value-free analysis of armed conflict is possible. All too often – and perhaps increasingly so, with the rise in interest in econometric approaches – scholars interested in civil war from different methodological traditions are isolated from each other. […] even within the more narrowly defined empirical approaches to civil war studies there are major disagreements regarding the most fundamental questions relating to contemporary civil wars, such as the trends in numbers of armed conflicts, whether civil wars are changing in nature, whether and how international actors can have a role in preventing, containing and ending civil wars, and the significance of [various] factors”.
“In simplest terms civil war is a violent conflict between a government and an organized rebel group, although some scholars also include armed conflicts primarily between non-state actors within their study. The definition of a civil war, and the analytical means of differentiating a civil war from other forms of large-scale violence, has been controversial […] The Uppsala Conflict Data Program (UCDP) uses 25 battle-related deaths per year as the threshold to be classified as armed conflict, and – in common with other datasets such as the Correlates of War (COW) – a threshold of 1,000 battle-related deaths for a civil war. While this is now widely endorsed, debate remains regarding the rigor of this definition […] differences between two of the main quantitative conflict datasets – the UCDP and the COW – in terms of the measurement of armed conflict result in significant differences in interpreting patterns of conflict. This has led to conflicting findings not only about absolute numbers of civil wars, but also regarding trends in the numbers of such conflicts. […] According to the UCDP/PRIO data, from 1946 to 2011 a total of 102 countries experienced civil wars. Africa witnessed the most with 40 countries experiencing civil wars between 1946 and 2011. During this period 20 countries in the Americas experienced civil war, 18 in Asia, 13 in Europe, and 11 in the Middle East […]. There were 367 episodes (episodes in this case being separated by at least one year without at least 25 battle-related deaths) of civil wars from 1946 to 2009 […]. The number of active civil wars generally increased from the end of the Cold War to around 1992 […]. Since then the number has been in decline, although whether this is likely to be sustained is debatable. In terms of onset of first episode by region from 1946 to 2011, Africa leads the way with 75, followed by Asia with 67, the Western Hemisphere with 33, the Middle East with 29, and Europe with 25 […]. As Walter (2011) has observed, armed conflicts are increasingly concentrated in poor countries. […] UCDP reports 137 armed conflicts for the period 1989–2011. For the overlapping period 1946–2007, COW reports 179 wars, while UCDP records 244 armed conflicts. As most of these conflicts have been fought over disagreements relating to conditions within a state, it means that civil war has been the most common experience of war throughout this period.”
“There were 3 million deaths from civil wars with no international intervention between 1946 and 2008. There were 1.5 million deaths in wars where intervention occurred. […] In terms of region, there were approximately 350,000 civil war-related deaths in both Europe and the Middle East from the years 1946 to 2008. There were 467,000 deaths in the Western Hemisphere, 1.2 million in Africa, and 3.1 million in Asia for the same period […] In terms of historical patterns of civil wars and intrastate armed conflict more broadly, the most conspicuous trend in recent decades is an apparent decline in absolute numbers, magnitude, and impact of armed conflicts, including civil wars. While there is wide – but not total – agreement regarding this, the explanations for this downward trend are contested. […] the decline seems mainly due not to a dramatic decline of civil war onsets, but rather because armed conflicts are becoming shorter in duration and they are less likely to recur. While this is undoubtedly welcome – and so is the tendency of civil wars to be generally smaller in magnitude – it should not obscure the fact that civil wars are still breaking out at a rate that has been fairly static in recent decades.”
“there is growing consensus on a number of findings. For example, intrastate armed conflict is more likely to occur in poor, developing countries with weak state structures. In situations of weak states the presence of lootable natural resources and oil increase the likelihood of experiencing armed conflict. Dependency upon the export of primary commodities is also a vulnerability factor, especially in conjunction with drastic fluctuations in international market prices which can result in economic shocks and social dislocation. State weakness is relevant to this – and to most of the theories regarding armed conflict proneness – because such states are less able to cushion the impact of economic shocks. […] Authoritarian regimes as well as entrenched democracies are less likely to experience civil war than societies in-between […] Situations of partial or weak democracy (anocracy) and political transition, particularly a movement towards democracy in volatile or divided societies, are also strongly correlated to conflict onset. The location of a society – especially if it has other vulnerability factors – in a region which has contiguous neighbors which are experiencing or have experienced armed conflict is also an armed conflict risk.”
“Military intervention aimed at supporting a protagonist or influencing the outcome of a conflict tends to increase the intensity of civil wars and increase their duration […] It is commonly argued that wars ending with military victory are less likely to recur […]. In these terminations one side no longer exists as a fighting force. Negotiated settlements, on the other hand, are often unstable […] The World Development Report 2011 notes that 90 percent of the countries with armed conflicts taking place in the first decade of the 2000s also had a major armed conflict in the preceding 30 years […] of the 137 armed conflicts that were fought after 1989 100 had ended by 2011, while 37 were still ongoing”
“Cross-national, aggregated, analysis has played a leading role in strengthening the academic and policy impact of conflict research through the production of rigorous research findings. However, the […] aggregation of complex variables has resulted in parsimonious findings which arguably neglect the complexity of armed conflict; simultaneously, differences in the codification and definition of key concepts result in contradictory findings. The growing popularity of micro-studies is therefore an important development in the field of civil war studies, and one that responds to the demand for more nuanced analysis of the dynamics of conflict at the local level.”
“Jason Quinn, University of Notre Dame, has calculated that the number of scholarly articles on the onset of civil wars published in the first decade of the twenty-first century is larger than the previous five decades combined”.
“One of the most challenging aspects of quantitative analysis is transforming social concepts into numerical values. This difficulty means that many of the variables used to capture theoretical constructs represent crude indicators of the real concept […] econometric studies of civil war must account for the endogenising effect of civil war on other variables. Civil war commonly lowers institutional capacity and reduces economic growth, two of the primary conditions that are consistently shown to motivate civil violence. Scholars have grown more capable of modelling this process […], but still too frequently fail to capture the endogenising effect of civil conflict on other variables […] the problems associated with the rare nature of civil conflict can [also] cause serious problems in a number of econometric models […] Case-based analysis commonly suffers from two fundamental problems: non-generalisability and selection bias. […] Combining research methods can help to enhance the validity of both quantitative and qualitative research. […] the combination of methods can help quantitative researchers address measurement issues, assess outliers, discuss variables omitted from the large-N analysis, and examine cases incorrectly predicted by econometric models […] The benefits of mixed methods research designs have been clearly illustrated in a number of prominent studies of civil war […] Yet unfortunately the bifurcation of conflict studies into qualitative and quantitative branches makes this practice less common than is desirable.”
“Ethnography has elicited a lively critique from within and without anthropology. […] Ethnographers stand accused of argument by ostension (pointing at particular instances as indicative of a general trend). The instances may not even be true. This is one of the reasons that the economist Paul Collier rejected ethnographic data as a source of insight into the causes of civil wars (Collier 2000b). According to Collier, the ethnographer builds on anecdotal evidence offered by people with good reasons to fabricate their accounts. […] The story fits the fact. But so might other stories. […] [It might be categorized as] a discipline that still combines a mix of painstaking ethnographic documentation with brilliant flights of fancy, and largely leaves numbers on one side.”
“While macro-historical accounts convincingly argue for the centrality of the state to the incidence and intensity of civil war, there is a radical spatial unevenness to violence in civil wars that defies explanation at the national level. Villages only a few miles apart can have sharply contrasting experiences of conflict and in most civil wars large swathes of territory remain largely unaffected by violence. This unevenness presents a challenge to explanations of conflict that treat states or societies as the primary unit of analysis. […] A range of databases of disaggregated data on incidences of violence have recently been established and a lively publication programme has begun to explore sub-national patterns of distribution and diffusion of violence […] All of these developments testify to a growing recognition across the social sciences that spatial variation, territorial boundaries and bounding processes are properly located at the heart of any understanding of the causes of civil war. It suggests too that sub-national boundaries in their various forms – whether regional or local boundaries, lines of control established by rebels or no-go areas for state security forces – need to be analysed alongside national borders and in a geopolitical context. […] In both violent and non-violent contention local ‘safe territories’ of one kind or another are crucial to the exercise of power by challengers […] the generation of violence by insurgents is critically affected by logistics (e.g. roads), but also shelter (e.g. forests) […] Schutte and Weidmann (2011) offer a […] dynamic perspective on the diffusion of insurgent violence. Two types of diffusion are discussed; relocation diffusion occurs when the conflict zone is shifted to new locations, whereas escalation diffusion corresponds to an expansion of the conflict zone. They argue that the former should be a feature of conventional civil wars with clear frontlines, whereas the latter should be observed in irregular wars, an expectation that is borne out by the data.”
“Research on the motivation of armed militants in social movement scholarship emphasises the importance of affective ties, of friendship and kin networks and of emotion […] Sageman’s (2004, 2008) meticulous work on Salafist-inspired militants emphasises that mobilisation is a collective rather than individual process and highlights the importance of inter-personal ties, networks of friendship, family and neighbours. That said, it is clear that there is a variety of pathways to armed action on the part of individuals rather than one single dominant motivation”.
“While it is often difficult to conduct real experiments in the study of civil war, the micro study of violence has seen a strong adoption of quasi-experimental designs and in general, a more careful thinking about causal identification”.
“Condra and Shapiro (2012) present one of the first studies to examine the effects of civilian targeting in a micro-level study. […] they show that insurgent violence increases as a result of civilian casualties caused by counterinsurgent forces. Similarly, casualties inflicted by the insurgents have a dampening effect on insurgent effectiveness. […] The conventional wisdom in the civil war literature has it that indiscriminate violence by counterinsurgent forces plays into the hands of the insurgents. After being targeted collectively, the aggrieved population will support the insurgency even more, which should result in increased insurgent effectiveness. Lyall (2009) conducts a test of this relationship by examining the random shelling of villages from Russian bases in Chechnya. He matches shelled villages with those that have similar histories of violence, and examines the difference in insurgent violence between treatment and control villages after an artillery strike. The results clearly disprove conventional wisdom and show that shelling reduces subsequent insurgent violence. […] Other research in this area has looked at alternative counterinsurgency techniques, such as aerial bombings. In an analysis that uses micro-level data on airstrikes and insurgent violence, Kocher et al. (2011) show that, counter to Lyall’s (2009) findings, indiscriminate violence in the form of airstrikes against villages in the Vietnam war was counterproductive […] Data availability […] partly dictates what micro-level questions we can answer about civil war. […] not many conflicts have datasets on bombing sorties, such as the one used by Kocher et al. (2011) for the Vietnam war.”
I haven’t really blogged this book in anywhere near the amount of detail it deserves even though my first post about the book actually had a few quotes illustrating how much different stuff is covered in the book.
This book is technical, and even if I’m trying to make it less technical by omitting the math in this post it may be a good idea to reread the first post about the book before reading this post to refresh your knowledge of these things.
Quotes and comments below – most of the coverage here focuses on stuff covered in chapters 3 and 4 in the book.
“Tests of null hypotheses and information-theoretic approaches should not be used together; they are very different analysis paradigms. A very common mistake seen in the applied literature is to use AIC to rank the candidate models and then “test” to see whether the best model (the alternative hypothesis) is “significantly better” than the second-best model (the null hypothesis). This procedure is flawed, and we strongly recommend against it […] the primary emphasis should be on the size of the treatment effects and their precision; too often we find a statement regarding “significance,” while the treatment and control means are not even presented. Nearly all statisticians are calling for estimates of effect size and associated precision, rather than test statistics, P-values, and “significance.” [Borenstein & Hedges certainly did as well in their book (written much later), and this was not an issue I omitted to talk about in my coverage of their book…] […] Information-theoretic criteria such as AIC, AICc, and QAICc are not a “test” in any sense, and there are no associated concepts such as test power or P-values or α-levels. Statistical hypothesis testing represents a very different, and generally inferior, paradigm for the analysis of data in complex settings. It seems best to avoid use of the word “significant” in reporting research results under an information-theoretic paradigm. […] AIC allows a ranking of models and the identification of models that are nearly equally useful versus those that are clearly poor explanations for the data at hand […]. Hypothesis testing provides no general way to rank models, even for models that are nested. […] In general, we recommend strongly against the use of null hypothesis testing in model selection.”
“The bootstrap is a type of Monte Carlo method used frequently in applied statistics. This computer-intensive approach is based on resampling of the observed data […] The fundamental idea of the model-based sampling theory approach to statistical inference is that the data arise as a sample from some conceptual probability distribution f. Uncertainties of our inferences can be measured if we can estimate f. The bootstrap method allows the computation of measures of our inference uncertainty by having a simple empirical estimate of f and sampling from this estimated distribution. In practical application, the empirical bootstrap means using some form of resampling with replacement from the actual data x to generate B (e.g., B = 1,000 or 10,000) bootstrap samples […] The set of B bootstrap samples is a proxy for a set of B independent real samples from f (in reality we have only one actual sample of data). Properties expected from replicate real samples are inferred from the bootstrap samples by analyzing each bootstrap sample exactly as we first analyzed the real data sample. From the set of results of sample size B we measure our inference uncertainties from sample to (conceptual) population […] For many applications it has been theoretically shown […] that the bootstrap can work well for large sample sizes (n), but it is not generally reliable for small n […], regardless of how many bootstrap samples B are used. […] Just as the analysis of a single data set can have many objectives, the bootstrap can be used to provide insight into a host of questions. For example, for each bootstrap sample one could compute and store the conditional variance–covariance matrix, goodness-of-fit values, the estimated variance inflation factor, the model selected, confidence interval width, and other quantities. Inference can be made concerning these quantities, based on summaries over the B bootstrap samples.”
“Information criteria attempt only to select the best model from the candidate models available; if a better model exists, but is not offered as a candidate, then the information-theoretic approach cannot be expected to identify this new model. Adjusted R2 […] are useful as a measure of the proportion of the variation “explained,” [but] are not useful in model selection […] adjusted R2 is poor in model selection; its usefulness should be restricted to description.”
“As we have struggled to understand the larger issues, it has become clear to us that inference based on only a single best model is often relatively poor for a wide variety of substantive reasons. Instead, we increasingly favor multimodel inference: procedures to allow formal statistical inference from all the models in the set. […] Such multimodel inference includes model averaging, incorporating model selection uncertainty into estimates of precision, confidence sets on models, and simple ways to assess the relative importance of variables.”
“If sample size is small, one must realize that relatively little information is probably contained in the data (unless the effect size if very substantial), and the data may provide few insights of much interest or use. Researchers routinely err by building models that are far too complex for the (often meager) data at hand. They do not realize how little structure can be reliably supported by small amounts of data that are typically “noisy.””
“Sometimes, the selected model [when applying an information criterion] contains a parameter that is constant over time, or areas, or age classes […]. This result should not imply that there is no variation in this parameter, rather that parsimony and its bias/variance tradeoff finds the actual variation in the parameter to be relatively small in relation to the information contained in the sample data. It “costs” too much in lost precision to add estimates of all of the individual θi. As the sample size increases, then at some point a model with estimates of the individual parameters would likely be favored. Just because a parsimonious model contains a parameter that is constant across strata does not mean that there is no variation in that process across the strata.”
“[In a significance testing context,] a significant test result does not relate directly to the issue of what approximating model is best to use for inference. One model selection strategy that has often been used in the past is to do likelihood ratio tests of each structural factor […] and then use a model with all the factors that were “significant” at, say, α = 0.05. However, there is no theory that would suggest that this strategy would lead to a model with good inferential properties (i.e., small bias, good precision, and achieved confidence interval coverage at the nominal level). […] The purpose of the analysis of empirical data is not to find the “true model”— not at all. Instead, we wish to find a best approximating model, based on the data, and then develop statistical inferences from this model. […] We search […] not for a “true model,” but rather for a parsimonious model giving an accurate approximation to the interpretable information in the data at hand. Data analysis involves the question, “What level of model complexity will the data support?” and both under- and overfitting are to be avoided. Larger data sets tend to support more complex models, and the selection of the size of the model represents a tradeoff between bias and variance.”
“The easy part of the information-theoretic approaches includes both the computational aspects and the clear understanding of these results […]. The hard part, and the one where training has been so poor, is the a priori thinking about the science of the matter before data analysis — even before data collection. It has been too easy to collect data on a large number of variables in the hope that a fast computer and sophisticated software will sort out the important things — the “significant” ones […]. Instead, a major effort should be mounted to understand the nature of the problem by critical examination of the literature, talking with others working on the general problem, and thinking deeply about alternative hypotheses. Rather than “test” dozens of trivial matters (is the correlation zero? is the effect of the lead treatment zero? are ravens pink?, Anderson et al. 2000), there must be a more concerted effort to provide evidence on meaningful questions that are important to a discipline. This is the critical point: the common failure to address important science questions in a fully competent fashion. […] “Let the computer find out” is a poor strategy for researchers who do not bother to think clearly about the problem of interest and its scientific setting. The sterile analysis of “just the numbers” will continue to be a poor strategy for progress in the sciences.
Researchers often resort to using a computer program that will examine all possible models and variables automatically. Here, the hope is that the computer will discover the important variables and relationships […] The primary mistake here is a common one: the failure to posit a small set of a priori models, each representing a plausible research hypothesis.”
“Model selection is most often thought of as a way to select just the best model, then inference is conditional on that model. However, information-theoretic approaches are more general than this simplistic concept of model selection. Given a set of models, specified independently of the sample data, we can make formal inferences based on the entire set of models. […] Part of multimodel inference includes ranking the fitted models from best to worst […] and then scaling to obtain the relative plausibility of each fitted model (gi) by a weight of evidence (wi) relative to the selected best model. Using the conditional sampling variance […] from each model and the Akaike weights […], unconditional inferences about precision can be made over the entire set of models. Model-averaged parameter estimates and estimates of unconditional sampling variances can be easily computed. Model selection uncertainty is a substantial subject in its own right, well beyond just the issue of determining the best model.”
“There are three general approaches to assessing model selection uncertainty: (1) theoretical studies, mostly using Monte Carlo simulation methods; (2) the bootstrap applied to a given set of data; and (3) utilizing the set of AIC differences (i.e., ∆i) and model weights wi from the set of models fit to data.”
“Statistical science should emphasize estimation of parameters and associated measures of estimator uncertainty. Given a correct model […], an MLE is reliable, and we can compute a reliable estimate of its sampling variance and a reliable confidence interval […]. If the model is selected entirely independently of the data at hand, and is a good approximating model, and if n is large, then the estimated sampling variance is essentially unbiased, and any appropriate confidence interval will essentially achieve its nominal coverage. This would be the case if we used only one model, decided on a priori, and it was a good model, g, of the data generated under truth, f. However, even when we do objective, data-based model selection (which we are advocating here), the [model] selection process is expected to introduce an added component of sampling uncertainty into any estimated parameter; hence classical theoretical sampling variances are too small: They are conditional on the model and do not reflect model selection uncertainty. One result is that conditional confidence intervals can be expected to have less than nominal coverage.”
“Data analysis is sometimes focused on the variables to include versus exclude in the selected model (e.g., important vs. unimportant). Variable selection is often the focus of model selection for linear or logistic regression models. Often, an investigator uses stepwise analysis to arrive at a final model, and from this a conclusion is drawn that the variables in this model are important, whereas the other variables are not important. While common, this is poor practice and, among other issues, fails to fully consider model selection uncertainty. […] Estimates of the relative importance of predictor variables xj can best be made by summing the Akaike weights across all the models in the set where variable j occurs. Thus, the relative importance of variable j is reflected in the sum w+ (j). The larger the w+ (j) the more important variable j is, relative to the other variables. Using the w+ (j), all the variables can be ranked in their importance. […] This idea extends to subsets of variables. For example, we can judge the importance of a pair of variables, as a pair, by the sum of the Akaike weights of all models that include the pair of variables. […] To summarize, in many contexts the AIC selected best model will include some variables and exclude others. Yet this inclusion or exclusion by itself does not distinguish differential evidence for the importance of a variable in the model. The model weights […] summed over all models that include a given variable provide a better weight of evidence for the importance of that variable in the context of the set of models considered.” [The reason why I’m not telling you how to calculate Akaike weights is that I don’t want to bother with math formulas in wordpress – but I guess all you need to know is that these are not hard to calculate. It should perhaps be added that one can also use bootstrapping methods to obtain relevant model weights to apply in a multimodel inference context.]
“If data analysis relies on model selection, then inferences should acknowledge model selection uncertainty. If the goal is to get the best estimates of a set of parameters in common to all models (this includes prediction), model averaging is recommended. If the models have definite, and differing, interpretations as regards understanding relationships among variables, and it is such understanding that is sought, then one wants to identify the best model and make inferences based on that model. […] The bootstrap provides direct, robust estimates of model selection probabilities πi , but we have no reason now to think that use of bootstrap estimates of model selection probabilities rather than use of the Akaike weights will lead to superior unconditional sampling variances or model-averaged parameter estimators. […] Be mindful of possible model redundancy. A carefully thought-out set of a priori models should eliminate model redundancy problems and is a central part of a sound strategy for obtaining reliable inferences. […] Results are sensitive to having demonstrably poor models in the set of models considered; thus it is very important to exclude models that are a priori poor. […] The importance of a small number (R) of candidate models, defined prior to detailed analysis of the data, cannot be overstated. […] One should have R much smaller than n. MMI [Multi-Model Inference] approaches become increasingly important in cases where there are many models to consider.”
“In general there is a substantial amount of model selection uncertainty in many practical problems […]. Such uncertainty about what model structure (and associated parameter values) is the K-L [Kullback–Leibler] best approximating model applies whether one uses hypothesis testing, information-theoretic criteria, dimension-consistent criteria, cross-validation, or various Bayesian methods. Often, there is a nonnegligible variance component for estimated parameters (this includes prediction) due to uncertainty about what model to use, and this component should be included in estimates of precision. […] we recommend assessing model selection uncertainty rather than ignoring the matter. […] It is […] not a sound idea to pick a single model and unquestioningly base extrapolated predictions on it when there is model uncertainty.”
This is a neat little book in the Springer Briefs in Statistics series. The author is David J Bartholomew, a former statistics professor at the LSE. I wrote a brief goodreads review, but I thought that I might as well also add a post about the book here. The book covers topics such as the EM algorithm, Gibbs sampling, the Metropolis–Hastings algorithm and the Rasch model, and it assumes you’re familiar with stuff like how to do ML estimation, among many other things. I had some passing familiarity with many of the topics he talks about in the book, but I’m sure I’d have benefited from knowing more about some of the specific topics covered. Because large parts of the book is basically unreadable by people without a stats background I wasn’t sure how much of it it made sense to cover here, but I decided to talk a bit about a few of the things which I believe don’t require you to know a whole lot about this area.
“Modern statistics is built on the idea of models—probability models in particular. [While I was rereading this part, I was reminded of this quote which I came across while finishing my most recent quotes post: “No scientist is as model minded as is the statistician; in no other branch of science is the word model as often and consciously used as in statistics.” Hans Freudenthal.] The standard approach to any new problem is to identify the sources of variation, to describe those sources by probability distributions and then to use the model thus created to estimate, predict or test hypotheses about the undetermined parts of that model. […] A statistical model involves the identification of those elements of our problem which are subject to uncontrolled variation and a specification of that variation in terms of probability distributions. Therein lies the strength of the statistical approach and the source of many misunderstandings. Paradoxically, misunderstandings arise both from the lack of an adequate model and from over reliance on a model. […] At one level is the failure to recognise that there are many aspects of a model which cannot be tested empirically. At a higher level is the failure is to recognise that any model is, necessarily, an assumption in itself. The model is not the real world itself but a representation of that world as perceived by ourselves. This point is emphasised when, as may easily happen, two or more models make exactly the same predictions about the data. Even worse, two models may make predictions which are so close that no data we are ever likely to have can ever distinguish between them. […] All model-dependant inference is necessarily conditional on the model. This stricture needs, especially, to be borne in mind when using Bayesian methods. Such methods are totally model-dependent and thus all are vulnerable to this criticism. The problem can apparently be circumvented, of course, by embedding the model in a larger model in which any uncertainties are, themselves, expressed in probability distributions. However, in doing this we are embarking on a potentially infinite regress which quickly gets lost in a fog of uncertainty.”
“Mixtures of distributions play a fundamental role in the study of unobserved variables […] The two important questions which arise in the analysis of mixtures concern how to identify whether or not a given distribution could be a mixture and, if so, to estimate the components. […] Mixtures arise in practice because of failure to recognise that samples are drawn from several populations. If, for example, we measure the heights of men and women without distinction the overall distribution will be a mixture. It is relevant to know this because women tend to be shorter than men. […] It is often not at all obvious whether a given distribution could be a mixture […] even a two-component mixture of normals, has 5 unknown parameters. As further components are added the estimation problems become formidable. If there are many components, separation may be difficult or impossible […] [To add to the problem,] the form of the distribution is unaffected by the mixing [in the case of the mixing of normals]. Thus there is no way that we can recognise that mixing has taken place by inspecting the form of the resulting distribution alone. Any given normal distribution could have arisen naturally or be the result of normal mixing […] if f(x) is normal, there is no way of knowing whether it is the result of mixing and hence, if it is, what the mixing distribution might be.”
“Even if there is close agreement between a model and the data it does not follow that the model provides a true account of how the data arose. It may be that several models explain the data equally well. When this happens there is said to be a lack of identifiability. Failure to take full account of this fact, especially in the social sciences, has led to many over-confident claims about the nature of social reality. Lack of identifiability within a class of models may arise because different values of their parameters provide equally good fits. Or, more seriously, models with quite different characteristics may make identical predictions. […] If we start with a model we can predict, albeit uncertainly, what data it should generate. But if we are given a set of data we cannot necessarily infer that it was generated by a particular model. In some cases it may, of course, be possible to achieve identifiability by increasing the sample size but there are cases in which, no matter how large the sample size, no separation is possible. […] Identifiability matters can be considered under three headings. First there is lack of parameter identifiability which is the most common use of the term. This refers to the situation where there is more than one value of a parameter in a given model each of which gives an equally good account of the data. […] Secondly there is what we shall call lack of model identifiability which occurs when two or more models make exactly the same data predictions. […] The third type of identifiability is actually the combination of the foregoing types.
Mathematical statistics is not well-equipped to cope with situations where models are practically, but not precisely, indistinguishable because it typically deals with things which can only be expressed in unambiguously stated theorems. Of necessity, these make clear-cut distinctions which do not always correspond with practical realities. For example, there are theorems concerning such things as sufficiency and admissibility. According to such theorems, for example, a proposed statistic is either sufficient or not sufficient for some parameter. If it is sufficient it contains all the information, in a precisely defined sense, about that parameter. But in practice we may be much more interested in what we might call ‘near sufficiency’ in some more vaguely defined sense. Because we cannot give a precise mathematical definition to what we mean by this, the practical importance of the notion is easily overlooked. The same kind of fuzziness arises with what are called structural eqation models (or structural relations models) which have played a very important role in the social sciences. […] we shall argue that structural equation models are almost always unidentifiable in the broader sense of which we are speaking here. […] [our results] constitute a formidable argument against the careless use of structural relations models. […] In brief, the valid use of a structural equations model requires us to lean very heavily upon assumptions about which we may not be very sure. It is undoubtedly true that if such a model provides a good fit to the data, then it provides a possible account of how the data might have arisen. It says nothing about what other models might provide an equally good, or even better fit. As a tool of inductive inference designed to tell us something about the social world, linear structural relations modelling has very little to offer.”
“It is very common for data to be missing and this introduces a risk of bias if inferences are drawn from incomplete samples. However, we are not usually interested in the missing data themselves but in the population characteristics to whose estimation those values were intended to contribute. […] A very longstanding way of dealing with missing data is to fill in the gaps by some means or other and then carry out the standard analysis on the completed data set. This procedure is known as imputation. […] In its simplest form, each missing data point is replaced by a single value. Because there is, inevitably, uncertainty about what the imputed values should be, one can do better by substituting a range of plausible values and comparing the results in each case. This is known as multiple imputation. […] missing values may occur anywhere and in any number. They may occur haphazardly or in some pattern. In the latter case, the pattern may provide a clue to the mechanism underlying the loss of data and so suggest a method for dealing with it. The conditional distribution which we have supposed might be the basis of imputation depends, of course, on the mechanism behind the loss of data. From a practical point of view the detailed information necessary to determine this may not be readily obtainable or, even, necessary. Nevertheless, it is useful to clarify some of the issues by introducing the idea of a probability mechanism governing the loss of data. This will enable us to classify the problems which would have to be faced in a more comprehensive treatment. The simplest, if least realistic approach, is to assume that the chance of being missing is the same for all elements of the data matrix. In that case, we can, in effect, ignore the missing values […] Such situations are designated as MCAR which is an acronym for Missing Completely at Random. […] In the smoking example we have supposed that men are more likely to refuse [to answer] than women. If we go further and assume that there are no other biasing factors we are, in effect, assuming that ‘missingness’ is completely at random for men and women, separately. This would be an example of what is known as Missing at Random(MAR) […] which means that the missing mechanism depends on the observed variables but not on those that are missing. The final category is Missing Not at Random (MNAR) which is a residual category covering all other possibilities. This is difficult to deal with in practice unless one has an unusually complete knowledge of the missing mechanism.
Another term used in the theory of missing data is that of ignorability. The conditional distribution of y given x will, in general, depend on any parameters of the distribution of M [the variable we use to describe the mechanism governing the loss of observations] yet these are unlikely to be of any practical interest. It would be convenient if this distribution could be ignored for the purposes of inference about the parameters of the distribution of x. If this is the case the mechanism of loss is said to be ignorable. In practice it is acceptable to assume that the concept of ignorability is equivalent to that of MAR.”
(No, not that type of modelling! – I was rather thinking about the type below…)
Anyway, I assume not all readers are equally familiar with this stuff, which I’ve incidentally written about before e.g. here. Some of you will know all this stuff already and you do not need to read on (well, maybe you do – in order to realize that you do not..). Some of it is recap, some of it I don’t think I’ve written about before. Anyway.
i. So, a model is a representation of the world. It’s a simplified version of it, which helps us think about the matters at hand.
ii. Models always have a lot of assumptions. A perhaps surprising observation is that, from a certain point of view, models which might be categorized as more ‘simple’ (few explicit assumptions) can be said to make as many assumptions as do more ‘complex’ models (many explicit assumptions); it’s just that the underlying assumptions are different. To illustate this, let’s have a look at two different models, model 1 and model 2. Model 1 is a model which states that ‘Y = aX’. Model 2 is a model which states that ‘Y = aX + bZ’.
Model 1 assumes b is equal to 0 so that Z is not a relevant variable to include, whereas model 2 assumes b is not zero – but both models make assumptions about this variable ‘Z’ (and the parameter ‘b’). Models will often differ along such lines, making different assumptions about variables and how they interact (incidentally here we’re implicitly assuming in both models that X and Z are independent). A ‘simple’ model does make fewer (explicit) assumptions about the world than does a ‘complex’ model – but that question is different from the question of which restrictions the two models impose on the data. And thinking in binary terms when we ask ourselves the question, ‘Are we making an assumption about this variable or this relationship?’, then the answer will always be ‘yes’ either way. Does the variable Z contribute information relevant to Y? Does it interact with other variables in the model? Both the simple model and the complex model include assumptions about this stuff. At every branching point where the complex model departs from the simple one, you have one assumption in one model (‘the distinction between f and g matters’, ‘alpha is non-zero’) and another assumption in the other (‘the distinction between f and g doesn’t matter’, ‘alpha is zero’). You always make assumptions, it’s just that the assumptions are different. In simple models assumptions are often not spelled out, which is presumably part of why some of the assumptions made in such models are easy to overlook; it makes sense that they’re not, incidentally, because there’s an infinite number of ways to make adjustments to a model. It’s true that branching out does take place in some complex models in ways that do not occur in simple models, and once you’re more than one branching point away from the departure point where the two models first differ then the behaviour of the complex model may start to be determined by additional new assumptions where on the other hand the behaviour of the simple model might still rely on the same assumption that determined the behaviour at the first departure point – so the number of explicit assumptions will be different, but an assumption is made in either case at every junction.
As might be inferred from the comments above usually ‘the simple model’ will be the one with the more restrictive assumptions, in terms of what the data is ‘allowed’ to do. Fewer assumptions usually means stronger assumptions. It’s a much stronger assumption to assume that e.g. males and females are identical than is the alternative that they are not; there are many ways they could be not identical but only one way in which they can be. The restrictiveness of a model does not equal the number of assumptions (explicitly) made. No, on a general note it is rather the case that more assumptions mean that your model becomes less restrictive, because additional assumptions allow for more stuff to vary – this is indeed a big part of why model-builders generally don’t just stick to very simple models; if you do that, you don’t get the details right. Adding more assumptions may allow you to make a more correct model that better explains the data. It is my experience (not that I have much of it, but..) that people who’re unfamiliar with modelling think of additional assumptions as somehow ‘problematic’ – ‘more stuff can go wrong if you add more assumptions, the more assumptions you have the more likely it is that one of them is violated’. The problem is that not making assumptions is not really an option; you’ll basically assume something no matter what you do. ‘That variable/distinction/connection is irrelevant’, which is often the default assumption, is also just that – an assumption. If you do modelling you don’t ever get to not make assumptions, they’re always there lurking in the background whether you like it or not.
iii. A big problem is that we don’t know a priori which assumptions are correct before we’ve actually tested the models – indeed, we often make models mainly in order to figure out which assumptions are correct. (Sometimes we can’t even test the assumptions we’re making in a model, but let’s ignore this problem here…). A more complex model may not always be more correct, perform better. Sometimes it’ll actually do a worse job at explaining the variation in the data than a simple one would have done. When you add more variables to a model, you also add more uncertainty because of things like measurement error. Sometimes it’s worth it, because the new variable explain a lot of the variation in the data. Sometimes it’s not – sometimes the noise you add is far more relevant than is the additional information contribution about how the data behaves.
There are various ways to try to figure out if the amount of noise added from an additional variable is too high for it to be a good idea to include the variable in a model, but they’re not perfect and you always have tradeoffs. There are many different methods to estimate which model performs better, and the different methods apply different criteria – so you can easily get into a situation where the choice of which variable to include in your ‘best model’ depends on e.g. which information criterium you choose to apply.
Anyway the key point is this: You can’t just add everything (all possible variables you could imagine play a role) and assume you’ll be able to explain everything that way – adding another variable may indeed sometimes be a very bad idea.
iv. If you test a lot of hypotheses simultaneously, which all have some positive probability of being evaluated as correct, then as you add more variables to your model it becomes more and more likely that at least one of those hypotheses will be evaluated as being correct (relevant link) unless you somehow adjust the probability of a given hypothesis being evaluated as correct as you add more hypotheses along the way. This is another reason adding more variables to a model can sometimes be problematic. There are ways around this particular problem, but if they are not used, which they often are not, then you need to be careful.
v. Adding more variables is not always preferable, but then what about throwing more data at the problem by adding to the sample size? Surely if you add more data to the sample that should increase your confidence in the model results, right? Well… No – bigger is actually not always better. This is related to the concept of consistency in statistics. “A consistent estimator is one for which, when the estimate is considered as a random variable indexed by the number n of items in the data set, as n increases the estimates converge to the value that the estimator is designed to estimate,” as the wiki article puts it. You can imagine that consistency is one of the key assumptions underlying statistical models – it really is, we care a lot about consistency, and all else equal you should always prefer a consistent estimator to an inconsistent one (however it should be noted that all else is not always equal; a consistent estimator may have larger variance than an inconsistent estimator in a finite sample, which means that we may actually sometimes prefer the latter to the former in specific situations). But the thing is, not all estimators are consistent. There are always some critical assumptions which need to be satisfied in order for the consistency requirement to be met, and in a bad model these requirements will not be met. If you have a bad model, for example if you’ve incorrectly specified the relationships between the variables or included the wrong variables in your model, then increasing the sample size will do nothing to help you – additional data will not somehow magically make the estimates more reliable ‘because of asymptotics’. In fact if your model’s performance is very sensitive to the sample size to which you apply it, it may well indicate that there’s a problem with the model, i.e. that the model is misspecified (see e.g. this).
vi. Not all model assumptions are equal – some assumptions will usually be much more critical than others. As already mentioned consistency of regressors is very important, and here it is important to note that not all model assumption violations will lead to inconsistent estimators. An example of where this is not the case is the homoskedasticity assumption (see also this) in regression analysis. Here you can actually find yourself in a situation where you deliberately apply a model where you know that one of your assumptions about how the data behaves is violated, yet this is not a problem at all because you can deal with the problem separately so that that violation is of no practical importance as you can correct for it. As already mentioned in the beginning most models will be simplified versions of the stuff that goes on in the real world, so you’ll expect to see some ‘violations’ here and there – the key question to ask here is then, is the violation important and which consequences does it have for the estimates we’ve obtained? If you do not ask yourself such questions when evaluating a model, you may easily end up quibbling about details which are of no importance anyway because they don’t really matter. And remember that all the assumptions made in the model are not always spelled out, and that some of the important ones may have been overlooked.
vii. Which causal inferences to make from the model? Correlation != causation. To some extent the question to which extent the statistical link is causal relates to questions pertaining to whether we’ve picked the right variables and the right way to relate them to each other. But as I’ve remarked upon before some model types are better suited for establishing causal links than are others – there are good ways and bad ways to get at the heart of the matter (one application here, I believe I’ve linked to this before). Different fields will often have developed different approaches, see e.g. this, this and this. Correlation on its own will probably tell you next to nothing about anything you might be interested in; as I believe my stats prof put it last semester, ‘we don’t care about correlation, correlation means nothing’. Randomization schemes with treatment groups and control groups are great. If we can’t do those, we can still try to make models to get around the problems. Those models make assumptions, but so do the other models you’re comparing them with and in order to properly evaluate them you need to be explicit about the assumptions made by the competing models as well.
It takes way more time to cover this stuff in detail here than I’m willing to spend on it, but here are a few relevant links to stuff I’m working on/with at the moment:
iii. Kolmogorov–Smirnov test.
iv. Chow test.
vi. Education and health: Evaluating Theories and Evidence, by Cutler & Muney.
vii. Education, Health and Mortality: Evidence from a Social Experiment, by Meghir, Palme & Simeonova.
i. Econometric methods for causal evaluation of education policies and practices: a non-technical guide. This one is ‘work-related’; in one of my courses I’m writing a paper and this working paper is one (of many) of the sources I’m planning on using. Most of the papers I work with are unfortunately not freely available online, which is part of why I haven’t linked to them here on the blog.
I should note that there are no equations in this paper, so you should focus on the words ‘a non-technical guide’ rather than the words ‘econometric methods’ in the title – I think this is a very readable paper for the non-expert as well. I should of course also note that I have worked with most of these methods in a lot more detail, and that without the math it’s very hard to understand the details and really know what’s going on e.g. when applying such methods – or related methods such as IV methods on panel data, a topic which was covered in another class just a few weeks ago but which is not covered in this paper.
This is a place to start if you want to know something about applied econometric methods, particularly if you want to know how they’re used in the field of educational economics, and especially if you don’t have a strong background in stats or math. It should be noted that some of the methods covered see wide-spread use in other areas of economics as well; IV is widely used, and the difference-in-differences estimator have seen a lot of applications in health economics.
ii. Regulating the Way to Obesity: Unintended Consequences of Limiting Sugary Drink Sizes. The law of unintended consequences strikes again.
You could argue with some of the assumptions made here (e.g. that prices (/oz) remain constant) but I’m not sure the findings are that sensitive to that assumption, and without an explicit model of the pricing mechanism at work it’s mostly guesswork anyway.
iii. A discussion about the neurobiology of memory. Razib Khan posted a short part of the video recently, so I decided to watch it today. A few relevant wikipedia links: Memory, Dead reckoning, Hebbian theory, Caenorhabditis elegans. I’m skeptical, but I agree with one commenter who put it this way: “I know darn well I’m too ignorant to decide whether Randy is possibly right, or almost certainly wrong — yet I found this interesting all the way through.” I also agree with another commenter who mentioned that it’d have been useful for Gallistel to go into details about the differences between short term and long term memory and how these differences relate to the problem at hand.
“An extensive body of prior research indicates an association between emotion and moral judgment. In the present study, we characterized the predictive power of specific aspects of emotional processing (e.g., empathic concern versus personal distress) for different kinds of moral responders (e.g., utilitarian versus non-utilitarian). Across three large independent participant samples, using three distinct pairs of moral scenarios, we observed a highly specific and consistent pattern of effects. First, moral judgment was uniquely associated with a measure of empathy but unrelated to any of the demographic or cultural variables tested, including age, gender, education, as well as differences in “moral knowledge” and religiosity. Second, within the complex domain of empathy, utilitarian judgment was consistently predicted only by empathic concern, an emotional component of empathic responding. In particular, participants who consistently delivered utilitarian responses for both personal and impersonal dilemmas showed significantly reduced empathic concern, relative to participants who delivered non-utilitarian responses for one or both dilemmas. By contrast, participants who consistently delivered non-utilitarian responses on both dilemmas did not score especially high on empathic concern or any other aspect of empathic responding.”
In case you were wondering, the difference hasn’t got anything to do with a difference in the ability to ‘see things from the other guy’s point of view’: “the current study demonstrates that utilitarian responders may be as capable at perspective taking as non-utilitarian responders. As such, utilitarian moral judgment appears to be specifically associated with a diminished affective reactivity to the emotions of others (empathic concern) that is independent of one’s ability for perspective taking”.
On a small sidenote, I’m not really sure I get the authors at all – one of the questions they ask in the paper’s last part is whether ‘utilitarians are simply antisocial?’ This is such a stupid way to frame this I don’t even know how to begin to respond; I mean, utilitarians make better decisions that save more lives, and that’s consistent with them being antisocial? I should think the ‘social’ thing to do would be to save as many lives as possible. Dead people aren’t very social, and when your actions cause more people to die they also decrease the scope for future social interaction.
v. Lastly, some Khan Academy videos:
(This one may be very hard to understand if you haven’t covered this stuff before, but I figured I might as well post it here. If you don’t know e.g. what myosin and actin is you probably won’t get much out of this video. If you don’t watch it, this part of what’s covered is probably the most important part to take away from it.)
It’s been a long time since I checked out the Brit Cruise information theory playlist, and I was happy to learn that he’s updated it and added some more stuff. I like the way he combines historical stuff with a ‘how does it actually work, and how did people realize that’s how it works’ approach – learning how people figured out stuff is to me sometimes just as fascinating as learning what they figured out:
(Relevant wikipedia links: Leyden jar, Electrostatic generator, Semaphore line. Cruise’ play with the cat and the amber may look funny, but there’s a point to it: “The Greek word for amber is ηλεκτρον (“elektron”) and is the origin of the word “electricity”.” – from the first link).
i. Aedes Albopictus.
“The Tiger mosquito or forest day mosquito, Aedes albopictus (Stegomyia albopicta), from the mosquito (Culicidae) family, is characterized by its black and white striped legs, and small black and white striped body. It is native to the tropical and subtropical areas of Southeast Asia; however, in the past couple of decades this species has invaded many countries throughout the world through the transport of goods and increasing international travel. This mosquito has become a significant pest in many communities because it closely associates with humans (rather than living in wetlands), and typically flies and feeds in the daytime in addition to at dusk and dawn. The insect is called a tiger mosquito because its striped appearance is similar to a tiger. Aedes albopictus is an epidemiologically important vector for the transmission of many viral pathogens, including the West Nile virus, Yellow fever virus, St. Louis encephalitis, dengue fever, and Chikungunya fever, as well as several filarial nematodes such as Dirofilaria immitis. […]
Aedes albopictus also bites other mammals besides humans and they also bite birds. They are always on the search for a host and are both persistent and cautious when it comes to their blood meal and host location. Their blood meal is often broken off short without enough blood ingested for the development of their eggs. This is why Asian tiger mosquitoes bite multiple hosts during their development cycle of the egg, making them particularly efficient at transmitting diseases. The mannerism of biting diverse host species enables the Asian tiger mosquito to be a potential bridge vector for certain pathogens, for example, the West Nile virus that can jump species boundaries. […]
The Asian tiger mosquito originally came from Southeast Asia. In 1966, parts of Asia and the island worlds of India and the Pacific Ocean were denoted as the area of circulation for the Asian tiger mosquito. Since then, it has spread to Europe, the Americas, the Caribbean, Africa and the Middle East. Aedes albopictus is one of the 100 world’s worst invasive species according to the Global Invasive Species Database. […]
In Europe, the Asian tiger mosquito apparently covers an extensive new niche. This means that there are no native, long-established species that conflict with the dispersal of Aedes albopictus. […]
The Asian tiger mosquito was responsible for the Chikungunya epidemic on the French Island La Réunion in 2005–2006. By September 2006, there were an estimated 266,000 people infected with the virus, and 248 fatalities on the island. The Asian tiger mosquito was also the transmitter of the virus in the first and only outbreak of Chikungunya fever on the European continent. […]
Aedes albopictus has proven to be very difficult to suppress or to control due to their remarkable ability to adapt to various environments, their close contact with humans, and their reproductive biology.”
In case you were wondering, the word Aedes comes from the Greek word for “unpleasant”. So, yeah…
ii. Orbital resonance.
“In celestial mechanics, an orbital resonance occurs when two orbiting bodies exert a regular, periodic gravitational influence on each other, usually due to their orbital periods being related by a ratio of two small integers. The physics principle behind orbital resonance is similar in concept to pushing a child on a swing, where the orbit and the swing both have a natural frequency, and the other body doing the “pushing” will act in periodic repetition to have a cumulative effect on the motion. Orbital resonances greatly enhance the mutual gravitational influence of the bodies, i.e., their ability to alter or constrain each other’s orbits. In most cases, this results in an unstable interaction, in which the bodies exchange momentum and shift orbits until the resonance no longer exists. Under some circumstances, a resonant system can be stable and self-correcting, so that the bodies remain in resonance. Examples are the 1:2:4 resonance of Jupiter‘s moons Ganymede, Europa and Io, and the 2:3 resonance between Pluto and Neptune. Unstable resonances with Saturn‘s inner moons give rise to gaps in the rings of Saturn. The special case of 1:1 resonance (between bodies with similar orbital radii) causes large Solar System bodies to eject most other bodies sharing their orbits; this is part of the much more extensive process of clearing the neighbourhood, an effect that is used in the current definition of a planet.”
iii. Some ‘work-blog related links’: Local regression, Quasi-experiment, Nonparametric regression, Regression discontinuity design, Kaplan–Meier estimator, Law of total expectation, Slutsky’s theorem, Difference in differences, Panel analysis.
v. Hill sphere.
“An astronomical body‘s Hill sphere is the region in which it dominates the attraction of satellites. To be retained by a planet, a moon must have an orbit that lies within the planet’s Hill sphere. That moon would, in turn, have a Hill sphere of its own. Any object within that distance would tend to become a satellite of the moon, rather than of the planet itself.
In more precise terms, the Hill sphere approximates the gravitational sphere of influence of a smaller body in the face of perturbations from a more massive body. It was defined by the American astronomer George William Hill, based upon the work of the French astronomer Édouard Roche. For this reason, it is also known as the Roche sphere (not to be confused with the Roche limit). The Hill sphere extends between the Lagrangian points L1 and L2, which lie along the line of centers of the two bodies. The region of influence of the second body is shortest in that direction, and so it acts as the limiting factor for the size of the Hill sphere. Beyond that distance, a third object in orbit around the second (e.g. Jupiter) would spend at least part of its orbit outside the Hill sphere, and would be progressively perturbed by the tidal forces of the central body (e.g. the Sun), eventually ending up orbiting the latter. […]
The Hill sphere is only an approximation, and other forces (such as radiation pressure or the Yarkovsky effect) can eventually perturb an object out of the sphere. This third object should also be of small enough mass that it introduces no additional complications through its own gravity. Detailed numerical calculations show that orbits at or just within the Hill sphere are not stable in the long term; it appears that stable satellite orbits exist only inside 1/2 to 1/3 of the Hill radius.”
I found myself looking up quite a few other astronomy-related articles when I was reading Formation and Evolution of Exoplanets (technically the link is to the 2010 version whereas I was reading the 2008 version, but it doesn’t look as if a whole lot of stuff’s been changed and I can’t find a link to the 2008 version). I haven’t mentioned the book here because I basically gave up reading it midway into the second chapter. The book didn’t try to hide that I probably wasn’t in the intended target group but I decided to give it a try anyway: “This book is intended to suit a readership with a wide range of previous knowledge of planetary science, astrophysics, and scientific programming. Expertise in these fields should not be required to grasp the key concepts presented in the forthcoming chapters, although a reasonable grasp of basic physics is probably essential.” I figured I could grasp the key concepts even though I’d lose out on a lot of details, but the math started getting ugly quite fast, and as I have plenty of ugly math to avoid as it is I decided to give the book a miss (though I did read the first 50 pages or so).
vi. Grover Cleveland (featured).
“Stephen Grover Cleveland (March 18, 1837 – June 24, 1908) was the 22nd and 24th President of the United States. Cleveland is the only president to serve two non-consecutive terms (1885–1889 and 1893–1897) and therefore is the only individual to be counted twice in the numbering of the presidents. He was the winner of the popular vote for president three times—in 1884, 1888, and 1892—and was the only Democrat elected to the presidency in the era of Republican political domination that lasted from 1861 to 1913.
Cleveland was the leader of the pro-business Bourbon Democrats who opposed high tariffs, Free Silver, inflation, imperialism and subsidies to business, farmers or veterans. His battles for political reform and fiscal conservatism made him an icon for American conservatives of the era. Cleveland won praise for his honesty, independence, integrity, and commitment to the principles of classical liberalism. Cleveland relentlessly fought political corruption, patronage, and bossism. Indeed, as a reformer his prestige was so strong that the reform wing of the Republican Party, called “Mugwumps“, largely bolted the GOP ticket and swung to his support in 1884. […]
Cleveland took strong positions and was heavily criticized. His intervention in the Pullman Strike of 1894 to keep the railroads moving angered labor unions nationwide and angered the party in Illinois; his support of the gold standard and opposition to Free Silver alienated the agrarian wing of the Democratic Party. Furthermore, critics complained that he had little imagination and seemed overwhelmed by the nation’s economic disasters—depressions and strikes—in his second term. Even so, his reputation for honesty and good character survived the troubles of his second term. […]
Cleveland’s term as mayor was spent fighting the entrenched interests of the party machines. Among the acts that established his reputation was a veto of the street-cleaning bill passed by the Common Council. The street-cleaning contract was open for bids, and the Council selected the highest bidder, rather than the lowest, because of the political connections of the bidder. While this sort of bipartisan graft had previously been tolerated in Buffalo, Mayor Cleveland would have none of it, and replied with a stinging veto message: “I regard it as the culmination of a most bare-faced, impudent, and shameless scheme to betray the interests of the people, and to worse than squander the public money”. The Council reversed themselves and awarded the contract to the lowest bidder. For this, and several other acts to safeguard the public funds, Cleveland’s reputation as an honest politician began to spread beyond Erie County. […] [As a president…] Cleveland used the veto far more often than any president up to that time. […]
In a 1905 article in The Ladies Home Journal, Cleveland weighed in on the women’s suffrage movement, writing that “sensible and responsible women do not want to vote. The relative positions to be assumed by men and women in the working out of our civilization were assigned long ago by a higher intelligence.””
Here’s how his second cabinet looked like – this was how a presidential cabinet looked like 120 years ago (as always you can click the image to see it in a higher resolution – and just in case you were in doubt: Cleveland is the old white man in the picture…):
vii. Boeing B-52 Stratofortress (‘good article’).
“The Boeing B-52 Stratofortress is a long-range, subsonic, jet-powered strategic bomber. The B-52 was designed and built by Boeing, which has continued to provide support and upgrades. It has been operated by the United States Air Force (USAF) since the 1950s. The bomber carries up to 70,000 pounds (32,000 kg) of weapons.
Beginning with the successful contract bid in June 1946, the B-52 design evolved from a straight-wing aircraft powered by six turboprop engines to the final prototype YB-52 with eight turbojet engines and swept wings. The B-52 took its maiden flight in April 1952. Built to carry nuclear weapons for Cold War-era deterrence missions, the B-52 Stratofortress replaced the Convair B-36. Although a veteran of several wars, the Stratofortress has dropped only conventional munitions in combat. Its Stratofortress name is rarely used outside of official contexts; it has been referred to by Air Force personnel as the BUFF (Big Ugly Fat/Flying Fucker/Fellow). […]
Superior performance at high subsonic speeds and relatively low operating costs have kept the B-52 in service despite the advent of later aircraft, including the cancelled Mach 3 North American XB-70 Valkyrie, the variable-geometry Rockwell B-1B Lancer, and the stealthy Northrop Grumman B-2 Spirit. The B-52 marked its 50th anniversary of continuous service with its original operator in 2005 and after being upgraded between 2013 and 2015 it will serve into the 2040s.[N 1] […]
B-52 strikes were an important part of Operation Desert Storm. With about 1,620 sorties flown, B-52s delivered 40% of the weapons dropped by coalition forces while suffering only one non-combat aircraft loss, with several receiving minor damage from enemy action. […]
The USAF continues to rely on the B-52 because it remains an effective and economical heavy bomber, particularly in the type of missions that have been conducted since the end of the Cold War against nations that have limited air defense capabilities. The B-52 has the capacity to “loiter” for extended periods over (or even well outside) the battlefield, and deliver precision standoff and direct fire munitions. It has been a valuable asset in supporting ground operations during conflicts such as Operation Iraqi Freedom. The B-52 had the highest mission capable rate of the three types of heavy bombers operated by the USAF in 2001. The B-1 averaged a 53.7% ready rate and the Northrop Grumman B-2 Spirit achieved 30.3%, while the B-52 averaged 80.5% during the 2000–2001 period. The B-52’s $72,000 cost per hour of flight is more than the $63,000 for the B-1B but almost half of the $135,000 of the B-2.”
I’ll just repeat that: $72,000/hour of flight. And the B-2 is at $135,000/hour. War is expensive.
I’ve not had lectures for the last two weeks, but tomorrow the new semester starts.
Like last semester I’ll try to ‘work-blog’ some stuff along the way – hopefully I’ll do it more often than I did, but it’s hard to say if that’s realistic at this point.
I bought the only book I’m required to acquire this semester earlier today:
…and having had a brief look at it I’m already starting to wonder if it was even a good idea to take that course. I’ve been told it’s a very useful course, but I have a nagging suspicion that it may also be quite hard. Here are some of the reasons (click to view in a higher resolution):
I don’t think it’s particularly likely that I’ll cover stuff from that particular course in work-blogs, for perhaps obvious reasons. One problem is the math, wordpress doesn’t handle math very well. Another problem is that most readers would be unlikely to benefit much from such posts unless I were to spend a lot more time on them than I’d like to do. But it’s not my only course this semester. We’ll see how it goes.
“…it’s just a matter of estimating the hazard functions…”
Or something like that. The words in the post title the instructor actually said, but I believe his voice sort of trailed off as he finished the sentence. All the stuff above is from today’s lecture notes, click to enlarge. The quote is from the last part of the lecture, after he’d gone through that stuff.
In the last slide, it should “of course” be ‘Oaxaca Blinder decomposition’, rather than ‘Oaxaca-Bilder’.
What we’re covering right now in class is not something I’ll cover here in detail – it’s very technical stuff. A few excerpts from today’s lecture notes (click to view full size):
Stuff like this is why I actually get a bit annoyed by people who state that their impression is that economics is a relatively ‘soft’ science, and ask questions like ‘the math you guys make use of isn’t all that hard, is it?’ (I’ve been asked this question a few times in the past) It’s actually true that a lot of it isn’t – we spend a lot of time calculating derivatives and finding the signs of those derivatives and similar stuff. And economics is a reasonably heterogenous field, so surely there’s a lot of variation – for example, in Denmark business graduates often call themselves economists too even though a business graduates’ background, in terms of what we’ve learned during our education, would most often be reasonably different from e.g. my own.
What I’ll just say here is that the statistics stuff generally is not easy (if you think it is, you’ve spent way too little time on that stuff*). And yeah, the above excerpt is from what I consider my ‘easy course’ this semester – most of it is not like that, but some of it sure is.
Incidentally I should just comment in advance here, before people start talking about physics envy (mostly related to macro, IMO (and remember again the field heterogeneity; many, perhaps a majority of, economists don’t specialize in that stuff and don’t really know all that much about it…)), that the complexity economists deal with when they work with statistics – which is also economics – is the same kind of complexity that’s dealt with in all other subject areas where people need to analyze data to reach conclusions about what the data can tell us. Much of the complexity is in the data – the complexity relates to the fact that the real world is complex, and if we want to model it right and get results that make sense, we need to think very hard about which tools to use and how we use them. The economists who decide to work with that kind of stuff, more than they absolutely have to in order to get their degrees that is, are economists who are taught how to analyze data and do it the right way, and how what is the right way may depend upon what kind of data you’re working with and the questions you want to answer. This also involves learning what an Epanechnikov kernel is and what it implies that the error terms of a model are m-dependent.
(*…or (Plamus?) way too much time…)
i. Proportional hazards models. (work-related)
“Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes before some event occurs to one or more covariates that may be associated with that quantity. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one’s hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. These models could describe a situation such as a drug that reduces a subject’s immediate risk of having a stroke, but where there is no reduction in the hazard rate after one year for subjects who do not have a stroke in the first year of analysis.”
“A radioisotope thermoelectric generator (RTG, RITEG) is an electrical generator that obtains its power from radioactive decay. In such a device, the heat released by the decay of a suitable radioactive material is converted into electricity by the Seebeck effect using an array of thermocouples.
RTGs have been used as power sources in satellites, space probes and unmanned remote facilities, such as a series of lighthouses built by the former Soviet Union inside the Arctic Circle. RTGs are usually the most desirable power source for robotic or unmaintained situations needing a few hundred watts (or less) of power for durations too long for fuel cells, batteries, or generators to provide economically, and in places where solar cells are not practical. Safe use of RTGs requires containment of the radioisotopes long after the productive life of the unit. […]
In addition to spacecraft, the Soviet Union constructed many unmanned lighthouses and navigation beacons powered by RTGs. Powered by strontium-90 (90Sr), they are very reliable and provide a steady source of power. Critics[who?] argue that they could cause environmental and security problems as leakage or theft of the radioactive material could pass unnoticed for years, particularly as the locations of some of these lighthouses are no longer known due to poor record keeping. In one instance, the radioactive compartments were opened by a thief. In another case, three woodsmen in Georgia came across two ceramic RTG heat sources that had been stripped of their shielding. Two of the three were later hospitalized with severe radiation burns after carrying the sources on their backs. The units were eventually recovered and isolated.
There are approximately 1,000 such RTGs in Russia. All of them have long exhausted their 10-year engineered life spans. They are likely no longer functional, and may be in need of dismantling. Some of them have become the prey of metal hunters, who strip the RTGs’ metal casings, regardless of the risk of radioactive contamination.”
iii. List of unusual deaths. A lot of awesome stuff here. A few examples from the article:
- 1814: London Beer Flood, seven people were killed (some drowned, some died from injuries, and one succumbed to alcohol poisoning) when 323,000 imperial gallons (388,000 US gal; 1,468,000 L) of beer in the Meux and Company Brewery burst out of its vats and gushed into the streets.
- 1912: Franz Reichelt, tailor, fell to his death off the first deck of the Eiffel Tower while testing his invention, the overcoat parachute. It was his first ever attempt with the parachute.
- 1940: Marcus Garvey died due to two strokes after reading a negative premature obituary of himself.
- 1974: Basil Brown, a 48-year-old health food advocate from Croydon, drank himself to death with carrot juice.
- 2007: Jennifer Strange, a 28-year-old woman from Sacramento, California, died of water intoxication while trying to win a Nintendo Wii console in a KDND 107.9 “The End” radio station’s “Hold Your Wee for a Wii” contest, which involved drinking large quantities of water without urinating.
iv. Limnic eruption.
“A limnic eruption, also referred to as a lake overturn, is a rare type of natural disaster in which dissolved carbon dioxide (CO2) suddenly erupts from deep lake water, suffocating wildlife, livestock and humans. Such an eruption may also cause tsunamis in the lake as the rising CO2 displaces water. Scientists believe landslides, volcanic activity, or explosions can trigger such an eruption. Lakes in which such activity occurs may be known as limnically active lakes or exploding lakes.”
v. HeLa. The woman died more than 60 years ago, but some of the descendants of the cancer cells that killed her survives to this day:
“A HeLa cell /ˈhiːlɑː/, also Hela or hela cell, is a cell type in an immortal cell line used in scientific research. It is the oldest and most commonly used human cell line. The line was derived from cervical cancer cells taken on February 8, 1951 from Henrietta Lacks, a patient who eventually died of her cancer on October 4, 1951. The cell line was found to be remarkably durable and prolific as illustrated by its contamination of many other cell lines used in research. […]
HeLa cells, like other cell lines, are termed “immortal” in that they can divide an unlimited number of times in a laboratory cell culture plate as long as fundamental cell survival conditions are met (i.e. being maintained and sustained in a suitable environment). There are many strains of HeLa cells as they continue to evolve in cell cultures, but all HeLa cells are descended from the same tumor cells removed from Mrs. Lacks. It has been estimated that the total number of HeLa cells that have been propagated in cell culture far exceeds the total number of cells that were in Henrietta Lacks’s body. […]
HeLa cells were used by Jonas Salk to test the first polio vaccine in the 1950s. Since that time, HeLa cells have been used for “research into cancer, AIDS, the effects of radiation and toxic substances, gene mapping, and many other scientific pursuits”. According to author Rebecca Skloot, by 2009, “more than 60,000 scientific articles had been published about research done on HeLa, and that number was increasing steadily at a rate of more than 300 papers each month.””
The result of over 50 years of experiments in the Soviet Union and Russia, the breeding project was set up in 1959 by Soviet scientist Dmitri Belyaev. It continues today at The Institute of Cytology and Genetics at Novosibirsk, under the supervision of Lyudmila Trut. […]
Belyaev believed that the key factor selected for in the domestication of dogs was not size or reproduction, but behavior; specifically, amenability to domestication, or tameability. He selected for low flight distance, that is, the distance one can approach the animal before it runs away. Selecting this behavior mimics the natural selection that must have occurred in the ancestral past of dogs. More than any other quality, Belyaev believed, tameability must have determined how well an animal would adapt to life among humans. Since behavior is rooted in biology, selecting for tameness and against aggression means selecting for physiological changes in the systems that govern the body’s hormones and neurochemicals. Belyaev decided to test his theory by domesticating foxes; in particular, the silver fox, a dark color form of the red fox. He placed a population of them in the same process of domestication, and he decided to submit this population to strong selection pressure for inherent tameness.
The result is that Russian scientists now have a number of domesticated foxes that are fundamentally different in temperament and behavior from their wild forebears. Some important changes in physiology and morphology are now visible, such as mottled or spotted colored fur. Many scientists believe that these changes related to selection for tameness are caused by lower adrenaline production in the new breed, causing physiological changes in very few generations and thus yielding genetic combinations not present in the original species. This indicates that selection for tameness (i.e. low flight distance) produces changes that are also influential on the emergence of other “dog-like” traits, such as raised tail and coming into heat every six months rather than annually.”
vi. Attalus I (featured).
“Attalus I (Greek: Ἄτταλος), surnamed Soter (Greek: Σωτὴρ, “Savior”; 269 BC – 197 BC) ruled Pergamon, an Ionian Greek polis (what is now Bergama, Turkey), first as dynast, later as king, from 241 BC to 197 BC. He was the second cousin and the adoptive son of Eumenes I, whom he succeeded, and was the first of the Attalid dynasty to assume the title of king in 238 BC. He was the son of Attalus and his wife Antiochis.
Attalus won an important victory over the Galatians, newly arrived Celtic tribes from Thrace, who had been, for more than a generation, plundering and exacting tribute throughout most of Asia Minor without any serious check. This victory, celebrated by the triumphal monument at Pergamon (famous for its Dying Gaul) and the liberation from the Gallic “terror” which it represented, earned for Attalus the name of “Soter”, and the title of “king“. A courageous and capable general and loyal ally of Rome, he played a significant role in the first and second Macedonian Wars, waged against Philip V of Macedon. He conducted numerous naval operations, harassing Macedonian interests throughout the Aegean, winning honors, collecting spoils, and gaining for Pergamon possession of the Greek islands of Aegina during the first war, and Andros during the second, twice narrowly escaping capture at the hands of Philip.
Attalus was a protector of the Greek cities of Anatolia and viewed himself as the champion of Greeks against barbarians. During his reign he established Pergamon as a considerable power in the Greek East. He died in 197 BC, shortly before the end of the second war, at the age of 72, having suffered an apparent stroke while addressing a Boeotian war council some months before.”
“The East African Campaign was a series of battles and guerrilla actions which started in German East Africa and ultimately affected portions of Mozambique, Northern Rhodesia, British East Africa, Uganda, and the Belgian Congo. The campaign was effectively ended in November 1917. However, the Germans entered Portuguese East Africa and continued the campaign living off Portuguese supplies.
The strategy of the German colonial forces, led by Lieutenant Colonel (later Generalmajor) Paul Emil von Lettow-Vorbeck, was to drain and divert forces from the Western Front to Africa. His strategy failed to achieve these results after 1916, as mainly Indian and South African forces, which were prevented by colonial policy from deploying to Europe, conducted the rest of the campaign. […]
In this campaign, disease killed or incapacitated 30 men for every man killed in battle on the British side.”
viii. European bison (Wisent). I had never heard about those. Here’s what they look like:
“The European bison (Bison bonasus), also known as wisent ( /ˈviːzənt/ or /ˈwiːzənt/) or the European wood bison, is a Eurasian species of bison. It is the heaviest surviving wild land animal in Europe; a typical European bison is about 2.1 to 3.5 m (7 to 10 ft) long, not counting a tail of 30 to 60 cm (12 to 24 in) long, and 1.6 to 2 m (5 to 7 ft) tall. Weight typically can range from 300 to 920 kg (660 to 2,000 lb), with an occasional big bull to 1,000 kg (2,200 lb) or more. On average, it is slightly lighter in body mass and yet taller at the shoulder than the American bison (Bison bison). Compared to the American species, the Wisent has shorter hair on the neck, head and forequarters, but longer tail and horns.
European bison were hunted to extinction in the wild, with the last wild animals being shot in the Białowieża Forest in Eastern Poland in 1919 and in the Western Caucasus in 1927, but have since been reintroduced from captivity into several countries in Europe, all descendants of the Białowieża or lowland European bison. They are now forest-dwelling. They have few predators (besides humans), with only scattered reports from the 19th century of wolf and bear predation. […]
Historically, the lowland European bison’s range encompassed all lowlands of Europe, extending from the Massif Central to the Volga River and the Caucasus. It may have once lived in the Asiatic part of what is now the Russian Federation. Its range decreased as human populations expanded cutting down forests. The first population to be extirpated was that of Gaul in the 8th century AD. The European bison became extinct in southern Sweden in the 11th century, and southern England in the 12th. The species survived in the Ardennes and the Vosges until the 15th century. In the early middle ages, the wisent apparently still occurred in the forest steppes east of the Ural, in the Altay Mountains and seems to have reached Lake Baikal in the east. The northern boundary in the Holocene was probably around 60°N in Finland.
European bison survived in a few natural forests in Europe but its numbers dwindled. The last European bison in Transylvania died in 1790. In Poland, European bison in the Białowieża Forest were legally the property of the Polish kings until the Third partition of Poland. Wild European bison herds also existed in the forest until the mid-17th century. Polish kings took measures to protect the bison. King Sigismund II Augustus instituted the death penalty for poaching a European bison in Białowieża in the mid-16th century. In the early 19th century, Russian czars retained old Polish laws protecting the European bison herd in Białowieża. Despite these measures and others, the European bison population continued to decline over the following century, with only Białowieża and Northern Caucasus populations surviving into the 20th century.
During World War I, occupying German troops killed 600 of the European bison in the Białowieża Forest for sport, meat, hides, and horns. A German scientist informed army officers that the European bison were facing imminent extinction, but at the very end of the war, retreating German soldiers shot all but 9 animals. The last wild European bison in Poland was killed in 1919, and the last wild European bison in the world was killed by poachers in 1927 in the western Caucasus. By that year fewer than 50 remained, all in zoos.”
Mostly to make clear that even though low posting frequency often means that I feel less well than I sometimes do, this is not the reason for this last week’s lpf. I’m simply too busy to blog much or do stuff that’s blog-worthy. Didn’t really have a weekend this week at all.
Some random stuff/links:
2. How to mate with King vs King + 2 bishops:
3. Ever wondered what a Vickrey auction is and what the optimal bidding strategy in such an auction is? No? Now you know.
4. How long can people hold their breath under water? (and many other things. The answer of course is: ‘It depends…’)
Or a sample that’s arguably closer than yesterday’s to the kind of stuff I’m actually working with. The pics are from my textbook. Click to view in higher res.
In a couple of months, I’ll probably say that (‘stuff like this’) looks worse than it is. Some of it is quite a bit simpler than it looks, but in general I don’t feel that way right now. Even though we made some progress today there’s still a long way to go.
Stopped working half an hour ago, basically because I couldn’t think straight anymore, not because I wouldn’t like to keep working. On my way to bed. We’re in time trouble and I probably won’t do anything but work and sleep until Friday (not that I’ve been doing all that much else so far); anyway, don’t expect any updates until Friday evening or some time Saturday.
I’ve kept the links somewhat general in order not to give any hints to fellow students finding this blogpost via google (none of them relates to the breakthroughs mentioned below), but these links is a good sample of the kind of stuff I’ve been working with today: 1, 2, 3 (notice how big that file is. We frequently look up stuff here), 4, 5. I’ve chosen links with some degree of formalization, though most of them of course don’t go into all that much detail. Our curriculum in this course consists of a few hundred pages like those.
I’ve just parted ways with my study group (until tomorrow morning) after appr. 12 hours of (almost) completely uninterrupted work. Hopefully we just made two major breakthroughs. We work with (think about, manipulate, program with..) equations such as those in the links (and the related concepts) all the time and we’ve done it for days on end already.
This exam is very hard and I’m very tired. The tired part is not because of lack of sleep, that’s not an issue (yet). It’s because thinking is hard. Also, it’s depressing working with this stuff because I’m pretty sure that for a guy with an IQ of 150-160, most of this stuff is simply just a walk in the park. Right now I kinda feel like the stupid kid in primary school.
Roman Emperor from 98 AD to 117 AD. This is what the Roman Empire looked like at the end of his reign:
You can file this one under: ‘Yet more stuff I should have learned something about when I was younger.’ Before I started at the university, I learned a lot of the stuff the various schools I was enrolled in had to offer – but I didn’t learn much outside school. I really dislike now that I wasted so much time back then. I still do, btw., ie. waste a lot of time – old habits die hard but it’s better than it used to be. No, it’s not that I consider all the time that is spent not collecting knowledge like this wasted, no way; I just don’t have all that many better things to be doing with my time when I’m not doing the stuff I have to do, like studying the stuff that’s actually related to my exams, so my tradeoffs don’t look quite like those of a more ordinary person – who might have, say, a lot of what might be termed ‘social obligations’. I think of reading stuff like this as somehow more virtuous than reading tv-tropes or kibitzing a game of chess between two GMs and most certainly more virtuous than watching an episode of House, which I also happen to be doing every now and then.
Robin Lane Fox did include Trajan’s ruling period in his book but it’s been a while since I read that anyway and there wasn’t a lot of stuff about that guy in there. Here’s one sentence, perhaps not exactly displaying Trajan in the best possible light: “Between May 107 and November 109 Trajan celebrated his conquest of Dacia (modern Romania) with more than twenty weeks of blood sports, showing more than 5,500 pairs of gladiators and killing over 11,000 animals.” Though it should probably also be noted that such ‘blood sports’ were quite popular among the populace as well back then. (how much did I actually quote from that book here on the blog back when I’d read it? I now think perhaps my coverage of the book back then was somewhat lacking, perhaps I should have included more stuff? Well, it’s not too late, if I get ’round to it, maybe..).
2. Ants. File under: ‘These guys are pretty amazing’. There are more than four times as many estimated ant species (22.000) as there are species of mammals combined (5.400) – more than 12.500 ant species have already been classified. They’ve been around for more than 100 million years:
“Ants evolved from a lineage within the vespoid wasps. Phylogenetic analysis suggests that ants arose in the mid-Cretaceous period about 110 to 130 million years ago. After the rise of flowering plants about 100 million years ago they diversified and assumed ecological dominance around 60 million years ago.”
According to one of the source articles to the article:
“Ants are arguably the greatest success story in the history of terrestrial metazoa. On average, ants monopolize 15–20% of the terrestrial animal biomass, and in tropical regions where ants are especially abundant, they monopolize 25% or more.”
4. Autoregressive model. ‘The type of stuff people like me work with on a near-daily basis’. [‘economics? That’s a bit like philosophy, right?’ – I got that comment once not long ago out in the Real World. In some ways it kinda is, sort of, or there are at least some elements the two systems have in common within relevant subsystems; but if you actually ask a question like that the answer will always be ‘No’.]
5. International Space Station. A featured article. Some stats:
Mass: 369,914 kg
Length: 51 m
Width: 109 m
“The cost of the station has been estimated by ESA as €100 billion over 30 years, and, although estimates range from 35 to 160 billion US dollars, the ISS is believed to be the most expensive object ever constructed.”
The link  in the article states that: “The European share, at around 8 billion Euros spread over the whole programme, amounts to just one Euro spent by every European every year…”
One of the great benefits of experimental research is that, in principle, we can repeat the experiment and generate a fresh set of data. While this is impossible for many questions in social science, at a minimum one would hope that we could replicate our original results using the same dataset. As many students in Gov 2001 can tell you, however, social science often fails to clear even that low bar.
Of course, even this type of replication is impossible if someone else has changed the dataset since the original analysis was conducted. But that would never happen, right?