This will be my last post about the book. Below I have included some observations from the last 100 pages.
“A central theme in this volume is the fact that we usually prefer to work with effect sizes, rather than p-values. [...] While we would argue that researchers should shift their focus to effect sizes even when working entirely with primary studies, the shift is absolutely critical when our goal is to synthesize data from multiple studies. A narrative reviewer who works with p-values (or with reports that were based on p-values) and uses these as the basis for a synthesis, is facing an impossible task. Where people tend to misinterpret a single p-value, the problem is much worse when they need to compare a series of p-values. [...] the p-value is often misinterpreted. Because researchers care about the effect size, they tend to take whatever information they have and press it into service as an indicator of effect size. A statistically significant p-value is assumed to reflect a clinically important effect, and a nonsignificant p-value is assumed to reflect a trivial (or zero) effect. However, these interpretations are not necessarily correct. [...] The narrative review typically works with p-values (or with conclusions that are based on p-values), and therefore lends itself to [...] mistakes. p-values that differ are assumed to reflect different effect sizes but may not [...], p-values that are the same are assumed to reflect similar effect sizes but may not [...], and a more significant p-value is assumed to reflect a larger effect size when it may actually be based on a smaller effect size [...]. By contrast, the meta-analysis works with effect sizes. As such it not only focuses on the question of interest (what is the size of the effect) but allows us to compare the effect size from study to study.”
“To compute the summary effect in a meta-analysis we compute an effect size for each study and then combine these effect sizes, rather than pooling the data directly. [...] This approach allows us to study the dispersion of effects before proceeding to the summary effect. For a random-effects model this approach also allows us to incorporate the between-studies dispersion into the weights. There is one additional reason for using this approach [...]. The reason is to ensure that each effect size is based on the comparison of a group with its own control group, and thus avoid a problem known as Simpson’s paradox. In some cases, particularly when we are working with observational studies, this is a critically important feature. [...] The term paradox refers to the fact that one group can do better in every one of the included studies, but still do worse when the raw data are pooled. The problem is not limited to studies that use proportions, but can exist also in studies that use means or other indices. The problem exists only when the base rate (or mean) varies from study to study and the proportion of participants from each group varies as well. For this reason, the problem is generally limited to observational studies, although it can exist in randomized trials when allocation ratios vary from study to study.” [See the wiki article for more]
“When studies are addressing the same outcome, measured in the same way, using the same approach to analysis, but presenting results in different ways, then the only obstacles to meta-analysis are practical. If sufficient information is available to estimate the effect size of interest, then a meta-analysis is possible. [...]
When studies are addressing the same outcome, measured in the same way, but using different approaches to analysis, then the possibility of a meta-analysis depends on both statistical and practical considerations. One important point is that all studies in a meta-analysis must use essentially the same index of treatment effect. For example, we cannot combine a risk difference with a risk ratio. Rather, we would need to use the summary data to compute the same index for all studies.
There are some indices that are similar, if not exactly the same, and judgments are required as to whether it is acceptable to combine them. One example is odds ratios and risk ratios. When the event is rare, then these are approximately equal and can readily be combined. As the event gets more common the two diverge and should not be combined. Other indices that are similar to risk ratios are hazard ratios and rate ratios. Some people decide these are similar enough to combine; others do not. The judgment of the meta-analyst in the context of the aims of the meta-analysis will be required to make such decisions on a case by case basis.
When studies are addressing the same outcome measured in different ways, or different outcomes altogether, then the suitability of a meta-analysis depends mainly on substantive considerations. The researcher will have to decide whether a combined analysis would have a meaningful interpretation. [...] There is a useful class of indices that are, perhaps surprisingly, combinable under some simple transformations. In particular, formulas are available to convert standardized mean differences, odds ratios and correlations to a common metric [I should note that the book covers these data transformations, but I decided early on not to talk about that kind of stuff in my posts because it’s highly technical and difficult to blog] [...] These kinds of conversions require some assumptions about the underlying nature of the data, and violations of these assumptions can have an impact on the validity of the process. [...] A report should state the computational model used in the analysis and explain why this model was selected. A common mistake is to use the fixed-effect model on the basis that there is no evidence of heterogeneity. As [already] explained [...], the decision to use one model or the other should depend on the nature of the studies, and not on the significance of this test [because the test will often have low power anyway]. [...] The report of a meta-analysis should generally include a forest plot.”
“The issues addressed by a sensitivity analysis for a systematic review are similar to those that might be addressed by a sensitivity analysis for a primary study. That is, the focus is on the extent to which the results are (or are not) robust to assumptions and decisions that were made when carrying out the synthesis. The kinds of issues that need to be included in a sensitivity analysis will vary from one synthesis to the next. [...] One kind of sensitivity analysis is concerned with the impact of decisions that lead to different data being used in the analysis. A common example of sensitivity analysis is to ask how results might have changed if different study inclusion rules had been used. [...] Another kind of sensitivity analysis is concerned with the impact of the statistical methods used [...] For example one might ask whether the conclusions would have been different if a different effect size measure had been used [...] Alternatively, one might ask whether the conclusions would be the same if fixed-effect versus random-effects methods had been used. [...] Yet another kind of sensitivity analysis is concerned with how we addressed missing data [...] A very important form of missing data is the missing data on effect sizes that may result from incomplete reporting or selective reporting of statistical results within studies. When data are selectively reported in a way that is related to the magnitude of the effect size (e.g., when results are only reported when they are statistically significant), such missing data can have biasing effects similar to publication bias on entire studies. In either case, we need to ask how the results would have changed if we had dealt with missing data in another way.”
“A cumulative meta-analysis is a meta-analysis that is performed first with one study, then with two studies, and so on, until all relevant studies have been included in the analysis. As such, a cumulative analysis is not a different analytic method than a standard analysis, but simply a mechanism for displaying a series of separate analyses in one table or plot. When the series are sorted into a sequence based on some factor, the display shows how our estimate of the effect size (and its precision) shifts as a function of this factor. When the studies are sorted chronologically, the display shows how the evidence accumulated, and how the conclusions may have shifted, over a period of time.”
“While cumulative analyses are most often used to display the pattern of the evidence over time, the same technique can be used for other purposes as well. Rather than sort the data chronologically, we can sort it by any variable, and then display the pattern of effect sizes. For example, assume that we have 100 studies that looked at the impact of homeopathic medicines, and we think that the effect is related to the quality of the blinding process. We anticipate that studies with complete blinding will show no effect, those with lower quality blinding will show a minor effect, those that blind only some people will show a larger effect, and so on. We could sort the studies based on the quality of the blinding (from high to low), and then perform a cumulative analysis. [...] Similarly, we could use cumulative analyses to display the possible impact of publication bias. [...] large studies are assumed to be unbiased, but the smaller studies may tend to over-estimate the effect size. We could perform a cumulative analysis, entering the larger studies at the top and adding the smaller studies at the bottom. If the effect was initially small when the large (nonbiased) studies were included, and then increased as the smaller studies were added, we would indeed be concerned that the effect size was related to sample size. A benefit of the cumulative analysis is that it displays not only if there is a shift in effect size, but also the magnitude of the shift. [...] It is important to recognize that cumulative meta-analysis is a mechanism for display, rather than analysis. [...] These kinds of displays are compelling and can serve an important function. However, if our goal is actually to examine the relationship between a factor and effect size, then the appropriate analysis is a meta-regression”
“John C. Bailar, in an editorial for the New England Journal of Medicine (Bailar, 1997), [wrote] that mistakes [...] are common in meta-analysis. He argues that a meta-analysis is inherently so complicated that mistakes by the persons performing the analysis are all but inevitable. He also argues that journal editors are unlikely to uncover all of these mistakes. [...] The specific points made by Bailar about problems with meta-analysis are entirely reasonable. He is correct that many meta-analyses contain errors, some of them important ones. His list of potential (and common) problems can serve as a bullet list of mistakes to avoid when performing a meta-analysis. However, the mistakes cited by Bailar are flaws in the application of the method, rather than problems with the method itself. Many primary studies suffer from flaws in the design, analyses, and conclusions. In fact, some serious kinds of problems are endemic in the literature. The response of the research community is to locate these flaws, consider their impact for the study in question, and (hopefully) take steps to avoid similar mistakes in the future. In the case of meta-analysis, as in the case of primary studies, we cannot condemn a method because some people have used that method improperly. [...] In his editorial Bailar concludes that, until such time as the quality of meta-analyses is improved, he would prefer to work with the traditional narrative reviews [...] We disagree with the conclusion that narrative reviews are preferable to systematic reviews, and that meta-analyses should be avoided. The narrative review suffers from every one of the problems cited for the systematic review. The only difference is that, in the narrative review, these problems are less obvious. [...] the key advantage of the systematic approach of a meta-analysis is that all steps are clearly described so that the process is transparent.”
It’s been a long time since I had one of these. Questions? Comments? Random observations?
I hate posting posts devoid of content, so here’s some random stuff:
If you think the stuff above is all fun and games I should note that the topic of chiralty, which is one of the things talked about in the lecture above, was actually covered in some detail in Gale’s book, which hardly is a book which spends a great deal of time talking about esoteric mathematical concepts. On a related note, the main reason why I have not blogged that book is incidentally that I lost all notes and highlights I’d made in the first 200 pages of the book when my computer broke down, and I just can’t face reading that book again simply in order to blog it. It’s a good book, with interesting stuff, and I may decide to blog it later, but I don’t feel like doing it at the moment; without highlights and notes it’s a real pain to blog a book, and right now it’s just not worth it to reread the book. Rereading books can be fun – I’ve incidentally been rereading Darwin lately and I may decide to blog this book soon; I imagine I might also choose to reread some of Asimov’s books before long – but it’s not much fun if you’re finding yourself having to do it simply because the computer deleted your work.
Here’s the abstract:
“Statistical power analysis provides the conventional approach to assess error rates when designing a research study. However, power analysis is flawed in that a narrow emphasis on statistical significance is placed as the primary focus of study design. In noisy, small-sample settings, statistically significant results can often be misleading. To help researchers address this problem in the context of their own studies, we recommend design calculations in which (a) the probability of an estimate being in the wrong direction (Type S [sign] error) and (b) the factor by which the magnitude of an effect might be overestimated (Type M [magnitude] error or exaggeration ratio) are estimated. We illustrate with examples from recent published research and discuss the largest challenge in a design calculation: coming up with reasonable estimates of plausible effect sizes based on external information.”
If a study has low power, you can get into a lot of trouble. Some problems are well known, others probably aren’t. A bit more from the paper:
“design calculations can reveal three problems:
1. Most obvious, a study with low power is unlikely to “succeed” in the sense of yielding a statistically significant result.
2. It is quite possible for a result to be significant at the 5% level — with a 95% confidence interval that entirely excludes zero — and for there to be a high chance, sometimes 40% or more, that this interval is on the wrong side of zero. Even sophisticated users of statistics can be unaware of this point — that the probability of a Type S error is not the same as the p value or significance level.
3. Using statistical significance as a screener can lead researchers to drastically overestimate the magnitude of an effect (Button et al., 2013).
Design analysis can provide a clue about the importance of these problems in any particular case.”
“Statistics textbooks commonly give the advice that statistical significance is not the same as practical significance, often with examples in which an effect is clearly demonstrated but is very small [...]. In many studies in psychology and medicine, however, the problem is the opposite: an estimate that is statistically significant but with such a large uncertainty that it provides essentially no information about the phenomenon of interest. [...] There is a range of evidence to demonstrate that it remains the case that too many small studies are done and preferentially published when “significant.” We suggest that one reason for the continuing lack of real movement on this problem is the historic focus on power as a lever for ensuring statistical significance, with inadequate attention being paid to the difficulties of interpreting statistical significance in underpowered studies. Because insufficient attention has been paid to these issues, we believe that too many small studies are done and preferentially published when “significant.” There is a common misconception that if you happen to obtain statistical significance with low power, then you have achieved a particularly impressive feat, obtaining scientific success under difficult conditions.
However, that is incorrect if the goal is scientific understanding rather than (say) publication in a top journal. In fact, statistically significant results in a noisy setting are highly likely to be in the wrong direction and invariably overestimate the absolute values of any actual effect sizes, often by a substantial factor.”
iii. I’m sure most people who might be interested in following the match are already well aware that Anand and Carlsen are currently competing for the world chess championship, and I’m not going to talk about that match here. However I do want to mention to people interested in improving their chess that I recently came across this site, and that I quite like it. It only deals with endgames, but endgames are really important. If you don’t know much about endgames you may find the videos available here, here and here to be helpful.
iv. A link: Crosss Validated: “Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization.”
A friend recently told me about this resource. I knew about the existence of StackExchange, but I haven’t really spent much time there. These days I mostly stick to books and a few sites I already know about; I rarely look for new interesting stuff online. This also means you should not automatically assume I surely already know about X when you’re considering whether to tell me about X in an Open Thread.
Female Infidelity and Paternal Uncertainty – Evolutionary Perspectives on Male Anti-Cuckoldry Tactics
“A couple of chapters were really nice, but the authors repeat themselves *a lot* throughout the book and some chapters are really weak. I was probably at three stars after approximately 100 pages, but the book in my opinion lost steam after that. A couple of chapters are in my opinion really poor – basically they’re just a jumble of data-poor theorizing which is most likely just plain wrong. A main hypothesis presented in one of the chapters is frankly blatantly at odds with a lot of other evidence, some of which is even covered earlier in the same work, but the authors don’t even mention this in the coverage.
I don’t regret reading the book, but it’s not that great.”
Let’s say you have a book where Hrdy’s idea that it’s long been in the interest of human females to confuse paternity by various means, e.g. through extra-pair copulations, because such behaviour reduces the risk of infanticide (I’ve talked about these things before here on the blog, if you’re unfamiliar with this work and haven’t read my posts on the topics see for example this post) is covered, and where various other reasons why females may choose to engage in extra-pair copulations (e.g. ‘genetic benefits’) are also covered. Let’s say that in another, problematic, chapter of said book, a theory is proposed that ‘unfamiliar sperm’ (sperm from an individual the female has not had regular sex with before) leading to pregnancy is more likely to lead to preeclampsia in a female, a pregnancy complication which untreated will often lead to the abortion of the fetus. Let’s say the authors claim in that problematic chapter that the reason why females are more likely to develop preeclampsia in case of a pregnancy involving unfamiliar sperm is that such a pregnancy is likely to be a result of rape, and that the physiological mechanism leading to the pregnancy complication is an evolved strategy on part of the female, aimed at enabling her to exercise (post-copulatory) mate choice and reduce the negative fitness consequences of the rape. Let’s say the authors of the preeclampsia chapter/theory don’t talk at all about e.g. genetic benefits derived from extra-pair copulations which are not caused by rape but are engaged in willingly by the female because it’s in her reproductive interests to engage in them, and that the presumably common evolutionary female strategy of finding a semi-decent provider male as a long-term partner while also occasionally sleeping around with high-quality males (and low quality males – but only when not fertile (e.g. when pregnant…)) and have their children without the provider male knowing about it is not even mentioned. Assume the authors of the chapter seem to assume that getting a child by a male with unfamiliar sperm is always a bad idea.
Yeah, the above is what happened in this book, and it’s part of why it only gets two stars. These people are way too busy theorizing, and that specific theory is really poor – or at least the coverage of it was, as they don’t address the obvious issues which people reading the other chapters wouldn’t even have a hard time spotting. Kappeler et al. is a much better book, and it turns out that there was much less new stuff in this book than I’d thought – a lot of ‘the good stuff’ is also covered there.
It doesn’t help that many of the authors are systematically overestimating the extra-pair paternity rate by relying on samples/studies which are obviously deeply suspect due to selection bias. Not all of them goes overboard and claim the number is 10% or something like that, but many of them do – ‘the number is between 1-30%, with best estimates around 10%’ is a conclusion drawn in at least a couple of chapters. This is wrong. Only one contributor talking about these numbers come to the conclusion that the average number is likely to have been less than 5% in an evolutionary context (“Only very tentative conclusions about typical EPP [extra-pair paternity] rates throughout recent human history (e.g. in the past 50 000 years) can be drawn [...] It seems reasonable to suggest that rates have typically been less than 10% and perhaps in most cases less than 5%. It also seems reasonable to suggest that they have probably also been variable across time and place, with some populations characterized by rates of 10% or higher.”). An idea worth mentioning in this context is that human behaviour can easily have been dramatically impacted by things which rarely happen now, because the reason why those things may be rare may well be that a lot of behaviour is aimed towards making sure it is rare and stays rare – this idea should be well known to people familiar with Hrdy’s thesis, and it also to me seems to apply to cuckoldry; cuckoldry may happen relatively infrequently, but perhaps the reason for this is that human males with female partners are really careful not to allow their partners to sleep around quite as much as their genetic code might like them to do. I mentioned in the coverage of Kappeler et al. that female sexual preferences change over the course of her menstrual cycle – they also talk about this in this book, but a related observation also made in the book is that males seem to be more vigilant and seem to intensify their level of mate guarding when their partner is ovulating. There’s probably a lot of stuff which goes on ‘behind the scenes’ which we humans are not aware of. Human behaviour is really complicated.
All these things said, there’s some really nice stuff in the book as well. The basic idea behind much of the coverage is that whereas females always know that their children are their children, males can never know for sure – and in a context where males may derive a fitness benefit from contributing to their offspring and a fitness loss by contributing to another male’s child, this uncertainty is highly relevant for how they might choose to behave in many contexts related to partnership dynamics. Many different aspects of the behaviour of human males is to some extent directed towards minimizing the risk of getting cuckolded and/or the risk of a partner in whom they have invested leaving him. They may choose to hide the female partner from competitors e.g. by monopolizing her time or by using violence to keep her from interacting with male competitors, they may signal to competitors that she is taken and/or perhaps that it may be costly to try to have sex with her (threats to other males, violence directed towards the competitor rather than the partner), they may try to isolate her socially by badmouthing her to potential competitors (e.g. male friends and acquaintances). On a more positive note males may also choose to do ‘nice things’ to keep the partner from leaving him, like ‘giving in to sexual requests’ and ‘performing sexual favors to keep her around’ (in at least one study, “men partnered to women who [were] more likely to be sexually unfaithful [were] also more likely to perform sexual inducements to retain their partners” – but before women reading this conclude that their incentives may look rather different from what they thought they did, it’s probably worth noting that the risk of abuse also goes up when the male thinks the partner might be unfaithful (see below)). If the first anti-cuckold approach, the mate-guarding strategy of trying to keep her from having sex with others, fails, then the male has additional options – one conceptualization in the book splits the strategy choices up into three groups; mate-guarding strategies, intra-vaginal strategies and post-partum strategies (in another chapter they distinguish among “preventative tactics, designed to minimize female infidelity; sperm-competition tactics, designed to minimize conception in the event of female infidelity; and differential paternal investment” – but the overall picture is reasonably similar). Intra-vaginal strategies relate to sperm competition and for example more specifically relate to e.g. the observation that a male may try to minimize the risk of being cuckolded after having been separated from the partner by having sex with the partner soon after they meet up again. A male may also increase the amount of sperm deposited during intercourse in such a context, compared to normal, and ‘sexual mechanics’ may also change as a function of cuckoldry risk (deeper thrusts and longer duration of sex if they’ve been separated for a while). There are five chapters on this stuff in the book, but I’ve limited coverage of this stuff because I don’t think it’s particularly interesting. Post-partum strategies naturally relate to strategies employed after the child has been born. Here the father may observe the child after it’s been born and then try to figure out if it looks like him/his family, and then adjust investment in the child based on how certain he is that he’s actually the father:
“There is growing evidence that human males are [...] affected by [...] evolutionary pressures to invest in offspring as a function of paternal certainty”, and “Burch and Gallup (2000) have shown that males spend less time with, invest fewer resources in, and are more likely to abuse ostensibly unrelated children than children they assume to be their genetic offspring. They also found that the less a male thinks a child (unrelated or genetic) looks like him, the worse he treats the child and the worse he views the relationship with that child.”
It’s worth mentioning that dividing the strategy set up into three, and exactly three, overall categories seem to me slightly artificial, also because some relevant behaviours may not fit very well into any of them; to take an example, “There is growing evidence that males who question their partner’s fidelity show an increase in spouse abuse during pregnancy, and the abuse is often directed toward the female’s abdomen” – this behavioural pattern relates to none of the three strategy categories mentioned, but also seems ‘relevant’. In general it’s important to observe that employment of a specific type of tactic does not necessarily preclude the employment of other tactics as well – as pointed out in the book:
“A male’s best strategy is to prevent female infidelity and, if he is unsuccessful in preventing female infidelity, he would benefit by attempting to prevent conception by a rival male. If he is unsuccessful in preventing conception by a rival male, he would benefit by adjusting paternal effort according to available paternity cues. The performance of one tactic does not necessitate the neglect of another tactic; indeed, a reproductively wise strategy would be to perform all three categories of anti-cuckoldry tactics”
There’s a lot of food for thought in the book. I’ve included some more detailed observations from the book below – in particular I’ve added some stuff closely related to what I believe people might normally term ‘red flags’ or similar in a relationship context. I’d say that enough research has been done on this kind of stuff for it to make a lot of sense for women to read some of it – in light of the evidence, there are certain types of male behaviours which should most definitely be considered strong warning signs that it may be a bad idea to engage with this individual. (I was annoyed that the book only dealt with male abuse, as there are quite a few female abusers as well, but I can’t really fault the authors for limiting coverage to male behaviours).
“Paternal investment in humans and many other species is facultatively expressed: it often benefits offspring but is not always necessary for their survival and thus the quantity and quality of human paternal investment often varies with proximate conditions [...] The facultative expression of male parenting reflects the [...] cost–benefit trade-offs as these relate to the current social and ecological contexts in which the male is situated. The degree of male investment (1) increases with increases in the likelihood that investment will be provided to his own offspring (i.e. paternity certainty), (2) increases when investment increases the survival and later reproductive prospects of offspring, and (3) decreases when there are opportunities to mate with multiple females. [...] the conditional benefits of paternal investment in these species results in simultaneous cost–benefit trade-offs in females. Sometimes it is in the females’ best interest (e.g. when paired with an unhealthy male) to cuckold their partner and mate with higher-quality males [...] As a result, women must balance the costs of reduced paternal investment or male retaliation against the benefits of cuckoldry; that is, having their children sired by a more fit man while having their social partner assist in the rearing of these children.”
“In several large but unrepresentative samples, 20–25% of adult women reported having had at least one extra-pair sexual relationship during their marriage [...] Using a nationally representative sample in the USA, Wiederman (1997) found that 12% of adult women reported at least one extra-pair sexual relationship during their marriage, and about 2% reported such a relationship during the past 12 months; Treas and Giesen (2000) found similar percentages for another nationally representative sample. These may be underestimates, given that people are reluctant to admit to extra-pair relationships. In any case, the results indicate that some women develop simultaneous and multiple opposite-sex relationships, many of which become sexual and are unknown to their social partner [...] The dynamics of these extra-pair relationships are likely to involve a mix of implicit (i.e. unconscious) and explicit (i.e. conscious) psychological processes (e.g. attention to symmetric facial features) and social strategies. [...] the finding that attraction to extra-pair partners is influenced by hormonal fluctuations points to the importance of implicit mechanisms. [...] The emerging picture is one in which women appear to have an evolved sensitivity to the proximate cues of men’s fitness, a sensitivity that largely operates automatically and implicitly and peaks around the time women ovulate. The implicit operation of these mechanisms enables women to assess the fitness of potential extra-pair partners without a full awareness that they are doing so. In this way, women are psychologically and socially attentive to the relationship with their primary partner and most of the time have no explicit motive to cuckold this partner. If their social partners monitor for indications of attraction to extra-pair men, which they often do [...], then these cues are only emitted during a short time frame. Moreover, given that attraction to a potential extra-pair partner is influenced by hormonal mechanisms, often combined with some level of pre-existing and non-sexual emotional intimacy with the extra-pair male [...], many of these women may have no intention of an extra-pair sexual relationship before it is initiated. Under these conditions, the dynamics of cuckoldry may involve some level of self deception on women’s part, a mechanism that facilitates their ability to keep the extra-pair relationship hidden from their social partners. [...] As with women, men’s anti-cuckoldry biases almost certainly involve a mix of implicit processes and explicit behavioral strategies that can be directed toward their mates, toward potential rivals, and toward the evaluation of the likely paternity of children born to their partners”
“Males have evolved psychological adaptations that produce mate guarding and jealousy [...] to reduce or to prevent a mate from being inseminated by another male. Recent evidence suggests that males maximize the utility of their mateguarding strategies by implementing them at ovulation, a key reproductive time in a female’s menstrual cycle [...]. Further, jealousy appears to fluctuate with a man’s mate value and, hence, risk of cuckoldry. Brown and Moore (2003), for example, found that males who were less symmetrical were significantly more jealous. These and other data suggest that jealousy has evolved as a means by which males can attempt to deter extra-pair copulations [...] When triggered, jealousy often results in a variety of behavioral responses, including male-on-female aggression [...], divorce [...], the monitoring and attempted control of the social and sexual behavior of their partners [...], enhancement of their attractiveness as a mate [...], and the monitoring of and aggression toward actual or perceived sexual rivals [...]. In total, these behaviors encompass tactics that function to ensure, through coercion or enticement, that their reproductive investment and that of their mate is directed toward the man’s biological children. [...] One of the more common behavioral responses to relationship jealousy is mate guarding. For men this involves reducing their partner’s opportunity to mate with other men.”
“Cuckoldry is a reproductive cost inflicted on a man by a woman’s sexual infidelity or temporary defection from her regular long-term relationship. Ancestral men also would have incurred reproductive costs by a long-term partner’s permanent defection from the relationship. These costs include loss of the time, effort, and resources the man has spent attracting his partner, the potential misdirection of his resources to a rival’s offspring, and the loss of his mate’s investment in offspring he may have had with her in the future [...] Expressions of male sexual jealousy historically may have been functional in deterring rivals from mate poaching [...] and deterring a mate from a sexual infidelity or outright departure from the relationship [...] Buss (1988) categorized the behavioral output of jealousy into different ‘‘mate-retention’’ tactics, ranging from vigilance over a partner’s whereabouts to violence against rivals [...] Performance of these tactics is assessed by the Mate Retention Inventory (MRI[)] [...] Buss’s taxonomy (1988) partitioned the tactics into two general categories: intersexual manipulations and intrasexual manipulations. Intersexual manipulations include behaviors directed toward one’s partner, and intrasexual manipulations include behaviors directed toward same-sex rivals. Intersexual manipulations include direct guarding, negative inducements, and positive inducements. Intrasexual manipulations include public signals of possession. [...] Unfortunately, little is known about which specific acts and tactics of men’s mate-retention efforts are linked with violence. The primary exception is the study by Wilson, Johnson, and Daly (1995), which identified several predictors of partner violence – notably, verbal derogation of the mate and attempts at sequestration such as limiting access to family, friends, and income.”
“Tactics within the direct guarding category of the MRI include vigilance, concealment of mate, and monopolization of time. An exemplary act for each tactic is, respectively, ‘‘He dropped by unexpectedly to see what she was doing,’’ ‘‘He refused to introduce her to his same-sex friends,’’ and ‘‘He monopolized her time at the social gathering.’’ Each of these tactics implicates what Wilson and Daly (1992) term ‘‘male sexual proprietariness,’’ which refers to the sense of entitlement men sometimes feel that they have over their partners [...] Wilson et al. (1995) demonstrated that violence against women is linked closely to their partners’ autonomy-limiting behaviors. Women who affirmed items such as ‘‘He is jealous and doesn’t want you to talk to other men,’’ were more than twice as likely to have experienced serious violence by their partners.” [What was the base rate? I find myself asking. But it’s still relevant knowledge.] [...] Not all mate-retention tactics are expected to predict positively violence toward partners. Some of these tactics include behaviors that are not in conflict with a romantic partner’s interests and, indeed, may be encouraged and welcomed by a partner [...] Holding his partner’s hand in public, for example, may signal to a woman her partner’s commitment and devotion to her. [...] Tactics within the public signals of possession category include verbal possession signals (e.g. ‘‘He mentioned to other males that she was taken’’), physical possession signals (e.g. ‘‘He held her hand when other guys were around’’), and possessive ornamentation (e.g. ‘‘He hung up a picture of her so others would know she was taken’’).”
“The current studies examined how mate-retention tactics are related to violence in romantic relationships, using the reports of independent samples of several hundred men and women in committed, romantic relationships [...], and using the reports of 107 married men and women [...] With few exceptions, we found the same pattern of results using three independent samples. Moreover, these samples were not just independent, but provided different perspectives (the male perpetrator’s, the female victim’s, and a combination of the two) on the same behaviors – men’s mate-retention behaviors and men’s violence against their partners. We identified overlap between the best predictors of violence across the studies. For example, men’s use of emotional manipulation, monopolization of time, and punish mate’s infidelity threat are among the best predictors of female-directed violence, according to independent reports provided by men and women, and according to reports provided by husbands and their wives. The three perspectives also converged on which tactics are the weakest predictors of relationship violence. For example, love and care and resource display are among the weakest predictors of female-directed violence. [...] The tactic of emotional manipulation was the highest-ranking predictor of violence in romantic relationships in study 1, and the second highest-ranking predictor in studies 2 and 3. The items that comprise the emotional manipulation tactic include, ‘‘He told her he would ‘die’ if she ever left,’’ and ‘‘He pleaded that he could not live without her.’’ Such acts seem far removed from those that might presage violence. [...] Monopolization of time also ranked as a strong predictor of violence across the three studies. Example acts included in this tactic are ‘‘He spent all his free time with her so that she could not meet anyone else’’ and ‘‘He would not let her go out without him.’’ [...] The acts ‘‘Dropped by unexpectedly to see what my partner was doing’’ and ‘‘Called to make sure my partner was where she said she would be’’ are the third and fifth highest-ranking predictors of violence, respectively. These acts are included in the tactic of vigilance, which is the highest-ranking tactic-level predictor of violence in study 3. Given that (1) two of the top five actlevel predictors of violence are acts of vigilance, (2) the numerically best tactic-level predictor of violence is vigilance, and (3) seven of the nine acts included within the vigilance tactic are correlated significantly with violence [...], a man’s vigilance over his partner’s whereabouts is likely to be a key signal of his partner-directed violence. [...] Wilson et al. (1995) found that 40% of women who affirmed the statement ‘‘He insists on knowing who you are with and where you are at all times’’ reported experiencing serious violence at the hands of their husbands.”
“Relative to women’s reports of their partners’ behavior, men self-reported more frequent use of intersexual negative inducements, positive inducements, and controlling behavior. Although not anticipated, the sex difference in reported frequency of controlling behaviors is not surprising upon examination of the acts included in the CBI [Controlling Behavior Index]. More than half of the acts do not require the woman’s physical presence or knowledge, for example ‘‘Deliberately keep her short of money’’ and ‘‘Check her movements.’’ In addition, such acts might be more effective if the woman is not aware of their occurrence. [...] Increased effort devoted to mate retention is predicted to occur when the adaptive problems it was designed to solve are most likely to be encountered – when a mate is particularly desirable, when there exist mate poachers, when there is a mate-value discrepancy, and when the partner displays cues to infidelity or defection”
“Although sometimes referred to as marital rape, spouse rape, or wife rape,we use the term forced in-pair copulation (FIPC) to refer to the forceful act of sexual intercourse by a man against his partner’s will. [...] FIPC is not performed randomly [...] FIPC reliably occurs immediately after extra-pair copulations, intrusions by rival males, and female absence in many species of waterfowl [...] and other avian species [...] FIPC in humans often follow[s] accusations of female infidelity”
i. “While we stop to think, we often miss our opportunity.” (Publilius Syrus)
ii. “The civility which money will purchase, is rarely extended to those who have none.” (Charles Dickens)
iii. “Grief can take care of itself; but to get the full value of a joy you must have somebody to divide it with.” (Mark Twain)
iv. “Long books, when read, are usually overpraised, because the reader wants to convince others and himself that he has not wasted his time.” (E. M. Forster)
v. “I do not envy people who think they have a complete explanation of the world, for the simple reason that they are obviously wrong.” (Salman Rushdie)
vi. “To be conscious that you are ignorant of the facts is a great step to knowledge.” (Benjamin Disraeli)
vii. “To hear, one must be silent.” (Ursula Guin)
viii. “The danger in trying to do good is that the mind comes to confuse the intent of goodness with the act of doing things well.” (-ll-)
ix. “Things don’t have purposes, as if the universe were a machine, where every part has a useful function. What’s the function of a galaxy? I don’t know if our life has a purpose and I don’t see that it matters.” (-ll-)
x. “In the land of the blind, the one-eyed man will poke out his eye to fit in.” (Caitlín R. Kiernan)
xi. “The greatest happiness you can have is knowing that you do not necessarily require happiness.” (William Saroyan)
xii. “An original idea. That can’t be too hard. The library must be full of them.” (Stephen Fry)
xiii. “It is a cliché that most clichés are true, but then like most clichés, that cliché is untrue.” (-ll-)
xiv. “Of what use is freedom of speech to those who fear to offend?” (Roger Ebert)
xv. “The assumption that anything true is knowable is the grandfather of paradoxes.” (William Poundstone)
xvi. “Approved attributes and their relation to face make every man his own jailer; this is a fundamental social constraint even though each man may like his cell.” (Erving Goffman)
xvii. “There may be no good reason for things to be the way they are.” (Alain de Botton)
xviii. “It is striking how much more seriously we are likely to be taken after we have been dead a few centuries.” (-ll-)
xix. “Deciding to avoid other people does not necessarily equate with having no desire whatsoever for company; it may simply reflect a dissatisfaction with what — or who — is available.” (-ll-)
xx. “We are able to breathe, drink, and eat in comfort because millions of organisms and hundreds of processes are operating to maintain a liveable environment, but we tend to take nature’s services for granted because we don’t pay money for most of them.” (Eugene Odum)
Here’s my first post about the book. I was disappointed by some of the chapters in the second half of the book and I think a few of them were quite poor. I have been wondering what to cover from the second half, in part because some of the authors seem to proceed as if e.g. the work of these authors does not exist (key quote: Our findings do not support continued widespread efforts to boost self-esteem in the hope that it will by itself foster improved outcomes) – I was thinking this about the authors of the last chapter, on ‘Changing self-esteem through competence and worthiness training’, in particular; their basic argument seems to be that since CWT (Competence and Worthiness Training) has been shown to improve self-esteem, ‘good things will follow’ people who make use of such programs. Never mind the fact that causal pathways between self-esteem and life outcomes are incredibly unclear, never mind that self-esteem is not the relevant outcome measure (and studies with good outcome measures do not exist), and never mind that effect persistence over time is unknown, to take but three of many problems with the research. They argue/conclude in the chapter that CWT is ‘empirically validated’, an observation which almost made me laugh. I’m in a way slightly puzzled that whereas doctors contributing to Springer publications and similar are always supposed to disclose conflicts of interest in the publications, no similar demands are made in the context of the psychological literature; these people obviously make money off of these things, and yet they’re the ones evaluating the few poor studies that have been done, often by themselves, while pretending to be unbiased observers with no financial interests in whether the methods are ‘validated’ or not. Oh well.
Although some chapters are poor (‘data-poor and theory rich’, might not be a bad way to describe them – note that the ‘data poor’ part relates both to low amounts of data and the use of data of questionable quality; I’m thinking specifically about the use of measures of ‘implicit self-esteem’ in chapter 6 – the authors seem confused about the pattern of results and seem to have a hard time making sense of them (they seem to keep having to make up new ad-hoc explanations for why ‘this makes sense in context’), but I don’t think the results are necessarily that confusing; the variables probably aren’t measuring what they think they’re measuring, not even close, and the two different types of measures probably aren’t remotely measuring anything similar (I have a really hard time figuring out why anyone would ever think that they do), so it makes good sense that findings are all over the place..), chapter 8, on ‘Self-esteem as an interpersonal signal, was however really great and I thought I should share some observations from that chapter here – I have done this below. Interestingly, people who read the first post about the book would in light of the stuff included in that chapter do well to forget my personal comments in the first post about me having low self-esteem; interpersonal outcomes seem to be likely to be better if you think the people with whom you interact have high self-esteem (there are exceptions, but none of them seem relevant in this context), whether or not that’s actually true. Of course the level of ‘interaction’ going on here on the blog is very low, but even so… (I may be making a similar type of mistake the authors make in the last chapter here, by making unwarranted assumptions, but anyway…).
Before moving on, I should perhaps point out that I just finished the short Springer publication Appointment Planning in Outpatient Clinics and Diagnostic Facilities. I’m not going to blog this book separately as there frankly isn’t enough stuff in there for it to make sense to devote an entire blog post to it, but I thought I might as well add a few remarks here before moving on. The book contains a good introduction to some basic queueing theory, and quite a few important concepts are covered which people working with those kinds of things ought to know about (also, if you’ve ever had discussions about waiting lists and how ‘it’s terrible that people have to wait so long’ and ‘something has to be done‘, the discussion would have had a higher quality if you’d read this book first). Some chapters of the book are quite technical – here are a few illustrative/relevant links dealing with stuff covered in the book: Pollaczek–Khinchine formula, Little’s Law, the Erlang C formula, the Erlang B formula, Laplace–Stieltjes transform. The main thing I took away from this book was that this stuff is a lot more complicated that I’d thought. I’m not sure how much the average nurse would get out of this book, but I’m also not sure how much influence the average nurse has on planning decisions such as those described in this book – little, I hope. Sometimes a book contains a few really important observations and you sort of want to recommend the book based simply on these observations, because a lot of people would benefit from knowing exactly those things; this book is like that, as planners on many different decision-making levels would benefit from knowing the ‘golden rules’ included in section 7.1. When things go wrong due to mismanagement and very long waiting lists develop, it’s obvious that however you look at it, if people had paid more attention to those aspects, this would probably not have happened. An observation which is critical to include in the coverage of a book like this is that it may be quite difficult for an outside observer (e.g. a person visiting a health clinic) to evaluate the optimality of scheduling procedures except in very obvious cases of inefficiently long queues. Especially in the case of excess capacity most outsiders do not know enough to evaluate these systems fairly; what may look like excess capacity to the outsider may well be a necessary buffer included in the planning schedule to keep waiting times from exploding at other points in time, and it’s really hard to tell those apart if you don’t have access to relevant data. Even if you do, things can be, complicated (see the links above).
Okay, back to the self-esteem text – some observations from the second half of the book below…
“low self-esteem is listed as either a diagnostic criterion or associated feature of at least 24 mental disorders in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV- TR). Low self-esteem and an insufficient ability to experience self-relevant positive emotions such as pride is particularly strongly linked to depression, to such a degree that some even suggest conceptualizing self-esteem and depression as opposing end points of a bipolar continuum […] The phenomenology of low self-esteem – feeling incompetent and unworthy, unfit for life – inevitably translates into experiencing existence as frightening and futile. This turns life for the person lacking in self-esteem into a chronic emergency: that person is psychologically in a constant state of danger, surrounded by a feeling of impending disaster and a sense of helplessness. Suffering from low self-esteem thus involves having one’s consciousness ruled by fear, which sabotages clarity and efficiency (Branden, 1985). The main goal for such a person is to keep the anxieties, insecurities, and self-doubts at bay, at whatever cost that may come. On the other hand, a person with a satisfying degree of self-respect, whose central motivation is not fear, can afford to rejoice in being alive, and view existence as a more exciting than threatening affair.” [from chapter 7, on ‘Existential perspective on self-esteem’ – I didn’t particularly like that chapter and I’m not sure to which extent I agree with the observations included, but I thought I should add the above to illustrate which kind of stuff is also included in the book.]
“Although past research has emphasized how social environments are internalized to shape self-views, researchers are increasingly interested in how self-views are externalized to shape one’s social environment. From the externalized perspective, people will use information about another’s self-esteem as a gauge of that person’s worth [...] self-esteem serves a “status-signaling” function that complements the status-tracking function [...] From this perspective, self-esteem influences one’s self-presentational behavior, which in turn influences how others view the self. This status-signaling system in humans should work much like the status-signaling models developed in non-human animals [Aureli et al. and Kappeler et al. are examples of places to go if you’re interested in knowing more about this stuff] [...] Ultimately, these status signals have important evolutionary outcomes, such as access to mates and consequent reproductive success. In essence, self-esteem signals important status-related information to others in one’s social world. [...] the basic notion here is that conveying high (or low) self-esteem provides social information to others.”
“In an effort to understand their social world, people form lay theories about the world around them. These lay theories consist of information about how characteristics covary within individuals [...] Research on the status-signaling function of self-esteem [...] and on self-esteem stereotypes [...] report a consistent positive bias in the impressions formed about high self-esteem individuals and a consistent negative bias about those with low self-esteem. In several studies conducted by Cameron and her colleagues [...], when Canadian and American participants were asked to rate how the average person would describe a high self-esteem individual, they universally reported that higher self-esteem people were attractive, intelligent, warm, competent, emotionally stable, extraverted, open to experience, conscientious, and agreeable. Basically, on all characteristics in the rating list, high self-esteem people were described as superior. [...] Whereas people sing the praises of high self-esteem, low self-esteem is viewed as a “fatal flaw.” In the same set of studies, Cameron and her colleagues [...] found that participants attributed negative characteristics to low self-esteem individuals. Across all of the characteristics assessed, low self-esteem people were seen as inferior. They were described as less attractive, less intelligent, less warm, less competent, less sociable, and so forth. The only time that the stereotypes of low self-esteem individuals were rated as “more” than the group of high self-esteem individuals was on negative characteristics, such as experiencing more negative moods and possessing more interpersonally disadvantageous characteristics (e.g., jealousy). [...] low self-esteem individuals were seen just as negatively as welfare recipients and mentally ill people on most characteristics [...] All cultures do not view self-esteem in the same way. [...] There is some evidence to suggest that East Asian cultures link high self-esteem with more negative qualities”
“Zeigler-Hill and his colleagues [...] presented participants with a single target, identified as low self-esteem or high self-esteem, and asked for their evaluations of the target. Whether the target was identified as low self-esteem by an explicit label (Study 3), a self-deprecating slogan on a T-shirt (Study 4), or their email address (Study 5, e.g., sadeyes@), participants rated an opposite-sex low self-esteem target as less romantically desirable than a high self-esteem target [...]. However, ascribing negative characteristics to low self-esteem individuals is not just limited to decisions about an opposite-sex target. Zeigler-Hill and colleagues demonstrated that, regardless of match or mismatch of perceiver-target gender, when people thought a target had lower self-esteem they were more likely to ascribe negative traits to him or her, such as being lower in conscientiousness [...] Overall, people are apt to assume that people with low self-esteem possess negative characteristics, whereas those with high self-esteem possess positive characteristics. Such assumptions are made at the group level [...] and at the individual level [...] According to Cameron and colleagues [...], fewer than 1% of the sample ascribed any positive characteristics to people with low self-esteem when asked to give open-ended descriptions. Furthermore, on the overwhelming majority of characteristics assessed, low self-esteem individuals were rated more negatively than high self-esteem individuals”
“Although for the most part it is low self-esteem that people associate with negative qualities, there is a dark side to being labeled as having high self-esteem. People who are believed to have high self-esteem are seen as more narcissistic [...], self-absorbed, and egotistical [...] than those believed to possess low self-esteem. Moreover, the benefits of being seen as high self-esteem may be moderated by gender. When rating an opposite-sex target, men were often more positive toward female targets with moderate self-esteem than those with high self-esteem”
“Not only might perceptions of others’ self-esteem influence interactions among relative strangers, but they may also be particularly important in close relationships. Ample evidence demonstrates that a friend or partner’s self-esteem can have actual relational consequences [...]. Relationships involving low self-esteem people tend to be less satisfying and less committed [...], due at least in part to low self-esteem people’s tendency to engage in defensive, self-protective behavior and their enhanced expectations of rejection [...]. Mounting evidence suggests that people can intuit these disadvantages, and thus use self-esteem as an interpersonal signal. [...] Research by MacGregor and Holmes (2007) suggests that people expect to be less satisfied in a romantic relationship with a low self-esteem partner than a high self-esteem partner, directly blaming low self-esteem individuals for relationship mishaps [...] it appears that people use self-esteem as a signal to indicate desirability as a mate: People report themselves as less likely to date or have sex with those explicitly labeled as having “low self-esteem” compared to those labeled as having “high self-esteem” [...] Even when considering friendships, low self-esteem individuals are rated less socially appealing [...] In general, it appears that low self-esteem individuals are viewed as less-than-ideal relationship partners.”
“Despite people’s explicit aversion to forming social bonds with low self-esteem individuals, those with low self-esteem do form close relationships. Nevertheless, even these established relationships may suffer when one person detects another’s low self-esteem. For example, people believe that interactions with low self-esteem friends or family members are more exhausting and require more work than interactions with high self-esteem friends and family [...]. In the context of romantic relationships, Lemay and Dudley’s (2011) findings confirm the notion that relationships with low self-esteem individuals require extra relationship maintenance (or “work”) as people attempt to “regulate” their romantic partner’s insecurities. Specifically, participants who detected their partner’s low self-esteem tended to exaggerate affection for their partner and conceal negative sentiments, likely in an effort to maintain harmony in their relationship. Unfortunately, this inauthenticity was actually associated with decreased relationship satisfaction for the regulating partner over time. [...] MacGregor and colleagues [...] have explored a different type of communication in close relationships. Their focus was on capitalization, which is the disclosure of positive personal experiences to others [...]. In two experiments [...], participants who were led to believe that their close other had low self-esteem capitalized less positively (i.e., enthusiastically) compared to control participants. [...] Moreover, in a study involving friend dyads, participants reported capitalizing less frequently with their friend to the extent they perceived him or her as having low self-esteem [...] low self-esteem individuals are actually no less responsive to others’ capitalization attempts than are high self-esteem partners. Despite this fact, MacGregor and Holmes (2011) found that people are reluctant to capitalize with low self-esteem individuals precisely because they expect them to be less responsive than high self-esteem partners. Thus people appear to be holding back from low self-esteem individuals unnecessarily. Nevertheless, the consequences may be very real given that capitalization is a process associated with personal and interpersonal benefits”
“Cameron (2010) asked participants to indicate how much they tried to conceal or reveal their self-feelings and insecurities with significant others (best friends, romantic partners, and parents). Those with lower self-esteem reported attempting to conceal their insecurities and self-doubts to a greater degree than those with higher self-esteem. Thus, even in close relationships, low self-esteem individuals appear to see the benefit of hiding their self-esteem. Cameron, Hole, and Cornelius (2012) further investigated whether concealing self-esteem was linked with relational benefits for those with low self-esteem. In several studies, participants were asked to report their own self-esteem and then to provide their “self-esteem image”, or what level of self-esteem they thought they had conveyed to their significant others. Participants then indicated their relationship quality (e.g., satisfaction, commitment, trust). Across all studies and across all relationship types studied (friends, romantic partners, and parents), people reporting a higher self-esteem image, regardless of their own self-esteem level, reported greater relationship quality. [...] both low and high self-esteem individuals benefit from believing that a high self-esteem image has been conveyed, though this experience may feel “inauthentic” for low self-esteem people. [...] both low and high self-esteem individuals may hope to been seen as they truly are by their close others. [...] In a recent meta-analysis, Kwang and Swann (2010) proposed that individuals desire verification unless there is a high risk for rejection. Thus, those with negative self-views may desire to be viewed positively, but only if being seen negatively jeopardizes their relationship. From this perspective, romantic partners should signal high self-esteem during courtship, job applicants should signal high self-esteem to potential bosses, and politicians should signal high self-esteem to their voters. Once the relationship has been cemented (and the potential for rejection has been reduced), however, people should desire to be seen as they are. Importantly, the results of the meta-analysis supported this proposal. While this boundary condition has shed some light on this debate, more research is needed to understand fully under what contexts people are motivated to communicate either positive or negative self-views.”
“it appears that people’s judgments of others’ self-esteem are partly well informed, yet also based on inaccurate stereotypes about characteristics not actually linked to self-esteem. [...] Traits that do not readily manifest in behavior, or are low in observability, should be more difficult to detect accurately (see Funder & Dobroth, 1987). Self-esteem is one of these “low-observability” traits [...] Although the operationalization of accuracy is tricky [...], it does appear that people are somewhat accurate in their impressions of self-esteem [...] research from various laboratories indicates that both friends [...] and romantic partners [...] are fairly accurate in judging each other’s self-esteem. [...] However, people may also use information that has nothing to do with the appearances or behaviors of target. Instead, people may make judgements about another’s personality traits based on how they perceive their own traits [...] people tend to project their own characteristics onto others [...] People’s ratings of others’ self-esteem tend to be correlated with their own, be it for friends or romantic partners”
You can read my first post about the book here. Some parts of the book are fairly technical, so I decided in the post below to skip some chapters in my coverage, simply because I could see no good way to cover the stuff on a wordpress blog (which as already mentioned many times is not ideal for math coverage) without spending a lot more time on that stuff than I wanted to. If you’re a new reader and/or you don’t know what a meta-analysis is, I highly recommend you read my first post about the book before moving on to the coverage below (and/or you can watch this brief video on the topic).
Below I have added some more quotes and observations from the book.
“In primary studies we use regression, or multiple regression, to assess the relationship between one or more covariates (moderators) and a dependent variable. Essentially the same approach can be used with meta-analysis, except that the covariates are at the level of the study rather than the level of the subject, and the dependent variable is the effect size in the studies rather than subject scores. We use the term meta-regression to refer to these procedures when they are used in a meta-analysis.
The differences that we need to address as we move from primary studies to meta-analysis for regression are similar to those we needed to address as we moved from primary studies to meta-analysis for subgroup analyses. These include the need to assign a weight to each study and the need to select the appropriate model (fixed versus random effects). Also, as was true for subgroup analyses, the R2 index, which is used to quantify the proportion of variance explained by the covariates, must be modified for use in meta-analysis.
With these modifications, however, the full arsenal of procedures that fall under the heading of multiple regression becomes available to the meta-analyst. [...] As is true in primary studies, where we need an appropriately large ratio of subjects to covariates in order for the analysis be to meaningful, in meta-analysis we need an appropriately large ratio of studies to covariates. Therefore, the use of meta-regression, especially with multiple covariates, is not a recommended option when the number of studies is small.”
“Power depends on the size of the effect and the precision with which we measure the effect. For subgroup analysis this means that power will increase as the difference between (or among) subgroup means increases, and/or the standard error within subgroups decreases. For meta-regression this means that power will increase as the magnitude of the relationship between the covariate and effect size increases, and/or the precision of the estimate increases. In both cases, a key factor driving the precision of the estimate will be the total number of individual subjects across all studies and (for random effects) the total number of studies. [...] While there is a general perception that power for testing the main effect is consistently high in meta-analysis, this perception is not correct [...] and certainly does not extend to tests of subgroup differences or to meta-regression. [...] Statistical power for detecting a difference among subgroups, or for detecting the relationship between a covariate and effect size, is often low [and] failure to obtain a statistically significant difference among subgroups should never be interpreted as evidence that the effect is the same across subgroups. Similarly, failure to obtain a statistically significant effect for a covariate should never be interpreted as evidence that there is no relationship between the covariate and the effect size.”
“When we have effect sizes for more than one outcome (or time-point) within a study, based on the same participants, the information for the different effects is not independent and we need to take account of this in the analysis. [...] When we are working with different outcomes at a single point in time, the plausible range of correlations [between outcomes] will depend on the similarity of the outcomes. When we are working with the same outcome at multiple time-points, the plausible range of correlations will depend on such factors as the time elapsed between assessments and the stability of the relative scores over this time period. [...] Researchers who do not know the correlation between outcomes sometimes fall back on either of two ‘default’ positions. Some will include both [outcome variables] in the analysis and treat them as independent. Others would use the average of the [variances of the two outcomes]. It is instructive, therefore, to consider the practical impact of these choices. [...] In effect, [...] researchers who adopt either of these positions as a way of bypassing the need to specify a correlation, are actually adopting a correlation, albeit implicitly. And, the correlation that they adopt falls at either extreme of the possible range (either zero or 1.0). The first approach is almost certain to underestimate the variance and overestimate the precision. The second approach is almost certain to overestimate the variance and underestimate the precision.” [A good example of a more general point in the context of statistical/mathematical modelling: Sometimes it’s really hard not to make assumptions, and trying to get around such problems by ‘ignoring them’ may sometimes lead to the implicit adoption of assumptions which are highly questionable as well.]
“Vote counting is the name used to describe the idea of seeing how many studies yielded a significant result, and how many did not. [...] narrative reviewers often resort to [vote counting] [...] In some cases this process has been formalized, such that one actually counts the number of significant and non-significant p-values and picks the winner. In some variants, the reviewer would look for a clear majority rather than a simple majority. [...] One might think that summarizing p-values through a vote-counting procedure would yield more accurate decision than any one of the single significance tests being summarized. This is not generally the case, however. In fact, Hedges and Olkin (1980) showed that the power of vote-counting considered as a statistical decision procedure can not only be lower than that of the studies on which it is based, the power of vote counting can tend toward zero as the number of studies increases. [...] the idea of vote counting is fundamentally flawed and the variants on this process are equally flawed (and perhaps even more dangerous, since the basic flaw is less obvious when hidden behind a more complicated algorithm or is one step removed from the p-value). [...] The logic of vote counting says that a significant finding is evidence that an effect exists, while a non-significant finding is evidence that an effect is absent. While the first statement is true, the second is not. While a nonsignificant finding could be due to the fact that the true effect is nil, it can also be due simply to low statistical power. Put simply, the p-value reported for any study is a function of the observed effect size and the sample size. Even if the observed effect is substantial, the p-value will not be significant unless the sample size is adequate. In other words, as most of us learned in our first statistics course, the absence of a statistically significant effect is not evidence that an effect is absent.”
“While the term vote counting is associated with narrative reviews it can also be applied to the single study, where a significant p-value is taken as evidence that an effect exists, and a nonsignificant p-value is taken as evidence that an effect does not exist. Numerous surveys in a wide variety of substantive fields have repeatedly documented the ubiquitous nature of this mistake. [...] When we are working with a single study and we have a nonsignificant result we don’t have any way of knowing whether or not the effect is real. The nonsignificant p-value could reflect either the fact that the true effect is nil or the fact that our study had low power. While we caution against accepting the former (that the true effect is nil) we cannot rule it out. By contrast, when we use meta-analysis to synthesize the data from a series of studies we can often identify the true effect. And in many cases (for example if the true effect is substantial and is consistent across studies) we can assert that the nonsignificant p-value in the separate studies was due to low power rather than the absence of an effect. [...] vote
counting is never a valid approach.”
“The fact that a meta-analysis will often [but not always] have high power is important because [...] primary studies often suffer from low power. While researchers are encouraged to design studies with power of at least 80%, this goal is often elusive. Many studies in medicine, psychology, education and an array of other fields have power substantially lower than 80% to detect large effects, and substantially lower than 50% to detect smaller effects that are still important enough to be of theoretical or practical importance. By contrast, a meta-analysis based on multiple studies will have a higher total sample size than any of the separate studies and the increase in power can be substantial. The problem of low power in the primary studies is especially acute when looking for adverse events. The problem here is that studies to test new drugs are powered to find a treatment effect for the drug, and do not have adequate power to detect side effects (which have a much lower event rate, and therefore lower power).”
“Assuming a nontrivial effect size, power is primarily a function of the precision [...] When we are working with a fixed-effect analysis, precision for the summary effect is always higher than it is for any of the included studies. Under the fixed-effect analysis precision is largely determined by the total sample size [...], and it follows the total sample size will be higher across studies than within studies. [...] in a random-effects meta-analysis, power depends on within-study error and between-studies variation […if you don’t recall the difference between fixed-effects models and random effects models, see the previous post]. If the effect sizes are reasonably consistent from study to study, and/or if the analysis includes a substantial number of studies, then the second of these will tend to be small, and power will be driven by the cumulative sample size. In this case the meta-analysis will tend to have higher power than any of the included studies. [...] However, if the effect size varies substantially from study to study, and the analysis includes only a few studies, then this second aspect will limit the potential power of the meta-analysis. In this case, power could be limited to some low value even if the analysis includes tens of thousands of persons. [...] The Cochrane Database of Systematic Reviews is a database of systematic reviews, primarily of randomized trials, for medical interventions in all areas of healthcare, and currently includes over 3000 reviews. In this database, the median number of trials included in a review is six. When a review includes only six studies, power to detect even a moderately large effect, let alone a small one, can be well under 80%. While the median number of studies in a review differs by the field of research, in almost any field we do find some reviews based on a small number of studies, and so we cannot simply assume that power is high. [...] Even when power to test the main effect is high, many meta-analyses are not concerned with the main effect at all, but are performed solely to assess the impact of covariates (or moderator variables). [...] The question to be addressed is not whether the treatment works, but whether one variant of the treatment is more effective than another variant. The test of a moderator variable in a meta-analysis is akin to the test of an interaction in a primary study, and both suffer from the same factors that tend to decrease power. First, the effect size is actually the difference between the two effect sizes and so is almost invariably smaller than the main effect size. Second, the sample size within groups is (by definition) smaller than the total sample size. Therefore, power for testing the moderator will often be very low (Hedges and Pigott, 2004).”
“It is important to understand that the fixed-effect model and random-effects model address different hypotheses, and that they use different estimates of the variance because they make different assumptions about the nature of the distribution of effects across studies [...]. Researchers sometimes remark that power is lower under the random-effects model than for the fixed-effect model. While this statement may be true, it misses the larger point: it is not meaningful to compare power for fixed- and random-effects analyses since the two values of power are not addressing the same question. [...] Many meta-analyses include a test of homogeneity, which asks whether or not the between-studies dispersion is more than would be expected by chance. The test of significance is [...] based on Q, the sum of the squared deviations of each study’s effect size estimate (Yi) from the summary effect (M), with each deviation weighted by the inverse of that study’s variance. [...] Power for this test depends on three factors. The larger the ratio of between-studies to within-studies variance, the larger the number of studies, and the more liberal the criterion for significance, the higher the power.”
“While a meta-analysis will yield a mathematically accurate synthesis of the studies included in the analysis, if these studies are a biased sample of all relevant studies, then the mean effect computed by the meta-analysis will reflect this bias. Several lines of evidence show that studies that report relatively high effect sizes are more likely to be published than studies that report lower effect sizes. Since published studies are more likely to find their way into a meta-analysis, any bias in the literature is likely to be reflected in the meta-analysis as well. This issue is generally known as publication bias. The problem of publication bias is not unique to systematic reviews. It affects the researcher who writes a narrative review and even the clinician who is searching a database for primary papers. [...] Other factors that can lead to an upward bias in effect size and are included under the umbrella of publication bias are the following. Language bias (English-language databases and journals are more likely to be searched, which leads to an oversampling of statistically significant studies) [...]; availability bias (selective inclusion of studies that are easily accessible to the researcher); cost bias (selective inclusion of studies that are available free or at low cost); familiarity bias (selective inclusion of studies only from one’s own discipline); duplication bias (studies with statistically significant results are more likely to be published more than once [...]) and citation bias (whereby studies with statistically significant results are more likely to be cited by others and therefore easier to identify [...]). [...] If persons performing a systematic review were able to locate studies that had been published in the grey literature (any literature produced in electronic or print format that is not controlled by commercial publishers, such as technical reports and similar sources), then the fact that the studies with higher effects are more likely to be published in the more mainstream publications would not be a problem for meta-analysis. In fact, though, this is not usually the case.
While a systematic review should include a thorough search for all relevant studies, the actual amount of grey/unpublished literature included, and the types, varies considerably across meta-analyses.”
“In sum, it is possible that the studies in a meta-analysis may overestimate the true effect size because they are based on a biased sample of the target population of studies. But how do we deal with this concern? The only true test for publication bias is to compare effects in the published studies formally with effects in the unpublished studies. This requires access to the unpublished studies, and if we had that we would no longer be concerned. Nevertheless, the best approach would be for the reviewer to perform a truly comprehensive search of the literature, in hopes of minimizing the bias. In fact, there is evidence that this approach is somewhat effective. Cochrane reviews tend to include more studies and to report a smaller effect size than similar reviews published in medical journals. Serious efforts to find unpublished, and difficult to find studies, typical of Cochrane reviews, may therefore reduce some of the effects of publication bias. Despite the increased resources that are needed to locate and retrieve data from sources such as dissertations, theses, conference papers, government and technical reports and the like, it is generally indefensible to conduct a synthesis that categorically excludes these types of research reports. Potential benefits and costs of grey literature searches must be balanced against each other.”
“Since we cannot be certain that we have avoided bias, researchers have developed methods intended to assess its potential impact on any given meta-analysis. These methods address the following questions:
*Is there evidence of any bias?
*Is it possible that the entire effect is an artifact of bias?
*How much of an impact might the bias have? [...]
Methods developed to address publication bias require us to make many assumptions, including the assumption that the pattern of results is due to bias, and that this bias follows a certain model. [...] In order to gauge the impact of publication bias we need a model that tells us which studies are likely to be missing. The model that is generally used [...] makes the following assumptions: (a) Large studies are likely to be published regardless of statistical significance because these involve large commitments of time and resources. (b) Moderately sized studies are at risk for being lost, but with a moderate sample size even modest effects will be significant, and so only some studies are lost here. (c) Small studies are at greatest risk for being lost. Because of the small sample size, only the largest effects are likely to be significant, with the small and moderate effects likely to be unpublished.
The combined result of these three items is that we expect the bias to increase as the sample size goes down, and the methods described [...] are all based on this model. [...] [One problem is however that] when there is clear evidence of asymmetry, we cannot assume that this reflects publication bias. The effect size may be larger in small studies because we retrieved a biased sample of the smaller studies, but it is also possible that the effect size really is larger in smaller studies for entirely unrelated reasons. For example, the small studies may have been performed using patients who were quite ill, and therefore more likely to benefit from the drug (as is sometimes the case in early trials of a new compound). Or, the small studies may have been performed with better (or worse) quality control than the larger ones. Sterne et al. (2001) use the term small-study effect to describe a pattern where the effect is larger in small studies, and to highlight the fact that the mechanism for this effect is not known.”
“It is almost always important to include an assessment of publication bias in relation to a meta-analysis. It will either assure the reviewer that the results are robust, or alert them that the results are suspect.”
I’m currently reading this book. I’ve written about this kind of stuff before here on the blog, so there are some observations from the book which I’ve decided not to repeat here even if it’s stuff that’s nice to know – instead I refer to these posts on the topic (I should perhaps clarify that a few of the observations made in those posts are observations I’d have liked the authors to include in the book as well, though they decided not to..). It’s worth mentioning that many other psychology-related posts in the archives also deal with stuff covered in the book, though the focus has often been different – one example would be this post, but there are lots of others as well.
I like the book and it’s certainly worth reading even if at times it’s been somewhat speculative and I have disagreed with a few of the authors about the interpretation of the research results they’ve presented. Reading a book like this one may easily make you question what you know about yourself and how you think about yourself, and this is probably a very good thing for me to do. People reading along here may not know this, but I have quite low self-esteem – I however along the way got curious about where I’d actually score on metrics usually applied in the literature, and so I decided to have a go at the Rosenberg self-esteem scale before publishing this post. You should check it out if you’re the least bit interested, there are only 10 questions and it shouldn’t take you very long to answer the questions. My score was 6. It’s best thought of as a state estimate, not a trait estimate, but I doubt I’d ever score anywhere near 15 if not on illegal drugs or severely intoxicated by alcohol. It’s perhaps noteworthy in the context of this test that “average scores for most self-esteem instruments are well above the midpoint of their response scales (more than one standard deviation in many cases[)]“.
The book is not a self-help book, it’s a research book dealing with (parts of) the psychological literature written on the topic (a lot of stuff has been written on this topic: “Self-esteem is clearly one of the most popular topics in modern psychology, with more than 35,000 publications on the subject of this construct”). That said, a few tentative conclusions about how ‘healthy self-esteem’ may be different from ‘unhealthy self-esteem’ can be drawn from the literature – or anyway have been drawn from the literature by some of the authors of the book, whether or not they should have been drawn – and I’ve included some of these in the post below. An important general point in that context is that self-esteem is a complex trait; “there is far more to self-esteem than simply whether global self-esteem is high or low”. I’ve added a few comments about some key moderating variables below.
The book has a lot of good stuff and unlike a few of the books I’ve recently read it’s relatively easy to blog, in the sense that a somewhat high proportion of the total content would be stuff I could justify including in a post like this. If you like what I’ve included in the post below, I think it’s quite likely you’ll like the book.
With those things out of the way, below some observations from the first half of the book.
“self-esteem is generally considered to be the evaluative aspect of self-knowledge that reflects the extent to which people like themselves and believe they are competent [...]. High self-esteem refers to a highly favorable view of the self, whereas low self-esteem refers to evaluations of the self that are either uncertain or outright negative [...] self-esteem reflects perception rather than reality.
Self-esteem is considered to be a relatively enduring characteristic that possesses both motivational and cognitive components [...]. Individuals tend to show a desire for high levels of self-esteem and engage in a variety of strategies to maintain or enhance their feelings of self-worth [...] Individuals with different levels of self-esteem tend to adopt different strategies to regulate their feelings of self-worth, such that those with high self-esteem are more likely to focus their efforts on further increasing their feelings of self-worth (i.e., self-enhancement), whereas those with low self-esteem are primarily concerned with not losing the limited self-esteem resources they already possess (i.e., self-protection[)] [...] In contrast to the self-enhancing tendencies exhibited by those with high self-esteem, individuals with low levels of self-esteem are more likely to employ self-protective strategies characterized by a reluctance to call attention to themselves, attempts to prevent their bad qualities from being noticed, and an aversion to risk. In essence, individuals with low self-esteem tend to behave in a manner that is generally cautious and conservative [...] the risks taken by individuals with low self-esteem appear to have a greater potential cost for them than for those with high self-esteem because those with low self-esteem lack the evaluative resources necessary to buffer themselves from the self-esteem threats that accompany negative experiences such as failure and rejection.”
“According to the sociometer model, self-esteem has a status-tracking property such that the feelings of self-worth possessed by an individual depend on the level of relational value that the individual believes he or she possesses [...] In essence, the sociometer model suggests that self-esteem is analogous to a gauge that tracks gains in perceived relational value (accompanied by increases in self-esteem) as well as losses in perceived value (accompanied by decreases in self-esteem). [...] Although the sociometer model has been extremely influential, it may provide only a partial representation of the way this information is transferred between the individual and the social environment. That is, status-tracking models of self-esteem have focused exclusively on the influence that perceived standing has on feelings of self-worth [...] without addressing the possibility that self-esteem also influences how others perceive the individual [...] The status-signaling model of self-esteem [...] provides a complement to the sociometer model by addressing the possibility that self-esteem influences how individuals present themselves to others and alters how those individuals are perceived by their social environment. [...] The existing data has supported this basic idea”
“A wide array of studies have shown clear and consistent evidence that individuals who report more positive feelings of self-worth are also more emotionally stable and less prone to psychological distress than those who do not feel as good about themselves [...] There is little debate that self-esteem is positively associated with outcomes such as self-reported happiness [...] and overall life satisfaction [...] Although there is a clear link between low self-esteem and psychopathology, the reason for this connection [however] remains unclear.”
“The model of evaluative self-organization measures the distribution of positively and negatively valenced self-beliefs across self-aspects (i.e., contexts). This model highlights individual differences in the organization of positive and negative beliefs into same- or mixed-valenced self-aspects, labeled compartmentalization and integration, respectively [...] the basic model outlines two types of self-organizations: Evaluative compartmentalization, wherein individuals separate their positive and negative self-beliefs into distinct self-aspects, and evaluative integration, wherein individuals intermix positive and negative self-beliefs in each of their multiple self-aspects [...] Compartmentalized selves are highly differentiated. [...] This suggests that compartmentalized individuals have [relatively high] affect intensity [...] with evaluative integration, there appears to be lower affective intensity. Trait self-esteem is gauged less heavily on affect, but more by cognitive features. Integratives possibly weigh their positive and negative beliefs by more objective standards, such as overall social position. Moreover, state self-esteem is often consistent with trait self-esteem because the situation does not often change the qualities of self-beliefs [...] A second important feature of evaluative self-organization is differential importance [...]. Some selves are considered subjectively more important than others and, naturally, these important selves weigh heavily in self-esteem judgments.”
“Individuals whose positive self-aspects are more important than their negatives (differential importance) are referred to as positively compartmentalized or positively integrative; and those whose negative selves are most important are referred to as negatively compartmentalized or negative integrative. [...] Both negatively compartmentalized and integrative individuals feel as though acceptance from others is beyond their control, but their reactions and approaches to life may differ dramatically. [...] negatively compartmentalized individuals strive to obtain belongingness in much the same way as their positively compartmentalized counterparts, but they fail in their efforts to live up to contingencies. This likely makes social acceptance particularly desirable and the inability to achieve it all the more frustrating, culminating in a despairing form of low self-esteem (a judgment they might arrive at reluctantly). On the other hand, negative integratives’ response to rejection seems more accepting, as if they can simply conclude that they are not worthy of acceptance and concede that their needs are unlikely ever to be met: a defeated form of low self-esteem.”
“People with HSE [high self-esteem] vs LSE [low self-esteem] have very different ways of orienting to their social worlds and regulating feelings of safety and security in response to self-esteem threats. HSE people’s self-confidence and interpersonal security motivates them to strive for positive end states (e.g., positive affect, social rewards) more than avoiding negative end states (e.g., loss of self-esteem, rejection). For example, following self-esteem threats, HSEs are quicker to access their strengths relative to weaknesses, are likely to dismiss the validity of negative feedback, derogate out-group members, make self-serving biases, and express increased zeal about value-laden opinions and ideologies [...] In relational contexts, HSEs often show approach-motivated responses by drawing closer to their relationship partners following threat. [...] In close relationships, LSEs [on the other hand] typically adopt avoidance strategies following threat, such as distancing from their romantic partner and devaluing their relationship [...] When LSEs fail, they feel ashamed and humiliated [...], generalize the failure to other aspects of themselves [...], have difficulty accessing positive thoughts about themselves [...], are less likely to show self-serving biases [...], and are thought to possess fewer positive aspects of their self-image with which to affirm themselves [...] Such findings support the idea that people who feel relationally devalued prioritize self-protection goals over self-enhancement goals or relationship-promotion goals, especially under conditions of heightened threat or perceived risk [...] [various] findings support the idea that whereas HSEs regulate their responses to threat by defending and maintaining their favorable self-views, LSEs regulate their emotional reactions by withdrawing from the situation [...] to avoid further loss of self-esteem. [...] Overall, then, these studies suggest that interpersonally motivated responses – to draw closer or distance from others following threat – depends on one’s global level of self-esteem, the degree to which self-worth is invested in a domain, and whether or not one experiences a threat to that domain.”
“We define consistency in this chapter as rank-order stability, which is typically assessed using test-retest correlations [...] The degree of rank-order stability is an important consideration when evaluating whether the construct of self-esteem is more state- or trait-like [...]. Psychological traits such as the Big Five typically exhibit high stability over time, whereas mood and other states tend to exhibit lower levels of stability [...]. Although debate persists [...], we believe that the evidence now supports the conclusion that self-esteem is best conceptualized as a stable trait. Most notably, Trzesniewski and colleagues (2003) examined the rank-order stability of self-esteem using data from 50 published articles (N = 29,839). They found that test-retest correlations are moderate in magnitude and comparable to those found for personality traits [...] The rank-order stability of self-esteem showed a robust curvilinear trend [as a function of age] [...] Recent studies have replicated the curvilinear trend for the Big Five personality domains [...], suggesting that the pattern observed for self-esteem may reflect a more general developmental process. [...] The increasing consistency of self-esteem from childhood to mid-life conforms well to the cumulative continuity principle [...], which states that psychological traits become more consistent as individuals mature into adulthood. [...] After decades of contentious debate [...], research accumulating over the past several years suggests that there are reliable age differences in self-esteem across the life span [...]. A broad generalization is that levels of self-esteem decline from childhood to adolescence, increase during the transition to adulthood, reach a peak sometime in middle adulthood, and decrease in old age [...] there is [...] a biological component to self-esteem that is being increasingly recognized. Twin studies indicated that genetic factors account for about 40% of the observed variability in self-esteem
(Neiss, Sedikides, & Stevenson, 2002). The relatively high heritability of self-esteem [...] approaches that found for basic personality traits”
“Perceptions by others may shape self-esteem and these reflected appraisals have long been implicated in self-esteem development [...] perceiving one’s partner as supportive and loving leads to greater self-esteem over time, and perceiving a partner to view one less positively leads to diminished self-esteem over time. Similarly, being viewed as competent and liked by peers may promote self-esteem, whereas being viewed as incompetent and disliked by peers may diminish self-esteem [...]. One complicated issue [...] is that self-esteem may actually shape how individuals perceive the world [...]. Individuals with low self-esteem may perceive peer rejection and negativity even when peers do not actually harbor such perspectives. These self-perceptions – whether true or not – may reinforce levels of self-esteem. [...] An emerging body of evidence suggests that people with low self-esteem elicit particular responses from the social environment. [...] For example, people report being more interested in voting for presidential candidates that are perceived as having higher self-esteem and people perceived as having higher self-esteem are thought to make more desirable relationship partners, particularly when the high self-esteem target is male [...] In many cases, the environmental stimuli evoked by self-esteem seems to follow the “corresponsive principle” [...] of personality development, the idea that life experiences accentuate the characteristics that were initially responsible for the environmental experiences in the first place. For instance, when low self-esteem invites victimization, it is likely that peer victimization will further depress self-esteem. Similarly, Holmes and Wood (2009) found that individuals with low self-esteem are less disclosing in interpersonal settings, which tends to hamper the development of close relationships. As they note, “the avoidance of risk is self-defeating, resulting in lost social opportunities, the very lack of close connection that [individuals with low self-esteem] fear, and the perpetuation of their low self-esteem””
“differences in traits motivate individuals to select certain situations over others. These processes will also facilitate continuity by the corresponsive principle [...] The idea that self-esteem is related to the kinds of environments individuals select for themselves is also consistent with Swann’s self-verification theory, which proposes that individuals are motivated to confirm their pre-existing self-views [...]. That is, individuals with low self-esteem seek contexts that confirm and maintain their low self-regard whereas individuals with high self-esteem seek contexts that promote their high self-regard. This process might explain why individuals with low self-esteem prefer certain kinds of relationships that can involve negative feedback [...]. An important caveat is that individuals with low self-esteem will prefer negative feedback in relationship contexts with a low risk of rejection. The gist is that a romantic partner can be negative but not rejecting. Nonetheless, the upshot of self-verification motives is that they tend to promote the consistency of self-views.”
“Direct measures of self-esteem do a good job of assessing one’s overall level of self-esteem (i.e., global self-evaluation) but individuals with both secure and fragile forms of high self-esteem will report feeling good about themselves and they should score equally high on these measures of self-esteem. Consequently, researchers have begun considering factors beyond self-esteem level in order to identify individuals with the fragile form of high self-esteem. The three main approaches have been to consider whether high self-esteem is contingent, unstable, or accompanied by low implicit self-esteem [...] Contingent self-esteem refers to self-evaluations that depend on meeting standards of performance, approval, or acceptance in order to be maintained. This is a fragile form of high self-esteem because individuals only feel good about themselves when they are able to meet these standards [...] Deci and Ryan [...] argued in their self-determination theory that some people regulate their behavior and goals based on introjected standards – most typically they internalize significant others’ conditional standards of approval – which causes them to develop contingent self-esteem. These individuals become preoccupied with meeting externally derived standards or expectations in order to maintain their feelings of self-worth. As a result, their self-esteem is continually “on the line.” [...] Tellingly, drops in self-esteem that result from failures in contingent domains are often greater in magnitude than boosts that result from successes [...]. This asymmetry may contribute to the overall fragility of contingent self-esteem.
As a result of their positive self-views being continuously on the line, individuals with contingent high self-esteem may go to great lengths to guard against threatening information by engaging in practices such as blaming others for their failures, derogating people who criticize them, or distorting or denying information that reflects poorly on them [...] Taken together, this evidence suggests that contingent high self-esteem is fragile and related to defensiveness.”
“Another indicator of fragile self-esteem is self-esteem instability [...] Unstable high self-esteem is considered to be a form of fragile high self-esteem because these fluctuations in feelings of self-worth suggest that the positive attitudes these individuals hold about themselves are vulnerable to challenges or threats. Contingent self-esteem can contribute to self-esteem instability. [...] Some self-esteem contingencies are more likely to induce self-esteem instability than others. For example, those contingencies of self-worth that have been identified by Crocker and her colleagues [...] as external [...] appear to be more closely associated with self-esteem instability than those contingencies that are internal [...] individuals with unstable high self-esteem are more self-aggrandizing and indicate they are more likely to boast to their friends about their successes than individuals with stable high self-esteem [...] When they actually do perform well (e.g., on an exam), individuals with unstable high self-esteem are more likely to claim that they did so in spite of performance-inhibiting factors [...] The final major indicator of fragile self-esteem is high self-esteem that is accompanied by low levels of implicit self-esteem [...]. A number of longstanding theories suggest that some individuals with high self-esteem are defensive, aggressive, and boastful because they harbor negative self-feelings at less conscious levels [...] We can now offer a provisional answer to the question of what constitutes “healthy” self-esteem: it is high self-esteem that is relatively non-contingent, stable, and accompanied by high implicit self-esteem. [...] The key to healthy self-esteem may [...] be to take one’s focus away from self-esteem. Self-determination theory posits that secure (or “true”) high self-esteem arises from acting in accordance with one’s authentic values and interests rather than regulating behavior around self-esteem concerns [...] Somewhat ironically, the most effective way to cultivate healthy self-esteem may be to worry less about having high self-esteem.”
I finished this book earlier today. This is a really hard book to blog, and I ended up concluding that I should just cover the material here by adding some links to relevant wiki articles dealing with stuff also covered in the book. It’s presumably far from obvious to someone who has not read the book how the different links are related to each other, but adding details about this as well would mean rewriting the book. Consider the articles samples of the kind of stuff covered in the book.
It should be noted that the book has very few illustrations and a lot of (/nothing but) formulas, theorems, definitions, and examples. Lots of proofs, naturally. I must admit I found the second half of the book unnecessarily hard to follow because it unlike the first half included no images and illustrations at all (like the ones in the various wiki articles below) – you don’t need an illustration on every second page to follow this stuff, but in my opinion you do need some occasionally to help you imagine what’s actually going on; this stuff is quite abstract enough to begin with without removing all ‘visual aids’. Some of the stuff covered in the book was review, but a lot of the stuff covered was new to me and I did have a few ‘huh, I never really realized you could think about [X] that way‘-experiences along the way. One problem making me want to read the book was also that although I’ve encountered some of the concepts introduced along the way – to give an example, the axiom of choice and various properties of sets, such as countability, came up during a micro course last year – I’ve never really read a book about that stuff so it’s sort of felt disconnected and confusing to some extent because I’ve been missing the big picture. It was of course too much to ask for to assume that this book would give me the big picture, but it’s certainly helped a bit to read this book. I incidentally did not find it too surprising that the stuff in this book sort of overlaps a bit with some stuff I’ve previously encountered in micro; in a book like this one there are a lot of thoughts and ideas about how to set up a mathematical system, and you start out at a very simple level and then gradually add more details along the way when you encounter problems or situations which cannot be dealt with with the tools/theorems/axioms already at your disposal – this to me seems to be a conceptually similar approach to numbers and mathematics as the approach to modelling preferences and behaviour which is applied in e.g. Mas-Colell et al.‘s microeconomics textbook, so it makes sense that some similar questions and considerations pop up. A related point is that some of the problems the two analytical approaches are trying to solve are really quite similar; a microeconomist wants to know how best to define preferences and then use the relevant definitions to compare people’s preferences, e.g. by trying to order them in various ways, and he wants to understand which ways of mathematically manipulating the terms used are permissible and which are not – activities which, at least from a certain point of view, are not that different from those of set theorists such as those who wrote this book.
I decided not to rate the book in part because I don’t think it’s really fair to rate a math book when you’re not going to have a go at the exercises. I wanted an overview, I don’t care enough about this stuff to dive in deep and start proving stuff on my own. I was as mentioned annoyed by the lack of illustrations in the second part and occasionally it was really infuriating to encounter the standard ‘…and so it’s obvious that…’ comments which you can find in most math textbooks, but I don’t know how much these things should really matter. Another reason why I decided not to rate the book is that I’ve been suffering from some pretty bad noise pollution over the last days, which has meant that I’ve occasionally had a really hard time concentrating on the mathematics (‘if I’d just had some peace and quiet I might have liked the book much better than I did, and I might have done some of the exercises as well’, is my working hypothesis).
Okay, as mentioned I’ve added some relevant links below – check them out if you want to know more about what this book’s about and what kind of mathematics is covered in there. I haven’t read all of those links, but I’ve read about those things by reading the book. I should note, in case it’s not clear from the links, that some of this stuff is quite weird.
Bijection, injection and surjection.
Family of sets.
Axiom of choice.
Axiom of infinity.
Cantor’s diagonal argument.
Cantor’s continuum hypothesis.
Below some observations from Holmes et al.‘s chapters about the sexually transmitted bacterial infections chlamydia and gonorrhea. A few of these chapters covered some really complicated stuff, but I’ve tried to keep the coverage reasonably readable by avoiding many of the technical details. I’ve also tried to make the excerpts easier to read by adding relevant links and by adding brief explanations of specific terms in brackets where this approach seemed like it might be helpful.
“Since the early 1970s, Chlamydia trachomatis has been recognized as a genital pathogen responsible for an increasing variety of clinical syndromes, many closely resembling infections caused by Neisseria gonorrhoeae [...]. Because many practitioners have lacked access to facilities for laboratory testing for chlamydia, these infections often have been diagnosed and treated without benefit of microbiological confirmation. Newer, molecular diagnostic tests have in part now addressed this problem [...] Unfortunately, many chlamydial infections, particularly in women, are difficult to diagnose clinically and elude detection because they produce few or no symptoms and because the symptoms and signs they do produce are nonspecific. [...] chlamydial infections tend to follow a fairly self-limited acute course, resolving into a low-grade persistent infection which may last for years. [...] The disease process and clinical manifestations of chlamydial infections probably represent the combined effects of tissue damage from chlamydial replication and inflammatory responses to chlamydiae and the necrotic material from destroyed host cells. There is an abundant immune response to chlamydial infection (in terms of circulating antibodies or cell-mediated responses), and there is evidence that chlamydial diseases are diseases of immunopathology. [...] A common pathologic end point of chlamydial infection is scarring of the affected mucous membranes. This is what ultimately leads to blindness in trachoma and to infertility and ectopic pregnancy after acute salpingitis. There is epidemiologic evidence that repeated infection results in higher rates of sequelae.”
“The prevalence of chlamydial urethral infection has been assessed in populations of men attending general medical clinics, STD clinics, adolescent medicine clinics, and student health centers and ranges from 3–5% of asymptomatic men seen in general medical settings to 15–20% of all men seen in STD clinics. [...] The overall incidence of C. trachomatis infection in men has not been well defined, since in most countries these infections are not officially reported, are not microbiologically
confirmed, and often may be asymptomatic, thus escaping detection. [...] The prevalence of chlamydial infection has been studied in pregnant women, in women attending gynecology or family planning clinics, in women attending STD clinics, in college students, and in women attending general medicine or family practice clinics in school-based clinics and more recently in population-based studies. Prevalence of infection in these studies has ranged widely from 3% in asymptomatic women in community-based surveys to over 20% in women seen in STD clinics.[31–53] During pregnancy, 3–7% of women generally have been chlamydia positive [...] Several studies in the United States indicate that approximately 5% of neonates acquire chlamydial infection perinatally, yet antibody prevalence in later childhood before onset of sexual activity may exceed 20%.”
“Clinically, chlamydia-positive and chlamydia-negative NGU [Non-Gonococcal Urethritis] cannot be differentiated on the basis of signs or symptoms. Both usually present after a 7–21-day incubation period with dysuria and mild-to-moderate whitish or clear urethral discharge. Examination reveals no abnormalities other than the discharge in most cases [...] Clinical recognition of chlamydial cervicitis depends on a high index of suspicion and a careful cervical examination. There are no genital symptoms that are specifically correlated with chlamydial cervical infection. [...] Although urethral symptoms may develop in some women with chlamydial infection, the majority of female STD clinic patients with urethral chlamydial infection do not have dysuria or frequency. [...] the majority of women with chlamydial infection cannot be distinguished from uninfected women either by clinical examination or by [...] simple tests and thus require the use of specific diagnostic testing. [...] Since many chlamydial infections are asymptomatic, it has become clear that effective control must involve periodic testing of individuals at risk. As the cost of extensive screening may be prohibitive, various approaches to defining target populations at increased risk of infection have been evaluated. One strategy has been to designate patients attending specific high prevalence clinic populations for universal testing. Such clinics would include STD, juvenile detention, and some family planning clinics. This approach, however, fails to account for the majority of asymptomatic infections, since attendees at high prevalence clinics often attend because of symptoms or suspicion of infection. Consequently, selective screening criteria have been developed for use in various clinical settings.[204–208] Among women, young age (generally, <24 years) is a critical risk factor for chlamydial infection in almost all studies. [...] The practical implementation of screening programs in settings with low-to-moderate chlamydia prevalence requires that the prevalence at which selective screening becomes cost effective relative to universal screening must be defined. Toward this end, a number of investigators have undertaken cost-effectiveness analyses. Most of these analyses have concluded that universal screening is preferred in settings with chlamydia prevalence above 3–7%.”
If you’re a woman who’s decided not to have children and so aren’t terribly worried about infertility, it should be emphasized that untreated chlamydia can cause other really unpleasant stuff as well, like chronic pelvic pain from pelvic inflammatory disease, or ectopic pregnancy, which may be life-threatening. This is the sort of infection you’ll want to get treated even if you’re not bothered by symptoms.
Neisseria gonorrhoeae (gonococci) is the etiologic agent of gonorrhea and its related clinical syndromes (urethritis, cervicitis, salpingitis, bacteremia, arthritis, and others). It is closely related to Neisseria meningitidis (meningococci), the etiologic agent of one form of bacterial meningitis, and relatively closely to Neisseria lactamica, an occasional human pathogen. The genus Neisseria includes a variety of other relatively or completely nonpathogenic organisms that are principally important because of their occasional diagnostic confusion with gonococci and meningococci. [...] Many dozens of specific serovars have been defined [...] By a combination of auxotyping and serotyping [...] gonococci can be divided into over 70 different strains; the number may turn out to be much larger.”
“Humans are the only natural host for gonococci. Gonococci survive only a short time outside the human body. Although gonococci can be cultured from a dried environment such as a toilet seat up to 24 hours after being artificially inoculated in large numbers onto such a surface, there is virtually no evidence that natural transmission occurs from toilet seats or similar objects. Gonorrhea is a classic example of an infection spread by contact: immediate physical contact with the mucosal surfaces of an infected person, usually a sexual partner, is required for transmission. [...] Infection most often remains localized to initial sites of inoculation. Ascending genital infections (salpingitis, epididymitis) and bacteremia, however, are relatively common and account for most of the serious morbidity due to gonorrhea.”
“Consideration of clinical manifestations of gonorrhea suggests many facets of the pathogenesis of the infection. Since gonococci persist in the male urethra despite hydrodynamic forces that would tend to wash the organisms from the mucosal surface, they must be able to adhere effectively to mucosal surfaces. Similarly, since gonococci survive in the urethra despite close attachment to large numbers of neutrophils, they must have mechanisms that help them to survive interactions with polymorphonuclear neutrophils. Since some gonococci are able to invade and persist in the bloodstream for many days at least, they must be able to evade killing by normal defense mechanisms of plasma [...] Invasion of the bloodstream also implies that gonococci are able to invade mucosal barriers in order to gain access to the bloodstream. Repeated reinfections of the same patient by one strain strongly suggest that gonococci are able to change surface antigens frequently and/or to escape local immune mechanisms [...] The considerable tissue damage of fallopian tubes consequent to gonococcal salpingitis suggests that gonococci make at least one tissue toxin or gonococci trigger an immune response that results in damage to host tissues. There is evidence to support many of these inferences. [...] Since the mid-1960s, knowledge of the molecular basis of gonococcal–host interactions and of gonococcal epidemiology has increased to the point where it is amongst the best described of all microbial pathogens. [...] Studies of pathogenesis are [however] complicated by the absence of a suitable animal model. A variety of animal models have been developed, each of which has certain utility, but no animal model faithfully reproduces the full spectrum of naturally acquired disease of humans.”
“Gonococci are inherently quite sensitive to antimicrobial agents, compared with many other gram-negative bacteria. However, there has been a gradual selection for antibioticresistant mutants in clinical practice over the past several decades [...] The consequence of these events has been to make penicillin and tetracycline therapy ineffective in most areas. Antibiotics such as spectinomycin, ciprofloxacin, and ceftriaxone generally are effective but more expensive than penicillin G and tetracycline. Resistance to ciprofloxacin emerged in SE Asia and Africa in the past decade and has spread gradually throughout much of the world [...] Streptomycin (Str) is not frequently used for therapy of gonorrhea at present, but many gonococci exhibit high-level resistance to Str. [...] Resistance to fluoroquinolones is increasing, and now has become a general problem in many areas of the world.”
“The efficiency of gonorrhea transmission depends on anatomic sites infected and exposed as well as the number of exposures. The risk of acquiring urethral infection for a man following a single episode of vaginal intercourse with an infected woman is estimated to be 20%, rising to an estimated 60–80% following four exposures. The prevalence of infection in women named as secondary sexual contacts of men with gonococcal urethritis has been reported to be 50–90%,[16,17] but no published studies have carefully controlled for number of exposures. It is likely that the single-exposure transmission rate from male to female is higher than that from female to male [...] Previous reports saying that 80% of women with gonorrhea were asymptomatic were most often based on studies of women who were examined in screening surveys or referred to STD clinics because of sexual contact with infected men. Symptomatic infected women who sought medical attention were thus often excluded from such surveys. However [...] more than 75% of women with gonorrhea attending acute care facilities such as hospital emergency rooms are symptomatic. The true proportion of infected women who remain asymptomatic undoubtedly lies between these extremes [...] Asymptomatic infections occur in men as well as women [...] Asymptomatically infected males and females contribute disproportionately to gonorrhea transmission, because symptomatic individuals are more likely to cease sexual activity and seek medical care.”
“the incidence of asymptomatic urethral gonococcal infection in the general population also has been estimated at approximately 1–3%. The prevalence of asymptomatic infection may be much higher, approaching 5% in some studies, because untreated asymptomatic infections may persist for considerable periods. [...] The prevalence of gonorrhea within communities tends to be dynamic, fluctuating over time, and influenced by a number of interactive factors. Mathematical models for gonorrhea within communities suggest that gonorrhea prevalence is sustained not only through continued transmission by asymptomatically infected patients but also by “core group” transmitters who are more likely than members of the general population to become infected and transmit gonorrhea to their sex partners. [...] At present, gonorrhea prevention and control efforts are heavily invested in the concept of vigorous pursuit and treatment of infected core-group members and asymptomatically infected individuals.”
“Relatively large numbers (>50) of gonococcal A/S [auxotype/serotype] classes usually are present in most communities simultaneously [...] and new strains can be detected over time. The distribution of isolates within A/S classes tends to be uneven, with a few A/S classes contributing disproportionately to the total number of isolates. These predominant A/S classes generally persist within communities for months or years. [...] Interviews of the patients infected by [a specific] strain early in [an] outbreak identified one infected female who acknowledged over 100 different sexual partners over the preceding 2 months, suggesting that she may have played an important role in the introduction and establishment of this gonococcal strain in the community. Thus the Proto/IB-3 strain may have become common in Seattle not because of specific biologic factors but because of its chance of transmission to members of a core population by a high-frequency transmitter.” [100+ partners over a 2 month period! I was completely dumbstruck when I’d read that.]
“clinical gonorrhea is manifested by a broad spectrum of clinical presentations including asymptomatic and symptomatic local infections, local complicated infections, and systemic dissemination. [...] Acute anterior urethritis is the most common manifestation of gonococcal infection in men. The incubation period ranges from 1 to 14 days or even longer; however, the majority of men develop symptoms within 2–5 days [...] The predominant symptoms are urethral discharge or dysuria [pain on urination]. [...] Without treatment, the usual course of gonococcal urethritis is spontaneous resolution over a period of several weeks, and before the development of effective antimicrobial therapy, 95% of untreated patients became asymptomatic within 6 months. [...] The incubation period for urogenital gonorrhea in women is less certain and probably more variable than in men, but most who develop local symptoms apparently do so within 10 days of infection.[51,52] The most common symptoms are those of most lower genital tract infections in women [...] and include increased vaginal discharge, dysuria, intermenstrual uterine bleeding, and menorrhagia [abnormally heavy and prolonged menstrual period], each of which may occur alone or in combination and may range in intensity from minimal to severe. [...] The clinical assessment of women for gonorrhea is often confounded [...] by the nonspecificity of these signs and symptoms and by the high prevalence of coexisting cervical or vaginal infections with Chlamydia trachomatis, Trichomonas vaginalis, Candida albicans, herpes simplex virus, and a variety of other organisms [...] Among coinfecting agents for patients with gonorrhea in the United States, C. trachomatis [chlamydia] is preeminent. Up to 10–20% of men and 20–30% of women with acute urogenital gonorrhea are coinfected with C. trachomatis.[10,46,76,139–141] In addition, substantial numbers of women with acute gonococcal infection have simultaneous T. vaginalis infections.”
“Among patients with gonorrhea, pharyngeal infection occurs in 3–7% of heterosexual men, 10–20% of heterosexual women, and 10–25% of homosexually active men. [...] Gonococcal infection is transmitted to the pharynx by orogenital sexual contact and is more efficiently acquired by fellatio than by cunnilingus.“
“In men, the most common local complication of gonococcal urethritis is epididymitis [...], a syndrome that occurred in up to 20% of infected patients prior to the availability of modern antimicrobial therapy. [...] Postinflammatory urethral strictures were common complications of untreated gonorrhea in the preantibiotic era but are now rare [...] In acute PID [pelvic inflammatory disease], the clinical syndrome comprised primarily of salpingitis, and frequently including endometritis, tubo-ovarian tuboovarian abscess, or pelvic peritonitis is the most common complication of gonorrhea in women, occurring in an estimated 10–20% of those with acute gonococcal infection.[75,76] PID is the most common of all complications of gonorrhea, as well as the most important in terms of public-health impact, because of both its acute manifestations and its longterm sequelae (infertility, ectopic pregnancy, and chronic pelvic pain).”
“A major impediment to use of culture for gonorrhea diagnosis in many clinical settings are the time, expense, and logistical limitations such as specimen transport to laboratories for testing, a process that may take several days and result in temperature variation or other circumstances that can jeopardize culture viability. In recent years, reliable nonculture assays for gonorrhea detection have become available and are being used increasingly. [...] recently, nucleic acid amplification tests (NAATs) for gonorrhea diagnosis have become widely available.[116,117] Assays based on polymerase chain reaction (PCR), transcription-mediated amplification (TMA), and other nucleic acid amplification technologies have been developed. As a group, commercially available NAATs are more sensitive than culture for gonorrhea diagnosis and specificities are nearly as high as for culture. [...] Emerging data suggest that most currently available NAATs are substantially more sensitive for gonorrhea detection than conventional culture.”
“Prior to the mid-1930s, when sulfanilamide was introduced, gonorrhea therapy involved local genital irrigation with antiseptic solutions such as silver nitrate [...] By 1944 [...] many gonococci had become sulfanilamide resistant [...] Fortunately, in 1943 the first reports of the near 100% utility of penicillin for gonorrhea therapy were published, and by the end of World War II, as penicillin became available to the general public, it quickly became the therapy of choice. Since then, continuing development of antimicrobial resistance by N. gonorrhoeae[128,129] led to regular revisions of recommended gonorrhea therapy. From the 1950s until the mid-1970s, gradually increasing chromosomal penicillin resistance led to periodic increases in the amount of penicillin required for reliable therapy. [...] by the late 1980s, penicillins and tetracyclines were no longer recommended for gonorrhea therapy.
In addition to resistance to penicillin, tetracyclines, and erythromycin, in 1987, clinically significant chromosomally mediated resistance to spectinomycin — another drug recommended for gonorrhea therapy — was described in U.S. military personnel in Korea. In Korea, because of the high prevalence of PPNG [spectinomycin-resistant Penicillinase-Producing Neisseria Gonorrhoeae], in 1981, spectinomycin had been adopted as the drug of choice for gonorrhea therapy. By 1983, however, spectinomycin treatment failures were beginning to occur in patients with gonorrhea [...] Following recognition of the outbreak of spectinomycin-resistant gonococci in Korea, ceftriaxone became the drug of choice for treatment of gonorrhea in U.S. military personnel in that country. [...] Beginning in 1993, fluoroquinolone antibiotics were recommended for therapy of uncomplicated gonorrhea in the United States [...] [However] in 2007 the CDC opted to no longer recommend fluoroquinolone antibiotics for therapy of uncomplicated gonorrhea. This change meant that ceftriaxone and other cephalosporin antibiotics had become the sole class of antibiotics recommended as first-line therapy for gonorrhea. [...] For over two decades, ceftriaxone — a third-generation cephalosporin—has been the most reliable single-dose regimen used for gonorrhea worldwide. [...] there are currently few well-studied therapeutic alternatives to ceftriaxone for gonorrhea treatment.”
“Since meta-analysis is a relatively new field, many people, including those who actually use meta-analysis in their work, have not had the opportunity to learn about it systematically. We hope that this volume will provide a framework that allows them to understand the logic of meta-analysis, as well as how to apply and interpret meta-analytic procedures properly.
This book is aimed at researchers, clinicians, and statisticians. Our approach is primarily conceptual. The reader will be able to skip the formulas and still understand, for example, the differences between fixed-effect and random-effects analysis, and the mechanisms used to assess the dispersion in effects from study to study. However, for those with a statistical orientation, we include all the relevant formulas, along with worked examples. [...] This volume is intended for readers from various substantive fields, including medicine, epidemiology, social science, business, ecology, and others. While we have included examples from many of these disciplines, the more important message is that meta-analytic methods that may have developed in any one of these fields have application to all of them.”
I’ve been reading this book and I like it so far – I’ve read about the topic before but I’ve been missing a textbook on this topic, and this one is quite good so far (I’ve read roughly half of it so far). Below I have added some observations from the first thirteen chapters of the book:
“Meta-analysis refers to the statistical synthesis of results from a series of studies. While the statistical procedures used in a meta-analysis can be applied to any set of data, the synthesis will be meaningful only if the studies have been collected systematically. This could be in the context of a systematic review, the process of systematically locating, appraising, and then synthesizing data from a large number of sources. Or, it could be in the context of synthesizing data from a select group of studies, such as those conducted by a pharmaceutical company to assess the efficacy of a new drug. If a treatment effect (or effect size) is consistent across the series of studies, these procedures enable us to report that the effect is robust across the kinds of populations sampled, and also to estimate the magnitude of the effect more precisely than we could with any of the studies alone. If the treatment effect varies across the series of studies, these procedures enable us to report on the range of effects, and may enable us to identify factors associated with the magnitude of the effect size.”
“For systematic reviews, a clear set of rules is used to search for studies, and then to determine which studies will be included in or excluded from the analysis. Since there is an element of subjectivity in setting these criteria, as well as in the conclusions drawn from the meta-analysis, we cannot say that the systematic review is entirely objective. However, because all of the decisions are specified clearly, the mechanisms are transparent. A key element in most systematic reviews is the statistical synthesis of the data, or the meta-analysis. Unlike the narrative review, where reviewers implicitly assign some level of importance to each study, in meta-analysis the weights assigned to each study are based on mathematical criteria that are specified in advance. While the reviewers and readers may still differ on the substantive meaning of the results (as they might for a primary study), the statistical analysis provides a transparent, objective, and replicable framework for this discussion. [...] If the entire review is performed properly, so that the search strategy matches the research question, and yields a reasonably complete and unbiased collection of the relevant studies, then (providing that the included studies are themselves valid) the meta-analysis will also be addressing the intended question. On the other hand, if the search strategy is flawed in concept or execution, or if the studies are providing biased results, then problems exist in the review that the meta-analysis cannot correct.”
“Meta-analyses are conducted for a variety of reasons [...] The purpose of the meta-analysis, or more generally, the purpose of any research synthesis has implications for when it should be performed, what model should be used to analyze the data, what sensitivity analyses should be undertaken, and how the results should be interpreted. Losing sight of the fact that meta-analysis is a tool with multiple applications causes confusion and leads to pointless discussions about what is the right way to perform a research synthesis, when there is no single right way. It all depends on the purpose of the synthesis, and the data that are available.”
“The effect size, a value which reflects the magnitude of the treatment effect or (more generally) the strength of a relationship between two variables, is the unit of currency in a meta-analysis. We compute the effect size for each study, and then work with the effect sizes to assess the consistency of the effect across studies and to compute a summary effect. [...] The summary effect is nothing more than the weighted mean of the individual effects. However, the mechanism used to assign the weights (and therefore the meaning of the summary effect) depends on our assumptions about the distribution of effect sizes from which the studies were sampled. Under the fixed-effect model, we assume that all studies in the analysis share the same true effect size, and the summary effect is our estimate of this common effect size. Under the random-effects model, we assume that the true effect size varies from study to study, and the summary effect is our estimate of the mean of the distribution of effect sizes. [...] A key theme in this volume is the importance of assessing the dispersion of effect sizes from study to study, and then taking this into account when interpreting the data. If the effect size is consistent, then we will usually focus on the summary effect, and note that this effect is robust across the domain of studies included in the analysis. If the effect size varies modestly, then we might still report the summary effect but note that the true effect in any given study could be somewhat lower or higher than this value. If the effect varies substantially from one study to the next, our attention will shift from the summary effect to the dispersion itself.”
“During the time period beginning in1959 and ending in 1988 (a span of nearly 30 years) there were a total of 33 randomized trials performed to assess the ability of streptokinase to prevent death following a heart attack. [...] The trials varied substantially in size. [...] Of the 33 studies, six were statistically significant while the other 27 were not, leading to the perception that the studies yielded conflicting results. [...] In 1992 Lau et al. published a meta-analysis that synthesized the results from the 33 studies. [...] [They found that] the treatment reduces the risk of death by some 21%. And, this effect was reasonably consistent across all studies in the analysis. [...] The narrative review has no mechanism for synthesizing the p-values from the different studies, and must deal with them as discrete pieces of data. In this example six of the studies were statistically significant while the other 27 were not, which led some to conclude that there was evidence against an effect, or that the results were inconsistent [...] By contrast, the meta-analysis allows us to combine the effects and evaluate the statistical significance of the summary effect. The p-value for the summary effect [was] p=0.0000008. [...] While one might assume that 27 studies failed to reach statistical significance because they reported small effects, it is clear [...] that this is not the case. In fact, the treatment effect in many of these studies was actually larger than the treatment effect in the six studies that were statistically significant. Rather, the reason that 82% of the studies were not statistically significant is that these studies had small sample sizes and low statistical power.”
“the [narrative] review will often focus on the question of whether or not the body of evidence allows us to reject the null hypothesis. There is no good mechanism for discussing the magnitude of the effect. By contrast, the meta-analytic approaches discussed in this volume allow us to compute an estimate of the effect size for each study, and these effect sizes fall at the core of the analysis. This is important because the effect size is what we care about. If a clinician or patient needs to make a decision about whether or not to employ a treatment, they want to know if the treatment reduces the risk of death by 5% or 10% or 20%, and this is the information carried by the effect size. [...] The p-value can tell us only that the effect is not zero, and to report simply that the effect is not zero is to miss the point. [...] The narrative review has no good mechanism for assessing the consistency of effects. The narrative review starts with p-values, and because the p-value is driven by the size of a study as well as the effect in that study, the fact that one study reported a p-value of 0.001 and another reported a p-value of 0.50 does not mean that the effect was larger in the former. The p-value of 0.001 could reflect a large effect size but it could also reflect a moderate or small effect in a large study [...] The p-value of 0.50 could reflect a small (or nil) effect size but could also reflect a large effect in a small study [...] This point is often missed in narrative reviews. Often, researchers interpret a nonsignificant result to mean that there is no effect. If some studies are statistically significant while others are not, the reviewers see the results as conflicting. This problem runs through many fields of research. [...] By contrast, meta-analysis completely changes the landscape. First, we work with effect sizes (not p-values) to determine whether or not the effect size is consistent across studies. Additionally, we apply methods based on statistical theory to allow that some (or all) of the observed dispersion is due to random sampling variation rather than differences in the true effect sizes. Then, we apply formulas to partition the variance into random error versus real variance, to quantify the true differences among studies, and to consider the implications of this variance.”
“Consider [...] the case where some studies report a difference in means, which is used to compute a standardized mean difference. Others report a difference in proportions which is used to compute an odds ratio. And others report a correlation. All the studies address the same broad question, and we want to include them in one meta-analysis. [...] we are now dealing with different indices, and we need to convert them to a common index before we can proceed. The question of whether or not it is appropriate to combine effect sizes from studies that used different metrics must be considered on a case by case basis. The key issue is that it only makes sense to compute a summary effect from studies that we judge to be comparable in relevant ways. If we would be comfortable combining these studies if they had used the same metric, then the fact that they used different metrics should not be an impediment. [...] When some studies use means, others use binary data, and others use correlational data, we can apply formulas to convert among effect sizes. [...] When we convert between different measures we make certain assumptions about the nature of the underlying traits or effects. Even if these assumptions do not hold exactly, the decision to use these conversions is often better than the alternative, which is to simply omit the studies that happened to use an alternate metric. This would involve loss of information, and possibly the systematic loss of information, resulting in a biased sample of studies. A sensitivity analysis to compare the meta-analysis results with and without the converted studies would be important. [...] Studies that used different measures may [however] differ from each other in substantive ways, and we need to consider this possibility when deciding if it makes sense to include the various studies in the same analysis.”
“The precision with which we estimate an effect size can be expressed as a standard error or confidence interval [...] or as a variance [...] The precision is driven primarily by the sample size, with larger studies yielding more precise estimates of the effect size. [...] Other factors affecting precision include the study design, with matched groups yielding more precise estimates (as compared with independent groups) and clustered groups yielding less precise estimates. In addition to these general factors, there are unique factors that affect the precision for each effect size index. [...] Studies that yield more precise estimates of the effect size carry more information and are assigned more weight in the meta-analysis.”
“Under the fixed-effect model we assume that all studies in the meta-analysis share a common (true) effect size. [...] However, in many systematic reviews this assumption is implausible. When we decide to incorporate a group of studies in a meta-analysis, we assume that the studies have enough in common that it makes sense to synthesize the information, but there is generally no reason to assume that they are identical in the sense that the true effect size is exactly the same in all the studies. [...] Because studies will differ in the mixes of participants and in the implementations of interventions, among other reasons, there may be different effect sizes underlying different studies. [...] One way to address this variation across studies is to perform a random-effects meta-analysis. In a random-effects meta-analysis we usually assume that the true effects are normally distributed. [...] Since our goal is to estimate the mean of the distribution, we need to take account of two sources of variance. First, there is within-study error in estimating the effect in each study. Second (even if we knew the true mean for each of our studies), there is variation in the true effects across studies. Study weights are assigned with the goal of minimizing both sources of variance.”
“Under the fixed-effect model we assume that the true effect size for all studies is identical, and the only reason the effect size varies between studies is sampling error (error in estimating the effect size). Therefore, when assigning weights to the different studies we can largely ignore the information in the smaller studies since we have better information about the same effect size in the larger studies. By contrast, under the random-effects model the goal is not to estimate one true effect, but to estimate the mean of a distribution of effects. Since each study provides information about a different effect size, we want to be sure that all these effect sizes are represented in the summary estimate. This means that we cannot discount a small study by giving it a very small weight (the way we would in a fixed-effect analysis). The estimate provided by that study may be imprecise, but it is information about an effect that no other study has estimated. By the same logic we cannot give too much weight to a very large study (the way we might in a fixed-effect analysis). [...] Under the fixed-effect model there is a wide range of weights [...] whereas under the random-effects model the weights fall in a relatively narrow range. [...] the relative weights assigned under random effects will be more balanced than those assigned under fixed effects. As we move from fixed effect to random effects, extreme studies will lose influence if they are large, and will gain influence if they are small. [...] Under the fixed-effect model the only source of uncertainty is the within-study (sampling or estimation) error. Under the random-effects model there is this same source of uncertainty plus an additional source (between-studies variance). It follows that the variance, standard error, and confidence interval for the summary effect will always be larger (or wider) under the random-effects model than under the fixed-effect model [...] Under the fixed-effect model the null hypothesis being tested is that there is zero effect in every study. Under the random-effects model the null hypothesis being tested is that the mean effect is zero. Although some may treat these hypotheses as interchangeable, they are in fact different”
“It makes sense to use the fixed-effect model if two conditions are met. First, we believe that all the studies included in the analysis are functionally identical. Second, our goal is to compute the common effect size for the identified population, and not to generalize to other populations. [...] this situation is relatively rare. [...] By contrast, when the researcher is accumulating data from a series of studies that had been performed by researchers operating independently, it would be unlikely that all the studies were functionally equivalent. Typically, the subjects or interventions in these studies would have differed in ways that would have impacted on the results, and therefore we should not assume a common effect size. Therefore, in these cases the random-effects model is more easily justified than the fixed-effect model. [...] There is one caveat to the above. If the number of studies is very small, then the estimate of the between-studies variance [...] will have poor precision. While the random-effects model is still the appropriate model, we lack the information needed to apply it correctly. In this case the reviewer may choose among several options, each of them problematic [and one of which is to apply a fixed effects framework].”
i. “If people spent as much time studying as they spent hating, I’d be writing this from a goddamn moon-base.” (Zach Weiner)
ii. “Experience comprises illusions lost, rather than wisdom gained.” (Joseph Roux)
iii. “The happiness which is lacking makes one think even the happiness one has unbearable.” (-ll-)
iv. “Men are more apt to be mistaken in their generalizations than in their particular observations.” (Niccolo Machiavelli)
v. “A good reputation is more valuable than money.” (Publilius Syrus)
vi. “Many receive advice, few profit by it.” (-ll-)
vii. “Anyone can hold the helm when the sea is calm.” (-ll-)
viii. “To forget the wrongs you receive, is to remedy them.” (-ll-)
ix. “It is a bad plan that admits of no modification.” (-ll-)
x. “No one knows what he can do till he tries.” (-ll-)
xi. “Everything is worth what its purchaser will pay for it.” (-ll-)
xii. “I have often regretted my speech, never my silence.” (-ll-)
xiii. “We give to necessity the praise of virtue.” (Quintilian)
xiv. “Those who wish to appear wise among fools, among the wise seem foolish.” (-ll-)
xv. “Shared danger is the strongest of bonds; it will keep men united in spite of mutual dislike and suspicion.” (Titus Livius Patavinus)
xvi. “Favor and honor sometimes fall more fitly on those who do not desire them.” (-ll-)
xvii. “Men are only too clever at shifting blame from their own shoulders to those of others.” (-ll-)
xviii. “Men are slower to recognise blessings than misfortunes.” (-ll-)
xix. “It is easier to criticize than to correct our past errors.” (-ll-)
xx. “There is an old saying which, from its truth, has become proverbial, that friendships should be immortal, enmities mortal.” (-ll-)
(A minor note: These days when I’m randomly browsing wikipedia and not just looking up concepts or terms found in the books I read, I’m mostly browsing the featured content on wikipedia. There’s a lot of featured stuff, and on average such articles more interesting than random articles. As a result of this approach, all articles covered in the post below are featured articles. A related consequence of this shift may be that I may cover fewer articles in future wikipedia posts than I have in the past; this post only contains five articles, which I believe is slightly less than usual for these posts – a big reason for this being that it sometimes takes a lot of time to read a featured article.)
i. Woolly mammoth.
“The woolly mammoth (Mammuthus primigenius) was a species of mammoth, the common name for the extinct elephant genus Mammuthus. The woolly mammoth was one of the last in a line of mammoth species, beginning with Mammuthus subplanifrons in the early Pliocene. M. primigenius diverged from the steppe mammoth, M. trogontherii, about 200,000 years ago in eastern Asia. Its closest extant relative is the Asian elephant. [...] The earliest known proboscideans, the clade which contains elephants, existed about 55 million years ago around the Tethys Sea. [...] The family Elephantidae existed six million years ago in Africa and includes the modern elephants and the mammoths. Among many now extinct clades, the mastodon is only a distant relative of the mammoths, and part of the separate Mammutidae family, which diverged 25 million years before the mammoths evolved. [...] The woolly mammoth coexisted with early humans, who used its bones and tusks for making art, tools, and dwellings, and the species was also hunted for food. It disappeared from its mainland range at the end of the Pleistocene 10,000 years ago, most likely through a combination of climate change, consequent disappearance of its habitat, and hunting by humans, though the significance of these factors is disputed. Isolated populations survived on Wrangel Island until 4,000 years ago, and on St. Paul Island until 6,400 years ago.”
“The appearance and behaviour of this species are among the best studied of any prehistoric animal due to the discovery of frozen carcasses in Siberia and Alaska, as well as skeletons, teeth, stomach contents, dung, and depiction from life in prehistoric cave paintings. [...] Fully grown males reached shoulder heights between 2.7 and 3.4 m (9 and 11 ft) and weighed up to 6 tonnes (6.6 short tons). This is almost as large as extant male African elephants, which commonly reach 3–3.4 m (9.8–11.2 ft), and is less than the size of the earlier mammoth species M. meridionalis and M. trogontherii, and the contemporary M. columbi. [...] Woolly mammoths had several adaptations to the cold, most noticeably the layer of fur covering all parts of the body. Other adaptations to cold weather include ears that are far smaller than those of modern elephants [...] The small ears reduced heat loss and frostbite, and the tail was short for the same reason [...] They had a layer of fat up to 10 cm (3.9 in) thick under the skin, which helped to keep them warm. [...] The coat consisted of an outer layer of long, coarse “guard hair”, which was 30 cm (12 in) on the upper part of the body, up to 90 cm (35 in) in length on the flanks and underside, and 0.5 mm (0.020 in) in diameter, and a denser inner layer of shorter, slightly curly under-wool, up to 8 cm (3.1 in) long and 0.05 mm (0.0020 in) in diameter. The hairs on the upper leg were up to 38 cm (15 in) long, and those of the feet were 15 cm (5.9 in) long, reaching the toes. The hairs on the head were relatively short, but longer on the underside and the sides of the trunk. The tail was extended by coarse hairs up to 60 cm (24 in) long, which were thicker than the guard hairs. It is likely that the woolly mammoth moulted seasonally, and that the heaviest fur was shed during spring.”
“Woolly mammoths had very long tusks, which were more curved than those of modern elephants. The largest known male tusk is 4.2 m (14 ft) long and weighs 91 kg (201 lb), but 2.4–2.7 m (7.9–8.9 ft) and 45 kg (99 lb) was a more typical size. Female tusks averaged at 1.5–1.8 m (4.9–5.9 ft) and weighed 9 kg (20 lb). About a quarter of the length was inside the sockets. The tusks grew spirally in opposite directions from the base and continued in a curve until the tips pointed towards each other. In this way, most of the weight would have been close to the skull, and there would be less torque than with straight tusks. The tusks were usually asymmetrical and showed considerable variation, with some tusks curving down instead of outwards and some being shorter due to breakage.”
“Woolly mammoths needed a varied diet to support their growth, like modern elephants. An adult of six tonnes would need to eat 180 kg (397 lb) daily, and may have foraged as long as twenty hours every day. [...] Woolly mammoths continued growing past adulthood, like other elephants. Unfused limb bones show that males grew until they reached the age of 40, and females grew until they were 25. The frozen calf “Dima” was 90 cm (35 in) tall when it died at the age of 6–12 months. At this age, the second set of molars would be in the process of erupting, and the first set would be worn out at 18 months of age. The third set of molars lasted for ten years, and this process was repeated until the final, sixth set emerged when the animal was 30 years old. A woolly mammoth could probably reach the age of 60, like modern elephants of the same size. By then the last set of molars would be worn out, the animal would be unable to chew and feed, and it would die of starvation.“
“The habitat of the woolly mammoth is known as “mammoth steppe” or “tundra steppe”. This environment stretched across northern Asia, many parts of Europe, and the northern part of North America during the last ice age. It was similar to the grassy steppes of modern Russia, but the flora was more diverse, abundant, and grew faster. Grasses, sedges, shrubs, and herbaceous plants were present, and scattered trees were mainly found in southern regions. This habitat was not dominated by ice and snow, as is popularly believed, since these regions are thought to have been high-pressure areas at the time. The habitat of the woolly mammoth also supported other grazing herbivores such as the woolly rhinoceros, wild horses and bison. [...] A 2008 study estimated that changes in climate shrank suitable mammoth habitat from 7,700,000 km2 (3,000,000 sq mi) 42,000 years ago to 800,000 km2 (310,000 sq mi) 6,000 years ago. Woolly mammoths survived an even greater loss of habitat at the end of the Saale glaciation 125,000 years ago, and it is likely that humans hunted the remaining populations to extinction at the end of the last glacial period. [...] Several woolly mammoth specimens show evidence of being butchered by humans, which is indicated by breaks, cut-marks, and associated stone tools. It is not known how much prehistoric humans relied on woolly mammoth meat, since there were many other large herbivores available. Many mammoth carcasses may have been scavenged by humans rather than hunted. Some cave paintings show woolly mammoths in structures interpreted as pitfall traps. Few specimens show direct, unambiguous evidence of having been hunted by humans.”
“While frozen woolly mammoth carcasses had been excavated by Europeans as early as 1728, the first fully documented specimen was discovered near the delta of the Lena River in 1799 by Ossip Schumachov, a Siberian hunter. Schumachov let it thaw until he could retrieve the tusks for sale to the ivory trade. [Aargh!] [...] The 1901 excavation of the “Berezovka mammoth” is the best documented of the early finds. It was discovered by the Berezovka River, and the Russian authorities financed its excavation. Its head was exposed, and the flesh had been scavenged. The animal still had grass between its teeth and on the tongue, showing that it had died suddenly. [...] By 1929, the remains of 34 mammoths with frozen soft tissues (skin, flesh, or organs) had been documented. Only four of them were relatively complete. Since then, about that many more have been found.”
ii. Daniel Lambert.
“Daniel Lambert (13 March 1770 – 21 June 1809) was a gaol keeper[n 1] and animal breeder from Leicester, England, famous for his unusually large size. After serving four years as an apprentice at an engraving and die casting works in Birmingham, he returned to Leicester around 1788 and succeeded his father as keeper of Leicester’s gaol. [...] At the time of Lambert’s return to Leicester, his weight began to increase steadily, even though he was athletically active and, by his own account, abstained from drinking alcohol and did not eat unusual amounts of food. In 1805, Lambert’s gaol closed. By this time, he weighed 50 stone (700 lb; 318 kg), and had become the heaviest authenticated person up to that point in recorded history. Unemployable and sensitive about his bulk, Lambert became a recluse.
In 1806, poverty forced Lambert to put himself on exhibition to raise money. In April 1806, he took up residence in London, charging spectators to enter his apartments to meet him. Visitors were impressed by his intelligence and personality, and visiting him became highly fashionable. After some months on public display, Lambert grew tired of exhibiting himself, and in September 1806, he returned, wealthy, to Leicester, where he bred sporting dogs and regularly attended sporting events. Between 1806 and 1809, he made a further series of short fundraising tours.
In June 1809, he died suddenly in Stamford. At the time of his death, he weighed 52 stone 11 lb (739 lb; 335 kg), and his coffin required 112 square feet (10.4 m2) of wood. Despite the coffin being built with wheels to allow easy transport, and a sloping approach being dug to the grave, it took 20 men almost half an hour to drag his casket into the trench, in a newly opened burial ground to the rear of St Martin’s Church.”
“Sensitive about his weight, Daniel Lambert refused to allow himself to be weighed, but sometime around 1805, some friends persuaded him to come with them to a cock fight in Loughborough. Once he had squeezed his way into their carriage, the rest of the party drove the carriage onto a large scale and jumped out. After deducting the weight of the (previously weighed) empty carriage, they calculated that Lambert’s weight was now 50 stone (700 lb; 318 kg), and that he had thus overtaken Edward Bright, the 616-pound (279 kg) “Fat Man of Maldon”, as the heaviest authenticated person in recorded history.
Despite his shyness, Lambert badly needed to earn money, and saw no alternative to putting himself on display, and charging his spectators. On 4 April 1806, he boarded a specially built carriage and travelled from Leicester to his new home at 53 Piccadilly, then near the western edge of London. For five hours each day, he welcomed visitors into his home, charging each a shilling (about £3.5 as of 2014). [...] Lambert shared his interests and knowledge of sports, dogs and animal husbandry with London’s middle and upper classes, and it soon became highly fashionable to visit him, or become his friend. Many called repeatedly; one banker made 20 visits, paying the admission fee on each occasion. [...] His business venture was immediately successful, drawing around 400 paying visitors per day. [...] People would travel long distances to see him (on one occasion, a party of 14 travelled to London from Guernsey),[n 5] and many would spend hours speaking with him on animal breeding.”
“After some months in London, Lambert was visited by Józef Boruwłaski, a 3-foot 3-inch (99 cm) dwarf then in his seventies. Born in 1739 to a poor family in rural Pokuttya, Boruwłaski was generally considered to be the last of Europe’s court dwarfs. He was introduced to the Empress Maria Theresa in 1754, and after a short time residing with deposed Polish king Stanisław Leszczyński, he exhibited himself around Europe, thus becoming a wealthy man. At age 60, he retired to Durham, where he became such a popular figure that the City of Durham paid him to live there and he became one of its most prominent citizens [...] The meeting of Lambert and Boruwłaski, the largest and smallest men in the country, was the subject of enormous public interest”
“There was no autopsy, and the cause of Lambert’s death is unknown. While many sources say that he died of a fatty degeneration of the heart or of stress on his heart caused by his bulk, his behaviour in the period leading to his death does not match that of someone suffering from cardiac insufficiency; witnesses agree that on the morning of his death he appeared well, before he became short of breath and collapsed. Bondeson (2006) speculates that the most consistent explanation of his death, given his symptoms and medical history, is that he had a sudden pulmonary embolism.“
“The exposed geology of the Capitol Reef area presents a record of mostly Mesozoic-aged sedimentation in an area of North America in and around Capitol Reef National Park, on the Colorado Plateau in southeastern Utah.
Nearly 10,000 feet (3,000 m) of sedimentary strata are found in the Capitol Reef area, representing nearly 200 million years of geologic history of the south-central part of the U.S. state of Utah. These rocks range in age from Permian (as old as 270 million years old) to Cretaceous (as young as 80 million years old.) Rock layers in the area reveal ancient climates as varied as rivers and swamps (Chinle Formation), Sahara-like deserts (Navajo Sandstone), and shallow ocean (Mancos Shale).
The area’s first known sediments were laid down as a shallow sea invaded the land in the Permian. At first sandstone was deposited but limestone followed as the sea deepened. After the sea retreated in the Triassic, streams deposited silt before the area was uplifted and underwent erosion. Conglomerate followed by logs, sand, mud and wind-transported volcanic ash were later added. Mid to Late Triassic time saw increasing aridity, during which vast amounts of sandstone were laid down along with some deposits from slow-moving streams. As another sea started to return it periodically flooded the area and left evaporite deposits. Barrier islands, sand bars and later, tidal flats, contributed sand for sandstone, followed by cobbles for conglomerate and mud for shale. The sea retreated, leaving streams, lakes and swampy plains to become the resting place for sediments. Another sea, the Western Interior Seaway, returned in the Cretaceous and left more sandstone and shale only to disappear in the early Cenozoic.”
“The Laramide orogeny compacted the region from about 70 million to 50 million years ago and in the process created the Rocky Mountains. Many monoclines (a type of gentle upward fold in rock strata) were also formed by the deep compressive forces of the Laramide. One of those monoclines, called the Waterpocket Fold, is the major geographic feature of the park. The 100 mile (160 km) long fold has a north-south alignment with a steeply east-dipping side. The rock layers on the west side of the Waterpocket Fold have been lifted more than 7,000 feet (2,100 m) higher than the layers on the east. Thus older rocks are exposed on the western part of the fold and younger rocks on the eastern part. This particular fold may have been created due to movement along a fault in the Precambrian basement rocks hidden well below any exposed formations. Small earthquakes centered below the fold in 1979 may be from such a fault. [...] Ten to fifteen million years ago the entire region was uplifted several thousand feet (well over a kilometer) by the creation of the Colorado Plateaus. This time the uplift was more even, leaving the overall orientation of the formations mostly intact. Most of the erosion that carved today’s landscape occurred after the uplift of the Colorado Plateau with much of the major canyon cutting probably occurring between 1 and 6 million years ago.”
Apollonius of Perga (ca. 262 BC – ca. 190 BC) posed and solved this famous problem in his work Ἐπαφαί (Epaphaí, “Tangencies”); this work has been lost, but a 4th-century report of his results by Pappus of Alexandria has survived. Three given circles generically have eight different circles that are tangent to them [...] and each solution circle encloses or excludes the three given circles in a different way [...] The general statement of Apollonius’ problem is to construct one or more circles that are tangent to three given objects in a plane, where an object may be a line, a point or a circle of any size. These objects may be arranged in any way and may cross one another; however, they are usually taken to be distinct, meaning that they do not coincide. Solutions to Apollonius’ problem are sometimes called Apollonius circles, although the term is also used for other types of circles associated with Apollonius. [...] A rich repertoire of geometrical and algebraic methods have been developed to solve Apollonius’ problem, which has been called “the most famous of all” geometry problems.“
v. Globular cluster.
“A globular cluster is a spherical collection of stars that orbits a galactic core as a satellite. Globular clusters are very tightly bound by gravity, which gives them their spherical shapes and relatively high stellar densities toward their centers. The name of this category of star cluster is derived from the Latin globulus—a small sphere. A globular cluster is sometimes known more simply as a globular.
Globular clusters, which are found in the halo of a galaxy, contain considerably more stars and are much older than the less dense galactic, or open clusters, which are found in the disk. Globular clusters are fairly common; there are about 150 to 158 currently known globular clusters in the Milky Way, with perhaps 10 to 20 more still undiscovered. Large galaxies can have more: Andromeda, for instance, may have as many as 500. [...]
Every galaxy of sufficient mass in the Local Group has an associated group of globular clusters, and almost every large galaxy surveyed has been found to possess a system of globular clusters. The Sagittarius Dwarf galaxy and the disputed Canis Major Dwarf galaxy appear to be in the process of donating their associated globular clusters (such as Palomar 12) to the Milky Way. This demonstrates how many of this galaxy’s globular clusters might have been acquired in the past.
Although it appears that globular clusters contain some of the first stars to be produced in the galaxy, their origins and their role in galactic evolution are still unclear.”
Before I started reading the book I was considering whether it’d be worth it, as a book like this might have little to offer for someone with my background – I’ve had a few stats courses at this point, and it’s not like the specific topic of medical statistics is completely unknown to me; for example I read an epidemiology textbook just last year, and Hill and Glied and Smith covered related topics as well. It wasn’t that I thought there’s not a lot of medical statistics I don’t already know – there is – it was more of a concern that this specific (type of) book might not be the book to read if I wanted to learn a lot of new stuff in this area.
Disregarding the specific medical context of the book I knew a lot of stuff about many of the topics covered. To take an example, Bartholomew’s book devoted a lot of pages to the question of how to handle missing data in a sample, a question this book devotes 5 sentences to. There are a lot of details missing here and the coverage is not very deep. As I hint at in the goodreads review, I think the approach applied in the book is to some extent simply mistaken; I don’t think this (many chapters on different topics, each chapter 2-3 pages long) is a good way to write a statistics textbook. The many different chapters on a wide variety of topics give you the impression that the authors have tried to maximize the amount of people who might get something out of this book, which may have ended up meaning that few people will actually get much out of it. On the plus side there are illustrated examples of many of the statistical methods used in the book, and you also get (some of) the relevant formulas for calculating e.g. specific statistics – but you get little understanding of the details of why this works, when it doesn’t, and what happens when it doesn’t. I already mentioned Bartholomew’s book – many other textbooks written about topics which they manage to cover in their two- or three-page chapters could be mentioned as well – examples include publications such as this, this and this.
Given the way the book starts out (which different types of data exist? How do you calculate an average and what is a standard deviation?) I think the people most likely to be reading a book like this are people who have a very limited knowledge of statistics and data analysis – and when people like that read stats books, you need to be very careful with your wording and assumptions. Maybe I’m just a grumpy old man, but I’m not sure I think the authors are careful enough. A couple of examples:
“Statistical modelling includes the use of simple and multiple linear regression, polynomial regression, logistic regression and methods that deal with survival data. All these methods rely on generating the mathematical model that describes the relationship between two or more variables. In general, any model can be expressed in the form:
g(Y) = a + b1x1 + b2x2 + … + bkxk
where Y is the fitted value of the dependent variable, g(.) is some optional transformation of it (for example, the logit transformation), xl, . . . , xk are the predictor or explanatory variables”
(In case you were wondering, it took me 20 minutes to find out how to lower those 1′s and 2′s because it’s not a standard wordpress function and you need to really want to find out how to do this in order to do it. The k’s still look like crap, but I’m not going to spend more time trying to figure out how to make this look neat. I of course could not copy the book formula into the post, or I would have done that. As I’ve pointed out many times, it’s a nightmare to cover mathematical topics on a blog like this. Yeah, I know Terry Tao also blogs on wordpress, but presumably he writes his posts in a different program – I’m very much against the idea of doing this, even if I am sometimes – in situations like these – seriously reconsidering whether I should do that.)
Let’s look closer at this part again: “In general, any model can be expressed…”
This choice of words and the specific example is the sort of thing I have in mind. If you don’t know a lot about data analysis and you read a statement like this literally, which is the sort of thing I for one am wont to do, you’ll conclude that there’s no such thing as a model which is non-linear in its parameters. But there are a lot of models like that. Imprecise language like this can be incredibly frustrating because it will lead either to confusion later on, or, if people don’t read another book on any of these topics again, severe overconfidence and mistaken beliefs due to hidden assumptions.
Here’s another example from chapter 28, on ‘Performing a linear regression analysis’:
“Checking the assumptions
For each observed value of x, the residual is the observed y minus the corresponding fitted Y. Each residual may be either positive or negative. We can use the residuals to check the following assumptions underlying linear regression.
1 There is a linear relationship between x and y: Either plot y against x (the data should approximate a straight line), or plot the residuals against x (we should observe a random scatter of points rather than any systematic pattern).
2 The observations are independent: the observations are independent if there is no more than one pair of observations on each individual.”
This is not good. Arguably the independence assumption is in some contexts best conceived of as an in practice untestable assumption, but regardless of whether it ‘really’ is or not there are a lot of ways in which this assumption may be violated, and observations not being derived from the same individual is not a sufficient requirement for establishing independence. Assuming otherwise is potentially really problematic.
Here’s another example:
“Some words of comfort
Do not worry if you find the theory underlying probability distributions complex. Our experience demonstrates that you want to know only when and how to use these distributions. We have therefore outlined the essentials, and omitted the equations that define the probability distributions. You will find that you only need to be familiar with the basic ideas, the terminology and, perhaps (although infrequently in this computer age), know how to refer to the tables.”
I found this part problematic. If you want to do hypothesis testing using things like the Chi-squared distribution or the F-test (both ‘covered’, sort of, in the book), you need to be really careful about details like the relevant degrees of freedom and how these may depend on what you’re doing with the data, and stuff like this is sometimes not obvious – not even to people who’ve worked with the equations (well, sometimes it is obvious, but it’s easy to forget to correct for estimated parameters and you can’t always expect the program to do this for you, especially not in more complex model frameworks). My position is that if you’ve never even seen the relevant equations, you have no business conducting anything but the most basic of analyses involving these distributions. Of course a person who’s only read this book would not be able to do more than that, but even so instead of ‘some words of comfort’ I’d much rather have seen ‘some words of caution’.
One last one:
* Categorical data – It is relatively easy to check categorical data, as the responses for each variable can only take one of a number of limited values. Therefore, values that are not allowable must be errors.”
Nothing else is said about error checking of categorical data in this specific context, so it would be natural to assume from reading this that if you simply check whether values are ‘allowable’ or not, this is sufficient to catch all the errors. But this is a completely uninformative statement, as a key term remains undefined – neglected is the question of how to define (observation-specific-) ‘allowability’ in the first place, which is the real issue; a proper error-finding algorithm should apply a precise and unambiguous definition of this term, and how to (/implicitly?) construct/apply such an algoritm is likely to sometimes be quite hard, especially when multiple categories are used and allowed and the category dimension in question is hard to cross-check against other variables. Reading the above sequence, it’d be easy for the reader to assume that this is all very simple and easy.
Oh well, all this said the book did had some good stuff as well. I’ve added some further comments and observations from the book below, with which I did not ‘disagree’ (to the extent that this is even possible). It should be noted that the book has a lot of focus on hypothesis testing and (/how to conduct) different statistical tests, and very little about statistical modelling. Many different tests are either mentioned and/or explicitly covered in the book, which aside from e.g. standard z-, t- and F-tests also include things like e.g. McNemar’s test, Bartlett’s test, the sign test, and the Wilcoxon rank-sum test, most of which were covered – I realized after having read the book – in the last part of the first statistics text I read, a part I was not required to study and so technically hadn’t read. So I did come across some new stuff while reading the book. Those specific parts were actually some of the parts of the book I liked best, because they contained stuff I didn’t already know, and not just stuff which I used to know but had forgot about. The few additional quotes added below do to some extent illustrate what the book is like, but it should also be kept in mind that they’re perhaps also not completely ‘fair’, in a way, in terms of providing a balanced and representative sample of the kind of stuff included in the publication; there are many (but perhaps not enough..) equations along the way (which I’m not going to blog, for reasons already mentioned), and the book includes detailed explanations and illustrations of how to conduct specific tests – it’s quite ‘hands-on’ in some respects, and a lot of tools will be added to the toolbox of someone who’s not read a similar publication before.
“Generally, we make comparisons between individuals in different groups. For example, most clinical trials (Topic 14) are parallel trials, in which each patient receives one of the two (or occasionally more) treatments that are being compared, i.e. they result in between-individual comparisons.
Because there is usually less variation in a measurement within an individual than between different individuals (Topic 6), in some situations it may be preferable to consider using each individual as hidher own control. These within-individual comparisons provide more precise comparisons than those from between-individual designs, and fewer individuals are required for the study to achieve the same level of precision. In a clinical trial setting, the crossover design is an example of a within-individual comparison; if there are two treatments, every individual gets each treatment, one after the other in a random order to eliminate any effect of calendar time. The treatment periods are separated by a washout period, which allows any residual effects (carry-over) of the previous treatment to dissipate. We analyse the difference in the responses on the two treatments for each individual. This design can only be used when the treatment temporarily alleviates symptoms rather than provides a cure, and the response time is not prolonged.”
“A cohort study takes a group of individuals and usually follows them forward in time, the aim being to study whether exposure to a particular aetiological factor will affect the incidence of a disease outcome in the future [...]
Advantages of cohort studies
*The time sequence of events can be assessed.
*They can provide information on a wide range of outcomes.
*It is possible to measure the incidence/risk of disease directly.
*It is possible to collect very detailed information on exposure to a wide range of factors.
*It is possible to study exposure to factors that are rare.
*Exposure can be measured at a number of time points, so that changes in exposure over time can be studied. There is reduced recall and selection bias compared with case-control studies (Topic 16).
Disadvantages of cohort studies
*In general, cohort studies follow individuals for long periods of time, and are therefore costly to perform.
*Where the outcome of interest is rare, a very large sample size is needed.
*As follow-up increases, there is often increased loss of patients as they migrate or leave the study, leading to biased results. *As a consequence of the long time-scale, it is often difficult to maintain consistency of measurements and outcomes over time. [...]
*It is possible that disease outcomes and their probabilities, or the aetiology of disease itself, may change over time.”
“A case-control study compares the characteristics of a group of patients with a particular disease outcome (the cases) to a group of individuals without a disease outcome (the controls), to see whether any factors occurred more or less frequently in the cases than the controls [...] Many case-control studies are matched in order to select cases and controls who are as similar as possible. In general, it is useful to sex-match individuals (i.e. if the case is male, the control should also be male), and, sometimes, patients will be age-matched. However, it is important not to match on the basis of the risk factor of interest, or on any factor that falls within the causal pathway of the disease, as this will remove the ability of the study to assess any relationship between the risk factor and the disease. Unfortunately, matching [means] that the effect on disease of the variables that have been used for matching cannot be studied.”
“Advantages of case-control studies
“quick, cheap and easy [...] particularly suitable for rare diseases. [...] A wide range of risk factors can be investigated. [...] no loss to follow-up.
Disadvantages of case-control studies
Recall bias, when cases have a differential ability to remember certain details about their histories, is a potential problem. For example, a lung cancer patient may well remember the occasional period when he/she smoked, whereas a control may not remember a similar period. [...] If the onset of disease preceded exposure to the risk factor, causation cannot be inferred. [...] Case-control studies are not suitable when exposures to the risk factor are rare.”
“The P-value is the probability of obtaining our results, or something more extreme, if the null hypothesis is true. The null hypothesis relates to the population of interest, rather than the sample. Therefore, the null hypothesis is either true or false and we cannot interpret the P-value as the probability that the null hypothesis is true.”
“Hypothesis tests which are based on knowledge of the probability distributions that the data follow are known as parametric tests. Often data do not conform to the assumptions that underly these methods (Topic 32). In these instances we can use non-parametric tests (sometimes referred to as distribution-free tests, or rank methods). [...] Non-parametric tests are particularly useful when the sample size is small [...], and when the data are measured on a categorical scale. However, non-parametric tests are generally wasteful of information; consequently they have less power [...] A number of factors have a direct bearing on power for a given test.
*The sample size: power increases with increasing sample size. [...]
*The variability of the observations: power increases as the variability of the observations decreases [...]
*The effect of interest: the power of the test is greater for larger effects. A hypothesis test thus has a greater chance of detecting a large real effect than a small one.
*The significance level: the power is greater if the significance level is larger”
“The statistical use of the word ‘regression’ derives from a phenomenon known as regression to the mean, attributed to Sir Francis Galton in 1889. He demonstrated that although tall fathers tend to have tall sons, the average height of the sons is less than that of their tall fathers. The average height of the sons has ‘regressed’ or ‘gone back’ towards the mean height of all the fathers in the population. So, on average, tall fathers have shorter (but still tall) sons and short fathers have taller (but still short) sons.
We observe regression to the mean in screening and in clinical trials, when a subgroup of patients may be selected for treatment because their levels of a certain variable, say cholesterol, are extremely high (or low). If the measurement is repeated some time later, the average value for the second reading for the subgroup is usually less than that of the first reading, tending towards (i.e. regressing to) the average of the age- and sex-matched population, irrespective of any treatment they may have received. Patients recruited into a clinical trial on the basis of a high cholesterol level on their first examination are thus likely to show a drop in cholesterol levels on average at their second examination, even if they remain untreated during this period.”
“A systematic review is a formalized and stringent process of combining the information from all relevant studies (both published and unpublished) of the same health condition; these studies are usually clinical trials [...] of the same or similar treatments but may be observational studies [...] a meta-analysis, because of its inflated sample size, is able to detect treatment effects with greater power and estimate these effects with greater precision than any single study. Its advantages, together with the introduction of meta-analysis software, have led meta-analyses to proliferate. However, improper use can lead to erroneous conclusions regarding treatment efficacy. The following principal problems should be thoroughly investigated and resolved before a meta-analysis is performed.
*Publication bias - the tendency to include in the analysis only the results from published papers; these favour statistically significant findings.
*Clinical heterogeneity - in which differences in the patient population, outcome measures, definition of variables, and/or duration of follow-up of the studies included in the analysis create problems of non-compatibility.
*Quality differences - the design and conduct of the studies may vary in their quality. Although giving more weight to the better studies is one solution to this dilemma, any weighting system can be criticized on the grounds that it is arbitrary.
*Dependence - the results from studies included in the analysis may not be independent, e.g. when results from a study are published on more than one occasion.”
“Mathematical models underpin much ecological theory, [...] [y]et most students of ecology and environmental science receive much less formal training in mathematics than their counterparts in other scientific disciplines. Motivating both graduate and undergraduate students to study ecological dynamics thus requires an introduction which is initially accessible with limited mathematical and computational skill, and yet offers glimpses of the state of the art in at least some areas. This volume represents our attempt to reconcile these conflicting demands [...] Ecology is the branch of biology that deals with the interaction of living organisms with their environment. [...] The primary aim of this book is to develop general theory for describing ecological dynamics. Given this aspiration, it is useful to identify questions that will be relevant to a wide range of organisms and/or habitats. We shall distinguish questions relating to individuals, populations, communities, and ecosystems. A population is all the organisms of a particular species in a given region. A community is all the populations in a given region. An ecosystem is a community related to its physical and chemical environment. [...] Just as the physical and chemical properties of materials are the result of interactions involving individual atoms and molecules, so the dynamics of populations and communities can be interpreted as the combined effects of properties of many individuals [...] All models are (at best) approximations to the truth so, given data of sufficient quality and diversity, all models will turn out to be false. The key to understanding the role of models in most ecological applications is to recognise that models exist to answer questions. A model may provide a good description of nature in one context but be woefully inadequate in another. [...] Ecology is no different from other disciplines in its reliance on simple models to underpin understanding of complex phenomena. [...] the real world, with all its complexity, is initially interpreted through comparison with the simplistic situations described by the models. The inevitable deviations from the model predictions [then] become the starting point for the development of more specific theory.”
I haven’t blogged this book yet even if it’s been a while since I finished it, and I figured I ought to talk a little bit about it now. As pointed out on goodreads, I really liked the book. It’s basically a math textbook for biologists which deals with how to set up models in a specific context, that dealing with questions pertaining to ecological dynamics; having read the above quote you should at this point at least have some idea which kind of stuff this field deals with. Here are a few links to examples of applications mentioned/covered in the book which may give you a better idea of the kinds of things covered.
There are 9 chapters in the book, and only the introductory chapter has fewer than 50 ‘named’ equations – most have around 70-80 equations, and 3 of them have more than 100. I have tried to avoid equations in this post in part because it’s hell to deal with them in wordpress, so I’ll be leaving out a lot of stuff in my coverage. Large chunks of the coverage was to some extent review but there was also some new stuff in there. The book covers material both intended for undergraduates and graduates, and even if the book is presumably intended for biology majors many of the ideas also can be ‘transferred’ to other contexts where the same types of specific modelling frameworks might be applied; for example there are some differences between discrete-time models and continuous-time models, and those differences apply regardless of whether you’re modelling animal behaviour or, say, human behaviour. A local stability analysis looks quite similar in the contexts of an economic model and an ecological model. Etc. I’ve tried to mostly talk about rather ‘general stuff’ in this coverage, i.e. model concepts and key ideas covered in the book which might also be applicable in other fields of research as well. I’ve tried to keep things reasonably simple in this post, and I’ve only talked about stuff from the first three chapters.
“The simplest ecological models, called deterministic models, make the assumption that if we know the present condition of a system, we can predict its future. Before we can begin to formulate such a model, we must decide what quantities, known as state variables, we shall use to describe the current condition of the system. This choice always involves a subtle balance of biological realism (or at least plausibility) against mathematical complexity. [...] The first requirement in formulating a usable model is [...] to decide which characteristics are dynamically important in the context of the questions the model seeks to answer. [...] The diversity of individual characteristics and behaviours implies that without considerable effort at simplification, a change of focus towards communities will be accompanied by an explosive increase in model complexity. [...] A dynamical model is a mathematical statement of the rules governing change. The majority of models express these rules either as an update rule, specifying the relationship between the current and future state of the system, or as a differential equation, specifying the rate of change of the state variables. [...] A system with [the] property [that the update rule does not depend on time] is said to be autonomous. [...] [If the update rule depends on time, the models are called non-autonomous].”
“Formulation of a dynamic model always starts by identifying the fundamental processes in the system under investigation and then setting out, in mathematical language, the statement that changes in system state can only result from the operation of these processes. The “bookkeeping” framework which expresses this insight is often called a conservation equation or a balance equation. [...] Writing down balance equations is just the first step in formulating an ecological model, since only in the most restrictive circumstances do balance equations on their own contain enough information to allow prediction of future values of state variables. In general, [deterministic] model formulation involves three distinct steps: *choose state variables, *derive balance equations, *make model-specific assumptions.
Selection of state variables involves biological or ecological judgment [...] Deriving balance equations involves both ecological choices (what processes to include) and mathematical reasoning. The final step, the selection of assumptions particular to any one model, is left to last in order to facilitate model refinement. For example, if a model makes predictions that are at variance with observation, we may wish to change one of the model assumptions, while still retaining the same state variables and processes in the balance equations. [...] a remarkably good approximation to [...] stochastic dynamics is often obtained by regarding the dynamics as ‘perturbations’ of a non-autonomous, deterministic system. [...] although randomness is ubiquitous, deterministic models are an appropriate starting point for much ecological modelling. [...] even where deterministic models are inadequate, an essential prerequisite to the formulation and analysis of many complex, stochastic models is a good understanding of a deterministic representation of the system under investigation.”
“Faced with an update rule or a balance equation describing an ecological system, what do we do? The most obvious line of attack is to attempt to find an analytical solution [...] However, except for the simplest models, analytical solutions tend to be impossible to derive or to involve formulae so complex as to be completely unhelpful. In other situations, an explicit solution can be calculated numerically. A numerical solution of a difference equation is a table of values of the state variable (or variables) at successive time steps, obtained by repeated application of the update rule [...] Numerical solutions of differential equations are more tricky [but sophisticated methods for finding them do exist] [...] for simple systems it is possible to obtain considerable insight by ‘numerical experiments’ involving solutions for a number of parameter values and/or initial conditions. For more complex models, numerical analysis is typically the only approach available. But the unpleasant reality is that in the vast majority of investigations it proves impossible to obtain complete or near-complete information about a dynamical system, either by deriving analytical solutions or by numerical experimentation. It is therefore reassuring that over the past century or so, mathematicians have developed methods of determining the qualitative properties of the solutions of dynamic equations, and thus answering many questions [...] without explicitly solving the equations concerned.”
“[If] the long-term behaviour of the state variable is independent of the initial condition [...] the ‘end state’ [...] is known as an attractor. [...] Equilibrium states need not be attractors; they can be repellers [as well] [...] if a dynamical system has an equilibrium state, any initial condition other than the exact equilibrium value may lead to the state variable converging towards the equilibrium or diverging away from it. We characterize such equilibria as stable and unstable respectively. In some models all initial conditions result in the state variable eventually converging towards a single equilibrium value. We characterize such equilibria as globally stable. An equilibrium that is approached only from a subset of all possible initial conditions (often those close to the equilibrium itself) is said to be locally stable. [...] The combination of non-periodic solutions and sensitive dependence on initial conditions is the signature of the pattern of behaviour known to mathematicians as chaos.
“Most variables and parameters in models have units. [...] However, the behaviour of a natural system cannot be affected by the units in which we chose to measure the quantities we use to describe it. This implies that it should be possible to write down the defining equations of a model in a form independent of the units we use. For any dynamical equation to be valid, the quantities being equated must be measured in the same units. How then do we restate such an equation in a form which is unaffected by our choice of units? The answer lies in identifying a natural scale or base unit for each quantity in the equations and then using the ratio of each variable to its natural scale in our dynamic description. Since such ratios are pure numbers, we say that they are dimensionless. If a dynamic equation couched in terms of dimensionless variables is to be valid, then both sides of any equality must likewise be dimensionless. [...] the process of non-dimensionalisation, which we call dimensional analysis, can [...] yield information on system dynamics. [...] Since there is no unique dimensionless form for any set of dynamical equations, it is tempting to cut short the scaling process by ‘setting some parameter(s) equal to one’. Even experienced modellers make embarrasing blunders doing this, and we strongly recommend a systematic [...] approach [...] The key element in the scaling process is the selection of appropriate base units – the optimal choice being dependent on the questions motivating our study.”
“The starting point for selecting the appropriate formalism [in the context of the time dimension] must [...] be recognition that real ecological processes operate in continuous time. Discrete-time models make some approximation to the outcome of these processes over a finite time interval, and should thus be interpreted with care. This caution is particularly important as difference equations are intuitively appealing and computationally simple. [...] incautious empirical modelling with difference equations can have surprising (adverse) consequences. [...] where the time increment of a discrete-time model is an arbitrary modelling choice, model predictions should be shown to be robust against changes in the value chosen.”
“Of the almost limitless range of relations between population flux and local density, we shall discuss only two extreme possibilities. Advection occurs when an external physical flow (such as an ocean current) transports all the members of the population past the point, x [in a spatially one-dimensional model], with essentially the same velocity, v. [...] Diffusion occurs when the members of the population move at random. [...] This leads to a net flow rate which is proportional to the spatial gradient of population density, with a constant of proportionality D, which we call the diffusion constant. [...] the net flow [in this case] takes individuals from regions of high density to regions of low density” [...] […some remarks about reaction-diffusion models, which I’d initially thought I’d cover here but which turned out to be too much work to deal with (the coverage is highly context-dependent)].
This is a neat little book in the Springer Briefs in Statistics series. The author is David J Bartholomew, a former statistics professor at the LSE. I wrote a brief goodreads review, but I thought that I might as well also add a post about the book here. The book covers topics such as the EM algorithm, Gibbs sampling, the Metropolis–Hastings algorithm and the Rasch model, and it assumes you’re familiar with stuff like how to do ML estimation, among many other things. I had some passing familiarity with many of the topics he talks about in the book, but I’m sure I’d have benefited from knowing more about some of the specific topics covered. Because large parts of the book is basically unreadable by people without a stats background I wasn’t sure how much of it it made sense to cover here, but I decided to talk a bit about a few of the things which I believe don’t require you to know a whole lot about this area.
“Modern statistics is built on the idea of models—probability models in particular. [While I was rereading this part, I was reminded of this quote which I came across while finishing my most recent quotes post: “No scientist is as model minded as is the statistician; in no other branch of science is the word model as often and consciously used as in statistics.” Hans Freudenthal.] The standard approach to any new problem is to identify the sources of variation, to describe those sources by probability distributions and then to use the model thus created to estimate, predict or test hypotheses about the undetermined parts of that model. [...] A statistical model involves the identification of those elements of our problem which are subject to uncontrolled variation and a specification of that variation in terms of probability distributions. Therein lies the strength of the statistical approach and the source of many misunderstandings. Paradoxically, misunderstandings arise both from the lack of an adequate model and from over reliance on a model. [...] At one level is the failure to recognise that there are many aspects of a model which cannot be tested empirically. At a higher level is the failure is to recognise that any model is, necessarily, an assumption in itself. The model is not the real world itself but a representation of that world as perceived by ourselves. This point is emphasised when, as may easily happen, two or more models make exactly the same predictions about the data. Even worse, two models may make predictions which are so close that no data we are ever likely to have can ever distinguish between them. [...] All model-dependant inference is necessarily conditional on the model. This stricture needs, especially, to be borne in mind when using Bayesian methods. Such methods are totally model-dependent and thus all are vulnerable to this criticism. The problem can apparently be circumvented, of course, by embedding the model in a larger model in which any uncertainties are, themselves, expressed in probability distributions. However, in doing this we are embarking on a potentially infinite regress which quickly gets lost in a fog of uncertainty.”
“Mixtures of distributions play a fundamental role in the study of unobserved variables [...] The two important questions which arise in the analysis of mixtures concern how to identify whether or not a given distribution could be a mixture and, if so, to estimate the components. [...] Mixtures arise in practice because of failure to recognise that samples are drawn from several populations. If, for example, we measure the heights of men and women without distinction the overall distribution will be a mixture. It is relevant to know this because women tend to be shorter than men. [...] It is often not at all obvious whether a given distribution could be a mixture [...] even a two-component mixture of normals, has 5 unknown parameters. As further components are added the estimation problems become formidable. If there are many components, separation may be difficult or impossible [...] [To add to the problem,] the form of the distribution is unaffected by the mixing [in the case of the mixing of normals]. Thus there is no way that we can recognise that mixing has taken place by inspecting the form of the resulting distribution alone. Any given normal distribution could have arisen naturally or be the result of normal mixing [...] if f(x) is normal, there is no way of knowing whether it is the result of mixing and hence, if it is, what the mixing distribution might be.”
“Even if there is close agreement between a model and the data it does not follow that the model provides a true account of how the data arose. It may be that several models explain the data equally well. When this happens there is said to be a lack of identifiability. Failure to take full account of this fact, especially in the social sciences, has led to many over-confident claims about the nature of social reality. Lack of identifiability within a class of models may arise because different values of their parameters provide equally good fits. Or, more seriously, models with quite different characteristics may make identical predictions. [...] If we start with a model we can predict, albeit uncertainly, what data it should generate. But if we are given a set of data we cannot necessarily infer that it was generated by a particular model. In some cases it may, of course, be possible to achieve identifiability by increasing the sample size but there are cases in which, no matter how large the sample size, no separation is possible. [...] Identifiability matters can be considered under three headings. First there is lack of parameter identifiability which is the most common use of the term. This refers to the situation where there is more than one value of a parameter in a given model each of which gives an equally good account of the data. [...] Secondly there is what we shall call lack of model identifiability which occurs when two or more models make exactly the same data predictions. [...] The third type of identifiability is actually the combination of the foregoing types.
Mathematical statistics is not well-equipped to cope with situations where models are practically, but not precisely, indistinguishable because it typically deals with things which can only be expressed in unambiguously stated theorems. Of necessity, these make clear-cut distinctions which do not always correspond with practical realities. For example, there are theorems concerning such things as sufficiency and admissibility. According to such theorems, for example, a proposed statistic is either sufficient or not sufficient for some parameter. If it is sufficient it contains all the information, in a precisely defined sense, about that parameter. But in practice we may be much more interested in what we might call ‘near sufficiency’ in some more vaguely defined sense. Because we cannot give a precise mathematical definition to what we mean by this, the practical importance of the notion is easily overlooked. The same kind of fuzziness arises with what are called structural eqation models (or structural relations models) which have played a very important role in the social sciences. [...] we shall argue that structural equation models are almost always unidentifiable in the broader sense of which we are speaking here. [...] [our results] constitute a formidable argument against the careless use of structural relations models. [...] In brief, the valid use of a structural equations model requires us to lean very heavily upon assumptions about which we may not be very sure. It is undoubtedly true that if such a model provides a good fit to the data, then it provides a possible account of how the data might have arisen. It says nothing about what other models might provide an equally good, or even better fit. As a tool of inductive inference designed to tell us something about the social world, linear structural relations modelling has very little to offer.”
“It is very common for data to be missing and this introduces a risk of bias if inferences are drawn from incomplete samples. However, we are not usually interested in the missing data themselves but in the population characteristics to whose estimation those values were intended to contribute. [...] A very longstanding way of dealing with missing data is to fill in the gaps by some means or other and then carry out the standard analysis on the completed data set. This procedure is known as imputation. [...] In its simplest form, each missing data point is replaced by a single value. Because there is, inevitably, uncertainty about what the imputed values should be, one can do better by substituting a range of plausible values and comparing the results in each case. This is known as multiple imputation. [...] missing values may occur anywhere and in any number. They may occur haphazardly or in some pattern. In the latter case, the pattern may provide a clue to the mechanism underlying the loss of data and so suggest a method for dealing with it. The conditional distribution which we have supposed might be the basis of imputation depends, of course, on the mechanism behind the loss of data. From a practical point of view the detailed information necessary to determine this may not be readily obtainable or, even, necessary. Nevertheless, it is useful to clarify some of the issues by introducing the idea of a probability mechanism governing the loss of data. This will enable us to classify the problems which would have to be faced in a more comprehensive treatment. The simplest, if least realistic approach, is to assume that the chance of being missing is the same for all elements of the data matrix. In that case, we can, in effect, ignore the missing values [...] Such situations are designated as MCAR which is an acronym for Missing Completely at Random. [...] In the smoking example we have supposed that men are more likely to refuse [to answer] than women. If we go further and assume that there are no other biasing factors we are, in effect, assuming that ‘missingness’ is completely at random for men and women, separately. This would be an example of what is known as Missing at Random(MAR) [...] which means that the missing mechanism depends on the observed variables but not on those that are missing. The final category is Missing Not at Random (MNAR) which is a residual category covering all other possibilities. This is difficult to deal with in practice unless one has an unusually complete knowledge of the missing mechanism.
Another term used in the theory of missing data is that of ignorability. The conditional distribution of y given x will, in general, depend on any parameters of the distribution of M [the variable we use to describe the mechanism governing the loss of observations] yet these are unlikely to be of any practical interest. It would be convenient if this distribution could be ignored for the purposes of inference about the parameters of the distribution of x. If this is the case the mechanism of loss is said to be ignorable. In practice it is acceptable to assume that the concept of ignorability is equivalent to that of MAR.”
i. “A slave dreams of freedom, a free man dreams of wealth, the wealthy dream of power, and the powerful dream of freedom.” (Andrzej Majewski)
ii. “The tragedy of a thoughtless man is not that he doesn’t think, but that he thinks that he’s thinking.” (-ll-)
iii. “Money is the necessity that frees us from necessity.” (W. H. Auden)
iv. “Young people, who are still uncertain of their identity, often try on a succession of masks in the hope of finding the one which suits them — the one, in fact, which is not a mask.” (-ll-)
v. “The aphorist does not argue or explain, he asserts; and implicit in his assertion is a conviction that he is wiser and more intelligent than his readers.” (-ll-)
vi. “none become at once completely vile.” (William Gifford)
vii. “It is by a wise economy of nature that those who suffer without change, and whom no one can help, become uninteresting. Yet so it may happen that those who need sympathy the most often attract it the least.” (F. H. Bradley)
viii. “He who has imagination without learning has wings but no feet.” (Joseph Joubert)
ix. “It is better to debate a question without settling it than to settle a question without debating it.” (-ll-)
x. “The aim of an argument or discussion should not be victory, but progress.” (-ll-)
xi. “Are you listening to the ones who keep quiet?” (-ll-)
xii. “Writing is closer to thinking than to speaking.” (-ll-)
xiii. “Misery is almost always the result of thinking.” (-ll-)
xiv. “The great inconvenience of new books is that they prevent us from reading the old ones.” (-ll-)
xv. “A good listener is one who helps us overhear ourselves.” (Yahia Lababidi)
xvi. “To suppose, as we all suppose, that we could be rich and not behave as the rich behave, is like supposing that we could drink all day and keep absolutely sober.” (Logan Pearsall Smith)
xvii. “People say that life is the thing, but I prefer reading.” (-ll-)
xviii. “Most men give advice by the bucket, but take it by the grain.” (William Alger)
xix. “There is an instinct that leads a listener to be very sparing of credence when a fact is communicated [...] But give him a fable fresh from the mint of the Mendacity Society [...] and he will not only make affidavit of its truth, but will call any man out who ventures to dispute its authenticity.” (Samuel Blanchard)
xx. “Experience leaves fools as foolish as ever.” (-ll-)
“In a variety of mammals and a few birds, newly immigrated or newly dominant males are known to attack and kill dependent infants [...]. Hrdy (1974) was the first to suggest that this bizarre behaviour was the product of sexual selection: by killing infants they did not sire, these males advanced the timing of the mother’s next oestrus and, owing to their new social position, would have a reasonable probability of siring this female’s next infant. [...] Although this interpretation, and indeed the phenomenon itself, has been hotly debated for decades [...], on balance, this hypothesis provides a far better fit with the observations on primates than any of the alternatives [...] several large-scale studies have estimated that the time gained by the infanticidal male amounts to [25-32] per cent of the mean interbirth interval [...] Because males rarely, if ever, suffer injuries during infanticidal attacks, and because there is no evidence that committing infanticide leads to reduced tenure length, one can safely conclude that, on average, infanticide is an adaptive male strategy. [...] Infanticide often happens when the former dominant male, the most likely sire of most infants even in multi-male groups [...], is eliminated or incapacitated. [...] dominant males are effective protectors of infants as long as they are not ousted or incapacitated.”
“Conceptually, we can distinguish two kinds of mating by females that may reduce the risk of infanticide. First, by mating polyandrously in potentially fertile periods, females can reduce the concentration of paternity in the dominant male, and spread some of it to other males, so that long-term average paternity probabilities will be somewhat below 1 for the dominant male and somewhat above 0 for the subordinates. Second, by mating during periods of non-fertility [...], a female may be able to manipulate the assessment by the various males of their paternity chances, although she obviously cannot change the actual paternity values allocated to the various males. [...] The basic prediction is that females that are vulnerable to infanticide by males should be actively polyandrous whenever potentially infanticidal males are present in the mating pool (i.e. the sexually mature males in the social unit or nearby with which the female can mate, in principle). There is ample evidence that primate females in vulnerable species actively pursue polyandrous matings and that they often engage in matings when fertilisation is unlikely or impossible [...]. Indeed, females often target low-ranking or peripheral males reluctant to mate in the presence of the dominant males, especially during pregnancy. [...] In species vulnerable to infanticide, females often respond to changes in the male cohort of a group with immediate proceptivity, and effectively solicit matings with the new (or newly dominant) male [...] It is in the female’s interest to keep individual males guessing as to the extent to which other males have also mated with her [...] Hence, females should be likely to mate discreetly, especially with subordinate males. [...] We [expect] that matings between females and subordinate males tend to take place out of sight of the dominant male, e.g. at the periphery and away from the group [...] it has been noted for several species that matings between females and subordinate males [do] tend to occur rather surreptuously”
“Even though most primates have concealed ovulations, there is evidence that they use various pre-copulatory mechanisms, such as friendships [...] or increased proximity [...] with favoured males, copulation calls that are likely to attract particular males [...], active solicitation of copulations around the likely conception date [...], as well as changes in chemical signals [...]; unique vocalizations [...]; sexual swellings [...] and increased frequencies of particular behaviour patterns during the peri-ovulatory phase [...] to signal impending ovulation and/or to increase the chances of fertilization by favoured males.” [Recall from the previous post also in this context that which males are actually ‘favoured’ changes significantly during the cycle].
“Thornhill (1983) suggested that females might exhibit what he called ‘cryptic female choice’ – the differential utilisation of sperm from different males. The term ‘cryptic’ referred to the fact that this choice took place out of sight, inside the female reproductive tract. [...] Cryptic female choice is difficult to demonstrate [as] one has to control for all male effects, such as sperm numbers or differential fertilising ability [...] Cryptic female choice in primates is poorly documented, even though there are theoretical reasons to expect it to be common. [...] The strongest indirect evidence for a mechanism of cryptic female choice in primates is provided by the observation that females of several species of anthropoids (mostly macaques, baboons and chimpanzees) exhibit orgasm [...] Physiological measures during artificially induced orgasms [have] demonstrated the occurence of the same vaginal and uterine contractions that also characterise human orgasm [...] and are thought to accelerate and facilitate sperm transport towards the cervix and ovaries [...] female orgasm was observed more often in macaque pairs including high-ranking males (Troisi & Carosi, 1998). A comparable effect of male social status on female orgasm rates has also been reported for humans [...]. Orgasm therefore has the potential to be used selectively by females to facilitate fertilisation of their eggs by particular males [...] This hypothesis is indirectly supported by the observation that female orgasm apparently does not occur among prosimians [...], but rather among Old World primates, where the potential for coercive matings by multiple males is highest [...]. Seen this way, female primate orgasm may therefore represent an evolutionary response to male sexual coercion that provided females with an edge in the dynamic competition over the control of fertilisation” [Miller’s account/explanation was quite different. I think both explanations are rather speculative at this point. Speculative, but interesting.]
“It has long been an established fact in ethology that interactions with social partners influence an individual’s motivational state and vice versa, and, through interactions, its physiological development and condition. For example, the suppression of reproductive processes by the presence of a same-sex conspecific has been documented for many species, including primates. [...] The existence of a conditional [male mating] strategy with different tactics has been demonstrated in several species of mammals. To mention but one clear example: in savannah baboons, a male may decide what tactic to follow in its relationships with females after assessing what others do. Smuts (1985) has shown that dominant males follow a sexual tactic in which they monopolise access to fertile females by contest competition. A subordinate male may use another tactic. He may persuade a female to choose him for mating by rendering services to the female (e.g. protecting her in between-female competition) and thus forming a ‘friendship’ with the female. Similar variation in tactics has been found in other primates (e.g. in rhesus macaques, Berard et al., 1994).”
…And there you probably have at least part of the explanation for why millions of romantically frustrated (…’pathetic’?) human males waste significant parts of their (reproductive) lives catering to the needs of women who already have a sexual partner and are not sexually interested in them – they might not even have been born were it not for the successful application of this type of sit-and-wait strategy on part of some of their ancestors in the past.
The chapter in question has a lot of stuff about male orangutans, and although it’s quite interesting I won’t go much into the details here. I should note however that I think most females will probably prefer the above-mentioned ‘sneaky’ male tactic (I should perhaps note here that in terms of the ‘sneakiness’ of mating strategies, females do pretty well for themselves as well. Indeed in the specific setting it’s not unlikely that it’s actually the females who initiate in a substantial number of cases – see above..) to the mating tactic of unflanged orangutans, which basically amounts to walking around looking for a female unprotected by a flanged male and then raping her when he comes across one. In one sample included in the book of orangutan matings taking place in Tanjung Puting national park (Indonesia), of roughly 20 matings by unflanged males recorded only 1 or 2 (it’s a bar graph) did not involve a female resisting. These guys are great, and apparently really sexy to the opposite gender… The ratio of resisting/not resisting females in the case of the matings involving flanged males was pretty much the reverse; a couple of rapes and ~18-19 unforced mating events. It should be noted that the number of matings achieved by the flanged and unflanged males is roughly similar, so judging from these data approximately half of all matings these female orangutans experience during their lives are forced.
“Especially in long-lived organisms such as primates, a male’s success in competing for mates and protecting his offspring should be affected by the nature of major social decisions, such as whether and when to transfer to other groups or to challenge dominants. Several studies indicate dependence of male decisions about transfer and acquisition of rank on age and local demography [...]. Likewise, our work on male long-tailed macaques [...] indicated a remarkably tight fit between the behavioural decisions of males and expectations based on known determinants of success [...], suggesting that natural selection has endowed males with rules that, on average, produce optimal life-history trajectories (or careers) for a given set of conditions. [...] Most non-human primates live in groups with continuous male-female association ["Only a minority of about 10 per cent of primate species live in pairs" - from a previous chapter], in which group membership of reproductively active (usually non-natal) males can last many years. For a male living in such a mixed-sex group, dominance rank reflects his relative power in excluding others from resources. However, the impact of dominance on mating success is variable [...] Although rank acquisition is usually considered separately from transfer behaviour and mating success, the hypothesis examined here is that they are interdependent [...]. We predict that the degree of paternity concentration in the dominant male, determined by his ability to exclude other males from mating, determines the relative benefits of various modes of acquisition of top rank [...], and that these together determine patterns of male transfer”
“the cost of inbreeding may cause females to avoid mating with male relatives [...]. This tendency has been invoked to explain an apparent female preference for novel (recently immigrated) males”
“a male can attain top rank in a mixed-sex group in three different ways. First, he can defeat the current dominant male during an aggressive challenge [...] Second, he can attain top rank during the formation of a new group[...] A third way to achieve top rank is by default, or through ‘succession’, after the departure or death of the previous top-ranking male, not preceded by challenges from other males”
The chapter which included the above quotes is quite interesting, but in a way also difficult to quote from given the way it is written. They talk about multiple variables which may affect how likely a male is to leave the group in which he was born (for example if there are fewer females in the group, all else equal he’s more likely to leave); which mechanism he’s likely to employ in order to try to achieve top rank in his group, if that’s indeed an option (in small groups they always fight for the top spot and the dominant male will have a very dim view of other mature males trying to encroach upon his territory, whereas in large groups the dominant male is more tolerant of competitors and they’re much less likely to settle things by fighting with each other – the reason why fighting is less common is probably because the male in the latter group is in general unable to monopolize access to the females because of the size of the group, so you to some extent ‘gain less’ by achieving alpha male status), and when he’s likely to act (a young male is stronger than an old male and he can also expect to maintain his tenure as the top male for a longer period of time – so males who try to achieve top rank by fighting for it are likely to be young, whereas males who achieve top rank by other means tend to be older). Whether or not females reproduce in a seasonal pattern also matters. It’s obvious from the data that it’s far from random how and at which point during their lives males make their transfer decisions, and how they settle conflicts about who should get the top spot. The approach in that chapter reminded me a bit of optimal foraging theory stuff, but they didn’t talk about that kind of stuff at all in the chapter. Here’s what they concluded from the data they presented in the chapter:
“We found not only variation between species but also remarkable variation within species, or even populations, in the effect of group size on paternity concentration and thus transfer decisions, as well as mode of rank acquisition and likelihood of natal transfer. This variability suggests that a primate male’s behaviour is guided by a set of conditional rules that allow him to respond to a variety of local situations. [...] Primate males appear to have a set of conditional rules that allow them to respond flexibly to variation in the potential for paternity concentration. Before mounting a challenge, they assess the situation in their current group, and before making their transfer decisions they monitor the situation in multiple potential-target groups, where this is possible.”
I friend pointed me to a Danish article talking about this. I pointed out a few problems and reasons to be skeptical to my friend, and I figured I might as well share a few thoughts on these matters here as well. I do not have access to my library at the present point in time, so this post will be less well sourced than most posts I’ve written on related topics in the past.
i. I’ve had diabetes for over 25 years. A cure for type 1 diabetes has been just around the corner for decades. This is not a great argument for assuming that a cure will not be developed in a few years’ time, but you do at some point become a bit skeptical.
ii. The type of ‘mouse diabetes’ people use when they’re doing research on animal models such as e.g. NOD mice, from which many such ‘breakthroughs’ are derived, is different from ‘human diabetes’. As pointed out in the reddit thread, “Doug’s group alone has cured diabetes in mice nearly a dozen times”. This may or may not be true, but I’m pretty sure that at the present point in time my probability of being cured of diabetes would be significantly higher if I happened to be one of those lab mice.
iii. A major related point often overlooked in contexts like these is that type 1 diabetes is not one disease – it is a group of different disorders all sharing the feature that the disease process involved leads to destruction of the pancreatic beta-cells. At least this is not a bad way to think about it. This potentially important neglected heterogeneity is worth mentioning when we’re talking about cures. To talk about ‘type 1 diabetes’ as if it’s just one disease is a gross simplification, as multiple different, if similar, disease processes are at work in different patients; some people with ‘the disease’ get sick in days or weeks, in others it takes years to get to the point where symptoms develop. Multiple different gene complexes are involved. Prognosis – both regarding the risk of diabetes-related organ damage and the risk of developing ‘other’ autoimmune conditions (‘other’ because it may be the same disease process causing the ‘other’ diseases as well), such as Hashimoto’s thyroiditis – depends to some extent on the mutations involved. This stuff relates also to the question of what we mean by the word ‘cure’ – more on this below. You might argue that although diabetics are different from each other and vary in a lot of ways, the same thing could be said about the sufferers of all kinds of other diseases, such as, say, prostate cancer. So maybe heterogeneity within this particular patient population is not that important. But the point remains that we don’t treat all prostate cancer patients the same way, and that some are much easier to cure than others.
iv. The distinction between types (type 1, type 2) makes it easy to overlook the fact that there are significant within-group heterogeneities, as mentioned above. But the complexity of the processes involved are perhaps even better illustrated by pointing out that even between-group distinctions can also sometimes be quite complicated. The distinction between type 1 and type 2 diabetes is a case in point; usually people say only type 1 is auto-immune, but it was made clear in Sperling et al.’s textbook that that’s not really true; in a minority of type 2 diabetics autoimmune processes are also clearly involved – and this is actually highly relevant as these subgroups of patients have a much worse prognosis than the type 2 diabetics without autoantibody markers, as they’ll on average progress to insulin-dependent disease (uncontrollable by e.g. insulin-sensitizers) much faster than people without an auto-immune disease process. In my experience most people who talk about diabetes online, also well-informed people e.g. in reddit/askscience threads, are not (even?) aware of this. I mention it because it’s one obvious example of how within-group hidden heterogeneities can have huge relevance for which treatment modalities are desirable or useful. You’d expect type 2′s with auto-immune processes involved would need a different sort of ‘cure’ than ‘ordinary type 2′s’. For a little more on different ‘varieties’ of diabetes, see also this and this.
There are as already mentioned also big differences in outcomes between subgroups within the type 1 group; some people with type 1 diabetes will end up with three or four ‘different’(?) auto-immune diseases, whereas others will get lucky and ‘only’ ever get type 1 diabetes. Not only that, we also know that glycemic control differences between those groups do not account for all the variation in between-group differences in outcomes in terms of diabetes-related complications; type 1 diabetics hit by ‘other’ auto-immune processes (e.g. Graves’ disease) tend to be more likely to develop complications to their diabetes than the rest, regardless of glycemic control. Would successful beta-cell transplants, assuming these at some point become feasible, and achieved euglycemia in that patient population still prevent thyroid failure later on? Would the people more severely affected, e.g. people with multiple autoimmune conditions, still develop some of the diabetes-related complications, such as cardiovascular complications, even if they had functional beta cells and were to achieve euglycemia, because those problems may be caused by disease aspects like accelerated atherosclerosis to some extent perhaps unrelated to glycemic control? These are things we really don’t know. It’s very important in that context to note that most diabetics, both type 1 and type 2, die from cardiovascular disease, and that the link between glycemic control and cardiovascular outcomes is much weaker than the one between glycemic control and microvascular complications (e.g., eye disease, kidney disease). There may be reasons why we do not yet have a good picture of just how important euglycemia really is, e.g. because glucose variability and not just average glucose levels may be important in terms of outcomes (I recall seeing this emphasized recently in a paper, but I’m not going to look for a source) – and Hba1c only account for the latter. So maybe it does all come back to glycemic control, it’s just that we don’t have the full picture yet. Maybe. But to the extent that e.g. cardiovascular outcomes – or other complications in diabetics – are unrelated to glycemic control, beta-cell transplants may not improve cardiovascular outcomes at all. One potential cure might be one where diabetics get beta-cell transplants, achieve euglycemia and are able to drop the insulin injections – yet they still die too soon from heart disease because other aspects of the disease process has not been addressed by the ‘cure’. I don’t think at the current point in time that we really know enough about these diseases to really judge if a hypothetical diabetic with functional transplanted beta-cells may not still to some extent be ‘sick’.
v. If your cure requires active suppression of the immune system, not much will really be gained. A to some people perhaps surprising fact is that we already know how to do ‘curative’ pancreas transplants in diabetics, and these are sometimes done in diabetic patients with kidney failure (“In most cases, pancreas transplantation is performed on individuals with type 1 diabetes with end-stage renal disease, brittle diabetes [poor glycemic control, US] and hypoglycaemia unawareness. The majority of pancreas transplantation (>90%) are simultaneous pancreas-kidney transplantation.” – link) – these people would usually be dead without a kidney transplant and as they already have to suffer through all the negative transplant-related effects of immune suppression and so on, the idea is that you might as well switch both defective organs now you’re at it, if they’re both available. But immune suppression sucks and these patients do not have great prognoses so this is not a good way to deal with diabetes in a ‘healthy diabetic’; if rejection problems are not addressed in a much better manner than the ones currently available in whole-organ-transplant cases, the attractiveness of any such type of intervention/’cure’ goes down a lot. In the study they tried to engineer their way around this issue, but whether they’ve been successful in any meaningful way is subject to discussion - I share ‘SirT6”s skepticism at the original reddit link. I’d have to see something like this working in humans for some years before I get too optimistic.
vi. One final aspect is perhaps noting. Even a Complete and Ideal Cure involving beta-cell transplants in a setting where it turns out that everything that goes wrong with all diabetics is really blood-glucose related, is not going to repair the damage that’s already been done. Such aspects will of course matter much more to some people than to others.
Okay, here’s the short version: This book is awesome – I gave it five stars and added it to my list of favourites on goodreads.
It’s the second primatology text I read this year – the first one was Aureli et al.; my coverage of that book can be found here, here and here. I’ve also recently read a few other texts as well which have touched upon arguably semi-related themes; books such as Herrera et al., Gurney and Nisbet, Whitmore and Whitmore, Okasha, Miller, and Bobbi Low. Some of the stuff covered in Holmes et al. turned out to be relevant as well. I mention these books because this book is aimed at graduates in the field (“Sexual Selection in Primates is aimed at graduates and researchers in primatology, animal behaviour, evolutionary biology and comparative psychology“), and although my background is different I have as indicated read some stuff about these kinds of things before – if you know nothing about this stuff, it may be a bit more work for you to read the book than it was for me. I still think you should read it though, as this is the sort of book everybody should read; if they did, people’s opinions about extra-marital sex might change, their understanding of the behavioural strategies people employ when they go about being unfaithful might increase, single moms would find it easier to understand why their dating value is lower than that of their competitors without children, and new dimensions of friendship dynamics – both those involving same-sex individuals and those involving individuals of both sexes – might enter people’s mental model and provide additional angles which might be used by them to help explain why they, or other people, behave the way they do. To take a few examples.
Most humans are probably aware that many males in primate species quite closely related to us habitually engage in activities like baby-killing or rape, and that they do this because such behavioural strategies lead to them being more successful in the fitness context. However they may not be aware that females of those species have implemented behavioural strategies in order to counteract these behaviours; for example females may furtively sleep around with different males in order to confuse the males about who’s the real father of their offspring (you don’t want to kill your own baby), or they may band up with other females, and/or perhaps a strong male, in order to obtain protection from the potential rapists. I mention this in part because a related observation is that it should be clear from observing humans in their natural habitat that most human males are not baby-killers or rapists, and such an observation might easily lead people who have some passing familiarity with the field to think that a lot of the stuff included in a book like this one is irrelevant to human behaviour; a single mom is unlikely to hook up with a guy who kills her infant, so this kind of stuff is probably irrelevant to humans – we are different. I think this is the wrong conclusion to draw. What’s particularly important to note in this context is that counterstrategies are reasonably effective in many primate species, meaning for example that although infanticide does take place in wild primate species, it doesn’t happen that often; we’ve in some respects come a bit further than other species in terms of limiting such behaviours, but in more than a few areas of social behaviour humans actually seem to act in a rather similar manner to those baby-killing rapists and their victims. It’s also really important to observe that sexual conflict is but one of several types of conflicts which organisms such as mammals face, and that the dynamics of such conflicts and aspects like how they are resolved have many cross-species similarities – see Aureli et al. for an overview. It’s difficult and expensive to observe primates in the wild, but when you do it it’s not actually that hard to spot many precursors of- or animal equivalents of various behaviours that humans engage in as well. Some animals are more like us than people like to think, and the common idea that humans are really special and unique on account of our large brains may to some extent be the result of a lack of knowledge about how animals actually behave. Yep, we are different, but perhaps not quite as different as people like to think. Some of the behaviours we like to think of as somehow ‘irreducible’ probably aren’t.
Observations included in a book like this one may well change how you think about many things humans do, at least a little. Humans who are not sexually active have the same evolutionary past as those that are, which means that their behaviours are likely to be and have been shaped by similar mechanisms – an important point being that if even someone like me, who at the moment consider it a likely outcome that I’ll never have sex during my lifetime, is capable of finding stuff covered in a book such as this one to be relevant and useful, there are probably very few people who wouldn’t find some of the stuff in there relevant and useful to some extent. Modern humans face different decision variables and constraints than did our ancestors, but the brains we’re walking around with are to a significant extent best thought of as the brains of our ancestors – they really haven’t changed that much in, say, the last 100.000 years, and some parts of the ‘code’ we walk around with are literally millions of years old. You need to remember to account for stuff like birth control, ‘culture’ and institutions when you’re dealing with human sexual behaviours today, but a lot of other stuff should be included as well, and books like this one will give you another piece of the puzzle. An important piece, I think.
Although there’s a limited amount of mathematics in this book (mostly limited to an infanticide model in chapter 8), as you can imagine given the target group the book is really quite dense. There’s way too much good stuff in this book for me to cover all of it here, and I don’t know at this point how detailed my coverage of the book will end up being. A lot of details will be left out, regardless of how many posts I decide to give this book – more than a few chapters are of such high quality that I could easily devote an entire post to each of them. If the stuff I include in my posts sparks your interest, you’ll probably want to read the rest of the book as well.
“In this review I have emphasised five points that modern students of sexual selection ought to keep in mind. First, the list of mechanisms of sexual selection is longer than just the two most famous examples of male-male combat and female choice. Male mate choice and female-female competition are two frequently noted possibilities. Other between-sex social interactions that can result in sexual selection include male coercion of females [...] and female resistance to male coercion or manipulation [...] sexual selection among females should be as important as male sexual selection to dynamical interactions between the sexes. Sexual selection among females will favour resistance to male attempts to manipulate and control them [...] Second, even when a mechanism of intersexual selection depends on interactions between members of opposite sexes, the important thing for selection is the variance in reproductive success among members of one sex. Think about female mate choice for a moment. Whenever choosers discriminate, mate choice may cause variation among the chosen in mating and reproductive success [...] Thus, mate choice is a mechanism of sexual selection because it theoretically results in variance among individuals of the chosen sex in mating success and perhaps other components of fitness. [...] Third, sexual selection can result in individual tradeoffs among the components of fitness [...] Fourth, for a trait to be under selection, there must be variation in the trait. For sexual selection to operate the trait variation must be among individuals of the same sex. [...] To argue that an opportunity for sexual selection exists, variation among same-sex individuals in reproductive success must exist. Fifth, between-sex variances in reproductive success alone are [...] an insufficient basis for the conclusion that sexual selection operates [...], as within-sex variances may arise because of random, non-heritable factors”
“In summary, sex roles fixed by past selection from anisogamy or from parental investment patterns so that females are choosy and males indiscriminate are currently questionable for many species. The factors that determine whether individuals are choosy or indiscriminate seem relatively under-investigated.” (One factor which does seem to be important is the encounter frequency with potentially mating opposite-sex individuals; this variable (how often do you meet a potential partner?) has been shown to affect the sexual behaviours of individuals in species as diverse as fruit flies, fish and butterflies).
“Because most primates live in stable, long-lasting social groups, pressures for direct sexually selected communication cues may be less than in species with ephemeral mating groups or frequent pairings. Primates are likely to accumulate information about competitors and mates from many sources over a longer time frame. [...] Although there do appear to be some communication signals that may be sexually selected, it may be best to consider these signals as biasing factors rather than the determinants of mate choice. For primates, human and non-human, as well as for Japanese quails, gerbils, rats and blue guramis, there is more to successful reproduction than simply responding to a sexually selected cue. Although I might be initially attracted to a woman with the ‘correct’ breast-to-waist-to-hip ratios, a symmetric face and all of the other hypothesised sexually selected cues, I will quickly learn if she is intelligent or not, if she is emotionally stable, and many other things that should be more important in my reproductive decisions than mere appearance. It is important to keep this in mind in any discussion of sexual selection. [...] The strongest evidence, so far, for intersexual selection of traits is observed in female primates, suggesting that male mate choice and female competition may be as important as male competition and female mate choice. [...] The data suggest that intersexual selection is as strong if not stronger on female primates than on males.” [As should be very clear at this point, male primates do have standards, despite what the third cartoon at the beginning of this post would have you believe…]
“One form of polyandry that has received much attention is extra-pair copulation (EPC) – sex that a female with a social mate has with a male who is not the social mate. [...] Because an evolved adaption is a product of past direct selection for a function, the question of whether EPC by women is currently adaptive or currently advances women’s reproductive success (RS) is a distinct one. An evolved adaption may be currently non-adaptive and even maladaptive because the current ecological setting in which it occurs is different from the evolutionary historical setting that was the selection favouring it [...] Female EPC is not a rare occurence in humans. [...] Female EPC may be a relatively common occurence now. But was it sufficiently common in small ancestral populations of humans or pre-hominid primates to be an effective selective force of evolution? Evidence suggests yes, and perhaps the best evidence comes from design features of men rather than women. Men, but not women, can be duped about parentage as a result of EPC, leading to the unknowing investment in another man’s offspring. Men show a rich diversity of mate guarding and anti-cuckoldry tactics ranging from sexual jealousy, vigilance, monopolising a mate’s time, pampering a mate, threatening a mate with harm if she shows interest in other men, and adjusting ejaculate size to defend against the mate’s insemination by a competitor [...] Some mate guarding tactics appear to be conditional, such that men guard mates of high fertility status (young or not pregnant) more intensely than ones of low-fertility status (older or pregnant) [...] and hence appear not to be caused by general male-male competitive strivings but rather concern for fidelity of a primary social mate [...] We [...] asked women in [a] study to report their primary mate’s mate-retention tactics. Our questionnaire measures two major dimensions, ‘proprietariness’ and ‘attentiveness’. Women reported their partners to be higher on both when fertile [i.e., mid-cycle].”
“Women’s preferences shift across the [menstrual] cycle in a number of ways. They particularly prefer the scent and faces of more symmetrical men when fertile. The face they find most attractive when fertile is more masculine than the face they most prefer when not fertile. They prefer more assertive, intrasexually competitive displays when fertile than when not. [An example: “The behaviours of men being interviewed by women for a lunch date were coded for a host of verbal and non-verbal qualities [by Gangestad et al.]. Through principal components analysis of these codes, two major dimensions along which men’s performance varied were identified; ‘social presence’, marked by a man’s composure, his direct eye contact and lack of downward gaze, as well as a lack of self-deprecation, and emphasis that he’s a ‘nice guy’; and ‘direct intrasexual competitiveness’, marked by a man’s explicit derogation of his competitor and statements to the effect that he is the better choice, as well as not being obviously agreeable.”] Furthermore, evidence indicates that their preferences when evaluating men as sex partners (i.e. their sexiness) is particularly affected; evidence shows that their evaluations of men as long-term partners shift little, if at all. [...] symmetrical men appear to invest less time in and are less faithful to their primary relationship partners [...] [The] pattern of findings suggests that it is not simply the case that all traits preferred by females are particularly preferred mid-cycle; that fertility status simply enhances existing preferences. Rather, it appears that only specific preferences are enhanced – perhaps those for features that ancestrally were indicators of genetic benefits. Preferences for features particularly important in long-term investing mates may actually be more prominent outside the fertile period.”
“STDs typically have been viewed as a curious group of parasites rather than established entities with important selective effects on their hosts [...]. In recent decades, this view has changed, primarily through our increased understanding of HIV [...] [There are] at least three major costs of STDs: (1) A large proportion of STDs increase the risk of sterility in males and females. (2) STDs commonly exhibit vertical transmission, with severe consequences for offspring health [see also this – Holmes et al. covers this stuff in some detail and actually the authors refer to an older version of that book in this context]. (3) Relative to infectious disease transmitted by non-sexual contact, STDs commonly exhibit long infectious periods with low host recovery, failure to clear infectious organisms following recovery, or limited immunity to reinfection. [...] Many negative consequences of STD infection probably provide benefits to the parasites themselves, increasing the likelihood of invasion, transmission and persistence [...] In mammals, for example, host infertility is likely to result in repeated cycling by females and may consequently increase their number of sexual contacts. [Mind blown! I’d never even thought about this.] Primates offer an important opportunity to test this hypothesis, because the frequency of infertile females within wild groups may exceed 10 per cent [...]. Similarly, STDs that increase host mortality or possess short infectious periods are less likely to survive until the next breeding season, when contact is established with new, uninfected hosts [...] Thus, in addition to long infectious periods, STDs tend to produce less disease-induced mortality relative to other infectious diseases”
“Because sexual reproduction offers an important mechanism for disease spread and may even be influenced by infection status, it is pertinent to ask whether animals can identify infected individuals and avoid mating with them. Symptoms such as visible lesions, sores, discharge around the genitalia or olfactory cues may provide evidence of infection. [...] many human STDs are [...] characterized by limited symptoms or, in the case of viruses, asymptomatic shedding [...] reproductive success of an STD is correlated with partner exchange and successful matings of infected hosts. Therefore, virulent parasites that produce outward signs of infection will experience decreased transmission because they provide conspicuous cues for choosy members of the opposite sex to avoid infected mates. [...] A parasite faces two main barriers, or defences, imposed by the host: behavioural counter-strategies to avoid exposure, and physical or immune defences [...]. The order of events can vary, but behavioural mechanisms commonly are viewed as the first line of defence. An important point we wish to emphasise is that host behaviour to avoid exposure prior to mating is likely to have other reproductive costs, and these costs may outweigh their benefits. [...] male and female behaviour indicates that STD risk is of secondary importance relative to other selective pressures operating on mating success. Females mate polyandrously to reduce infanticide risk [...] and, for similar reasons, they prefer novel males, though risking infection with STDs acquired from other social groups. Males prefer females of intermediate age that have already produced offspring, as these females have high reproductive value [...]. Both sets of decisions by males and females are expected to increase exposure to STDs by increasing the number of partners and mating events.”