## Biodemography of aging (IV)

My working assumption as I was reading part two of the book was that I would not be covering that part of the book in much detail here because it would simply be too much work to make such posts legible to the readership of this blog. However I then later, while writing this post, had the thought that given that almost nobody reads along here anyway (I’m not complaining, mind you – this is how I like it these days), the main beneficiary of my blog posts will always be myself, which lead to the related observation/notion that I should not be limiting my coverage of interesting stuff here simply because some hypothetical and probably nonexistent readership out there might not be able to follow the coverage. So when I started out writing this post I was working under the assumption that it would be my last post about the book, but I now feel sure that if I find the time I’ll add at least one more post about the book’s statistics coverage. On a related note I am explicitly making the observation here that this post was written for *my* benefit, not yours. You can read it if you like, or not, but it was not really written for you.

I have added bold a few places to emphasize key concepts and observations from the quoted paragraphs and in order to make the post easier for me to navigate later (all the italics below are on the other hand those of the authors of the book).

…

“** Biodemography** is a multidisciplinary branch of science that unites under its umbrella various analytic approaches aimed at integrating biological knowledge and methods and traditional demographic analyses to shed more light on variability in mortality and health across populations and between individuals.

**is a special subfield of biodemography that focuses on understanding the impact of processes related to aging on health and longevity.”**

*Biodemography of aging*“Mortality rates as a function of age are a cornerstone of many demographic analyses. The longitudinal **age** **trajectories of biomarkers** add a new dimension to the traditional demographic analyses: the mortality rate becomes a function of not only age but also of these biomarkers (with additional dependence on a set of sociodemographic variables). Such analyses should incorporate dynamic characteristics of trajectories of biomarkers to evaluate their impact on mortality or other outcomes of interest. Traditional analyses using baseline values of biomarkers (e.g., Cox proportional hazards or logistic regression models) do not take into account these dynamics. One approach to the evaluation of the impact of biomarkers on mortality rates is to use the Cox proportional hazards model with time-dependent covariates; this approach is used extensively in various applications and is available in all popular statistical packages. In such a model, the biomarker is considered a time-dependent covariate of the hazard rate and the corresponding regression parameter is estimated along with standard errors to make statistical inference on the direction and the significance of the effect of the biomarker on the outcome of interest (e.g., mortality). However, **the choice of the analytic approach should not be governed exclusively by its simplicity or convenience of application. It is essential to consider whether the method gives meaningful and interpretable results relevant to the research agenda. In the particular case of biodemographic analyses, the Cox proportional hazards model with time-dependent covariates is not the best choice.**”

“Longitudinal studies of aging present special methodological challenges due to inherent characteristics of the data that need to be addressed in order to avoid biased inference. The challenges are related to the fact that the populations under study (aging individuals) experience substantial **dropout rates** related to death or poor health and often have co-morbid conditions related to the disease of interest. The standard assumption made in longitudinal analyses (although usually not explicitly mentioned in publications) is that dropout (e.g., death) is not associated with the outcome of interest. While this can be safely assumed in many general longitudinal studies (where, e.g., the main causes of dropout might be the administrative end of the study or moving out of the study area, which are presumably not related to the studied outcomes), the very nature of the longitudinal outcomes (e.g., measurements of some physiological biomarkers) analyzed in a longitudinal study of aging assumes that they are (at least hypothetically) related to the process of aging. Because the process of aging leads to the development of diseases and, eventually, death, in longitudinal studies of aging an assumption of non-association of the reason for dropout and the outcome of interest is, at best, risky, and usually is wrong. As an illustration, we found that the average trajectories of different physiological indices of individuals dying at earlier ages markedly deviate from those of long-lived individuals, both in the entire Framingham original cohort […] and also among carriers of specific alleles […] In such a situation, **panel compositional changes due to attrition** affect the averaging procedure and modify the averages in the total sample. Furthermore, biomarkers are subject to **measurement error** and random biological variability. They are usually collected intermittently at examination times which may be sparse and typically biomarkers are not observed at event times. It is well known in the statistical literature that ignoring measurement errors and biological variation in such variables and using their observed “raw” values as time-dependent covariates in a Cox regression model may lead to biased estimates and incorrect inferences […] **Standard methods of survival analysis such as the Cox proportional hazards model** (Cox 1972) **with time-dependent covariates should be avoided in analyses of biomarkers measured with errors** because they can lead to biased estimates.”

“Statistical methods aimed at analyses of time-to-event data jointly with longitudinal measurements have become known in the mainstream biostatistical literature as “**joint models for longitudinal and time-to-event data**” (“survival” or “failure time” are often used interchangeably with “time-to-event”) or simply “**joint models**.” This is an active and fruitful area of biostatistics with an explosive growth in recent years. […] The standard joint model consists of two parts, the first representing the dynamics of longitudinal data (which is referred to as the “longitudinal sub-model”) and the second one modeling survival or, generally, time-to-event data (which is referred to as the “survival sub-model”). […] Numerous extensions of this basic model have appeared in the joint modeling literature in recent decades, providing great flexibility in applications to a wide range of practical problems. […] The standard parameterization of the joint model (11.2) assumes that the risk of the event at age t depends on the current “true” value of the longitudinal biomarker at this age. While this is a reasonable assumption in general, it may be argued that additional dynamic characteristics of the longitudinal trajectory can also play a role in the risk of death or onset of a disease. For example, if two individuals at the same age have exactly the same level of some biomarker at this age, but the trajectory for the first individual increases faster with age than that of the second one, then the first individual can have worse survival chances for subsequent years. […] Therefore, extensions of the basic parameterization of joint models allowing for dependence of the risk of an event on such dynamic characteristics of the longitudinal trajectory can provide additional opportunities for comprehensive analyses of relationships between the risks and longitudinal trajectories. Several authors have considered such extended models. […] **joint models are computationally intensive** and are sometimes prone to convergence problems [however such] models provide more efficient estimates of the effect of a covariate […] on the time-to-event outcome in the case in which there is […] an effect of the covariate on the longitudinal trajectory of a biomarker. This means that** analyses of longitudinal and time-to-event data in joint models may require smaller sample sizes to achieve comparable statistical power** **with analyses based on time-to-event data alone** (Chen et al. 2011).”

“To be useful as a tool for biodemographers and gerontologists who seek biological explanations for observed processes, models of longitudinal data should be based on realistic assumptions and reflect relevant knowledge accumulated in the field. An example is the shape of the risk functions. Epidemiological studies show that **the conditional hazards of health and survival events considered as functions of risk factors often have U- or J-shapes** […], so a model of aging-related changes should incorporate this information. In addition, risk variables, and, what is very important, their effects on the risks of corresponding health and survival events, experience aging-related changes and these can differ among individuals. […] An important class of models for joint analyses of longitudinal and time-to-event data incorporating a stochastic process for description of longitudinal measurements uses an epidemiologically-justified assumption of a quadratic hazard (i.e., U-shaped in general and J-shaped for variables that can take values only on one side of the U-curve) considered as a function of physiological variables. **Quadratic hazard models** have been developed and intensively applied in studies of human longitudinal data”.

“Various approaches to statistical model building and data analysis that incorporate unobserved heterogeneity are ubiquitous in different scientific disciplines. **Unobserved heterogeneity** in models of health and survival outcomes can arise because there may be relevant risk factors affecting an outcome of interest that are either unknown or not measured in the data. Frailty models introduce the concept of unobserved heterogeneity in survival analysis for time-to-event data. […] Individual age trajectories of biomarkers can differ due to various observed as well as unobserved (and unknown) factors and such individual differences propagate to differences in risks of related time-to-event outcomes such as the onset of a disease or death. […] The joint analysis of longitudinal and time-to-event data is the realm of a special area of biostatistics named “joint models for longitudinal and time-to-event data” or simply “joint models” […] Approaches that incorporate heterogeneity in populations through random variables with continuous distributions (as in the standard joint models and their extensions […]) assume that the risks of events and longitudinal trajectories follow similar patterns for all individuals in a population (e.g., that biomarkers change linearly with age for all individuals). Although such homogeneity in patterns can be justifiable for some applications, generally this is a rather strict assumption […] A population under study may consist of subpopulations with distinct patterns of longitudinal trajectories of biomarkers that can also have different effects on the time-to-event outcome in each subpopulation. When such subpopulations can be defined on the base of observed covariate(s), one can perform stratified analyses applying different models for each subpopulation. However, observed covariates may not capture the entire heterogeneity in the population in which case it may be useful to conceive of the population as consisting of *latent* subpopulations defined by unobserved characteristics. Special methodological approaches are necessary to accommodate such hidden heterogeneity. Within the joint modeling framework, a special class of models, **joint latent class models**, was developed to account for such heterogeneity […] The joint latent class model has three components. First, it is assumed that a population consists of a fixed number of (latent) subpopulations. The latent class indicator represents the latent class membership and the probability of belonging to the latent class is specified by a multinomial logistic regression function of observed covariates. It is assumed that individuals from different latent classes have different patterns of longitudinal trajectories of biomarkers and different risks of event. The key assumption of the model is conditional independence of the biomarker and the time-to-events given the latent classes. Then the class-specific models for the longitudinal and time-to-event outcomes constitute the second and third component of the model thus completing its specification. […] **the latent class stochastic process model** […] provides a useful tool for dealing with unobserved heterogeneity in joint analyses of longitudinal and time-to-event outcomes and taking into account hidden components of aging in their joint influence on health and longevity. This approach is also helpful for sensitivity analyses in applications of the original stochastic process model. We recommend starting the analyses with the original stochastic process model and estimating the model ignoring possible hidden heterogeneity in the population. Then the latent class stochastic process model can be applied to test hypotheses about the presence of hidden heterogeneity in the data in order to appropriately adjust the conclusions if a latent structure is revealed.”

“**The longitudinal genetic-demographic model** (or the genetic-demographic model for longitudinal data) […] combines three sources of information in the likelihood function: (1) follow-up data on survival (or, generally, on some time-to-event) for genotyped individuals; (2) (cross-sectional) information on ages at biospecimen collection for genotyped individuals; and (3) follow-up data on survival for non-genotyped individuals. […] Such joint analyses of genotyped and non-genotyped individuals can result in substantial improvements in statistical power and accuracy of estimates compared to analyses of the genotyped subsample alone if the proportion of non-genotyped participants is large. Situations in which genetic information cannot be collected for all participants of longitudinal studies are not uncommon. They can arise for several reasons: (1) the longitudinal study may have started some time before genotyping was added to the study design so that some initially participating individuals dropped out of the study (i.e., died or were lost to follow-up) by the time of genetic data collection; (2) budget constraints prohibit obtaining genetic information for the entire sample; (3) some participants refuse to provide samples for genetic analyses. Nevertheless, even when genotyped individuals constitute a majority of the sample or the entire sample, application of such an approach is still beneficial […] **The genetic stochastic process model** […] adds a new dimension to genetic biodemographic analyses, combining information on longitudinal measurements of biomarkers available for participants of a longitudinal study with follow-up data and genetic information. Such **joint analyses of different sources of information** collected in both genotyped and non-genotyped individuals allow for more efficient use of the research potential of longitudinal data which otherwise remains underused when only genotyped individuals or only subsets of available information (e.g., only follow-up data on genotyped individuals) are involved in analyses. Similar to the longitudinal genetic-demographic model […], **the benefits of combining data** on genotyped and non-genotyped individuals in the genetic SPM come from the presence of common parameters describing characteristics of the model for genotyped and non-genotyped subsamples of the data. This takes into account the knowledge that the non-genotyped subsample is a mixture of carriers and non-carriers of the same alleles or genotypes represented in the genotyped subsample and applies the ideas of heterogeneity analyses […] When the non-genotyped subsample is substantially larger than the genotyped subsample, these joint analyses can lead to a noticeable increase in the power of statistical estimates of genetic parameters compared to estimates based only on information from the genotyped subsample. **This approach is applicable not only to genetic data but to any discrete time-independent variable that is observed only for a subsample of individuals in a longitudinal study.**”

“Despite an existing tradition of interpreting differences in the shapes or parameters of the mortality rates (survival functions) resulting from the effects of exposure to different conditions or other interventions in terms of characteristics of individual aging, this practice has to be used with care. This is because such characteristics are difficult to interpret in terms of properties of external and internal processes affecting the chances of death. An important question then is: What kind of mortality model has to be developed to obtain parameters that are biologically interpretable? The purpose of this chapter is to describe an approach to mortality modeling that represents mortality rates in terms of parameters of physiological changes and declining health status accompanying the process of aging in humans. […] **A traditional (demographic) description of changes in individual health/survival status is performed using a continuous-time random Markov process** with a finite number of states, and age-dependent transition intensity functions (transitions rates). Transitions to the absorbing state are associated with death, and the corresponding transition intensity is a mortality rate. Although such a description characterizes connections between health and mortality, it does not allow for studying factors and mechanisms involved in the aging-related health decline. Numerous epidemiological studies provide compelling evidence that health transition rates are influenced by a number of factors. Some of them are fixed at the time of birth […]. Others experience stochastic changes over the life course […] **The presence of** such **randomly changing influential factors violates the Markov assumption, and makes the description of aging-related changes in health status more complicated.** […] The age dynamics of influential factors (e.g., physiological variables) in connection with mortality risks has been described using a stochastic process model of human mortality and aging […]. Recent extensions of this model have been used in analyses of longitudinal data on aging, health, and longevity, collected in the Framingham Heart Study […] This model and its extensions are described in terms of **a Markov stochastic process satisfying a diffusion-type stochastic differential equation.** The stochastic process is stopped at random times associated with individuals’ deaths. […] When an individual’s health status is taken into account, the coefficients of the stochastic differential equations become dependent on values of the **jumping process.** This dependence violates the Markov assumption and renders the conditional Gaussian property invalid. So the description of this (continuously changing) component of aging-related changes in the body also becomes more complicated. Since studying age trajectories of physiological states in connection with changes in health status and mortality would provide more realistic scenarios for analyses of available longitudinal data, it would be a good idea to find an appropriate mathematical description of the joint evolution of these interdependent processes in aging organisms. For this purpose,** we propose a comprehensive model of human aging, health, and mortality in which the Markov assumption is fulfilled by a two-component stochastic process consisting of jumping and continuously changing processes. The jumping component is used to describe relatively fast changes in health status occurring at random times, and the continuous component describes relatively slow stochastic age-related changes of individual physiological states. **[…] The use of stochastic differential equations for random continuously changing covariates has been studied intensively in the analysis of longitudinal data […] Such a description is convenient since it captures the feedback mechanism typical of biological systems reflecting regular aging-related changes and takes into account the presence of random noise affecting individual trajectories. It also captures the dynamic connections between aging-related changes in health and physiological states, which are important in many applications.”

No comments yet.

## Leave a Reply