# Econstudentlog

## Hypotheses

(link). Some people would say that you should formulate the hypothesis before you start gathering data – and that’s what I’ll do now.

I guess this post is mostly for people like Plamus, but other people are very welcome to read along as well. I’ll start out with some introductionary remarks. I have an account on a chess website – playchess.com. It’s a neat site, I like it. They’ve recently introduced a new featured: A so-called ‘tactics trainer’. The way the tactics trainer works is by means of tactics sessions. Each tactics session features a number of chess problems you need to solve under a time constraint. You’ll never run out of problems; each time you’ve solved one problem (or answered incorrectly) a new one will pop up. Each session lasts about 6 minutes – some problems can be solved in a second or two, others might take more than a minute. The outcome of a session will depend upon the number of problems solved correctly, the ‘toughness’ of the problems solved or not solved and probably various other factors as well. Once you’ve finished a session, you’ll get a statistic on the number of correctly and incorrectly solved problems, the average time spent on each problem and the corresponding tactics performance rating. The performance rating will impact your combined tactics rating, which is a result of all previous sessions (it’s like a standard Elo rating system with frequent updating).

But why is the tactics trainer worth blogging about? Well, here’s the thing: Solving tactics problems is hard and it’s a cognitively demanding task. It takes brain power, and if your brain isn’t working 100 % you’ll do worse than if it did. I have often thought about how to model the effects of blood glucose variations on cognitive performance. I’ve thought about it because I know that blood glucose variation impacts my performance in various areas – it’s obviously the case, in extreme cases it’s extremely obvious. But what about the non-extreme cases? Blood glucose fluctuates a lot over the course of a day, and it’s not unlikely that such fluctuations also impact performance. But can those effects be quantified? So far it’s been difficult for me to figure out how one would set about doing that – one approach I’ve contemplated in the past was to use IQ-tests to measure performance as a function of blood glucose, but that idea was basically dead in the water in terms of getting the kind of results I’d like – an IQ-test takes a lot of time, it’s not always easy to compare scores across tests and you can’t do the same test over and over because the way the test is designed the validity of the results will be impacted if you repeat the test. Another problem is that the blood glucose level wouldn’t even be exogenous – to be in a state of deep concentration for a long time under stressful circumstances impacts blood glucose. What would be much better would be a shorter version of the test – like a relatively short test where a high level of concentration is required to perform well and where even small differences in performances as a result of blood glucose fluctuations can be measured and quantified. Remember the tactics trainer I was talking about? Yeah…

It seemed to me that using the tactics trainer sessions to gauge ‘mental ability’ as a function of blood glucose actually makes a lot of sense; it’s possible to run a lot of sessions over time, so n can potentially become large enough to actually make room for some non-silly results. There are always new and different problems available, and the comparability issue across tests disappears completely. Blood glucose values can be taken as exogenous as the sessions last only a very short amount of time. Performances are precisely measured.

I should make it clear from the start that the effect of blood glucose on performance is non-linear. Extremely low values impact performance, as do extremely high values – so in theory some kind of semi-inverse-u-shaped pattern should probably be expected. The actual relationship would not look very much like an inverse u both because the scales are asymmetric in terms of symptoms/(mmol/l deviation from the desired level) –  a blood glucose of somewhere between 4-10 mmol/l is often considered ‘desirable’, but whereas a value of 0 will mean that you’re dead, a value of 14 will for many diabetics probably often not give any symptoms at all – and because the left hand side is truncated (as mentioned) whereas in practice the right hand side is not for well-treated patients.

I will make a simplifying assumption here that will save me a lot of work and arguably will not be all that problematic when interpreting the results. I’ll disregard the non-linearities in the data by removing all data problems related to performance effects to the left of the lower bound of the ‘desirable level’, and by assuming that the ‘true’ non-linear relationship between performance and blood glucose on the right hand side of the distribution can be approximated by a linear function without this causing too many problems. The way to deal with the “data problems related to performance effects to the left of the lower bound of the ‘desirable level'” will be to exclude from the sample all observations with a measured blood glucose below 4.0 mmol/l. My motivation for removing the lowest values is that it will always become obvious to me within a very short amount of time, when my blood glucose is that low, that there’s a significant performance effect. I know those effects very well, and I know that it’s a bad idea to delay treatment – blood glucose levels below that can quickly turn into a medical emergency. When thinking about performance-effects here, it seems to me to make a lot of sense to implicitly employ a two-state model framework and then use separate/different models to analyze the stuff that’s going on in the two states: State one is quite simple, that’s the hypoglycemia scenario mentioned. To ‘model’ this state is easy: The effects are almost universally real and significant, to an extent where even measuring them in the manner described here becomes borderline dangerous. State two: Euglycemia or hyperglycemia. In this state, performance is likely to be at least somewhat some to linearly decreasing in the blood glucose level. I’m mostly interested in performance effects which are not obvious to me and so that makes state two the more interesting state to consider; it’s also a lot more interesting because state one is relatively (though not that…) rare, whereas state two is the default state in which I spend most of my time. Regarding using a linear approximation to model the relationship in state two rather than the ‘true’ non-linear function: This may be problematic, but I know myself well enough to know that I don’t want to bother with non-linear models when I look at this stuff later; it’s a poor and underspecified model to begin with. The kind of question I’m asking here is far more along the lines of: ‘does it even make sense to assume that your cognitive profile is affected by blood glucose variation?’ than it is a question along the lines of: ‘how will a 2,6 mmol/l difference impact your likelihood of getting an A when taking an exam in course X?’

When it comes to the specifics of the data gathering process, I’ll do it this way: Unless I have symptoms of hypoglycemia – in which case I’ll not do the session in question, but rather treat the hypoglycemia – I’ll only measure the blood glucose after I’ve finished the session. If the blood glucose is below 4.0 mmol/l the results will not be included in the sample. For all other observations, I will list the performance rating of the tactics session and the blood glucose level.

I intend to test the hypothesis that there is a significant and negative effect on performance of the blood glucose level measured (higher blood glucose level -> lower performance rating).

If I get around to it, it might also be interesting to see if there are threshold effects at play. One threshold to consider might be a blood glucose level of 15.0 mmol/l.. The precise cut-off is semi-arbitrary, but not completely; this is close to the point where you start to be able to measure beginning ketonuria, and it’s probably also around this point where symptoms start to (maybe) appear. I write ‘maybe’ because the symptoms of high blood glucose are far more unreliable than the symptoms of low blood glucose, which is also why I’m interested in the related performance effects; when I have symptoms I know I’m not ‘at my best’, but diabetics are often not ‘at their best’ without getting any signals from the body to that effect. A threshold effect also makes sense to include because it’s far from likely that a linear model will catch all the stuff that’s going on here.

As a starting point, my stopping rule will be that I’ll stop collecting data once I have 300 observations. This is completely arbitrary, but you should always have a stopping rule. I take in the neighbourhood of 8 blood tests a day and some of them aren’t taken when I sit at my computer doing tactics chess exercises. If half of them are, however, I will have 300 observations in 2,5 months, i.e. around New Year (this is close to my exams, so I’ll surely not want to do a lot of non-work statistical modelling at that point – so it will be kept simple..). Maybe it will be worth considering doing more than one session per blood test in which case the data can be gathered a lot faster than that, but then problems related to blood glucose exogeneity may start to pop up. I haven’t done multiple sessions after each other before, so I don’t know if such an approach will impact the performance rating; it might, and if it seems to do that I’ll probably disregard such ‘shortcuts’.

Potentially I might improve my tactics abilities during the survey period (in this specific setting that would be a bad thing, because the parameters would then no longer be constant over time) but unless such an effect is very noticeable early on I’ll proceed as if my skill does not improve during the survey period. I’ll write down the starting tactics rating (which is sort of ‘an average of recent past performances’) as well as the tactics rating at the end of the project and compare the difference between the two with the estimated standard deviation of the observations to at least get an idea if there’s a potential big problem here; I don’t know if I’ll really care if a big problem turns up, but I should at least pretend to care about this ‘risk’ of getting better over time (and as an added bonus this is also a simple way to try to establish if doing tactics exercises helps you improve your tactics abilities significantly). The reason why I assume the ‘improvement over time’-effect to be minor here is mostly that I’m actually a reasonably strong player by now so the learning curve is presumably a lot flatter than it was in the past, meaning that exercises like these should not be expected to have that big an effect on my performance.

Yes, I did consider including other variables in the model (number of unsolved problems, time spent/problem), but a) they don’t add much additional information, b) they’re strongly correlated with the rating variable (so I would not be comfortable including them in the same model as the rating variable), and c) the more data I need to write down the more this will feel like work, and I don’t want it to feel like work. So there’ll also be no controls included, this is all just a ‘fun (not quick) and dirty’ project to have running for a while. I’ll release the (limited) data afterwards and let people play around with it if they like to.

Ideas and suggestions (which do not involve me doing a lot of extra work), as well as questions, are of course most welcome.

Incidentally, if you want to know if you’re good at figuring out how smart people are based on how they look, here’s another small-scale project you may be interested in (I have nothing to do with it as such, but I know the guy behind it).

October 17, 2012 - Posted by | blogging, Data, Diabetes, Personal, Random stuff

1. This sounds both fun and interesting – I’ll follow and be happy to play with the data once you release them. In fact, I think you may have the makings of an interesting paper.

Let me see if I understand the situation, since my knowledge about blood sugar is so limited.

You say blood sugar level of zero means you’re dead, and expected intellectual performance drops to zero. Same with a blood sugar of 100% – glucose is not a good substitute for blood. Of course, with the human body as fond of homeostasis as it is, our interval of interest is much narrower, but we obviously have a something similar to a Laffer curve – zeroes at both ends, and non-zero levels in between. It seems to me that your lower bound cutoff of 4 mmol/l is clearly to the left of the peak of the distribution. Thus, a linear approximation may not be good for your whole range of data – it should work well for the data to the right of the peak/mode (it’ll be hilarious if you get some kind of multi-modal distribution). In other words, for example, if your performance peaks at 5.5 mmol/l, I’d use a linear approximation for glucose levels above that level. Other avenues of analysis may also work – say binning the results into blood glucose intervals, taking averages for the bins, and running t-tests between adjacent intervals; maybe even k-means, if you get distinct clusters. A look at the variances can also be informative – maybe (over some interval?) higher blood sugar does not lower average performance, but makes you less consistent (higher variance).

It’s a very good idea to measure blood sugar after each session, so that the knowledge does not influence your performance – anxiety, etc.

Since you’ll have a very respectable number of data points, I’d also record the time of the day you take the observations, for a potential look at performance, say, during morning, afternoon, and evening sessions. What is leading me to this suggestion is the variability of blood glucose levels through the day, especially around meals. Thus, it may make a significant difference whether your blood glucose level, for example, was recently spiked because of a meal and is now dropping precipitously. Of course, one can get fancy here, and recording the times of your meals is probably too much extra effort, but simple time should be almost “free” data category 🙂

If you also care to log the dates of data points, you can also break then the data set into several time intervals. This will give you an extra check for robustness, and also will let you detect the potential effect of an improvement in your skill level (which you already planned to check for). Time series are always fun to work with *rubs hands*.

Best of luck, and I hope you don’t wait till the end to release data – even with 100 observations one can get a good preliminary look.

Comment by Plamus | October 19, 2012 | Reply

• i. I’d love for you to work with the data – it saves me some of the trouble and this is stuff you’re more skilled at than me.

ii. It’s great that you ask about the blood glucose stuff. 0% and 100% is the wrong way to think about blood sugar variation. You’ll note in the post that I used mmol/l – millimol pr. liter – not percent. The brief version is this: Below 4 mmol/l, most diabetics will get symptoms. Obvious impairment will often be there by 3 mmol/l, but not always. Whether you develop symptoms depends upon multiple factors; for instance, if the blood glucose is dropping rapidly because you’ve just taken insulin to correct a high blood glucose or because you’ve just run 5 km in 20 minutes, symptoms may start before your values actually should give you cause for alarm; 4 to 10 is considered the standard acceptable range, but I’ve had symptoms of hypoglycemia when measuring values in the lower bound of that range before. What’s ‘an acceptable level’ incidentally depends greatly upon what I’m about to do or the time of day; most values are taken before a meal which is what makes the 4-10 range ideal. On the other hand if I’m about to head out for a run, a level below 10 is totally unacceptable (as would be a level above 15).

The bodies of diabetics to some extent get used to the values they experience; if you often have high or low values, you’ll get less sensitive to them and less likely to get symptoms. This may work in a sinister way too; if you often have too high values, not only will your body get accostumed to the high values so that you’ll get fewer symptoms of hyperglycemia (the cell damage caused by the disease is independent of symptoms caused by the disease, so it’s a very bad idea not to get the levels under control despite the fact that they do not cause a lot of symptoms) – but you’ll also be more likely to get symptoms of too low blood glucose when the glood glucose level is stil within the normal range (and vice versa). This makes it harder to regain control. It’s also a major problem for some diabetics, especially those who also have some nerve involvement, because without symptoms life gets a lot harder because you have to take a lot of tests in order not to get into trouble. When I was a teenager, I couldn’t tell when my blood glucose got low so I know some of these problems first-hand. Frequent hypoglycemias can also become a problem because people with frequent hypoglycemias will have bodies that are less well prepared for the next hypoglycemia than is the case for those that don’t have such problems; the body mainly uses the liver to counteract the effects of insulin by converting stored glycogen into glucose (via the glucagon peptide hormone), and if you have frequent hypoglycemias the stores of glycogen may become depleted, removing one of the body’s main line of defence (this is also why diabetics who have just had a hypoglycemic episode are often advised not to treat a hyperglycemia which is a direct result of the treatment of a previous hypoglycemia right away, as higher than normal blood glucose levels after a hypoglycemic episode work to restore the glycogen deposits). In general when I’m stressed, symptoms are much more unreliable than they otherwise would be.

Symptoms of high blood glucose will rarely pop up before 15-16 mmol/l, and for me not consistently before 18-19 mmol/l. It’s physically imposible to get to a hundred mmol/l, you’ll be dead long before then. Diabetic ketoacidosis – the complication that used to kill all type 1’s before we found out about insulin in the 20th century – are certain to develop when values go above 20 for a sustained period of time (measured in hours, not days). In terms of what the potential variation looks like, my blood glucose monitor cannot measure values below 1,1 mmol/l or above 33,3. I have not had a ‘too high to measure’ value with this glucose monitor, but I think I’ve had a too low to measure value; I have had some of those in the past anyway. It’s unlikely that the ‘true value’ has ever been that low when I’ve taken it; the at-home blood glucose monitors rely on capillary blood, not plasma glucose, so they’re not 100 % reliable even though they work well enough for the diabetic to manage his disease (if a blood glucose is 2 mmol/l or 2,5 mmol/l is irrelevant to me – when I see a value like that I panic and start loading up on glucose). We measure the blood glucose level ‘in the finger’ even though what we’re actually interested in, at least when it comes to hypoglycemia, is mostly the level in the brain.

iii. I have not noted the time of day of data points so far, but it’s actually an interesting idea to include that variable as well – I hadn’t considered it. However do note that the ‘natural variation in blood glucose’ is, to a significant extent at least, irrelevant for type 1 diabetics; our ‘natural variation’ in blood glucose levels goes in one direction only: Up, until we are dead. Every bloood glucose value and test which does not spell the words ‘die, die, die!’ is a complex result of the interaction of medicine, food, exercise, stress, etc. – there’s nothing ‘natural’ about it.

Anyway, all tests in this ‘study’ are taken before a meal – I guess I should have specified that. That’s the point in time most diabetics take most of their tests, because that’s the point where you decide what to eat (and how much), and how much insulin to take to counteract the effects of the carbohydrates in the food. If you’d like me to, I’ll add the hour of the day to the dataset for the remaining observations – you’re right it’s not exactly a lot of work.

iv. I’ll try to remember to post preliminary data when I’m at, say, 100 observations.

Comment by US | October 19, 2012 | Reply