Econstudentlog

On models

I’ve written a lot of stuff about models on this blog in the past, so some of the stuff I’m writing now I’ve probably covered before. I thought it was worth revisiting the subject anyway.

First off, one way to think about a mental model is to consider it a way of thinking about a problem. This also implies that if there’s a problem of some sort, you can construct a model. And thus, from a certain point of view (…the point of view of mathematicians, economists, engineers, or…), there’s always a model. It can be implicit, it can be explicit – but it’s there somewhere. A model is an explanation, and it’s always possible to come up with an explanation. So when you see a model you don’t like, it’s not very helpful to say that ‘it’s only a model’. What else would it be? And so is whatever you’re considering, from a certain point of view. If the model presented is an inaccurate representation of the problem at hand, then it’s the inaccuracy-part that should be the subject of criticism, not the model-part.

Most people dislike formal models that are very specific and give very precise estimates. They know instinctively that these models are simplistic and that the real world is much more complicated than the models – so the perceived over-precise estimates may be way off and may even seem downright silly. Skepticism is warranted, surely. But the precision is also a very helpful aspect of such models, because precision allows us to be demonstratively wrong about something. I’d argue that this is also an important part of why such models are disliked by humans. Many people who’ve worked a bit with models have a quite low regard for formal models because they know the assumptions are driving many of the results. They are skeptical and prefer the models in their own minds. Those ‘mind models’ are much less specific, much more flexible and much less likely to actually generate testable hypotheses. It’s not that they are necessarily wrong – it’s more that they’re unlikely to ever be proven wrong. People who’ve not worked with models also are skeptic when it comes to models, and their mind models are even less specific and testable than the rest.

Here’s the thing: If you think that it makes good sense to be skeptical of models where assumptions are clearly stated beforehand, where parameters/parameter estimates are generated through a clear and transparent process and where limitations are addressed, then you should be a lot more skeptical of models where these conditions are not met.

Most people prefer vague models because they are more convenient. You’re less likely to be proven wrong; you’re less likely to take a stance that are at odds with the tribe; if the model is general enough it will be able to predict anything, making you think that you’re always right. They’re also often less computationally expensive to formulate.

Here’s one hypothesis from a model: ‘Immigrants from country X are 2,5 times as likely to have a criminal record than are non-immigrants.’
Here’s another hypothesis: ‘Immigrants from country X are more likely to have a criminal record than are non-immigrants.’
Here’s a third hypothesis: ‘Some immigrants from country X have a criminal record.’
Here’s a fourth hypothesis: ‘Some people commit crime.’

Which one of these hypotheses has the greatest information potential, that is the potential to tell us the most about the world? The first one, given that all the other three are also true if that one is. Which one is more likely to be considered correct when evaluated against the evidence? The last one.

From an information processing point of view, having nothing but correct beliefs you are certain about is not a good thing. That’s a sign that your models are very poor and don’t contain a lot of information. If you never seem to be (/realize you’re) wrong, that’s a sign that you’re doing things wrong.

Sometimes the ‘models’ we make use of when evaluating evidence is of the variety: ‘I’d like X to be true (because Y, Z), so obviously X is true.’ Sometimes that’s the model you use when you reject the presented formal model with a beta-estimate of 0,21 and a standard deviation of 0,06. This is worth having in mind.

On a related note, of course not all models are about generating hypotheses and testing them – some of them are rather meant to be used to illustrate certain aspects of a problem at hand in a simple and transparent manner. It’s always important to have in mind what the model is trying to achieve. That goes for the ‘mind models’ too. Are you trying to learn new stuff about the world, or are you just trying to be right?

Here’s a related LW-post from the past.

November 3, 2012 - Posted by | rambling nonsense, random stuff, Rationality

5 Comments »

  1. I hope we can exchange a few posts on this, since it’s a topic I am very interested in, and you are knowledgeable of, so it’d be great to bounce some ideas back and forth.

    Yudkowsky’s post is great. Falsifiability and what he calls “floating” models are the gist of my beef with modern macro-economics (and sociology). It seems to me that engineering, and physics (excluding cosmology and string theory) are mostly free of floatability. There are other tricky points, of course.

    For example, as Yudkowsky’s koan about the falling tree and the sound illustrates, language/definitions often make models all but useless. To use your proposed model set, “We want to study the relationship between immigration and crime rate” sounds like a commendable goal. However, no scientist these days can do that. You can study, for example, the crime rates of immigrants from a number of countries into Denmark – probably because Denmark makes the data public and convenient to use (hello, selection bias, how are you?). So, you are already studying not immigration and crime, but immigration and crime in Denmark. Perhaps useful for Denmark, but does it tell you (translate for falsifiability: can you predict) anything about the US or Argentina? Your data set also uses crime reported to and investigated by Danish police. Thus, you are not studying crime and immigration in Denmark, but reported crime there, omitting the black eye that the girl from Bejikistan got from her father for trying to suggest she does not want to marry her knuckle-dragging first cousin back home in order to bring him into the country. Furthermore, as you have noted here, in Denmark, a definition of “immigrant” is used that, while, very reasonable, may not be the same as the one in Sweden. “Crime” is an equally malleable concept – for example, in US federal prisons (in 2004), over 12% of prisoners were held for marijuana violations. Would they even be on the stats in Denmark, or in Portugal? This can go on, but even at this point, is the study worth the paper it’s printed on? The hypothesis is so narrow and parochial as to justify, IMHO, the suspicion of being self-serving.

    If this is too abstruse, how about this: if in testing your hypothesis, you find it’s true while excluding a few outliers from your data, and it’s not true, when those outliers are left in, have you learned anything? Other than, of course, that your model works when it does and does not when it does not… (Disclaimer: personal pet peeve. I run a lot of investment models, and weeks with less that 4 days of trading are a major PITA. 9/11 was one; now Sandy added another.) How many models do not adjust for outliers?

    A more general question that I would like your opinion on is: are there situations where having no model is better than having a model? I am tempted to say yes, and to add that they do not merely exist, but are numerous, although I can be persuaded otherwise. Example: I am a business owner in the US, and I want to hire an employee. I have two candidates – one black, and one Asian. They are identical in all other individual respects. Flipping a coin is a model, I suppose, but kind of a primitive one. Meanwhile, all studies/models suggest that, for the unobservable variable I care about (future performance in the work place), Asians are superior to blacks – more diligent, less litigious, better team-players, etc. Perhaps I should pick the Asian fellow (remember, ceteris paribus). You may argue that these are just bad models, and can be improved (as many on the left do) by controlling for various cultural attributes, history, etc. Fair enough – then I end up discriminating against the Asian fellow for the fact that his ancestors did not work in the cotton plantations, but in the rice paddies in Asia. Am I better off with these models, or without them? Is society at large better off with them or without them?

    To synthesize my argument a bit: I claim that most macro-economic and sociological models are akin to taking one or a handful of partial derivatives on a problem with, at best, hundreds of variables – often many more. Can it help you to follow the implications of such a model? Sure. Can it hurt you? Sure. How do you do the cost/benefit analysis? You can’t, unless you incorporate in the models variables you may have no data for, or variables that are so numerous that the model is computationally intractable. Add here cautions about time series, where we do not just have enough history, but conceptually never can. GDP in the 1940’s does not compare very well with GDP today, with services and information being different from manufactured goods in pretty fundamental ways, and wartime GDP does not compare well with peace-time GDP. The US and Germany both manufacturing an anti-tank round adds to both their GDPs. The US and Germany both blowing up a tank, does not count as a decline in GDP. GDP=/= welfare.

    You say: “And thus, from a certain point of view (…the point of view of mathematicians, economists, engineers, or…), there’s always a model.” But is that always (or even most of the time) a good thing? Is a bad model better than no model at all? Is solving the problem of winning a horse race for a spherical horse in vacuum helpful in any way (it can clearly hurt, when some genius puts actual horses in vacuum)? Most models do not even start from the full model (reality) and take away a few features. They start away by considering a few features and omitting everything else. Modeling a human being with all it’s parts in and functioning except, for example, cell communication over long or short distances, leaves you with a human being without hormones or neurotransmitters, and thus not viable. How many macro models model interactions between you and me over the internet? How about a human being without only the feedback between the muscles and the brain? Result: paralysis and death. How many macro models (to parallel the analogy) model tax rates (output of the gov’t/brain) with regard to pleading by special interests and regulatory capture, and how many of those that actually do (I guess all three of them) don’t also assume the body has no pain sensitivity and adaptive abilities?

    Apologies for the long and disjointed post. I went over it a second time, tried to add some structure, and gave up – not much improvement, and too much temptation to add more.

    Look forward to your thoughts.

    Comment by Plamus | November 5, 2012 | Reply

  2. i. Specific hypotheses like the ones you discuss in the first part of your comment (‘Danish-Turkish immigrants’ reported crime rates related to non-drug related infractions during the years 1996-2004…’) are the result of one type of model; a type of model that perhaps quite often due to the limitations of the data available has a scope that is too narrow to make it all that useful for many of the people who would like to use it to support a specific agenda they favour. As I said, ‘It’s always important to have in mind what the model is trying to achieve.’ If you use the wrong type of model to think about a problem or if you don’t fully understand the model you’d like to use to help you make better decisions (which you don’t, if you try to use a Danish model with Danish data to decide on optimal US immigration policy recommendations), you can get into trouble – sure. That’s an argument for thinking about what you’re doing, and what the model is doing, not an argument for not using models.

    Just because concepts like ‘crime’ and ‘immigrant’ are malleable when people make formal models doesn’t mean they aren’t when formal models aren’t applied. They’re much more malleable in the latter case; when people make a formal model and test a hypothesis, they define their terms first. So, in the given context, you know what an immigrant is (‘had Turkish citizenship before 1970…’ or whatever), at least according to the person doing the modelling. If the model doesn’t really do a good job explaining the variation in the data, it can be reformulated until it does a better job – if, say, illegal immigrants are important then you can include estimates of them as well. If you’re (/not) interested in drug-crimes you can reformulate the model. If you want to know if there’s a problem with poor Bejikistani girls with a black eye, you can look at estimates of unreported crimes and include that type of stuff in the model. Every time you come across a bad model where you can spot problems, you also automatically stumble upon ways to, or at least ideas about how to, improve the model. This is part of what’s great about models; if you don’t discuss problems in terms of models, such conceptual improvement mechanisms are much more elusive and harder to implement. As for the “if in testing your hypothesis, you find it’s true while excluding a few outliers from your data, and it’s not true, when those outliers are left in, have you learned anything?” – I’d say that yes, you’ve learned something; you’ve learned that the stuff you’ve included in your model does not explain all the variation in the data, that it would probably be a good idea to see if you can somehow find additional stuff to include in your model that can help explain the outliers, and that you should be very cautious about drawing (some types of) conclusions from the model.

    ii. I’m not sure I understand the Asian/black-example; if previous analyses that have tried to take into account unobserved heterogeneities have found that Asian applicants are more likely to be good job candidates, you should pick the Asian applicant if the two candidates are otherwise identical. If all else isn’t equal, and it isn’t if such unobserved heterogeneities are important, you shouldn’t use a model with that asumption. If you’re an employer seeking to hire someone, you don’t have time to waste on controlling for ‘cultural attributes, history,…’ – again, ask yourself what your model is trying to achieve. You don’t want a completely accurate model here, perfect accuracy is way too expensive; you just want to find the guy who’s most likely to do a good job, and you’re solving this problem subject to time- and budget constraints that make elaborate models unlikely to be optimal to implement. More generally, of course one should only use formalized models to the extent that the benefits from doing so exceed the costs, and as you touch upon yourself some types of models perform better – as in, are more helpful when it comes to facilitate decisionmaking and drawing conclusion from the data – in some settings than in others. Again, this is from my perspective an argument for thinking about what you’re doing and which models to apply to which types of problems; it’s not a strong argument for giving up on models.

    iii. I use models to help me think about stuff, and I sometimes find that it helps me to formalize stuff a bit (otherwise I wouldn’t be doing it). Most people are different. Anyway, to formulate a model, even a simple and semi-implicit model, tends to help me to identify key variables, help me think about how they are likely to relate to one another, help me figure out which tradeoffs are important, and similar stuff. The alternative isn’t ‘the full model’ (knowing everything, including where all the blood vessels go and which neurons send signals when and how..); the alternative is to have not even the slightest clue what’s going on. This is also part of why I often use the blog to write about social interaction stuff in a more formal way than most people probably do; I do it because if I don’t think about that stuff in a semi-systematic way, I don’t really feel that I understand what’s going on.

    Maybe the stuff I think up when I engage in that type of thinking/modeling is wrong, because the model is poor, but then I can reformulate the model and obtain new insights. A flawed model is useful because it can be improved; the alternative will often be an uncoordinated muddle of unstructured thoughts, and separating the good ideas from the bad in such a mix is very hard.

    iv. Economic models including special interests, regulatory capture stuff, growth rate sensitivity to tax rates, and similar stuff you can find in the public choice/new political economy literature. I recommend Mueller as a place to start – I read it cover to cover last year and though I’ve probably forgotten most of it by now, I remember it as reasonably interesting stuff.

    (v. “[a topic] you are knowledgeable of” – what on Earth gave you that idea?)

    Comment by US | November 5, 2012 | Reply

  3. Thanks for your response. I’ll try to respond to some specific points of yours – not trying to fisk your post, it just makes mine more structured.

    “If you use the wrong type of model to think about a problem or if you don’t fully understand the model you’d like to use to help you make better decisions (which you don’t, if you try to use a Danish model with Danish data to decide on optimal US immigration policy recommendations), you can get into trouble – sure.” – That’s where, though, we need to define who “you” is. If “you” is indeed you, and the model is for your own consumption, that’s dandy. But what if “you” is generic third party – the public, policy-makers, special interests? What if you know that your model, which is very good for its limited purposes, will be misunderstood and misapplied? The model as an abstract concept may be immaculate, but what happens when it is combined with an ignorant and/or partial user?

    “If the model doesn’t really do a good job explaining the variation in the data, it can be reformulated until it does a better job – if, say, illegal immigrants are important then you can include estimates of them as well.” – Two problems. One, the green jelly beans cause acne problem. Two, let’s assume I have the significant and robust model that tells me immigrants commit more crime. I strongly suspect (gut feeling, anecdata, etc.) that illegal immigrants commit the extra crime, and that legal ones are indistinguishable from the native population in that aspect. I have no data on legal/illegal immigrant crime. Do I use the model as is, or do I discard it? My Asian/black example was meant to illustrate a similar situation. If I apply the model for hiring, I may end up penalizing the black fellow for circumstances beyond his control – that people of his skin color tend to do or not do certain things. As the Keynes vs Hayek video puts it, “That simple equation… too much aggregation… ignores human action and motivation.” I am no fan of arguments like “institutionalized racism/sexism/homophobia”, but… if a similar model is widely used, I can see exactly such an outcome arising.

    “The alternative isn’t ‘the full model’ (knowing everything, including where all the blood vessels go and which neurons send signals when and how..); the alternative is to have not even the slightest clue what’s going on.” – But, again, is that a bad thing? Hubris and arrogance are all too human, as self-delusion and impulsiveness. Ancient and medieval doctors (up until the late 19th century) had a “model” that bloodletting helped fever patients, based on all kinds of “humoral balance” twaddle. The practice probably killed millions of human beings throughout history. 2000 years of practice did not discredit it. Would it not have been better for Herophillus to keep his mouth shut about his “model”, rather than unleash it on a world unable to accept it critically and test it appropriately? History shows that time and again bad models take on a life of their own and wreak untold damage. When humans do not understand something, they tend to tiptoe around it, fear it, and leave it alone. As soon as they think they understand it, they want to change it and bend it to their preferences. Complex systems seem to respond much better to individual approach and very light touch than to models that claim to describe them. Modelling the behavior of a particle in a uniform gas seems to be a good target for modelling; economies, human bodies, societies, human interactions do not seem to be.

    “Economic models including special interests, regulatory capture stuff, growth rate sensitivity to tax rates, and similar stuff you can find in the public choice/new political economy literature.” – I have done some (probably insufficient reading) on public choice, and found it too abstruse. The conclusions are almost always along the lines of “well, yeah, this and this will happen, and it’s not good, so we can limit it a bit on the margin if we tweak this parameter”. If the conclusion is “holy mother of FSM, the whole system is f**ked up, we gotta tear it apart and start anew”, the researcher is branded as a loony radical, and his findings collect dust on a shelf in some university basement. Is a model any good if it’s correct, but its prescriptions lack path dependency, i.e. how do we get from here to there in a realistic way? It may help understanding for those few who bother to try and have no vested interest in keeping the status quo – but that’s the kind of understanding that breeds (well, in my case, at least), resignation, despair, cynicism.

    “[a topic] you are knowledgeable of” – what on Earth gave you that idea?” – Oh, I dunno, the fact that you study economic models on an almost daily basis🙂. When I encounter them in econ papers, I do not normally try to work through the implications, test the sensitivity to the assumptions, and all that good stuff – I have neither the time, nor (any longer) the analytical tool-set to do it. The last time I have derived the full partial derivatives of a non-trivial function was probably 7 or 8 years ago.

    Comment by Plamus | November 6, 2012 | Reply

  4. i. There are different types of models, and of course it depends on the type of model to which extent ‘I’ can be held accountable for how ‘other people’ use ‘my’ model. Models dealing with stuff like politics, ressource allocation and similar stuff are always going to be abused, but does that mean we’re better off without models? I don’t see how you’ve made the case for that at this point – when evaluating the performance of models you also need to figure out what would happen in their absence. Without formal models in those areas people would just make use of other conceptual devices to try to get their way; lawyers (and politicians) don’t use mathematical models very much, but they’re great at obfuscating matters and manipulate agendas even so.

    ii. a) ‘Green jelly beans cause acne’ stuff, publication bias and all sorts of similar concerns are legimitate problems a person will confront when relying on models to make sense of the world. But again, people who know a bit about this stuff should know better than to put much trust into studies with such shaky foundations, and people who don’t should be very skeptical about the conclusions anyway because they probably don’t fully understand how they came about. Of course then you might add stuff like Dunning-Kruger and the ‘shoulds’ become less relevant, but if you assume the non-model alternative (model) perform better it also needs to address such questions. And again, without the models it’s harder to update/correct mistaken beliefs.

    b) “I strongly suspect (gut feeling, anecdata, etc.) that illegal immigrants commit the extra crime, and that legal ones are indistinguishable from the native population in that aspect. I have no data on legal/illegal immigrant crime. Do I use the model as is, or do I discard it?” – Okay, first of all the model has made it clear that the combined group of legal- and illegal immigrants commit more crime. This is what the model can tell you, and if it doesn’t include data about which sub-groups commit that excess crime, and indeed no such data even exist, you should probably try not to assume anything (as such beliefs are likely not well justified). I’d probably in the hypothetical encourage you to examine where your ‘gut feeling’ is coming from (i.e. ‘model it’), because it can help you establish whether your intuition makes sense or not, and help you reformulate the model so that the problem you’re interested in is addressed. For example if the gut feeling is the result of a difference between the two groups when it comes to other types of data which also correlate with crime, your hypothesis/gut feeling is probably more likely to be true than if you try to justify it to yourself by arguing that you saw two illegal immigrants get arrested yesterday.

    “I may end up penalizing the black fellow for circumstances beyond his control” + the institutional racism stuff: That stuff is a lot older than are formalized models used by employers to find good employees. And you need to again think about what’s going on when people use these models: When employers systematically use data to put into models in order to figure out whom to hire, they do it because the model will give them a better chance at picking the best guy for the job than they would have if they didn’t rely on the model – that’s why they use the models in the first place. If a different model, say one that assumed that blacks don’t do that badly, performed better in terms of matching jobs and workers, employers would have an incentive to implement it. This is, I believe, one of the standard arguments against the institutional racism equilibrium; but whether that equilibrium is plausible or not, an argument can certainly be made that if it is it’s not the result of the employers’ search models.

    iii) Instead of the medicine example you could also have brought up Marx’ model of economic development. I bring this up so that we’re clear that I can see the downsides. But as I mentioned earlier, you haven’t done much to explain to me how the alternative scenario looks like; if humans don’t rely on formal models, informal and badly specified models take over and they are worse. Funny enough, you could again use my example with Marx here – you could surely find people on both sides when it comes to an argument over whether Marxism is best thought of as a ‘model’ or not, and/or which kind of model it is/isn’t. Anyway, knowledge is better than ignorance and striving for knowledge is better than accepting defeat. I refuse to accept that we should just give up on modeling stuff we don’t understand. We should try to model the stuff we want to understand, and if the models don’t help us we should try out other models, try to seek out different angles or related problems of a magnitude that we might be able to handle. Having the limitations of the models we make use of to understand the world in mind is much, much easier if the models we use are formalized and explicit.

    To make it perfectly clear, I feel confident we won’t ever get the kind of ‘Model’ of ‘The Economy’ that some macroeconomists have (wet?) dreams about. Certainly not one that’s ‘controllable‘ to the extent they’d like it to be. So there are types of models we are unlikely to have success with. The fact that some models we’d like to get our hands on remain elusive to us (quantum gravity would be another example) doesn’t mean we should stop asking questions, stop looking at the related problems that we can address, test hypotheses on a smaller scale, and then try to add the stuff we learn from these endeavours together to draw new parallels and obtain new insights… No, this is precisely what we should be doing. Because it adds to our knowledge of the world.

    Shorter me: Using bad (explicit) models may sometimes make us worse off than not using (explicit) models at all, but if you use good models the right way you beat both of the alternatives. Using the right kind of models the right way is hard, but that’s what we should be trying to do.

    “the fact that you study economic models on an almost daily basis” is a fact that’s easy to overlook in that context when your lecturers are mostly people who’ve been doing that for decades – I consider them to be the knowledgeable guys when it comes to that stuff, not me. I incidentally rarely read economics stuff ‘outside of work’ – there’s a lot of much more interesting stuff out there that I’d rather spend my time on.

    Comment by US | November 6, 2012 | Reply

  5. “… but does that mean we’re better off without models?” – I did not mean to make such a bald (and stupid) claim, and I apologize if I left the impression that I do. I fully agree that most of the time models are very helpful. I do, however, claim, that there exist situations, systems and circumstances where having the best model possible (because of data constraints, computing power, complexity of the modeled system, etc.) is inferior to defaulting to a trivial model – coin flip, play it by ear, wait and see, trial and error. Here’s my modified version of the Serenity Prayer: “FSM, grant me the serenity to accept the things I cannot model,
    the courage to model the things I can, and the wisdom to know the difference.”

    “The fact that some models we’d like to get our hands on remain elusive to us (quantum gravity would be another example) doesn’t mean we should stop asking questions, stop looking at the related problems that we can address, test hypotheses on a smaller scale, and then try to add the stuff we learn from these endeavours together to draw new parallels and obtain new insights… No, this is precisely what we should be doing. Because it adds to our knowledge of the world. ” – Amen, brother. That’s how science should be done. More experimental economics and less macro, more trial-and-error and less philosopher-king edicts, spend more money on researching nuclear fusion and less on fine-tuning the Arrow-Debreu model.

    I think we are in general agreement, only your advice (fully correct) was focused on what should be doing, and mine more on what to do when we cannot do what we should be doing.

    Comment by Plamus | November 7, 2012 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: