How couples meet

Click to view full-size (the same goes for the data posted below). The figure is from Searching for a Mate: The Rise of the Internet as a Social Intermediary, by Rosenfeld and Thomas.

“we show that gays, lesbians, and middle aged heterosexuals- three groups who inhabit thin markets for romantic partners- are particularly likely to have found their partners online. Individuals are in a thin market for potential partners when the cost of identifying multiple potential partners who meet minimum criteria may be large enough to present a barrier to relationship formation. We propose that for single adults in thin dating markets, improvements in the efficiency of Internet search may be especially useful and important. Conversely, single people (college students, for example) who are fortunate enough to inhabit an environment full of eligible potential partners may not need to actively search for partners at all.”

The last part of that sentence had me laughing, but it’s an interesting paper. Of course in general they’re probably right – in the discussion they note that:

“Young heterosexual adults, who we presume to be among the most technologically savvy people in society, are among the least likely to meet partners online. Young adults have single others all around them which renders the search advantages of the Internet mostly irrelevant. In environments rich with potential partners, old fashioned face-to-face socializing still trumps online search.”

Here’s another interesting observation:

“Searching the personal advertisements in the pre-Internet era meant thumbing through the newspaper classified section by hand. Print advertisements could only be examined one issue at a time. Perhaps that is why only 4 out of 3,009 couples in the dataset reported meeting through the newspaper classifieds (even though a majority of the sample met before the Internet era).”

Lastly, some tables from the paper:

(there’s basically no difference)

Note that there’s again pretty much no difference. Only the ‘met-through-friends’-variable was significant for the adjusted odds ratio measure and maybe that’s just a fluke. The raw ‘met-in-church’ odds ratio is highly significant, but once you control for relationship duration, children, race, religion and other stuff, the effect disappears completely.

July 4, 2012 - Posted by | Data, dating, Demographics, Papers


  1. Mmmm, data 🙂 Let’s see what we have here.

    Some observations:

    1) What’s with the temporary blip in “Met Online” for heterosexual couples in Figure 1 in the early to mid-80’s? Not big in absolute terms, to be sure, but given the good sample size (N=2462), I’d guess it’s highly statistically significant. It also does not seem to be there for same-sex couples – although it’s hard to judge, since that data series begins right around that time.

    2) Hmmm, wait a second. Since 2006 (seems to me to be the last year of growth for “Met Online”):
    “Met Online” is flat
    “Met through Friends” is down 4%
    “Coworkers” is down 4-5%
    “Met in College” is down 2%
    “Neighbors” is down 1-2%
    “Family” is down 3-4%
    “Met in … School” is down 2%
    “Met in Church” is down 3%
    So, a decline in the sum total of these categories of 19-22% is offset by a gain of only 4-5% in “Bar/Restaurant”? Even if my eyeballing is terrible, and I am off by a full percentage poin on all of these, that’s still 12-15%, offset by 5%. Something does not add up here. It’s most likely (as they mention) the pesky “more than one category can apply” – but that in and of itself is an indicator of a crappy dataset. Since categories overlap, you must be damn sure you pose questions that make overlaps clear and accountable. If you (US) tell me about a girl in Denmark who would totally dig a weirdo like me and give me her Facebook page (I do not and will never use FB, but bear with me), and I chat with her there, and then fly to Denmark to meet her in a nice restaurant over some delicious Flæskesteg… did I meet the fine lady through a friend, online, or in a restaurant? Or any two? Or all three?

    3) Ah, smoothing… economists love it for a reason – it can hide a whole lot of things you do not want to have to explain. Whenever smoothed data is used, when there is no compelling reason for that (such as seasonality… and even then!), you should ask “why?” The flattening “Met Online” for heterosexual couples is… strange. It’s the only “sharp” turning point in the whole chart. I admit I have not used LOWESS smoothing, so have only passing theoretical familiarity with it. But for “Met Online” they used a 5-yr moving average, and that I know well. For this to turn sharply flat, you need a significant drop in the “raw” data. I tried to simulate their data, and came up with something like this: 5,7,9,11,13,15,17,19,21,23,25,21,20,20,23,25,21… which gives you a 5-yr moving average of 9,11,13,15,17,19,21,21.8,22,21.8,21.8,21.8,21.8. I am sure you notice the big drop in the first (raw data) mock series from 25 to 21 to make the MA flatten out at 22-ish. What the hell happened in 2006?

    Thanks for a stimulating post, US.

    Comment by Plamus | July 5, 2012 | Reply

  2. And thanks for a stimulating comment.

    1) I noticed the spike in the 80’es too, but I had no idea what was going on so I decided not to comment on it.

    2) FWIW, I’m about as excited about facebook as you are, so you’ll never get a facebook link from me. Another kind of link perhaps. But Denmark is probably not the best place to look (I may touch upon this matter in more detail in a post later on; I have been thinking about writing a post about that a couple of times over the last few days). I was thinking about whether it made sense to say that the number of category overlaps have risen over time, simply by virtue of the way the internet works, but I’m not sure where I was going with that anyway.

    3) I didn’t think about the implied 2006-drop, this is a good observation. I was only thinking about the last period as a whole and how the pattern observed made good sense if we’ve just moved to a new equilibrium.

    Comment by US | July 5, 2012 | Reply

  3. I gave this some further thought, and now I am ashamed – I think we both should be, US. We let something glaring slip by us. In my previous comment I said “Whenever smoothed data is used, when there is no compelling reason for that (such as seasonality… and even then!), you should ask “why?””. Great intuition, but never followed through on it. It should have been obvious.

    Their sample sizes (2462 and 462) only look adequate in aggregate, but they are in fact woefully small for the stratified analysis they try to run. This N=2462 for heterosexual couples is spread over 74 years – 1936 must be their starting year so that they have a t-yr moving average for 1940. That’s on average only ~33 observations per year. For he less popular categories (college, church) that means only 2 or 3 responses per year initially; and given the declines in the last few years (remember, smoothing!), I bet they dropped to zero. No wonder they smoothed – for one, plotting the raw data would give you very step-wise plots that would beg the question of why so few observations; for another, the relative inter-year variance would be huge; and for yet another, bye-bye significance on category level. It’s even worse for the homosexual couples analysis – N=462, 30 years, that’s 15.4 observations per year. For example, the raw data for “Coworkers” for homosexual couples which start at 18% and ends up at 6% (and declining rapidly!) most likely starts at .18*15.4=2.774 average for 1980 through 1985… let’s be generous and call it 4,4,3,2,2, observations per year, and then goes something like a few 3’s, a lot more 2’s, then some 2’s and a lot more 1’s, and lately about even 1’s and 0’s.

    This is just bad, nay, horrible science. A sample size 10 times what they have for heterosexual couples, and 20 times what they have for homosexual couples would be barely adequate for their analysis. Given their use of LOWESS, I hesitate to accuse them of statistical ineptitude – so the obvious explanation, unless I am missing something, is obfuscation – something along the lines of “We have this neat data set… It’s small, but we can make a nice publishable paper out of it. Crap, these plots look BAD – how can we “massage” them to look less suspicious and yet seem to support an interesting enough thesis?”. To note, “sample size” is only mentioned twice in the paper, and one of those is in the explanation under Figure 1. Totally uncool, dudes.

    Again, thanks for a great post, although great not so much for the insight into mate selection as for keeping an eye out for bad statistics.

    Comment by Plamus | July 6, 2012 | Reply

    • Awesome comment – the small effective sample sizes are probably also behind the spike in the number of couples having met online in the 80’es. When I first looked at the figure, it seemed to me that a small n was the only kind of explanation that made any kind of sense, but then I disregarded that explanation even though it was staring me right in the face; I reasoned that with a sample size in the thousands, even if they asked a couple of people working at CERN at the time, it shouldn’t make such a difference given that they smoothed the data. Of course the truth of the matter is that smoothing worked precisely the opposite way; it probably emphasized to a significant degree what should have been thought of as a completely negligible data point. They didn’t need to ask a big number of such couples to get that kind of effect – what they needed was one couple, one of those years, and the smoothing would take care of the rest, making it look like a ‘mid-80’es spike’. With a sample size that small, they would be very ‘lucky’ to have such a couple in their data set in the first place, but it could happen and it seems by now to me the far most likely explanation.

      ‘Totally uncool’ indeed.

      Comment by US | July 6, 2012 | Reply

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: