How Big Data Informs Economics

In A Fistful of Dollars, Clint Eastwood challenges Gian Maria Volonte with the words, “When a man with .45 meets a man with a rifle, you said, the man with a pistol’s a dead man. Let’s see if that’s true. Go ahead, load up and shoot.”

That’s the right words to challenge big data, which recently reappeared in economics debates (Noah Smith, Chris House via Mark Thoma). Big data is a rifle, but not necessary winning. Economists must have special reasons to abandon small datasets and start messing with more numbers.

Unlike business, which only recently discovered the sexiest job of the future, economists do analytics for the last 150 years. They deal with “big data” for half of that period (I count from 1940, when the CPS started). So, how can the new big data be useful to them?

Let’s find out what big data offers. First of all, more information, of course. Notable cases include predicting the present with Google and Joshua Blumenstock’s use of mobile phones in development economics. Less notable cases encounter the same problem: a decline in the quality of data. Compare long surveys that development economists collect when they do experiments versus what Facebook dares to ask its most loyal users. Despite Facebook having 1.5 bn. observations, economists end up with much better evidences. That’s not about depth alone. Social scientists ask clearer questions, find representative respondents, and take nonresponses seriously. If you do a responsible job, you have to construct smaller but better samples like this.

Second, big data comes with its own tools, which, like econometrics, are deeply rooted in statistics but ignorant about causation:

Big data tools
Big data tools

The slogan is: to predict and to classify. But economics does care about cause and effect relations. Data scientists dispense with these relations because the professional penalty for misidentification is lower than in economics. And, honestly, at this stage, they have more important problems to solve. For example, much time still goes into capacity building and data wrangling.

Hal Varian shows a few compelling technical examples in his 2014 paper. One example comes from Kaggle’s Titanic competition:

Varian - 2014 - Big Data New Tricks for Econometrics
Varian – 2014 – Big Data New Tricks for Econometrics

The task requires predicting whether a person survived the crash or not. The chart says that children had more chances to survive than old passengers, while for the rest age didn’t matter. A regression tree captures this nonlinearity in the age, while logit regression does not. Hence, the big data tool does better than the economics tool.

But an economist who remembers to “always plot the data” is ready for this. Like with other big data tools, it’s useful to know the trees, but something similar is already available on the econometrics workbench.

There’s nothing ideological in these comments on big data. More data potentially available for research is better than less data. And data scientists do things economists can’t. The objection is the following. Economists mostly deal with the problems of two types. Type One, figuring out how n big variables, like inflation and unemployment, interact with each other. Type Two, making practical policy recommendations for the people who typically read nothing more than executive summaries. While big data can inform top-notch economics research, these two problems are easier to solve with simple models and small data. So, a pistol turns out to be better than a rifle.

2 thoughts on “How Big Data Informs Economics

  1. Anton, thanks for a thoughtful piece. Big Data does have the potential to give economists a valuable new tool in the form of real-time or near real-time sentiment analysis by tracking new channels such as social media. Alan Greenspan and other economic thought leaders believe the 2008 crisis taught us the importance of measuring “animal spirits” in managing economic health. If policy makers can better predict a plunge in market sentiment, such as that caused by the collapse of Lehman Brothers, they can better respond to or even prevent the ensuing damage.
    @KevinPetrieTech, http://www.attunity.com
    https://www.linkedin.com/pulse/three-reasons-data-scientists-might-prevent-next-market-kevin-petrie?trk=mp-reader-card

    Like

    1. Thanks, Kevin. Stock markets themselves reflect sentiments, but the track record of predicting markets wasn’t impressive. In addition, some works looked into relations between markets and, say, Twitter sentiments. No particular success and a very short prediction horizon (minutes, hours).

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s