Machine Learning in the Social Sciences:  How Increased Processing Power is Predicting Economies and Bringing Counterfactuals to Life 

Machine Learning in the Social Sciences:  How Increased Processing Power is Predicting Economies and Bringing Counterfactuals to Life 

The social sciences are by nature research driven fields; acquiring new insights towards human behavior requires figuring out how to best generalize measurable phenomena into numerical models. 

Economists use this approach to keep the mechanisms that affect markets afloat so that modern society may be a stable place in which to live. This is why many economists and data scientists are trying to implement machine learning techniques in their research. Computers can handle multitudes more data, in a fraction of the time, and find correlations in many more dimensions than humans can perceive, unveiling patterns unimaginable to the human mind. Machine learning functions differently between finding correlations between various economic indicators (an approach that does not require a hypothesis) and sociological experiments in which the experimenters enter a trial with a definite hypothesis.

In the realm of economics, many studies are being conducted using machine learning algorithms which predict various kinds of macroeconomic conditions using early indicators within time-series datasets. These studies have a huge potential for predicting market trends. One such study, titled “Evaluating the Performance of a EuroDivisia Index Using Artificial Intelligence Techniques,” uses trends in a Divisia index to predict future economic health within the eurozone. It is used as an indicator of region-wide economic health.

A Divisia index is a theoretical construct to create an index number series for continuous-time data on prices and quantities of goods exchanged. Itis designed to incorporate quantity and price changes over time using subcomponents that are measured in different units, such as labor hours,equipment investment and materials purchases. It summarizes them in a time series that reflects the changes in quantities and/or prices. This is an incredibly popular way data is compiled into a single, undefined unit that is comparable across long time series. 

The study was primarily focused on determining which of its two machine learning algorithms developed the best macroeconomic predictions within the eurozone. However, it also proved that the Divisia measure of money is a superior predictive tool as compared to simple sums.

The use of machine learning in studying the treatment effect of certain policies in sociological experiments is different. Machine learning in this context differs from the previous because, as Cynthia Rudin states, typically “in machine learning, one starts with a dataset to build a hypothesis, whereas in sociology, one often starts with a hypothesis.” Sociological experiments are typically hypothesis-driven and take the form of observing how the effect of a treatment on a population sample differs from a group that did not receive the treatment (control group).

Machine learning comes into play with the matrix completion technique. and addresses some problems in sociological experiments  For example, in sociological experiments, an individual can only be put in one group (treatment or control). For some we only observe their response to the treatment, and for the rest we only observe their outcome from no treatment. By organizing a large amount of personal data (i.e. height, weight, sex, income, marital status) into a data matrix for each individual, an algorithm can identify the counterfactual outcome of an individual from the experiment. 

In other words, if person A receives treatment, their outcome is known. However, to calculate the true effect of the treatment on person A, one would need to know person A’s outcome had they not received treatment. The algorithm can find characteristically similar individuals from the control group to estimate person A’s control outcome so that the true treatment effect can be estimated. That means we can eliminate calculating differences between the mean outcomes of the control group and the treatment group. Calculating the mean differences between the outcomes of all individuals and their estimated counterfactual outcomes is a step towards breaking this foundational problem in social experiments. 

Note: I’d like to thank Emory QTM and Political Science Professor Pablo Montagnes for sharing insights from personal research experience to help with my understanding of how machine learning is being utilized in the social sciences.

Written by Andrew McArthur

Edited by Urvi Argrawal

Reference:

Binner, J.M., Gazely, A.M. & Kendall, G. Evaluating the performance of a EuroDivisia index using artificial intelligence techniques. Int. J. Autom. Comput. 5, 58–62 (2008). https://doi.org/10.1007/s11633-008-0058-3

One method uses a standard backpropagation neural network and the other uses an evolutionary approach, where the network weights and the network architecture are evolved. Results indicated that backpropagation produced superior results. However, the evolving network still produced reasonable results with the advantage that the experimental set-up is minimal.

Rudin, Cynthia. “Can Machine Learning Be Useful for Social Science?” The Cities Papers, The Social Science Research Council, 21 Sept. 2015, citiespapers.ssrc.org/can-machine-learning-be-useful-for-social-science/.

Brexit: What is the Real Cost?

Brexit: What is the Real Cost?

Global Talks on Digital Tax: Will there be Global Tax Reform Targeting US Tech Giants?

Global Talks on Digital Tax: Will there be Global Tax Reform Targeting US Tech Giants?