Skip to main content

How many subjects in an economic experiment?

How many subjects should there be in an economic experiment? One answer to that question would be to draw on power rules for statistical significance. In short, you need enough subjects to be able to reasonably reject the null hypothesis you are testing. This approach, though, has never really been standard in experimental economics. There are two basic reasons for this - practical and theoretical. 

From a practical point of view the power rules may end up suggesting you need a lot of subjects. Suppose, for instance, you want to test cooperation within groups of 5 people. Then the unit of observation is the group. So, you need 5 subjects for 1 data point. Let's suppose that you determine you need 30 observations for sufficient power (which is a relatively low estimate). That is 30 x 5 = 150 subjects per treatment. If you want to compare 4 treatments that means 600 subjects. This is a lot of money (at least $10,000) and also a lot of subjects to recruit to a lab. In simple terms, it is not going to happen.  

That my appear to be sloppy science but there is a valid get-out clause. Most of experimental economics is about testing a theoretical model. This allows for a Bayesian mindset in which you have a prior belief about the validity of the theory and the experimental data allows you to update that belief. The more subjects and observations you have the more opportunity to update your beliefs. But even a small number of subjects is useful in updating your beliefs. Indeed, some of the classic papers in experimental and behavioral economics have remarkably few subjects. For instance, the famous Tversky and Kahneman (1992) paper on prospect theory had only 25 subjects. That did not stop the paper becoming a classic.

Personally I am a fan of the Bayesian mindset. This mindset doesn't, though, fit comfortably with how economic research is typically judged. What we should be doing is focusing on a body of work in which we have an accumulation of evidence, over time, for or against a particular theory. In practice research is all too often judged at the level of a single paper. That incentivizes the push towards low p values and an over-claiming of the significance of a specific experiment. 

Which brings us on to the replication crisis in economics and other disciplines. A knee-jerk reaction to the crisis is to say we need ever-bigger sample sizes. But, that kind of misses the point. A particular experiment is only one data point because it is run with a specific subject pool using a specific protocol. Adding more subjects does not solve that. Instead we need replication with different subject pools under different protocols - the slow accumulation of knowledge. And we need to carefully document research protocols.

My anecdotal impression is that journal editors and referees are upping the ante on how many subjects it takes to get an experiment published (without moving things forward much in terms of documenting protocols). To put that theory to a not-at-all scientific test I have compared the papers that appeared in the journal Experimental Economics in its first year (1998) and most recent edition (March 2018). Let me emphasize that the numbers here are rough-and-ready and may well have several inaccuracies. If anyone wants to do a more scientific comparison I would be very keen to see it. 

Anyway, what do we find? In 1998 the average number of subjects was 187, which includes the study of Cubbit, Starmer and Sugden where half the population of Norwich seemingly took part. In 2018 the average is 383. So, we see an increase. Indeed, only the studies of Cubbit et al. and Isaac and Walker are above the minimum in 2018. The number of observations per treatment are also notably higher in 2018 at 46 compared to 1998 when it was 16. Again, those numbers are almost certainly wrong (for instance the number of independent observations in Kirchler and Palan is open to interpretation). The direction of travel, though, seems clear enough. (It is also noticeable that around half of the papers in 1998 were survey papers or papers reinterpreting old data sets. Not in 2018.)  


At face value we should surely welcome an increase in the number of observations? Yes, but only if it does not come at the expense of other things. First we need to still encourage replication and the accumulation of knowledge. Experiments with a small number of subjects can still be useful. And, we also do not want to create barriers to entry. At the top labs running an experiment is relatively simple - the money, subject pool, programmers, lab assistants, expertise etc. are there and waiting. For others it is not so simple. The more constraints we impose for an experiment to count as 'well-run' the more experimental economics may potentially become 'controlled' by the big labs. If nothing else, that poses a potential problem in terms of variation in subject pool. Big is, therefore, not necessarily better. 

Comments

Popular posts from this blog

Revealed preference, WARP, SARP and GARP

The basic idea behind revealed preference is incredibly simple: we try to infer something useful about a person's preferences by observing the choices they make. The topic, however, confuses many a student and academic alike, particularly when we get on to WARP, SARP and GARP. So, let us see if we can make some sense of it all.           In trying to explain revealed preference I want to draw on a  study  by James Andreoni and John Miller published in Econometrica . They look at people's willingness to share money with another person. Specifically subjects were given questions like:  Q1. Divide 60 tokens: Hold _____ at $1 each and Pass _____ at $1 each.  In this case there were 60 tokens to split and each token was worth $1. So, for example, if they held 40 tokens and passed 20 then they would get $40 and the other person $20. Consider another question: Q2. D...

Nash bargaining solution

Following the tragic death of John Nash in May I thought it would be good to explain some of his main contributions to game theory. Where better to start than the Nash bargaining solution. This is surely one of the most beautiful results in game theory and was completely unprecedented. All the more remarkable that Nash came up with the idea at the start of his graduate studies!          The Nash solution is a 'solution' to a two-person bargaining problem . To illustrate, suppose we have Adam and Beth bargaining over how to split some surplus. If they fail to reach agreement they get payoffs €a and €b respectively. The pair (a, b) is called the disagreement point . If they agree then they can achieve any pair of payoffs within some set F of feasible payoff points . I'll give some examples later. For the problem to be interesting we need there to be some point (A, B) in F such that A > a and B > b. In...

Prisoners dilemma or stag hunt

Over Christmas I had chance to read The Stag Hunt and the Evolution of Social Structure by Brian Skyrms. A nice read, very interesting and thought provoking. There’s a couple of things in the book that prompt further discussion. The one I want to focus on in this post is the distinction between the stag hunt game and the prisoners dilemma game.    To be sure what we are talking about, here is a specific version of both type of game. Adam and Eve independently need to decide whether to cooperate or defect. The payoff matrix details their payoff for any combination of choices, where the first number is the payoff of Adam and the second number the payoff of Eve. For example, in the Prisoners Dilemma, if Adam cooperates and Eve defects then Adam gets 65 and Eve gets 165. Prisoners Dilemma Eve Cooperate Defect Adam Cooperate 140, 140 65, 165 Defect 165,...