Monday, 24 February 2014

Curling and the minimax theorem

Every four years, as the Winter Olympics hits town, a sizable proportion of the British population falls in love with the sport of curling. And that offers the chance to look afresh at game theory's 'first big result' - the Minimax Theorem. 
       Curling is often called 'chess on ice'. But, that analogy only goes so far because chess is a game of complete information and curling is not: If a chess player intends to move a bishop to E4 then we can be pretty sure he will move it to E4. He is not going to mistakenly move it to D3, and a gust of wind is not going to move it F5. Curling, by contrast, involves both skill and luck. Skill is required to put the stone where it was intended. And luck is needed because debris on the ice can deflect a stone, and so on. So, while chess is a pure game of strategy, curling is a game of strategy, skill and luck. 
          The fact that chess is a pure a game of strategy makes it relatively easy to analyze. It is no surprise, therefore, that chess has played an integral role in the development of game theory. But, the fact that chess is a pure game of strategy also makes it a relatively boring game to watch! The Minimax Theorem helps 'formalize' this latter point. One element of the Theory says that in a zero-sum game (of which chess is an example) any Nash equilibrium yields the same payoff to players. In principle, this means that if both players behave optimally the outcome - white wins, a draw, or black wins - is predictable before the players sit down to begin their game. And that does not make for much exciting! 
         To see this in practice consider a simple game like tick-tac-toe (or noughts and crosses). It does not take much time to realize that the outcome if both players behave optimally is a draw. And once you know that, there is not much fun in playing. With chess things are not quite so simple, because the game is complex enough that we do not know the optimal strategy. That's why grandmasters still compete. Grandmasters will, however, concede a game well before the game is technically over because the outcome is clear. And, this level of predictability does not lend itself to much excitement for people watching. 
       The skill and luck involved in curling makes the outcome much less clear. That can generate more excitement. Unpredictability can also, however, lead to a more fundamental difference than that generated by pure random chance. That's because of the way it changes the optimal strategy. Let me explain: At first sight the 'optimal strategy' in a game will typically be a 'boring' one. In curling, for instance, once a team is ahead in the match they can play a 'boring' strategy of keeping the house clean in order to maintain their advantage. In a game of complete information, like chess or tic-tac-toe, there is no point in doing anything other than the boring strategy because you can be sure it will succeed. In a game of incomplete information, however, the boring strategy is risky. It is risky because it might not work, and losing with a boring strategy does not go down well.
        To illustrate the point let me contrast two curling matches from the Winter Olympics. In the men's semi-final between Britain and Sweden, Britain had the chance to employ a boring strategy and chose not to do so. It nearly cost them, but they hung on and are currently the toast of Britain. In the women's final between Canada and Sweden, Canada did opt for a boring strategy. As soon as they did so the commentators became critical and many were hoping it would fail. It did not fail, but the point is still made. If Canada had lost that game playing such a strategy they would have been roundly criticized by everyone. The possibility of such an outcome may mean the boring strategy was not optimal. 
        My conjecture, therefore, is that in games where skill and luck play a role players will have a tendency to shun 'boring' strategies. If a boring strategy is no guarantee of success then it is just too risky. This makes for a more exciting game. And we don't have to stick with curling to see this in practice. Football and cricket teams, for example, that lose with a 'boring' strategy never get much praise.  

Monday, 3 February 2014

Testing kids: Are tests for four year old children a good idea?

The last few decades have seen a huge rise in testing and performance monitoring within the English education system. The latest installment is a call for testing of four year olds when they enter the school system. The objectives of such policies are fine enough - this one, for example, will supposedly allow teaching to be more tailored to students needs. There are, however, two big problems with testing in schools. In short these are that: (i) Tests are often poor measures of what is being assessed. (ii) Tests change incentives. Let me elaborate on each of these problems in turn.
       How can you measure ability? Exams and tests provide a simple to administer measure. They provide, however, a very, very noisy measure - in other words they often give the wrong impression. And the earlier one does testing the more noisy it is surely going to be because children naturally develop at different speeds. The fact, though, that tests can be wrong is not, in itself, a problem. The problem is that we are biased towards underestimating how wrong tests can be. In particular, the law of small numbers and confirmatory bias kick in and mislead. 
        The law of small numbers says that we tend to infer a lot from a little. So, if we see Sarah doing badly in a test and John doing well we infer too extreme a difference in their relative ability. Confirmatory bias then means that we tend to see events as confirming our initial beliefs. Basically, Sarah would get a lower mark than John, for subsequent work, even if they write exactly the same thing, because John gets the benefit of the doubt when Sarah does not. As a lecturer I have learnt over the years the power of the law of small numbers and confirmatory bias - and it can be very powerful indeed. That is why I prefer anonymous marking and always try to mark work without reading who wrote it. ('Good' students, notice, have an incentive to make sure I do see their name!)
        Given the power of the law of small numbers and confirmatory bias we should avoid noisy measures of ability. That, for me, suggests we should avoid tests that are not absolutely necessary. Or, at least, if we are going to have tests we should view them more as formative and part of the learning process, rather than important measures of ability. Otherwise, we are in danger of children being labelled into self-fulfilling prophecies of success and failure as a result of some meaningless test.
        All of this is compounded by the fact that tests change incentives. The more importance we place on tests the more incentive the child, parent and teacher has in getting a 'good' mark. This is obvious - but all too often overlooked. One only has to go to the average school and listen to the parents talking at pick up time to realize how competitive parents can be. Parents, understandably, want their child to be best. And that clearly means they will happily coach their child for a test. Indeed, given the law of small numbers and confirmatory bias this is exactly what a parent should be doing! This is one reason that tests are highly noisy measures of ability.
        To criticize testing is easy enough. But are there alternatives? Measuring ability and performance is crucial. The current trend, however, is towards tests and performance measures that give 'simple numbers' that can easily be put in charts, league tables and the like. Such convenience is misleading because ability and performance are not easily measured and compared. More nuanced and rounded measures are, therefore, to be preferred. Teachers clearly do form an overall picture of a student. The schools inspectorate Ofsted forms an overall assessment of a school. These kinds of measures have far more chance of being closer to the truth.