If
a fair bet on a horse or football team is paying $4.00, that implies a
probability of winning of 1/4. That’s because for every 4 such bets you take at
$1.00 each, you’ll expect to win one. So the general principle in an unbiased
bet is that the probability of the outcome is 1 / the payout. Alternatively,
the payout should be 1 / the probability of the outcome.
But
bookmakers always pay less than this. That’s how they make a profit: by biasing
the wager in their favour. Equivalently, the probabilities implied by
bookmakers’ payouts always add up to more than 1.
The
simplest example is the points start bet between two football teams. It’s
always set such that the implied chances of either outcome are 50/50. But the
payout is somewhere in the range $1.90 - $1.92 per $1 bet, depending on the
bookmaker. In such cases, even though the probabilities, say 1/1.9, add to more
than 1, it is clear from the equal payoffs that the true implied probabilities
are 50% for each team. Note that points starts are always halves eg. +3½, so
there is no third possibility of a draw.
An
asymmetric example is a bet on the winner of a football game where one team is
expected to win. They might be paying $1.50, their opposition $2.50, with $15
for a draw. The implied probabilities here are 1/1.5 = 2/3, 1/2.5 = 2/5 and
1/15. Notice that they add to 17/15. To get the true probabilities implied by
the payoffs, we could naively just multiply by 15/17, so the actual probability of a win
for the favourite is 10/17, their opposition 6/17 and a draw 1/17.
Another
example might be a horse race in which the runners are paying $3, $4, $6, $8,
$8, $12, $24. If you add the implied probabilities, you’ll get 9/8. So, we could
just multiply them all by 8/9 to get the actual probabilities implied by the
bookie’s odds: 8/27, 2/9, 4/27, 1/9, 1/9, 2/27, 1/27.
Seems easy.
Just like picking the winner in the football game, there’s no overwhelming
favourite and the true probabilities are all in the ratio implied by the
bookmaker’s payouts.
Now
consider the case of a two horse race with an overwhelming favourite, for
example betting on the election result in a safe seat. Let’s suppose the
favourite is paying $1.01 and the challenger $20. The implied probabilities are
100/101 and 1/20. If you look at betting on the federal election on Sportsbet or TAB, you’ll see the challenger is
typically paying even less than this.
Let’s
try our previous method. The implied probabilities add to 2101/2020, so
multiply each by 2020/2101 to get 0.9519 and 0.0481. By this method, the outsider’s true
probability is roughly the 5% implied by the $20 payout, but the favourite’s
true probability is estimated at just over 95%, despite the odds implying a
chance of winning of approximately 99%.
Is this a reasonable outcome? No.
The reason why is that the $20 payout figure is essentially made up. To protect
the bookmaker against a freak loss, this payout is well under the payout
implied by the true probability of the outsider winning. There is a big
difference between a 95% chance and a 99% chance. The odds of the favourite
winning really are much closer to the latter.
What
this situation is really telling us is that, unlike in the football game or
horse race, the two payout values do not contain the same amount of
information. In a fair bet on a two horse race, the probability of one outcome
contains all the information about both, since the probabilities add to 1.
However, bookmakers’ payouts in a two horse race contain independent components
of information, since their implied probabilities add to more than 1. We need
to adjust the improper, payout implied probabilities to reflect the fact that
the $1.01 payout contains more information as to the true probabilities than
the $20 payout ie. adjust the favourite’s probability by a little and the
outsider’s by relatively more.
One
way to do this is to use the log likelihood
function. It is the log of the product of the probabilities of all the possible
outcomes.
If
the payout implied, improper probabilities are q1, q2, …,
then the (improper) log likelihood
function is
Limp = Σk log qk
If
we assume the log likelihood function of the true probabilities is a constant
times the above function:
L = C * Σk
log qk
this
implies the true probabilities are pk = qkC,
with the constraint
Σk qkC = 1
We
can solve this equation for C and thus determine the true probabilities,
assuming that larger implied probabilities contain more information.
Applying
the method to our {$1.01, $20} race, we obtain C = 1.4234 and actual
probabilities of 0.986 and 0.014. Notice that the favourite’s probability is
still close to the bookmaker implied 100/101, but the outsider’s estimated
probability is now approximately 1/70.
Let’s
take payouts for the perfidious Tony Windsor’s electorate of New England,
almost certain to be regained for the Nationals by his arch nemesis and inveterate
dill, Barnaby Joyce. They are $1.01 (Joyce), $13 (ALP), $41, $51, $81 for the rest. The
implied probabilities add to 1.1234, but simply dividing by this value
estimates a true probability of 88% of Joyce winning the seat. This is clearly
wrong: he is almost certain to win.
Using
the log likelihood method, we obtain C = 1.6936 and fair payout values of
$1.017, $77, $540, $780, $1700, with the actual probabilities being the
reciprocals of these amounts.
Sportsbet
has payouts of $1.001, $15, $26, $34 for New England
(the $26 being for any candidate but Joyce, the ALP and Palmer United).
Applying the log likelihood method to these gives fair payout values of
$1.0024, $600, $2195, $4135, much more realistic, given the bookie’s payout on
Joyce of 1 cent per $10 bet.
The
main point here is that dividing the improper, payout implied probabilities by
their sum to obtain the actual implied probabilities (as in our first few
examples) only works as a reasonable approximation when there is no clear separation of the outcomes into two
groups such that the winner is overwhelmingly likely to come from one group.
The case of a single, overwhelming favourite and a group of also rans is the
obvious example. In such cases, the log likelihood method works well. It
properly uses the information in the favourite payout and more realistically
estimates the total probability of the outsiders.
In
cases where there are multiple favourites and the remainder are long outsiders,
the accuracy of the simple division method is less clear cut. Consider the
example of three equal favourites, each paying $3 and the remainder paying long
odds, say $25, $50 and $100.
In
this example, the improper probabilities add to 1.07. Dividing by this, we
obtain actual probabilities of 100/321 for the favourites and 4/107, 2/107,
1/107 for the others. This gives fair payouts of $3.21, $3.21, $3.21, $26.75,
$53.50 and $107; not much different to the originals. At a glance, it’s not obvious
there is anything wrong, perhaps because the long odds payouts really are accurate
representations of the relative likelihoods of the outsiders versus the
favourites winning.
But
perhaps they are not. Perhaps the chance of the winner coming from one of the 3
favourites is very high, say 99%. There is simply insufficient information in
the payouts to differentiate approximately accurate payouts for outsiders from
unrepresentative ones.
For
cases with multiple favourites and the remainder long outsiders, the value of C
in the log likelihood method is not much greater than 1. The estimates of the
actual probabilities of the outside chances are therefore not much greater than
1 / the payout values.
In
the 3 favourites case above, the log likelihood method gives C = 1.054 and
revised payouts of $3.18, $3.18, $3.18, $29.75, $61.75 and $128. With one
overwhelming favourite, the method tacitly assumes its odds are approximately
correct and decreases the probabilities of all other outcomes. With multiple
favourites, there is no evidence to support this and so the actual
probabilities are close to those obtained by the simple division method.
Such
cases require a decision as to whether we believe the ratio of the long to
short odds is approximately correct or alternatively, does the bulk of the
excess probability in the improper prior come from understating the outsiders’
payoffs?
If
the latter, we need to estimate the tail of the distribution separately ie.
choose a threshold payout beyond which the information content reduces rapidly
and adjust the payouts upward to decrease the size of the tail prior to
applying the log likelihood method. Such a procedure is systematic, but
necessarily subjective ie. not derivable from a priori principles.
Suppose
we choose some threshold payout Q and multiply each payout qk by max (1,
qk/Q). Note that this function is subjective and requires
calibration to beliefs about the true tail probabilities, or equivalently, the
true chance of the winner coming from the group of favourites. Other functions
will achieve qualitatively the same result.
For
example, let Q = $10 in the 3 favourites case above. Their payouts are
unchanged. However, the others become $62.50, $250 and $1000. Applying the log
likelihood method then gives fair payouts of $3.06, $3.06, $3.06, $67, $275 and
$1130. These are commensurate with the belief that the winner will almost
certainly be one of the 3 favourites.
Note
that this payout threshold transformation can be applied prior to using the log
likelihood method in the case of a single favourite if there is a firm belief
the long odds have been understated. The effect is not so great, however. In
the case of the odds for the federal seat of New England,
firstly applying the threshold multiplication with Q = 10, the result after the
log likelihood method is $1.015, $71, $2250, $4340 and $17500. Interestingly,
the probability of the ALP candidate winning increases from 1/77 to 1/71. This
typically happens: the second shortest priced candidate comes in a little if
the transformation is made. This illustrates the importance of only applying
such subjective transformations if there is some external evidence to support
doing so.
Stay
tuned … the reason I’ve given this topic so much thought is that I’m about to
apply it to sports betting odds on individual seats in the upcoming
federal election. This will allow a simulation of the overall election and an
estimation of the chances of various outcomes, including the parliamentary majority
gained by the winner.
No comments:
Post a Comment