Mean Madness

I played basketball in high school, and I was terrible at it. But my company hosted a March Madness pool, and I love math. So I started thinking about my bracket…

Imagine that there are four teams in a tournament: 1, 2, 3, and 4. If you were able to enter the gambling contest as many times you want, just select all possible permutations of four items:
1234 1243 1423 4123
1324 1342 1432 4132
2134 2143 2413 4213
2314 2341 2431 4231
3124 3142 3412 4313
3214 3241 3421 4321

You would be sure to win! But what if each entry costs you a hundred dollars? Then, you may want to enter only a few times. But which permutations are the best?

In the example in my drawing, four teams have a percentage change of winning, as well as a standard deviation (a measure of the uncertainty of that average “mean” possibility). The “best” permutation to choose them would be to pick the order of probability highest first:
First guess: 1234

But which permutation should you pick next? My theory is to expand a range round the mean, like a bubble. Start with a fraction of the standard deviation (+/- σ/n for large n) and make it bigger and bigger around each mean until two ranges intersect (i.e. decrease n). Then, flip those two teams and add them to the pool of guesses. So, look in the current pool of guesses and flip every “23” to “32” since the bubble around 2 would hit the bubble around 3 very quickly.
Second guess: 1324

Now, note that the standard distribution for team 3 is twice that of team 2 (20 vs. 10, in fact). So, the bubble around the mean for team 3 will actually hit team 1 before the bubble around 2 hits team 1, believe it or not. I am suggesting that 3124 is a better guess than 2134. Swap teams 1 and 3, and also swap them in any teams already in the pool. (13 -> 31)
Third guess: 3124

So, we’ve swapped 23->32 and 13->31. But my idea is that when you finally swap teams 1 and 2, you need to look at every pick in the pool and flip those, too. Start at the top of the pool list and look for any (12) combination and turn that into (21). So, 1234 becomes 2134, and 3124 becomes 3214.
Fourth and fifth guesses: 2134 and 3214, in that order

Note that the last guess of the top three teams, 2314, is the least likely, in my opinion. For my algorithm, every time you find two teams to swap, you need to integrate through the pool over and over again until it “settles down”.

So, take the first swap we found (23 which turns into 32) and look for any (23) guess in our existing pool. There aren’t any, so look for the next swap (13->31). There is one guess in the pool 2134 which will turns into 2314 which is the sixth guess we should pick.
Sixth guess: 2314

The “bubble” around team four will eventually expand to hit teams 3, 2, and 1 in that order. So the next swaps are (34->43, 24->42, and 14->41).
Final guesses: 1243, 2143, 1342, 3142, 3241, and 2341 (in that order)

However, I also need to have correct means and standard deviations, which would involve getting hundreds of brackets and averaging them. Finally, the probabilities of a given team winning March Madness is not a normal distribution. You can’t have a negative probability of winning. Or over 100%. Maybe it’s a Poisson distribution? I don’t know… it’s a big flaw in my big idea.

In any case, it’s fun to think about how I would pick the order of all 64 teams, and how I could model “long shots” as increased std devs. To add another twist, I wonder if I could figure out the probability of each guess winning, so I could know when to stop. If the prize to guess all 64 teams correct was a billion dollars and I had to pay one dollar for each guess… how many guesses should I make before the expected return is less than my sunk cost?

(My proposed algorithm)

Find the value x where:

m1 + (σ1/x) = m2 + (σ2/x)

xm1 + σ1 = xm2 – σ2

x = ((σ1+σ2)/(m2-m1))

The two teams that are very close and have std devs that will cause them to swap quickly will have a large “x” value

Calculate X for all the pairs:

Pairs [1,2] [1,3] [1,4] … [2,3]
X 100 80 70 … 90

Then, sort this array in descending order:

Pairs [1,2] [6,7] [2,3] [1,3] [1,4] …
X 100 95 90 80 70 …

Start with the initial ranking and call it the solution pool:
[1, 2, 3, 4, 5, 6, 7]

Loop through the solution pool, do swaps if that pair is found, and then:

New solution pool:
[1, 2, 3, 4, 5, 6, 7]
[2, 1, 3, 4, 5, 6, 7]

The do this again for all the pairs.
Next pair = [6,7]

[1, 2, 3, 4, 5, 6, 7]
[2, 1, 3, 4, 5, 6, 7]
[1, 2, 3, 4, 5, 7, 6]
[2, 1, 3, 4, 5, 7, 6]