CALCULATING THE ODDS OF PAI GOW POKER

by running thousands of simulations. Also what is the ‘house edge’?

Published

December 17, 2022

robot dealer by Dall-E 2

Background

Every time I play the tables at a casino, I always get the itch to go back to my hotel room and run simulations of the game on my laptop to calculate the ~odds of winning. I know most of these stats are already published but hey that’s no fun.

A buddy of mine introduced me to Pai Gow Poker on a recent trip and I dug the slow pace with most hands seemingly ending in pushes (break even) affording me plenty of time to sip on 🍹🍹

Let’s explore the odds of winning the main bet and the House Edge of Pai Gow Poker.

Basics

But first let’s talk about the methodology. How do we even go about calculating odds by running simulations in the first place?

Let’s start with something simpler, coin flips. Say you walk into a casino and there’s a coin flipping table where they payout $1 for a dollar bet on heads and you lose your bet on tails. What are our odds of winning?

Clearly the odds are 1/2 or .5 or 50% of winning or losing but why don’t we run a bunch of simulated coin flips to validate this? We’ll do 1000 coin flips 500 times for a total of half a million flips.

Setup the coin flip logic

import random
from collections import Counter

# Method to run the coin flip simulation 1000 times by default (adjustable)
def flip(runs=1000):
    # track the number of heads vs tails
    c = Counter()
    
    for run in range(0,runs):
        coin=('heads','tails')
        # randomly pick a coin side and then append it to the counter above
        c[random.choice(coin)] += 1
    
    return c

Run the simulation half a million times and save results

flips = []

for i in range(0,500):
    flips.append(flip())

Save the results in a pandas dataframe

import pandas as pd

flips_df = pd.DataFrame(flips)
flips_df

	tails	heads
0	499	501
1	520	480
2	530	470
3	457	543
4	510	490
...	...	...
495	525	475
496	505	495
497	529	471
498	496	504
499	519	481

500 rows × 2 columns

Redistribute (melt) the data into two columns

# combine 'heads' and 'tails' into a new column called 'side'
# This will make it easier to graph with Altair
# ref: https://altair-viz.github.io/user_guide/data.html#converting-between-long-form-and-wide-form-pandas

flips_melt = flips_df.melt(var_name='side', value_name='count')
flips_melt

	side	count
0	tails	499
1	tails	520
2	tails	530
3	tails	457
4	tails	510
...	...	...
995	heads	475
996	heads	495
997	heads	471
998	heads	504
999	heads	481

1000 rows × 2 columns

Analyze the data

So we have a list of 500 runs of 1000 coin flips comprised of the number of heads and tails in each run. Let’s graph it with a histogram to see what the results look like.

Set graphing defaults

# set graphing defaults (

def my_theme(*args, **kwargs):
    config = {
        'config': {
            'header': {
                'titleFontSize': 20,
                'labelFontSize': 20
            },
            'view': {
                'continuousWidth': 400,
                'continuousHeight': 300
            },
            'axis': {
                'titleFontSize': 18,
                'labelFontSize': 18
            },
            "title": {
                "font": "Arial",
                "fontSize": 24,
                "anchor": "center",
            }
        }
    }
    return config
alt.themes.register('my-chart', my_theme)
alt.themes.enable('my-chart')

ThemeRegistry.enable('my-chart')

Graph the results

# We'll use Altair (altair-viz.github.io) to graph
import altair as alt

# Setup the base graph referencing our flips dataframe from above
base = alt.Chart(flips_melt)

# Create a histogram (mark_bar + binning) to show the count of result outcomes
hist = base.mark_bar(
    opacity=0.8,
    binSpacing=0
).encode(
    x=alt.X('count:Q', bin=alt.Bin(maxbins=40), axis=alt.Axis(title='Occurrences out of 1000 flips')),
    y=alt.Y('count()', axis=alt.Axis(title='Count'))
)

# Draw a vertical line on the graph showing the average # of outcomes (out of 1000 flips)
rule = base.mark_rule(
    color='red',
    size=2
).encode(
    x=alt.X('mean(count):Q', axis=alt.Axis(title=''))
)

# Break out the charts into heads and tails (via faceting)
alt.layer(hist, rule).facet(
    'side:N'
).properties(
    title='HISTOGRAM OF 1000 COIN FLIPS .. 500 TIMES'
).configure_header(
    title=None
)

/Users/foo/labs/venv/std3.10/lib/python3.10/site-packages/altair/utils/core.py:317: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.
  for col_name, dtype in df.dtypes.iteritems():

Confirming our hunch

So as we can see above, if we’re doing many rounds of flipping a coin 1000 times, we can see that a head or tail will come up on average (red lines) 500 times out of 1000, or 50%. This should not come as a surprise. We can confirm this by calculate in the data frame as well:

flips_df.mean()

tails    500.028
heads    499.972
dtype: float64

Factoring in $$$ and the House Edge

What if we incorporated betting into the above scenario? Say this was a casino game and they paid you 1:1. Bet a dollar, win or lose a dollar. How good or bad of a bet would this be? Well, casinos make money by taking advantage of something called the HOUSE EDGE. It puts the odds in their favor and over a long enough timeline means the player will always lose more than they’ll win.

There’s an equation for the House Edge actually:

$\text{House Edge} = (\text{Payout of a Win} * \text{Odds of Winning}) + (\text{Amount of a Loss} * \text{Odds of Losing})$

So in our coin flip scenario, it would be:

$\text{House Edge} = (\text{\$1.00} * \frac{500}{1000}) + (\text{-\$1.00} * \frac{500}{1000})$

$\text{House Edge} = (\text{\$1.00} * \frac{1}{2}) + (\text{-\$1.00} * \frac{1}{2})$

$\text{House Edge} = (\text{\$.50}) + (\text{-\$.50})$

$\text{House Edge} = 0$

Normally measured in percent, a regular coin flip would give the casino a house edge of 0% which is why it ain’t a vegas game.

Applying the above to Pai Gow

Ok, so now we have a methodology for calculating odds and betting so let’s see how we can apply it to Pai Gow. But first, what is (Casino) Pai Gow Poker?

(paraphrased from wikipedia)
Pai Gow Poker is based on a traditional Chinese domino game in which a banker (usually the dealer) and 6 players are dealt 7 cards each. The cards must be split into five-card and two-card poker hands, both of which must beat the banker’s in order for the main bet to payout. The five-card rank must be higher than the two-card else it would be considered a foul. The game uses a standard 52 card deck + 1 joker which can be used to complete a 5 card straight or flush, otherwise it’s an Ace.

For more details you can watch this video.

from IPython.lib.display import YouTubeVideo
YouTubeVideo('ujnUctzNrjc')

Running the Pai Gow simulations

So in order to run Pai Gow simulations, you need to be able to emulate among other things: - dealing the cards out to 6 players and banker - splitting the cards in an optimal manner to produce strong 2 (top) and 5 (bottom) card hands - determining whether the player or banker won, or if it’s a push

Lucky for me someone on the internet already tackled this. I had to modify some code to get this fully working but basically was able to get it to accomplish all of the above. You should check out the source though if you’re curious about the inner workings.

Game Demo 🃝

Let’s see how a game actually works using the library.

The initial deal

# Import the library that handles dealing, core game logic, poker hand optimization & selection
import common

g = common.deal_game()
g

[['10s', '09h', '03s', '13s', '06h', '07c', '02d'],
 ['06c', '04s', '09s', '14h', '06s', '07h', '11d'],
 ['08c', '13d', '02h', '07d', '10c', '02c', '12d'],
 ['11c', '08d', '05s', '06d', '03d', '14s', '08h'],
 ['02s', '04d', '13h', '14c', '07s', '11s', '04h'],
 ['03c', '03h', '12h', '08s', '05d', '11h', '05c'],
 ['JKR', '09d', '05h', '14d', '10d', '10h', '09c']]

Cards are represented by a string containing: - a two digit rank, 02-10 for the number cards, and then 11-14 for Jack through Ace - a suite letter (d:diamond, c:club, h:heart, s:spade)

The first row is the banker’s (dealer’s) cards and the subsequent rows represent the 6 players at the table.

Sorting the hands

# Use a list comprehension to sort each hand
g = [ common.sort_hand(hand) for hand in g ]
# Format the output so it's easier to read
[ common.format_hand(hand) for hand in g ]

[[' 2♦', ' 3♠', ' 6♥', ' 7♣', ' 9♥', '10♠', ' K♠'],
 [' 4♠', ' 6♣', ' 6♠', ' 7♥', ' 9♠', ' J♦', ' A♥'],
 [' 2♥', ' 2♣', ' 7♦', ' 8♣', '10♣', ' Q♦', ' K♦'],
 [' 3♦', ' 5♠', ' 6♦', ' 8♦', ' 8♥', ' J♣', ' A♠'],
 [' 2♠', ' 4♦', ' 4♥', ' 7♠', ' J♠', ' K♥', ' A♣'],
 [' 3♥', ' 3♣', ' 5♦', ' 5♣', ' 8♠', ' J♥', ' Q♥'],
 [' 5♥', ' 9♦', ' 9♣', '10♦', '10♥', ' A♦', 'JKR']]

How did the dealer do?

Split hands are represented by a dict which also contains metadata about specific ranks.

# Next two libs deal with splitting the 7 cards into 5 & 2. House_strat uses a pre-determined set of rules and the player methodology
# actually starts with the house and then just runs through every single possibility to see if there's anything else that might beat
# the banker's hand, since it's an open face game
from split_strategies import house_strat
from split_strategies import player_strat

# g[0] is the dealer (referencing above)
dealer = house_strat.house_strat(g[0])
dealer

{'high': {'hand': ['02d', '03s', '06h', '07c', '13s'],
  'rank': 'K High',
  'rank_points': 13.07060302,
  'has_joker': False,
  'high_card_order': ['02d', '03s', '06h', '07c', '13s'],
  'multiples': {},
  'multiples_keys': [],
  'straights': [],
  'flushes': [],
  'straight_flushes': [],
  'seven_card_straight_flush': []},
 'low': {'hand': ['09h', '10s'],
  'rank': '10 High',
  'rank_points': 10.09,
  'has_joker': False,
  'high_card_order': ['09h', '10s'],
  'multiples': {},
  'multiples_keys': [],
  'straights': [],
  'flushes': [],
  'straight_flushes': [],
  'seven_card_straight_flush': []}}

The dealer’s hand is pretty weak. Looks promising for the table.

King high on the high hand and 10 high on the low hand.

Ok, let’s see how everyone made out

Loop through all the hands and evaluate how they did using a Pandas dataframe

# Outcome determination (winner or push)
from game_variants import face_up

# Let's create a pandas dataframe to display the results
demo_game_df = pd.DataFrame(columns=['Seat', 'High hand', 'Low hand', 'Ranks', 'Outcome'])

# First append the dealer's hand and rank metadata
demo_game_df = pd.concat([
    demo_game_df,
    pd.DataFrame([[
        'Dealer',
        ' '.join(common.format_hand(dealer['high']['hand'])),
        ' '.join(common.format_hand(dealer['low']['hand'])),
        ' & '.join({ i:j['rank'] for (i,j) in dealer.items() }.values()),
        ''
    ]], columns=demo_game_df.columns)
],
    ignore_index=True
)

# Append the other player's hands, ranks, and outcome
for seat, hand in enumerate(g[1:]):
    split = player_strat.player_strat(hand, house_strat.house_strat(g[0]))
    high_hand = ' '.join(common.format_hand(split['high']['hand']))
    low_hand = ' '.join(common.format_hand(split['low']['hand']))
    ranks = " & ".join({ i:j['rank'] for (i,j) in split.items() }.values())
    outcome = face_up.determine_winner(split, dealer)
    demo_game_df = pd.concat([
        demo_game_df,
        pd.DataFrame([[
            f'Player {seat+1}',
            high_hand,
            low_hand,
            ranks,
            outcome
        ]], columns=demo_game_df.columns)
    ],
        ignore_index=True
    )
            
# Styling the dataframe output

# Is it me or is it kind of ridiculous how much effort it
# takes to stylize a pandas dataframe? and how syntactically
# awkward it is? All this just to left align everything and
# increase font sizes

# ref:
# https://stackoverflow.com/questions/58801176/display-pandas-dataframe-with-larger-font-in-jupyter-notebook
# https://www.geeksforgeeks.org/align-columns-to-left-in-pandas-python/

left_aligned_df = demo_game_df.style.set_properties(
    **{'text-align': 'left'}
)

left_aligned_df = left_aligned_df.set_table_styles(
    [dict(
        selector = 'th',
        props=[('text-align', 'left'), ('font-size', '20px')]
    ),dict(
        selector="td", props=[('font-size', '18px')])
    ]
)

display(left_aligned_df)

	Seat	High hand	Low hand	Ranks	Outcome
0	Dealer	2♦ 3♠ 6♥ 7♣ K♠	9♥ 10♠	K High & 10 High
1	Player 1	4♠ 6♣ 6♠ 7♥ 9♠	J♦ A♥	Pair & A High	Player Wins
2	Player 2	2♥ 2♣ 7♦ 8♣ 10♣	Q♦ K♦	Pair & K High	Player Wins
3	Player 3	3♦ 5♠ 6♦ 8♦ 8♥	J♣ A♠	Pair & A High	Player Wins
4	Player 4	2♠ 4♦ 4♥ 7♠ J♠	K♥ A♣	Pair & A High	Player Wins
5	Player 5	3♥ 3♣ 5♦ 5♣ 8♠	J♥ Q♥	Two Pair & Q High	Player Wins
6	Player 6	5♥ 9♦ 9♣ 10♦ 10♥	A♦ JKR	Two Pair & Pair	Player Wins

Running the simulation to derive the odds

Ok, let’s run a bunch of games and evaulate the outcome to see if we can figure out the ~probability of winning the main bet in Pai Gow Poker and also find out the house edge as well.

Import the modules and setup the simulation function

## from collections import Counter

# By default run 1000 games
def run_sim(games=1000):
    results = Counter()

    for run in range(0,games):
        game = common.deal_game()
        game = [ common.sort_hand(hand) for hand in game ]

        for hand in game[1:]:
            # Loop through each player and save the outcome to the results Counter for later tallying
            outcome = face_up.determine_winner(player_strat.player_strat(hand, house_strat.house_strat(game[0])), \
                                           house_strat.house_strat(game[0]))
            results[outcome] += 1
    return results

Run the 1000 game simulation .. 1000 times (total 1M)

Parallelizing the work as appropriate and save the results to pool_outputs

# Refer to https://stackoverflow.com/questions/5364050/reloading-submodules-in-ipython
# on how to properly parallel process in Jupyter

from multiprocess import Pool
# TQDM is an awesome progress bar module
from tqdm import tqdm
import run_sim

runs = [1000]*1000

# Wrap the parallel processing with tqdm to track progress
with Pool(12) as p:
    pool_outputs = list(
        tqdm(
            p.imap(run_sim.run_sim, runs),
            total=len(runs)
        )
    )

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [09:00<00:00,  1.85it/s]

Save the results to a pandas dataframe for further analysis and visualization

The result is a tally of each outcome (Push, Dealer win, Player win, Dealer Ace High (push)). Each row adds up to 6000 because we simulated 1000 games with 6 players each (vs the banker in the 7th seat).

import pandas as pd
outcomes = pd.DataFrame(pool_outputs)
outcomes

	Push	Dealer Wins	Player Wins	Push - A High Pai Gow
0	2442	1571	1519	468
1	2450	1605	1393	552
2	2385	1543	1472	600
3	2272	1699	1405	624
4	2290	1521	1589	600
...	...	...	...	...
995	2325	1577	1510	588
996	2355	1528	1487	630
997	2305	1651	1522	522
998	2295	1621	1568	516
999	2354	1682	1424	540

1000 rows × 4 columns

Visualize the Pai Gow simulation results

We’ll look at the counts of outcomes using histograms with vertical lines highlighting the averages

base = alt.Chart(outcomes).transform_fold(
    ['Push','Dealer Wins','Player Wins','Push - A High Pai Gow'],
    as_=['Outcome','Outcomes out of 1000']
).transform_bin(
    field='Outcomes out of 1000',
    as_='bin_meas',
    bin=alt.Bin(maxbins=200)
).encode(
    color='Outcome:N'
).properties(
    title='HISTOGRAM OF 1000 PAI GOW GAMES, 1000 TIMES'
)

hist = base.mark_bar(
    opacity=0.3,
    binSpacing=0
).encode(
    alt.X('bin_meas:Q', axis=alt.Axis(title='Occurrences out of 6000')),
    alt.Y('count()', axis=alt.Axis(title='Count'), stack=None)
)

rule = base.mark_rule(
    size=2
).encode(
    alt.X('mean(Outcomes out of 1000):Q')
)

hist + rule

A pattern emerges

Clearly we can see the various outcome frequencies peaking at certain values. Let’s calculate the mean of each to see precisely where these land.

# bear in mind this is out of 6000 (6 players x 1000 games)
outcomes.mean()

Push                     2348.917
Dealer Wins              1600.198
Player Wins              1489.123
Push - A High Pai Gow     561.762
dtype: float64

Normalize

outcomes.mean()/6000

Push                     0.391486
Dealer Wins              0.266700
Player Wins              0.248187
Push - A High Pai Gow    0.093627
dtype: float64

So we can see that a push will occur roughly half the time with the rest being split between the player and dealer who has a slight advantage.

Calculating the House Edge for Pai Gow Poker

Recall the equation for the House Edge:

$\text{House Edge} = (\text{Payout of a Win} * \text{Odds of Winning}) + (\text{Amount of a Loss} * \text{Odds of Losing})$

So pulling in the actual outcomes, we have:

(Going with $1 as a sample bet since it’s even money between win or loss. I’ve read that some casinos take a 5% commission on win which obviously affects the house edge.)

$\text{House Edge} = (\text{\$1.00} * \frac{1489.123}{6000}) + (\text{-\$1.00} * \frac{1600.198}{6000})$

$\text{House Edge} = \frac{1489.123}{6000} + \frac{-1600.198}{6000}$

$\text{House Edge} = \frac{-111.075}{6000}$

$\text{House Edge} = .0185125$

    =======================================
    ✨✨✨ $\text{House Edge} = 1.85\%$ ✨✨✨
    =======================================

Conclusion

So there you have it. The odds of winning the main bet in Pai Gow Poker:

$\frac{24.8}{100}$

but that’s including pushes, which is basically 0 action so if we excluded them we’re looking at:

$1600\ losing+1489\ winning = 3089$ outcomes

~= 1489 / 3089

a $48.2\%$ chance of winning

and the house edge of the same bet? a menial

1.85%

Bear in mind, this was completely unscientific, and the actual edge according to LasVegasAdvisor is 1.81% so I was off by .04%. Not bad for some putzing around.

Overall compared to other games Pai Gow Poker offers a very low house edge and relaxed slow play opportunity. One commenter noted *“It’s the slow machine of table games.”