Cmglee, CC BY-SA 4.0 · CC-BY-SA-4.0 · Wikimedia Commons
Optimal stopping problems give you a stochastic sequence one item at a time and force an immediate accept-or-reject on each β no going back. The classic example is the secretary problem, and its optimal policy is a single-number threshold: reject the first $r-1$ items to calibrate, then accept the first running-max after that. The work is choosing $r$. Write the success probability as a sum over the position of the prize, $(r-1)/n \cdot \sum_{j=r}^{n} 1/(j-1)$, and maximize. The asymptotic answer $r \approx n/e$ gives $1/e \approx 0.368$ β the most-quoted constant in the optimal-stopping literature.
β Intro Β· expand
Try first (productive failure)
Before the worked example: spend 60 seconds taking your best shot at this.
A guess is fine β being briefly wrong about a problem makes the explanation
land harder when you read it. This appears once per tutorial; skip
if you already know the trick.
60s
β Try first Β· expand
Worked example
You will interview 10 candidates for a job, in a uniformly random order. After each interview you must immediately accept or reject β no going back. You want to maximize the probability of hiring the single best candidate. The policy is a threshold rule: reject the first $r - 1$ candidates outright, then hire the first one who is better than everyone you’ve seen so far. With the optimal $r$, what is the probability you hire the best candidate? (Round to 4 decimal places.)
β Worked example Β· expand
Practice 1 of 3Type a fraction, decimal, or expression β mathjs parses it.
β Practice Β· expand
More examples
A handful of harder problems on the same theme. Click any prompt to reveal the solution sketch.
Dice-with-bust game. A casino offers a dice game: you may roll as many times as you like, but each roll has rules. If the die shows 1β5, you may pocket the face value and either stop or roll again; if the die shows 6, all winnings accumulated so far are wiped out and the game ends. After each non-6 roll, you decide whether to keep what you have or risk another roll. If you are risk-neutral, how much would you pay to play once?
Optimal stopping threshold: cash out once your accumulated total is at least $\$15$. Expected pre-game value is approximately $\$6.00$. Reasoning: at accumulated total $n$, expected gain from one more roll is $\tfrac{5}{6} \cdot 3 - \tfrac{1}{6} n = \tfrac{5}{2} - \tfrac{n}{6}$, which turns negative for $n > 15$. Stopping rule + Bellman equation gives the entry value.
Dynamic card stopping. A casino deals from a standard 52-card deck face-up, one card at a time. Each red card pays you $1; each black card costs you $1. You may stop the dealer at any moment and pocket your net. What is the optimal stopping rule, and how much would you pay (risk-neutral) to play?
Stop when the running net first reaches its maximum-possible-future value, which by a DP argument is $f(b, r)$ β the expected payoff from state with $b$ black and $r$ red cards remaining. The closed-form is $f(b, r) = \max(r - b, 0) + \text{adjustment}$, and $f(26, 26) \approx \$2.62$. So a risk-neutral entry price is about $\$2.62$.
+ More examples Β· expand
Reflection
When you saw a sequential-decision problem with no recall, what made you reach for a threshold rule rather than try to compute a per-step decision? Where does the trade-off between “learn more” and “risk losing the best” show up in the formula?