By Sean McAlevey
THE SHADE OF BILL MURRAY’S UMBRELLA HAT, USA, Earth — Yesterday I rediscovered the timeless Bill James “safe lead” formula for college basketball, which calculates exactly when a given lead is so large that it’s effectively insurmountable. As I was reading through it, I thought, Why not adapt the formula for Major League Baseball? It is the first week of the season after all.
Why bother calculating when the lead’s safe, you ask? “With apologies to the Sage of St. Louis,” says James, “there comes a time when it ain’t over, but … it’s over.” There comes a point when we all want to say, “Ok, the Cubs are up four runs in the middle of the eighth at home” – obvious fantasy by the way, THE CUBS BLOW LOLOLOLZ – “am I allowed to dig into my celebratory bratwurst yet?” Even more importantly, though, this knowledge is useful for managers: put in the lights-out closer or save him for tomorrow? Rest your superstar’s legs by pinch hitting or keep him in and hope for a comeback?
Bill James and his quantitative obsession with baseball.
So where exactly is that point? When can we exhale with a four-run lead? Obviously, the team with a ten-run lead in the ninth has a safe lead – they’ve started singing “Dirty Water” at Fenway. Alternately, the team with a one-run lead in the bottom of the first isn’t anywhere close to having a safe lead. Those are easy examples though. A harder one: The Twins lead the Tigers 8-2 with a runner on second and one out in the top of the sixth. Safe? Not nearly as cut-and-dried a situation. (Spoiler: The lead is safe according the formula below. But how’s some schmuck color analyst going to know that? “Ehh I don’t know, Joe, I think this game’s still a toss-up. Sure, the Twins are in a dominant position, but I like the Tigers to mount a comeback in the next few innings; you never know.” Not happening, you moron; I can tell you for a (quasi-)fact that this game is over, period.)
The Jamesian heuristic I adapted for baseball makes calculating when your favorite team’s lead is safe – and if not, how close – an easy task. (A heuristic is defined by James as “a mathematical rule that works even though no licensed mathematician would be caught dead associating with it.”) So without further ado, the DynamicPicks MLB Safe Lead Formula:
1) Take the leading team’s run advantage and subtract one run. (Team A is up 4 runs, 6-2, at home against Team B going into the bottom of the eighth – so Team A has 3 “safe runs.”)
2*) Subtract one run for every opposing runner on base. Add 0.5 runs for the leading team’s baserunners. (No runners; Team A is still at 3 safe runs.)
3*) An additional half-inning is worth 0.5 safe runs (0.17 per out). If, as in our example, the leading team is at home and up to bat with no outs, they have a three-out advantage, which is worth an additional 0.5 additional safe runs. (Three extra outs for Team A, for a total of 3.5 safe runs.)
4) Square the total and divide by two. (Team A now has 6.13 safe runs.)
5) When that number is greater than or equal to the number of outs remaining for the trailing team, the lead is safe – the outcome is 99%+ certain. (Team A has 6.13 “safe runs” against Team B’s three remaining outs. The lead’s safe.)
*Note, however, that in the above formula, rules 2 and 3 provide nearly all of the apparent complexity. To streamline the formula, only calculate between innings, using rules 1, 4, and 5. For example, take Team A’s four-run lead, but instead of going into the bottom of the eighth – let’s say they go three up, three down – the game’s headed into the ninth inning. Subtract one: Three safe runs. Square that: Nine. Divide by two: Four and a half. Three outs remaining. Team A’s lead is still safe.
James’ quantitative analysis helped the Red Sox win their first World Series in over eight decades in 2004.
The most fascinating thing about a safe lead is that once it’s safe, it’s always safe. It’s like a law of nature in that sense, except not at all: once a lead’s safe, the outcome is (almost) as good a fact. Even if a team that was once leading by 14 runs in the sixth inning serves up two grand salamis and a bases-clearing double to their opponent in the eighth, cutting their lead to three, they still have a safe lead. “Once a lead is safe, it’s permanently safe,” explains James. “The theory of a safe lead is that to overcome it requires a series of events so improbable as to be essentially impossible.” Pretty sweet, eh? Go ahead – walk under the ladder in your garage; find a black cat and toss it in your path; place any ridiculous wager you want – “Yo, brah, if we lose, I’ll dress as a gimp for my grad ceremony.” Do anything you want, really, because it’s over; the fridge is closed, the lights are out, the butter’s getting hard, and the jello’s a-jigglin’.
Why don’t we take the formula for a spin in more complex situations? Is a one-run lead with the bases loaded and no outs in the top of the ninth safe? The leading team, Team A, has a one-run lead, which amounts to zero safe runs after subtracting the initial run. However, Team A has three runners on base: +1.5 safe runs (3 x 0.5 each). There isn’t an out advantage, as both teams have three remaining apiece. The square of 1.5 is 2.25. Divide that by two and you’re left with 1.13 safe runs, which means that Team A’s one-run lead with the bases loaded in the top of the ninth and no outs is far from a safe lead (1.13 safe runs > 3 outs remaining; only 37.67% safe).
What about Team Z’s seven-run lead going into the fourth inning? A seven-run lead amounts to six safe runs. There are no runners on, and there isn’t an out advantage, so Team Z still has six safe runs. The square of six is 36, divided by two is 18. Thus, Team Z’s seven-run lead going into the fourth (18 outs remaining for Team B) is just enough of a lead to be considered safe; they should win that game 99%+ of the time. That might be a somewhat shocking conclusion to some. Most if not all people would be hesitant to call any game after only three innings, let alone one with only a seven-run lead. At first glance it seems like that lead isn’t all that safe, but it is, however unintuitive it may sound. Team Z is winning that game whether you and I like it or not.
Think about it another way. James’ widely popular Pythagorean expectation formula – Runs^2/(Runs^2+OppRuns^2) – says that if a Team Q is up by seven runs on the road in the top of the first with no outs, 7-0, they have an expected win probability of approximately 88.3%, which is nearly a safe lead – and an out’s yet to be recorded! In Team Z’s case, the game is already in the fourth inning when they have their seven-run lead; a third of the game is no mas. With three less innings to mount a comeback, Team Z’s opponent is in significantly worse shape than their counterparts, Team Q’s opponent, down seven with no outs and none on in the top of the first. In fact, in Team Q’s case the lead is only 66.67% (2/3rds) safe. Those nine outs make all the difference, turning a comfortable lead into a safe one.
James, who originally created the Safe Lead formula for college basketball, sits at a SABERmetrics conference.
So when Craig Kimbrel comes jogging in from the ‘pen to close out the top of the ninth in Atlanta for a Braves squad sitting on a three-run lead against the rival Nationals, Braves fans shouldn’t crack that celebratory brew just yet; the game’s still not over. The Safe Lead Calculator says that the Braves’ lead is only 66.67% (2/3rds) safe. But after Kimbrel gets the first batter to chase an eye-level fastball in a 1-2 count for a strikeout, the Braves lead is then safe. A three-run lead with no runners on and a one-out advantage (if necessary, the Braves still have all three of their outs in the bottom half) is worth 2.35 safe runs. Against only two outs remaining, that’s a safe lead.
This formula does not, however, take into consideration relative team strengths. But that’s an unproblematic fact. Even though a seven-run Tigers lead is safer than a seven-run ‘Stros lead, per se, the difference is marginal. The greatest teams in the best of circumstances (ace vs. fifth starter, at home, etc.) win no more than 75% of their games, and rarely more than 70%. Moreover, we’re (usually) dealing with small fractions of games (12 outs left, 7 outs left, 2 outs left) and many times it’s the mediocre, 4.3+ ERA relievers that dominate the hill in the later stages of games, muddling relative team strength and reducing the home-field advantage factor to such a degree that taking them into consideration wouldn’t be worth the effort. Most importantly, if a team is winning by enough to be wondering if their lead is safe, they’re probably a pretty good team in the first place.
Check that Craig Kimbrel example, by the way. I doubt there’s ever been an unsafe lead with someone as dominant as Kimbrel on the mound.