Summary
I really enjoy rewatching games during the off season, so I set up an index that helps to choose which game to rewatch without knowing anything else than the teams and the day. Using Python and Savant data, I arranged an algorithm that takes account of changes in win probability during the game to make it possible to bring out exciting games to watch.
The code takes also account of good pitching, no-hitter situations, good defensive outfield plays and rivalry, to get a mix of different kinds of interesting games. BWRI rates from 0 to 100, but it’s no a percentile score. BWRI includes seasons from 2016. You can dig on BWRI in list mode if you want, but my suggestion is to use the random mode, filtering for games with a high score. BWRI doesn’t take account of season context, you can choose just postseason games to be shown, though.
Since 2024 season, the project moved from R Project and Retrosheet to Python and Savant. The reason is Retrosheet just updates play by play data at the end of the season, while using Savant updates can be made almost live.
Extended version
Offseasons are too long. I’m not much into the hot stove thing, so usually, I spend time watching games from the past season, eager to discover relievers or just having fun with exciting games. It’s not half the fun if you know in advance the outcome of the game, but this is not hard since there are more than 2,400 games in an MLB normal regular season. Sometimes I watch random games, but then I found baseballrewatch.com, and that saved me during the spring lockdown. Unfortunately, the website hasn’t been updated recently.
It was then that I thought if it would be possible to make an index to evaluate how much a game is worth to rewatch, just using the play by play stats of the game. Using Savant data and the knowledge I could easily get ready the basic tool to calculate the index: Win Probability Added (WPA) play by play, which is how the probability to win a game changes play after play. That’s the main tool used to create what I call Baseball Worth Rewatch Index (BWRI), there are other things I took into account though.
Total WPA
The first I thought was, If I add the absolute WPA values of every play in a game, the highest figures will point me to exciting games. Games that switched from the hands of one team to the other several times during the game. Drama, leverage situations, and entertainment, especially in the late innings, when a change in the scoreboard cashes a higher value of WPA. So the first factor of BWRI is Total WPA.
Pitching
It’s not all about runs and action, good pitching games are really enjoyable. Pitching here is evaluated in two simple ways. First, how many Ks per inning there are in a game, and how close it is to a no-hitter. So from games that get to the 7th with a no hitter to no-hitter games, all of those get extra points.
Good defensive outfield plays
Good defensive plays are really enjoyable too. I use catch probability data to reward games with good defensive plays, which here is basically games with 4,5 and 6 stars catches according to Savant.
Rivalry
Finally, I added some extra points for games with rivalry, for that purpose I use data from knowrivalry.com.
Other features
To get the final score, I normalize every feature of the index and give them a different weight according to the importance I decide they have, being added WPA and complete no-hitters the main factors.
Finally, to avoid extrainnings games to take over the whole top of the list, I decided to trim those games, 10 innings final score is multiplied, by 0.95, 11 by 0.9, and 12 inning games and more by 0.85.
To evaluate win probability play by play I used to employ the method suggested by Max Marchi, Jim Albert and Benjami S. Baumer in the Pproject days. Nowadays, I just use Savant data approach.
Thanks for reading. I hope you enjoy using BWRI and rewatching baseball. Don’t hesitate on leaving a comment or getting in touch for any comments or suggestions.