Happy holidays to everyone!
Today's entry looks at whether, in a five-game match, the winner of the fifth game can be predicted by the pattern of how the first four games have gone. For example, if one team wins the first two games, but the other team wins the third and fourth games, one might expect the latter team to win the fifth game, owing to its momentum from Games 3 and 4. This line of reasoning led many to expect a Stanford win in Game 5 of the recent NCAA women's final against Penn State, but it was the Nittany Lions prevailing.
In the analysis below, I looked at 2007 within-conference matches from four major women's conferences: the Big 10, Big 12, Pac 10, and SEC. As can be seen, in the 34 total matches in which one team (represented by "A") had won the first two games and the other team ("B") had rebounded to win Games 3 and 4, Team A won Game 5 -- and the match -- somewhat more often than did Team B, 19 compared to 15.
Another scenario involves single-game alternation, that is, Team A wins a game, then Team B, then A, and then B. In this case, Teams A and B had virtually identical probabilities of winning the decisive fifth game.
Lastly, a situation can arise in which Team A wins Games 1 and 4, and Team B wins Games 2 and 3. Here, Team A was nearly twice as likely to win Game 5 (15 occurrences) than was Team B (8 occurrences). If the underlying probability of either team winning the fifth game were .50, the probability of one team winning 15 (or more) times out of 23 would be .105, assuming independence of observations, like coin-flipping (see here for an online binomial calculator). This result renders a 15-of-23 result unlikely to arise from an underlying 50/50 distribution, but it does not achieve the conventional .05 statistical significance level necessary for rejecting the null hypothesis of a 50/50 underlying distribution.
The number of matches studied above comprised a relatively small sample, so additional data from the upcoming men's collegiate season and from next year's women's season will be useful for strengthening the analyses.
Another place to look is men's professional tennis, the major tournaments of which use a 3-out-of-5-set format. There are differences, to be sure, between tennis and volleyball, including one being an individual sport and the other, a team sport. Still, momentum-related phenomena may transcend particular sports.
I found an online article that looked at all matches from 1995-2004 in the four Grand Slam tournaments (Australian Open, French Open, Wimbledon, and the U.S. Open).
The tennis article used a different notation than I did, but the formats are analogous. Below are listed the number of occurrences of each outcome:
WWLLW (like AABB-A) 151
LLWWW (like AABB-B) 188
p = .025
The first of the three comparisons was significant, leading us to reject the null hypothesis of a 50/50 distribution of fifth-set outcomes, when one player has won the first two sets and the other, the next two. The player coming back from 0-2 won five-set matches significantly more than 50% of the time. This result is consistent with the "Stanford momentum" line of thinking in the context of the NCAA volleyball final.
WLWLW (like ABAB-A) 135
LWLWW (like ABAB-B) 156
p = .120
Under the single-set alternation scenario, we cannot reject a 50/50 distribution of outcomes, as was the case for volleyball.
WLLWW (like ABBA-A) 186
LWWLW (like ABBA-B) 138
p = .004
As with the volleyball analysis, in tennis the player who has won the first and fourth sets is substantially more likely to take the fifth, than is the player who has won the second and third sets.
I don't know about anyone else, but staring at these notations makes me want to listen to some music by the group ABBA.
UPDATE December 28: After I published the above write-up, I posted a message at the VolleyTalk discussion site to let people know about my analysis. Among the string of messages, a few VolleyTalk readers posted the results of volleyball analyses they had done previously. The following tabulation, by "p-dub," was most on-point (you can click on the chart to enlarge it):
The chart shows what percent of the time Team A wins, under each of the three distributions of wins in Games 1-4. For any given Team-A winning percentage, you can take (1 - p) to see what percent of the time Team B wins under the relevant configuration.
Although the deviations from .500 are small in p-dub's extensive samples, once again the "A" team in the "ABBA" sequence has an increased chance of winning.