Friday, August 24, 2012

Year-to-Year Consistency of Hitting Percentage in Top Women's College Spikers

It's time to put the Olympics behind us and start thinking about the new season of women's indoor college volleyball, which begins this weekend. In this posting, I attempt to answer, for women's college volleyball, a question raised in the book Stumbling on Wins by economists David Berri and Martin Schmidt.

Berri and Schmidt, drawing upon the earlier writings of J.C. Bradbury, argue that there are two dimensions on which to evaluate the importance of skills in sports:
  • Do players tend to exhibit them with consistency? In baseball, for example, are pitchers who lead their league one year in the proportion of opposing batters they strike out likely also to lead the league in this category the next year? In football, are running backs who amass a high yards-per-carry average one year likely to do the same the next year? As Berri and Schmidt characterize Bradbury's original point, "a measure that's consistent over time is probably measuring a skill. In contrast, inconsistent metrics are probably capturing luck or the impact of teammates" (p. 34).
  • Do skills tend to correlate with winning? Do basketball teams with high three-point shooting percentages win more than teams low on this metric? Do tennis players who record aces on high percentages of their serve attempts win matches more often than do players with low ace rates?
(As an aside, readers with some background in social-science research may recognize the parallel between the above two criteria and the terms reliability and validity, respectively.)

The present entry will concentrate only on the first issue, that of consistency, as applied to hitting percentage in volleyball. I have looked at the second question, that of connection to winning, previously (here and here). In a nutshell, I've selected a group of hitters (middle and outside/opposite), observed their hitting percentages in 2010 and 2011, and conducted analyses of correlation between the two. In other words, did the players with the highest hitting percentages in 2010 also rank highly on this measure in 2011, and did those with low hitting percentages in 2010 also exhibit relatively low proficiency in 2011?

The players in the analyses are not a random sampling of all hitters in women's collegiate volleyball. Rather, they are leading hitters (in terms of hitting percentages and share of their team's spike attempts taken) I featured a year ago in my 2011 previews of the Big 10, Big 12, Pac 12, and other conferences.

The data set gleaned from these previews originally consisted of 87 players. Five apparently played very little or not at all in 2011, due to injury, coach's decision, or player's decision. Many players were listed on their respective team's roster as playing both middle-blocker and outside/opposite hitter. I examined game articles involving these players to see if they were identified with one position more than another. For seven players, their predominant position could not be determined, so they were omitted from analyses comparing middle and outside/opposite hitters. Players listed as setters were classified as outside/opposite hitters.

Before we look at the results, let's review the correlation coefficient statistic, which measures how well two variables (in this case, hitting percentage in 2010 and in 2011) follow the same trend. A positive correlation refers to higher scores on one variable going along with higher scores on the other, and lower scores on one going along with lower scores on the other (i.e., like with like). The maximum value for a positive correlation is +1.00. A correlation of .00 reflects absolutely no relationship between the two variables; if someone has a high score on the first variable, it tells us nothing about whether that person scores high or low on the second variable.

When each spiker's hitting percentages for 2010 and 2011 are plotted against each other (with each player represented by a dot), a positive correlation will be revealed by an upwardly trending "best fit line" (the line that comes as close to as many data points as possible). 

(There is also such a thing as a negative correlation, where high scores on the first variable are associated  systematically with low scores on the second, and vice-versa.There may be a small number of volleyball spikers with extremely high hitting percentages one year and low percentages the other year, but it is unlikely such a trend would broadly characterize the entire sample of players.)

First, let's look at separate graphs for hitters whose teams had different vs. the same setters in 2010 and 2011 (if a team used a two-setter offense and only one setter returned, such a team was classified as having the same setter). I expected the correlation (i.e., year-to-year continuity of hitting percentages) to be lower when hitters played with different setters in the two years than when they played with the same setter. The former situation would require an adjustment period for hitters to get used to how the new setter delivered the ball, whereas the latter would not.

As shown in the graphs below (which you can click to enlarge), hitters who faced a change in setters from 2010 to 2011 (left graph) exhibited a slightly smaller (flatter) correlation than hitters who had the benefit of the same setter in both years (right graph). Numerically, the correlation between hitting percentage in 2010 and in 2011 was .58 with different setters and .65 when each hitter had the same setter in both years.

The data were also split by the hitters' position. Before looking at year-to-year continuity, it should be noted that middle-blockers tend to have higher hitting percentages than outside and opposite hitters (who are positioned on the left- and right-hand sides of the front row on the court, respectively). The conventional wisdom is that outside hitters get a lot of desperation sets, whereas middle-blockers are set more as part of structured plays. In the present sample, middles hit better on average than outsides in both 2010 (.325 vs. .265) and 2011 (.308 vs .254). Because of these mean differences, we see in the graphs below that the data points for middles (left graph) are further along the horizontal and vertical axes than is the case for the outside hitters (right graph)

Still, the upward slopes of the best-fitting lines are very similar in the two graphs. The correlations between 2010 and 2011 hitting percentages were .50 for middle-blockers and .57 for outside/opposite hitters. Among the outside/opposite hitters, the data points for virtually all players were close to the best-fit line, with the exception of Sha'Dare McNeal (Texas), who followed up her .300 hitting percentage in 2010 with a .425 in 2011. Such an improvement thus exceeded what would be typical for outside/opposite hitters.

Putting aside the statistical calculations for the moment, practical implications can be gleaned simply from the graphs. For the outside/opposite hitters in the analysis, we can say that knowing a player's hitting percentage one year (2010) tells us within a fairly narrow range the hitting percentage the player is likely to achieve the next year (2011). For example, outside/opposite hitters who recorded a hitting percentage of approximately .200 in 2010 all hit between roughly .150-.250 the next year. Those who hit around .300 in 2010 nearly all hit between .200-.325 the next year.

Middle-blockers, for whatever reason, had wider ranges in estimating their 2011 hitting percentages from what they hit in 2010. For example, a middle-blocker who hit around .300 in 2010 would have been expected from the graph to hit somewhere between .200-.400 in 2011.

The sample sizes for these analyses were small, of course, so additional research with larger samples is necessary to corroborate these findings. The present analyses at least provide estimates of the range of possible hitting percentages a player is likely to attain in an upcoming season, based on what she hit the previous season.


Here is a link to the 2012 AVCA preseason coaches' poll. Defending NCAA champion UCLA received the overwhelming share of the first-place votes. Following in positions 2 through 5 are "usual suspects" Texas, Penn State, Nebraska, and USC.

The marquee match of this opening weekend will take place Saturday night, with the Cornhuskers hosting the Bruins.

No comments: