Tuesday, October 30, 2007

NESSIS Presentation on Volleyball

Just about a month ago, Harvard hosted the inaugural New England Symposium on Statistics in Sports (official site, news release).

In perusing the abstracts of conference papers (which can be accessed through the conference website, where it says "Program"), I came across a study (presented in poster format) entitled, "Skill Importance in BYU Women’s Volleyball: A Bayesian Approach." The authors were BYU statistics graduate student Lindsay Florence and professor Gilbert Fellingham.

The beginning of the abstract gives the basics of the study:

The BYU womens volleyball team recorded all skills (pass, serve-receive, set, etc.), rated each skill, and recorded rally outcomes (point for BYU, rally continues, point for opposition) for the entire 2006 home volleyball season. Only sequences of events occurring on BYU's side of the net were considered.

Florence and Fellingham were nice enough to e-mail me a PDF of their poster. It conveys some basic statistics in the form of two-way cross-tabulated tables, along with more complex, Bayesian analyses.

Each of the simpler, two-way tables shows the relationship between a type of skill performance and outcome of the "possession" (BYU point, continuation of rally, or loss of point). A 6 X 3 table, for example, examines six grades of setting (from "perfect set" down through "set not by setter") in relation to the three possible outcomes.

Three types of sets ("perfect," .53; "low and inside," .52; and "outside and low," .51) were associated with winning the point a little over 50% of the time. Two other types of sets ("high and outside," .47; and "inside and high," .46) were associated with winning the point at a little below a 50% rate, and if the set was not by the setter, BYU won the point only 39% of the time.

Other skills, in the domains of hitting and passing, showed similar results: As long as the task was accomplished adequately (i.e., mid to high grades), the Cougars had around a 50% chance of winning the point. But, if the task were performed well below optimally, BYU only had roughly a 40% chance of winning the point.

As shown on Fellingham's CV on his website (linked above), he has a fairly large portfolio of research that would likely be of interest to quantitatively minded volleyball fans. This includes the following article, co-authored with (now retired) BYU men's volleyball coach Carl McGown:

Fellingham, G.W., Collings, B.J., & McGown., C. (1994). Developing an optimal scoring system with a special emphasis on volleyball. Research Quarterly for Exercise and Sport, 65, 237-243.

If you'd like a copy of the Florence-Fellingham New England poster, just e-mail me via my faculty webpage (link in the upper-right).

Tuesday, October 23, 2007

Overview of Defense: Blocking and Digging

Continuing our series on the different skills and facets of volleyball, our topic today is defense against opponents' spike attempts, namely blocking and digging. As always, it's a good idea to look at the formal definitions of these plays, for statistical purposes.

According to AVCA guidelines, blocks "are awarded when a player blocks the ball to the opposition's court leading directly to a point without a successful dig." As elaborated in the guidelines, blocks are credited as solo or assist, according to certain criteria. Also, the hitter who is blocked in the manner described above receives an attack error.

A dig "is awarded when a defensive player keeps a bona fide attack in play with a pass."

The central inquiry motivating this blog, of course, is what can be learned through measurement and statistics that tells us about winning matches. Given the suggestion in an earlier posting that a team's hitting percentage seems to be a good marker for general success, it seems plausible that holding down opponents' hitting percentage might also be associated with winning.

Opponents' hitting percentage is not among the statistics displayed on the NCAA statistics page, but it is kept for four major women's volleyball conferences -- Big Ten, Big 12, Pac-10, and Southeastern Conference -- that I looked at recently (see links section on the right).

In looking at team-level defensive statistics, my interest was two-fold: first, what is the correlation (overlap) between teams' blocking and digging statistics and their opposition hitting percentages (OpHP); and second, how do these three variables relate to winning percentage?

I thus computed some correlation coefficients between the four variables, separately for each conference (the data were as of yesterday). To avoid discrepancies within a conference in schedule difficulty due to non-conference schedules, I used statistics only from within conference games. The sample size (number of teams) for each analysis is small, but the replication of results over the different conferences can be instructive.

A positive correlation simply means that both variables travel in the same direction -- as one goes up, so does the other. A negative correlation indicates an inverse or opposite relation -- as one variable goes up, the other goes down. One should not infer a "bad" connotation to the word "negative" in this context; positive and negative simply convey patterns of relationships. A positive correlation approaching its maximum of 1.00 and a negative correlation approaching its (absolute value) maximum of -1.00 each convey a powerful relationship.

The results are shown in the following table (which you can click to enlarge). I just cannot resist the graphical embellishments of PowerPoint!


An "official" block (as defined above) gives the opponent a hitting error; a block that does not immediately rocket to the floor for a defensive point, but is instead played by the original hitting team to prolong the rally also dilutes the opponent's hitting percentage by adding an attempt without a kill. I thus expected to see negative correlations between blocking and opponent hitting percentage (as one goes up, the other goes down). I didn't know how strong the relationship would be, however.

As shown in the above table, these correlations were quite strong, ranging from -.68 to -.87 in the four conferences (these were all statistically significant, even with the small sample sizes). Digs also detract from OpHP, but the correlations were only moderately negative, at best, and not statistically significant (i.e., not reliably different from zero).

How might digs and blocks be correlated? One might expect an inverse (negative) relation, as "airtight" blocking would preclude the need for digs. On the other hand, if a team has a high skill level in general, it should excel at both of these (and other) facets of the game, leading to positive correlations.

In fact, these correlations ranged from moderately negative to moderately positive, with none significant. The one negative correlation, for the Big 10, may well stem from the fact that Michigan (my graduate school alma mater) was leading the conference in digs, but was last in blocks.

Next, for the second part of our inquiry, which defensive element -- OpHP, blocks, or digs -- is most strongly associated with conference winning percentages? In each conference, opponent hitting percentage edged blocks in absolute strength, but both were potent. OpHP is negatively related to own winning percentage because the lower the hitting percentage to which Team A holds Team B, the more likely Team A is to win. Blocks are positively related to winning, as higher numbers of blocks are associated with better winning percentages.

In doing my research to prepare for this entry, I came across two additional sources:

One is a study comparing two blocking strategies: "commit" vs. "read and react." According to this FIVB document:

Teams usually opt for a 'read and react' block (whereby they try to react to the ball leaving the setter's hands) or for a 'commit' block (whereby they decide before the point whether to jump on the quick middle balls).

In their article, "Relationship between the use of commit-block and the numbers of blockers and block effectiveness," researchers J. Afonso, I. Mesquita, and J.M. Palao analyzed four men's national teams in 2001. Quoting from the abstract of their article in the International Journal of Performance Analysis in Sport:

The results show that the use of the commit block [makes?] difficult the formation of double and triple blocks in the wings and does not increase the block effectiveness or the opponent's error in spike.

(As can be seen, a word was omitted from the original version of the online abstract. Based on context clues, my guess is that the word is "makes," but I've added the question mark to denote the uncertainty.)

The other source is an article from Gold Medal Squared by Carl McGown, a highly successful men's coach, on liberos' passing vs. digging. Like my four-conference analysis, the statistical analyses in McGown's article also highlight the uncertain role of digs in winning and losing.

Thursday, October 18, 2007

JQAS Article on Serve Reception, Setting, and Attack

A new issue of the online publication, the Journal of Quantitative Analysis in Sports, was announced today. Among the articles was one on volleyball by researchers from Greece, entitled "Does Effectiveness of Skill in Complex I Predict Win in Men’s Olympic Volleyball Games?"

The authors made a terminological distinction between "complex I (serve reception, setting, attack)" and "complex II (serve, block/defense, counterattack)" sequences, and focused on analyzing the former. Raters evaluated videotaped game footage with a software system, issuing grades (on a 0-4 scale) on serve reception and first attack (setting was not graded). Not surprisingly, high-level execution of both reception and attack were associated with winning. The authors used discriminant analysis, which is among the more complex techniques in the data analyst's arsenal. I would have liked to see more basic statistics, such as means and frequencies with, respectively, t-tests and chi-squares to distinguish winning and losing teams.

The article is available at: http://www.bepress.com/jqas/vol3/iss4/3. The journal requires a subscription, although free "guest" privileges are available to view a single article.

Zetou, Eleni; Moustakidis, Athanasios; Tsigilis, Nikolaos; and Komninakidou, Andromahi (2007) "Does Effectiveness of Skill in Complex I Predict Win in Men’s Olympic Volleyball Games?," Journal of Quantitative Analysis in Sports: Vol. 3 : Iss. 4, Article 3.

Tuesday, October 16, 2007

Overview of Serving and Serve Receipt

Today, let's take up serving and serve receiving, which appear to be two sides of the same coin. Box-score statistics tend to be quite limited, generally reporting only service aces and errors, and serve reception errors. Jim Coleman's chapter in Shondell and Reynaud's Volleyball Coaching Bible (which I've referenced previously) summarizes some schemes for grading serves and serve reception.

The schemes appear to have both a spatial component -- with short serves, in the center of the receivers' court on the left-right dimension, being considered poor for the server and advantageous for the receiver, and deep serves the opposite -- and a component for how likely the receiving team would be to generate an attack for a side-out, given the placement of the ball.

Consistent with calls for better statistical graphics in volleyball, I had been thinking of serve placement/receipt charts, modeled after shot charts in basketball (see examples here and here).

After searching Google Video with the keyword "volleyball," I found an archived full-length video of a 2006 women's Pac-10 match between Arizona and Oregon, from the Ducks' "O-Zone" broadcasts (video, box score). As shown below, I came up with a coding system, which I applied to Oregon's serve receptions in Game 1 (the availability of a freeze-frame option unquestionably increased the accuracy of my plottings). You can click on the graphic to enlarge it...


It would have been good to add the uniform number of the receiver to each little circle, but the resolution of the video clip wasn't sharp enough for me to see the numbers. Adding the server's number might also be helpful. It wouldn't surprise me if some software packages could generate plots similar to what I've done, but I'm not aware of any.

Going back to the Oregon-Arizona chart, the lack of deep serves stood out to me. This pattern may stem, in part at least, from rule changes in recent years that now allow serve receipt with a setting motion (back in junior high in the mid-1970s, I first learned that setting an incoming serve was a no-no). If, as before, a receiver could only field a serve from a digging position, serves presumably would travel further back in the court, as they could not be cut-off with a set at a higher point in their trajectories.

Monday, October 8, 2007

Overview of Setting

Following up on the previous entry about hitting, we now take up another indispensable part of the offense, the setting.

This Daily Californian article from about a year ago describes the setter's role through the eyes of Cal-Berkeley setter Samantha Carter, who at the time was finishing up her four-year career leading the Golden Bears' offense. The following excerpts give an idea of what being a setter entails:

“You have to be one of the better athletes — you’re doing more running and jumping than anyone else on the team,” says Bears coach Rich Feller... “You have to be a sponge and be able to absorb other players’ mistakes and take it upon yourself to make things better.”

...

Before each play, Carter will make eye contact with all of her hitters and signal to them to designate where they’ll each be going and what the play is. Throughout the play, she vocally communicates with her teammates on the court.

“First thing, when I give my calls I basically try to think, ‘Who do I want to set?’ and ‘How can I best get them the ball?’ Then I conjugate the plan and give the signals,” explains Carter. “I will look where the blockers are going and try to set away from them. It’s all about baiting the blocker and try to make them bite the hook.”

The playbook of a setter is extensive. The sets range in height and tempo: A one set is a fast ball to the middle, normally coming off of good passes; a four set is a high ball set for the outside. The list goes on.

“There are endless options of what I could run,” says Carter. “There are so many options, and I have so many hitters, that it makes my job a little bit tougher.”

The toughest part, of course, comes from reacting to unexpected and difficult attacks. A setter, more than anyone else on the team, must react immediately to anything from a bad pass to a surprising move by an opponent and make adjustments to the play accordingly.


The richness of the setter's role, as illustrated in these excerpts, stands in stark contrast to the paucity of quantitative metrics related to the position. The only setting statistic that appears to be widely available in box scores and NCAA compilations is the assist. According to the AVCA statistical definitions, an assist is "awarded to the player who passes the ball to a teammate who attacks the ball for a kill."

As evidenced by this definition -- and also through common sense -- the setter's assist and the hitter's kill are heavily intertwined. A good set, right in the hitter's "wheelhouse," enhances the likelihood of a kill, whereas the presence of hitters who pulverize the ball and have the savvy to overcome the block will increase setters' assist totals. Statistically, a team's number of kills and of assists will be virtually identical; the only discrepancies would occur when a kill came off of something other than a set, such as an overpass by the opposing team.

Within the 2002 book The Volleyball Coaching Bible (edited by Shondell & Reynaud), the chapter on "Scouting Opponents and Evaluating Team Performance" by the late Jim Coleman offers some interesting observations.

Coleman cites the need to go beyond box-score assist statistics and compile one's own, more elaborate, set of ratings (e.g., perfect set, mediocre set, set leading to free ball to opponent, and set giving opponent a direct point or rally). Even these more elaborate stats are not free of problems, however. Among the complications cited by Coleman are, "The perfect set to one spiker is not perfect for another spiker," and "The statistician is often influenced by whether the attacker kills the ball rather than the absolute quality of the set." He also notes that setter ratings are only weakly correlated with winning.

More generally, Coleman feels that, "A statistical system for setting is probably the most difficult system to create," and "The evaluation of setting seems to be more of an art than a science."

A couple of ideas I have for studying setters are as follows:

1. Similar to measuring fielders' range in getting to hit baseballs -- which some sabermetricians are interested in -- setters' range in chasing down passes gone awry could also be assessed. Further, their ability to put up serviceable sets on the run could be evaluated.

2. On an historical basis, collegiate setters' contributions to teams' offensive prowess could be estimated by looking at situations in which most of a team's top hitters return from one season to the next, but with different setters each year. Assuming relative constancy of hitters, difficulty of schedule, etc., from year to year, any difference in a team's aggregate hitting percentage might be attributable to the setter. Year-to-year improvement in hitting ability -- if there is any -- would have to be taken into account.

Monday, October 1, 2007

Overview of Hitting Percentage

For my next few postings, I would like to provide initial examinations of the major volleyball statistics, to try to get a feel for them. Let's start with hitting percentage (also called attack percentage), which can be computed for either individual players or teams.

For a player or a team, one totals the number of kills (i.e., successfully putting the ball away on an attempted attack), then subtracts the number of hitting errors (e.g., attack attempts hit out of bounds or into the net, or that get blocked back into the hitter's face for an opponent's point). The remaining number is then divided by total attack attempts. These terms are defined rigorously on this document from the American Volleyball Coaches Association.

Imagine the following different hypothetical performances. One player, whom we might call "Dana Devastator" has the ball set up for her 10 times and successfully puts it away all 10 times. That would be a 1.000 performance ([10-0]/10). Another player, "Patty Powerless," might be set up for 7 attempts, but only once get a successful kill, her other 6 spikes being fielded by the other team. That would yield a percentage of .143 ([1-0]/7). Then there's "Erin Erratic," who twice scores a kill on her 8 attempts, but also blasts the ball out of bounds 4 times. Because this player has done more harm than good, her hitting percentage enters negative territory, namely -.250 ([2-4]/8).

The hitting percentage formulation, which has been around as long as I can remember, is something I like. The ultimate goal of a team is to win games and matches, which requires getting points for one's own team and denying points to the opponent. Hitting percentage essentially weighs a player or team's balance between kills, which by definition result in points, and hitting errors, which lose you points. Poor efficiency, in terms of hitting a lot of balls that can be played by the opponent, serves to dilute a player or team's percentage.

Viewed from this perspective, it is not surprising that there is tremendous overlap between the nation's "best" NCAA Division I women's teams (as per the September 24 CSTV/AVCA Poll) and the top teams in hitting percentage (as of September 23). As I'll discuss in future postings, other NCAA statistics do not dovetail so well with the poll rankings.

Shown below for the top 10 poll-ranked teams are (left to right) their poll ranking, hitting percentage, and hitting percentage rank:

Nebraska____ 1 __ .338 __ 3

Stanford____ 2 __ .316 __ 4

Penn St.____ 3 __ .347 __ 1

USC_________ 4 __ .287 __ 12

UCLA________ 5 __ .255 __ 37

Florida_____ 6 __ .292 __ 10

Texas_______ 7 __ .298 __ 7

Washington__ 8 __ .341 __ 2

Wisconsin___ 9 __ .284 __ 13

California__ 10 _ .298 __ 6

As can be seen, 9 of the top 10 poll-ranked teams are in the top 13 of team hitting percentage. From the opposite perspective, 7 of the top 10 hitting-percentage teams are in the top 10 of the poll rankings.

There are some anomalies, though. UCLA, ranked as the nation's fifth-best team in the poll, is only 37th in hitting percentage. Texas A&M, on the other hand, ranks 9th in hitting percentage (.293), but is unranked in the poll (not only are the Aggies absent from the top 25, but they're not even among the additional 13 teams receiving some votes to be in the top 25).