Saturday, November 21, 2009

The October/November issue of the AVCA's Coaching Volleyball magazine features an in-depth article by Iowa State women's coach Christy Johnson, entitled "Taking Your Setter from Good to Great: Seven Qualities for Which to Strive." Many of the ideas in Johnson's article are amenable to data-recording and quantification, perhaps in ways that some teams are already tabulating for their internal statistical purposes.

Here are some examples, quoting from Johnson, that I feel potentially could inspire objective grading systems for evaluating setters.

"Great footwork allows your setter to contact the ball high in the middle of the forehead every time."

"If a hitter can take a great swing at the ball, then the setter has done her job..."

"To train this concept, I’ll toss balls all around the court. I’ll ask my setter to set a quickset, for example, on every set she can, unless she feels she can’t put up a good ball, in which case she should then set the outside set (or backrow set or backset, whatever she can do at that point). I want her to learn her range, and even though we’ll continue to work on expanding that range, she needs to understand what she is physically capable of."

"If our outside is hitting a high ball, the setter can afford to set her off the net a little bit. She’ll have time to adjust, and there will likely be two blockers waiting for her anyway, so better to keep her off the net. Middles need a ball that is traveling towards the net. They don’t have time to adjust to an off set, and they are often going against only one blocker, so we can keep their sets a little tighter."


Setting does not, of course, occur in a vacuum. Accomplishment of some of the above tasks will depend on the quality of passes a setter receives and the hitting abilities of the players she sets. If there were some way to record reliably the locations of suboptimal passes on serve-receipt, then different setters could be compared on the percent of time they put up a hittable set from a given location. Such attempts to quantify setters' range for "rescuing" errantly passed balls would parallel efforts among baseball analysts to quantify fielders' defensive ranges (see here and here for baseball examples).

Friday, October 23, 2009


Your intrepid VolleyMetrics correspondent was in Ann Arbor, Michigan last Saturday night for the match between Ohio State and U of M. Actually, I was in town to attend an academic conference and visit my graduate-school alma mater, and as a bonus the volleyball match fit my schedule.

The night before, the Wolverines had taken two-time defending NCAA champion Penn State to five games. I knew the Penn State match had been sold out, but when I got to the arena after flying all day Saturday, I was amazed to see the Ohio State match was too (notice the crowd-control grate in the lobby in the pictures above). I was on the outside looking in until "halftime" (between Games 2 and 3), when a number of seated spectators left and hangers-on were let in.

The statistical angle I pursued (starting with Game 3) followed up on my immediately prior posting (below), namely what happens on spike attempts where the hitter neither achieves a kill nor commits a hitting error (i.e., a "non-terminal" shot where the ball remains in play). If the non-kill hit attempt still renders the defense unable to launch its own return attack, then the original hit attempt will have achieved some measure of success. On the other hand, as I quoted Tristan Burton in my earlier posting, "An in-swing that the opponent converts for a kill is no different than a hitting error as far as the score is concerned."

I focused only on Michigan and only two Wolverine hitters, Alex Hunt and Juliana Paz, received a large enough number of sets during Games 3 and 4 to compile statistics. While I was there, Hunt had 10 non-terminal hits (three in Game 3 and seven in Game 4); of these 10, the Wolverines and Buckeyes each ultimately won five of the points. Hunt, a left-hander hitting on her far left-hand side of the court, seemed to aim straight (i.e., down the sideline) a great deal of the time, as opposed to cross-court, and the Buckeyes dug her well. Paz had six non-terminal hits (two in Game 3 and four in Game 4) and OSU ultimately won four of these points.

Hunt hit .244 against Ohio State, racking up 15 kills and 5 hitting errors (for a net positive of 10) on 41 spike attempts (box score). Paz was in negative hitting territory for the evening (-.059), based on 8 kills, 10 errors, and 34 attempts. Looking at the Wolverines' seasonal statistics (through games of October 21, at this writing), Hunt's hitting percentage was only .215, based on 203 kills, 80 errors, and 573 attempts; clearly, a great many of her spike attempts remain in play. Paz does better at .281 (265/91/619), and even higher is Veronica Rood at .371 (142/33/294).

Michigan has made the round of 16 in each of the last two years' NCAA tournaments. The Wolverines seem to have the potential to advance further this year, but to do so, they'll probably have to become more proficient at putting balls away.

Saturday, September 26, 2009

Over at the VolleyTalk discussion site, frequent contributor "P-Dub" raises an interesting question about hitting percentage, defined as: (kills-errors)/total attacks.

Player A: 6/3/15
Player B: 3/0/15

Both players have hit .200, but the first has done it with more kills and more errors. Which of these contributions is better?


To answer the question -- in theory, if not in practice -- P-Dub suggests looking at what the defensive team does with the balls the offensive team has neither put away (kills) nor failed to place in-bounds on the other side of the net (errors); in other words, what happens to the balls that remain in play?

For example, if a team is really good at converting opponents' non-kills into its own kills, then the aforementioned Player B's 3/0/15 line isn't good, because it gives the other team 12 opportunities to produce its own kills. This seems like a productive line of thinking, but it would be good to add some actual data to the debate.

The full discussion thread, which has now reached three pages, can be accessed here.

UPDATE (9/28): Tristan Burton, whose work has been cited before on this blog, sent me the following comment on evaluating hitting performances (with his permission to reproduce it).

I just saw your post about hitting efficiency. My paper defines "hitting effectiveness", which includes the outcome of any non-terminal swings by a hitter. So if I attack and the opponent digs me and then immediately gets a kill of their own then it counts against my hitting effectiveness. In English (instead of mathspeak), hitting effectiveness is hitting efficiency minus (the fraction of my swings on which the opponent gets their own attack) times (their hitting efficiency on those attacks). Usually, hitting effectiveness is lower than hitting efficiency and for those hitters who make a living just putting the ball in play it might be substantially lower (depending on whether or not the opponent is good at converting). I've looked at data for Pac-10 women where two OH's had virtually the same hitting efficiency but drastically different hitting effectiveness numbers because one of the players was aggressive and had a higher kill% and higher error% but the opponent could not easily convert her "in-swings" while the other player was putting more balls in play and the opponent was converting easily. There seems to be a focus in the volleyball community on minimizing errors but that's not what you are really trying to do, you're actually trying to increase the score difference between yourself and your opponent as much as possible with every swing (this is what hitting effectiveness actually tells you) . An in-swing that the opponent converts for a kill is no different than a hitting error as far as the score is concerned. As with so many things in life, there's a risk vs. reward relationship that needs to be considered.

Sunday, September 20, 2009

Yesterday was the home opener at my university, Texas Tech, as the Red Raiders hosted Texas A&M in Big 12 play. It was also the home debut for new Tech coach Trish Knight, who faces an enormous rebuilding job. Prior to Knight's arrival, Tech had lost 39 straight conference matches. After yesterday's 25-15, 25-11, 25-17 shellacking by the Aggies, the streak is now at 41.

With pencil, paper, and camera in hand, I decided to focus my statistical analysis yesterday on the serve-receipt success of Texas Tech's six rotations. I took the following picture (which you can click on to enlarge) during Game 3. We see that for the Red Raiders (near court), No. 11 (Amanda Dowdy) is front left, No. 4 (setter Caroline Witte) is front center, No. 13 (Barbara Conceicao) is front right, No. 1 (Hayley Ball) is back right, No. 10 (Aleah Hayes) is back center (her number doesn't show in the picture, but I got it from my notes), and No. 9 (libero Jenn [Harrell] Goehry) is back left. Once the ball is served, players can shift laterally; as shown in the photo, the setter Witte (No.4) is getting ready to move to the right, to leave Conceicao (No. 13) in her natural position of middle-blocker.


Shown next is a chart of Tech's six rotations in Games 1 and 3 (the rotation with the court depicted in yellow is the one in the photograph). In a few cases, I was unsure about a uniform number and/or positioning, but I've re-created the rotations to be as logically coherent as possible (e.g., if a given player were in the front left position in one rotation, she should be in the front center position in the next rotation). Between the libero role and just ordinary substitution, charting a team's rotations was not nearly as easy as I thought it might be.


As it turns out, you don't really need advanced statistical methods to see which rotations did better or worse on serve-receipt in Game 1 (I did not keep statistics when Tech served, but by locating server names in the play-by-play sheet and consulting the Red Raider roster, the success of the different rotations on serve should be able to be determined).

The ideal for a serve-receipt opportunity, would of course be to have a successful First-Ball Attack (FBA). In other words, the served ball would be dug, set, and spiked for an immediate kill. Texas Tech's starting rotation (the top-most in the left-hand column) achieved this ideal both times it had the chance. Starting setter Karlyn Meyers (No. 3) was in the back row, meaning that she had three front-row attackers at her disposal.

The Raiders' weakest rotation was clearly the third one down (one of three rotations in which the setter is in the front row, thus leaving only two eligible front-row attackers). In this rotation, Tech exhibited just about every problem in the book. Mostly, the Red Raiders mounted an FBA where the hit was Not Put Away (NPA), leading to a rally that the Aggies eventually won. Tech also failed to get its FBA onto the Aggie side of the court inbounds (twice), sent over a free ball, and made an overpass.

I assume that all teams keep their own statistics of this type. Further, there appear to be computer software packages available to assist with such data-collection efforts (just do a Google search with keywords such as: computer software volleyball rotation).

Sunday, September 6, 2009

I recently discovered that the American Volleyball Coaches Association (AVCA) makes its bimonthly magazine, Coaching Volleyball, free online. Naturally, I reviewed the last several issues in search of any statistically oriented articles and I hit paydirt.

UC San Diego assistant men's coach Tristan Burton, who earned a Ph.D. in mechanical engineering from Stanford with a 2003 dissertation entitled "Fully Resolved Simulations of Particle-Turbulence Interaction," contributed an article to the latest (August/September 2009) issue of Coaching Volleyball.

The title of Burton's AVCA article says it all: "A Comprehensive Statistics System for Volleyball Match Analysis." Whether using the game (set) or match as the unit of analysis, the system decomposes the final total point difference between the teams into seven categories.

As a concrete example, Burton uses the 2008 Olympic men's semifinal between the U.S. and Russia. With the U.S. winning 25-22, 25-21, 25-27, 22-25, 15-13, the Americans garnered 112 total points to the Russians' 108. The Americans' +4 overall differential could then be broken down into the following components (where PD = Point Difference):

Service (SPD)
1st Ball Attack (1PD)
Transition Attack (TPD)
Opponent Terminal Serve (OTSPD)
Opponent Giveaway Transition Attack (OGTPD)
Opponent Block and Cover Transition Attack (OBACTPD)
Miscellaneous (MPD)

As Burton notes, "Given that the service line is not an advantageous location from which to attack, [one's own service performance] is usually a negative number, i.e. on average a team loses points when they serve" (p. 17). In the example Olympic match, the U.S. had an SPD of -51, whereas for Russia it was -58.

The Americans' seven component scores, listed in the same order in which the terms appear above, were -51, 40, 9, 15, -6, -3, and 0, which sums to +4 (corresponding to the U.S. team's winning four more total points in the match than the Russians, as detailed above). Russia's sum would naturally come out to -4 (-58, 35, 17, 11, -6, -3, 0). I have not explained how each of these component scores is obtained; these procedures are fairly complicated, so interested readers will need to look at Burton's original article to see how everything works.

As Burton advises, "In addition to looking at these statistics for the entire team, it is also possible to look at them for individual players or individual rotations in order to identify more specific areas for improvement" (p. 18).

Burton's system is not for the faint-of-heart. It requires extensive manual record-keeping during a match and the use of computer software to calculate the various parameters. The article has so many variables and abbreviations that it will almost certainly leave any reader's head spinning (it did mine, and as a professor who teaches statistics, I'm usually quite comfortable with numbers and formulas).

Another potential use of Burton's article would be to select a few relatively straightforward tabulations to use for one's team, instead of immersing oneself in the full system. One statistic in the article that caught my eye is the following: "Russia was able to respond to slightly more (73.9% vs. 73.4%) serves with a 1st ball attack" (p. 17). I would have thought such elite teams would have more of a tendency to mount an attack directly off of serve receipt, but by the same token, I guess, elite teams would also be delivering a lot of tough serves!

ADDENDUM/CLARIFICATION: Dr. Burton and I have exchanged e-mails, in an attempt to clarify the statistic in the paragraph immediately above regarding teams' mounting a 1st ball attack only around 73-74% of the time. These figures include opponents' serving errors as non-1st ball attacks. Dr. Burton was kind enough to run some new numbers for readers of the blog. Limiting the situation to when a receiving team faced an in-play serve, how often did the receiving team successfully set up a spike attempt as a first response, as opposed to being aced or sending a feeble (i.e., freeball) response back to the serving team? The answer is generally around 90%, both from some Olympic men's and Pac-10 women's matches Dr. Burton analyzed.

Saturday, July 25, 2009

In the latest issue of the online Journal of Quantitative Analysis in Sports, Balazs Kovacs presents an article entitled "The Effect of the Scoring System Changes in Volleyball: A Model and an Empirical Test" (the journal requires subscriptions, but free guest privileges are available). The article focuses on the change, implemented about a decade ago in many different levels of volleyball competition, from server-only scoring (with side-outs) to rally scoring.

Back when only the serving team could score, matches could drag on indefinitely if the receiving team kept winning rallies (i.e., siding-out); several plays would go by and the score would remain unchanged. Rally scoring was not necessarily adopted to make matches end more quickly, as the number of points needed to win a set (also known as a game) was increased from 15 to either 25 or 30 (depending on league) coinciding with the introduction of rally scoring (except for fifth games of a match). Rather, the change was intended to narrow the range of how long matches took to play (by eliminating the kinds of long scoreless periods alluded to), which could be helpful for television programming.

Kovacs provided a number of computer simulations of matches, but also presented analyses from actual games (using women's play in the NCAA Division II Northern Sun Intercollegiate Conference), before and after the switch to rally scoring. The key results, comparing server-only to rally scoring, were as follows: "the average match length increased from 92.5 minutes to 99.8 minutes. The variance of the match length has decreased from 27.82 in 2000 to 22.56 in 2001" (p. 8). For readers more familiar with the standard deviation as a measure of spread, the variance is simply the squared SD.

Thus, from this one conference at least, rally scoring appeared to accomplish its aim of providing more regularity to the length of matches. Going beyond the scope of Kovacs's article, a concern I've always had about rally scoring is that it may impair teams' ability to stage comebacks. Hypothetically, take a team that's serving while trailing 14-10 in a game to 15. Under the old system, the trailing team could not lose, as long as it was serving. The leading team would first have to win a side-out (which receiving teams are well positioned to do, as they have first crack at running their offense) and then win a point on serve (which is harder, for the same reason). Under rally scoring, however, the leading team could win the game merely with a side-out.

Friday, July 3, 2009

A while back, I joined a Yahoo discussion group called VolleyStats. As a result, I've been receiving e-mails from the group that fall into two categories: junk messages and serious reports of statistical analysis from someone named Leo van Hal. Because van Hal's reports are written in Dutch, I really didn't learn anything more about volleyball from them than I did from the junk e-mails.

That situation recently changed, however. I found a new Google application website (new to me, at least) called Google Translate, which is quite easy to use. You simply type (or copy and paste) text from the originating language into a box, select the "from" and "to" languages, and click! van Hal's reports often contain graphs, so rather than copy and paste his entire paper into Google Translate, I do it a few paragraphs at a time, avoiding the graphs. The translations aren't always perfect, sometimes leaving me with odd English constructions. With a little inference ("reading between the lines"), however, I'm confident that I'm picking up the key points.

One message I received was titled "[volleystats] Digest Number 447 [2 Attachments]," dated May 17, 2009. The two attached analyses were titled "Punten verdeling" (Distribution of points) and "Winst Kansen" (Profit Opportunities). Both reports have to do with testing different probabilities of teams' winning points on their serves, and how this affects the probability of different point totals in the games and teams' probabilities of winning the games. (In the English translations, the Dutch word "opslag" appears in English as "storage," but apparently "service" is another synonym; the passages will read much easier in English if you substitute "service" for "storage.")

If you would like English-translated copies of these two van Hal reports, please e-mail me via my faculty webpage in the upper-right portion of this page.