Friday, October 23, 2009


Your intrepid VolleyMetrics correspondent was in Ann Arbor, Michigan last Saturday night for the match between Ohio State and U of M. Actually, I was in town to attend an academic conference and visit my graduate-school alma mater, and as a bonus the volleyball match fit my schedule.

The night before, the Wolverines had taken two-time defending NCAA champion Penn State to five games. I knew the Penn State match had been sold out, but when I got to the arena after flying all day Saturday, I was amazed to see the Ohio State match was too (notice the crowd-control grate in the lobby in the pictures above). I was on the outside looking in until "halftime" (between Games 2 and 3), when a number of seated spectators left and hangers-on were let in.

The statistical angle I pursued (starting with Game 3) followed up on my immediately prior posting (below), namely what happens on spike attempts where the hitter neither achieves a kill nor commits a hitting error (i.e., a "non-terminal" shot where the ball remains in play). If the non-kill hit attempt still renders the defense unable to launch its own return attack, then the original hit attempt will have achieved some measure of success. On the other hand, as I quoted Tristan Burton in my earlier posting, "An in-swing that the opponent converts for a kill is no different than a hitting error as far as the score is concerned."

I focused only on Michigan and only two Wolverine hitters, Alex Hunt and Juliana Paz, received a large enough number of sets during Games 3 and 4 to compile statistics. While I was there, Hunt had 10 non-terminal hits (three in Game 3 and seven in Game 4); of these 10, the Wolverines and Buckeyes each ultimately won five of the points. Hunt, a left-hander hitting on her far left-hand side of the court, seemed to aim straight (i.e., down the sideline) a great deal of the time, as opposed to cross-court, and the Buckeyes dug her well. Paz had six non-terminal hits (two in Game 3 and four in Game 4) and OSU ultimately won four of these points.

Hunt hit .244 against Ohio State, racking up 15 kills and 5 hitting errors (for a net positive of 10) on 41 spike attempts (box score). Paz was in negative hitting territory for the evening (-.059), based on 8 kills, 10 errors, and 34 attempts. Looking at the Wolverines' seasonal statistics (through games of October 21, at this writing), Hunt's hitting percentage was only .215, based on 203 kills, 80 errors, and 573 attempts; clearly, a great many of her spike attempts remain in play. Paz does better at .281 (265/91/619), and even higher is Veronica Rood at .371 (142/33/294).

Michigan has made the round of 16 in each of the last two years' NCAA tournaments. The Wolverines seem to have the potential to advance further this year, but to do so, they'll probably have to become more proficient at putting balls away.

Saturday, September 26, 2009

Over at the VolleyTalk discussion site, frequent contributor "P-Dub" raises an interesting question about hitting percentage, defined as: (kills-errors)/total attacks.

Player A: 6/3/15
Player B: 3/0/15

Both players have hit .200, but the first has done it with more kills and more errors. Which of these contributions is better?


To answer the question -- in theory, if not in practice -- P-Dub suggests looking at what the defensive team does with the balls the offensive team has neither put away (kills) nor failed to place in-bounds on the other side of the net (errors); in other words, what happens to the balls that remain in play?

For example, if a team is really good at converting opponents' non-kills into its own kills, then the aforementioned Player B's 3/0/15 line isn't good, because it gives the other team 12 opportunities to produce its own kills. This seems like a productive line of thinking, but it would be good to add some actual data to the debate.

The full discussion thread, which has now reached three pages, can be accessed here.

UPDATE (9/28): Tristan Burton, whose work has been cited before on this blog, sent me the following comment on evaluating hitting performances (with his permission to reproduce it).

I just saw your post about hitting efficiency. My paper defines "hitting effectiveness", which includes the outcome of any non-terminal swings by a hitter. So if I attack and the opponent digs me and then immediately gets a kill of their own then it counts against my hitting effectiveness. In English (instead of mathspeak), hitting effectiveness is hitting efficiency minus (the fraction of my swings on which the opponent gets their own attack) times (their hitting efficiency on those attacks). Usually, hitting effectiveness is lower than hitting efficiency and for those hitters who make a living just putting the ball in play it might be substantially lower (depending on whether or not the opponent is good at converting). I've looked at data for Pac-10 women where two OH's had virtually the same hitting efficiency but drastically different hitting effectiveness numbers because one of the players was aggressive and had a higher kill% and higher error% but the opponent could not easily convert her "in-swings" while the other player was putting more balls in play and the opponent was converting easily. There seems to be a focus in the volleyball community on minimizing errors but that's not what you are really trying to do, you're actually trying to increase the score difference between yourself and your opponent as much as possible with every swing (this is what hitting effectiveness actually tells you) . An in-swing that the opponent converts for a kill is no different than a hitting error as far as the score is concerned. As with so many things in life, there's a risk vs. reward relationship that needs to be considered.

Sunday, September 20, 2009

Yesterday was the home opener at my university, Texas Tech, as the Red Raiders hosted Texas A&M in Big 12 play. It was also the home debut for new Tech coach Trish Knight, who faces an enormous rebuilding job. Prior to Knight's arrival, Tech had lost 39 straight conference matches. After yesterday's 25-15, 25-11, 25-17 shellacking by the Aggies, the streak is now at 41.

With pencil, paper, and camera in hand, I decided to focus my statistical analysis yesterday on the serve-receipt success of Texas Tech's six rotations. I took the following picture (which you can click on to enlarge) during Game 3. We see that for the Red Raiders (near court), No. 11 (Amanda Dowdy) is front left, No. 4 (setter Caroline Witte) is front center, No. 13 (Barbara Conceicao) is front right, No. 1 (Hayley Ball) is back right, No. 10 (Aleah Hayes) is back center (her number doesn't show in the picture, but I got it from my notes), and No. 9 (libero Jenn [Harrell] Goehry) is back left. Once the ball is served, players can shift laterally; as shown in the photo, the setter Witte (No.4) is getting ready to move to the right, to leave Conceicao (No. 13) in her natural position of middle-blocker.


Shown next is a chart of Tech's six rotations in Games 1 and 3 (the rotation with the court depicted in yellow is the one in the photograph). In a few cases, I was unsure about a uniform number and/or positioning, but I've re-created the rotations to be as logically coherent as possible (e.g., if a given player were in the front left position in one rotation, she should be in the front center position in the next rotation). Between the libero role and just ordinary substitution, charting a team's rotations was not nearly as easy as I thought it might be.


As it turns out, you don't really need advanced statistical methods to see which rotations did better or worse on serve-receipt in Game 1 (I did not keep statistics when Tech served, but by locating server names in the play-by-play sheet and consulting the Red Raider roster, the success of the different rotations on serve should be able to be determined).

The ideal for a serve-receipt opportunity, would of course be to have a successful First-Ball Attack (FBA). In other words, the served ball would be dug, set, and spiked for an immediate kill. Texas Tech's starting rotation (the top-most in the left-hand column) achieved this ideal both times it had the chance. Starting setter Karlyn Meyers (No. 3) was in the back row, meaning that she had three front-row attackers at her disposal.

The Raiders' weakest rotation was clearly the third one down (one of three rotations in which the setter is in the front row, thus leaving only two eligible front-row attackers). In this rotation, Tech exhibited just about every problem in the book. Mostly, the Red Raiders mounted an FBA where the hit was Not Put Away (NPA), leading to a rally that the Aggies eventually won. Tech also failed to get its FBA onto the Aggie side of the court inbounds (twice), sent over a free ball, and made an overpass.

I assume that all teams keep their own statistics of this type. Further, there appear to be computer software packages available to assist with such data-collection efforts (just do a Google search with keywords such as: computer software volleyball rotation).

Sunday, September 6, 2009

I recently discovered that the American Volleyball Coaches Association (AVCA) makes its bimonthly magazine, Coaching Volleyball, free online. Naturally, I reviewed the last several issues in search of any statistically oriented articles and I hit paydirt.

UC San Diego assistant men's coach Tristan Burton, who earned a Ph.D. in mechanical engineering from Stanford with a 2003 dissertation entitled "Fully Resolved Simulations of Particle-Turbulence Interaction," contributed an article to the latest (August/September 2009) issue of Coaching Volleyball.

The title of Burton's AVCA article says it all: "A Comprehensive Statistics System for Volleyball Match Analysis." Whether using the game (set) or match as the unit of analysis, the system decomposes the final total point difference between the teams into seven categories.

As a concrete example, Burton uses the 2008 Olympic men's semifinal between the U.S. and Russia. With the U.S. winning 25-22, 25-21, 25-27, 22-25, 15-13, the Americans garnered 112 total points to the Russians' 108. The Americans' +4 overall differential could then be broken down into the following components (where PD = Point Difference):

Service (SPD)
1st Ball Attack (1PD)
Transition Attack (TPD)
Opponent Terminal Serve (OTSPD)
Opponent Giveaway Transition Attack (OGTPD)
Opponent Block and Cover Transition Attack (OBACTPD)
Miscellaneous (MPD)

As Burton notes, "Given that the service line is not an advantageous location from which to attack, [one's own service performance] is usually a negative number, i.e. on average a team loses points when they serve" (p. 17). In the example Olympic match, the U.S. had an SPD of -51, whereas for Russia it was -58.

The Americans' seven component scores, listed in the same order in which the terms appear above, were -51, 40, 9, 15, -6, -3, and 0, which sums to +4 (corresponding to the U.S. team's winning four more total points in the match than the Russians, as detailed above). Russia's sum would naturally come out to -4 (-58, 35, 17, 11, -6, -3, 0). I have not explained how each of these component scores is obtained; these procedures are fairly complicated, so interested readers will need to look at Burton's original article to see how everything works.

As Burton advises, "In addition to looking at these statistics for the entire team, it is also possible to look at them for individual players or individual rotations in order to identify more specific areas for improvement" (p. 18).

Burton's system is not for the faint-of-heart. It requires extensive manual record-keeping during a match and the use of computer software to calculate the various parameters. The article has so many variables and abbreviations that it will almost certainly leave any reader's head spinning (it did mine, and as a professor who teaches statistics, I'm usually quite comfortable with numbers and formulas).

Another potential use of Burton's article would be to select a few relatively straightforward tabulations to use for one's team, instead of immersing oneself in the full system. One statistic in the article that caught my eye is the following: "Russia was able to respond to slightly more (73.9% vs. 73.4%) serves with a 1st ball attack" (p. 17). I would have thought such elite teams would have more of a tendency to mount an attack directly off of serve receipt, but by the same token, I guess, elite teams would also be delivering a lot of tough serves!

ADDENDUM/CLARIFICATION: Dr. Burton and I have exchanged e-mails, in an attempt to clarify the statistic in the paragraph immediately above regarding teams' mounting a 1st ball attack only around 73-74% of the time. These figures include opponents' serving errors as non-1st ball attacks. Dr. Burton was kind enough to run some new numbers for readers of the blog. Limiting the situation to when a receiving team faced an in-play serve, how often did the receiving team successfully set up a spike attempt as a first response, as opposed to being aced or sending a feeble (i.e., freeball) response back to the serving team? The answer is generally around 90%, both from some Olympic men's and Pac-10 women's matches Dr. Burton analyzed.

Saturday, July 25, 2009

In the latest issue of the online Journal of Quantitative Analysis in Sports, Balazs Kovacs presents an article entitled "The Effect of the Scoring System Changes in Volleyball: A Model and an Empirical Test" (the journal requires subscriptions, but free guest privileges are available). The article focuses on the change, implemented about a decade ago in many different levels of volleyball competition, from server-only scoring (with side-outs) to rally scoring.

Back when only the serving team could score, matches could drag on indefinitely if the receiving team kept winning rallies (i.e., siding-out); several plays would go by and the score would remain unchanged. Rally scoring was not necessarily adopted to make matches end more quickly, as the number of points needed to win a set (also known as a game) was increased from 15 to either 25 or 30 (depending on league) coinciding with the introduction of rally scoring (except for fifth games of a match). Rather, the change was intended to narrow the range of how long matches took to play (by eliminating the kinds of long scoreless periods alluded to), which could be helpful for television programming.

Kovacs provided a number of computer simulations of matches, but also presented analyses from actual games (using women's play in the NCAA Division II Northern Sun Intercollegiate Conference), before and after the switch to rally scoring. The key results, comparing server-only to rally scoring, were as follows: "the average match length increased from 92.5 minutes to 99.8 minutes. The variance of the match length has decreased from 27.82 in 2000 to 22.56 in 2001" (p. 8). For readers more familiar with the standard deviation as a measure of spread, the variance is simply the squared SD.

Thus, from this one conference at least, rally scoring appeared to accomplish its aim of providing more regularity to the length of matches. Going beyond the scope of Kovacs's article, a concern I've always had about rally scoring is that it may impair teams' ability to stage comebacks. Hypothetically, take a team that's serving while trailing 14-10 in a game to 15. Under the old system, the trailing team could not lose, as long as it was serving. The leading team would first have to win a side-out (which receiving teams are well positioned to do, as they have first crack at running their offense) and then win a point on serve (which is harder, for the same reason). Under rally scoring, however, the leading team could win the game merely with a side-out.

Friday, July 3, 2009

A while back, I joined a Yahoo discussion group called VolleyStats. As a result, I've been receiving e-mails from the group that fall into two categories: junk messages and serious reports of statistical analysis from someone named Leo van Hal. Because van Hal's reports are written in Dutch, I really didn't learn anything more about volleyball from them than I did from the junk e-mails.

That situation recently changed, however. I found a new Google application website (new to me, at least) called Google Translate, which is quite easy to use. You simply type (or copy and paste) text from the originating language into a box, select the "from" and "to" languages, and click! van Hal's reports often contain graphs, so rather than copy and paste his entire paper into Google Translate, I do it a few paragraphs at a time, avoiding the graphs. The translations aren't always perfect, sometimes leaving me with odd English constructions. With a little inference ("reading between the lines"), however, I'm confident that I'm picking up the key points.

One message I received was titled "[volleystats] Digest Number 447 [2 Attachments]," dated May 17, 2009. The two attached analyses were titled "Punten verdeling" (Distribution of points) and "Winst Kansen" (Profit Opportunities). Both reports have to do with testing different probabilities of teams' winning points on their serves, and how this affects the probability of different point totals in the games and teams' probabilities of winning the games. (In the English translations, the Dutch word "opslag" appears in English as "storage," but apparently "service" is another synonym; the passages will read much easier in English if you substitute "service" for "storage.")

If you would like English-translated copies of these two van Hal reports, please e-mail me via my faculty webpage in the upper-right portion of this page.

Sunday, May 10, 2009

As most people looking at this blog would already be aware, the University of California, Irvine (UCI) defeated the University of Southern California (USC) in an exciting five-game NCAA men's championship match last night. Adding to the drama and excitement was the story of the Trojans' late-season turnaround, from a fifth-place finish in the Mountain Pacific Sports Federation, to a juggernaut that swept through the MPSF postseason tournament to capture an automatic bid in the four-team NCAA tournament and overpowered Penn State in the national semifinals.

The reason for USC's recent success was no secret -- it was killer hitting, especially by 6-foot-8 sophomore Murphy Troy. Whether Troy was in the front row or back row (where a player can still hit, as long as he takes his jump from behind the 10-foot line), USC would frequently set the ball to him, and Troy would deliver. Troy's hitting statistics in SC's last five matches, in terms of number of kills and hitting percentage were as follows: 28/.380 vs. Stanford, MPSF; 19/.600 vs. UCI, MPSF; 23/.220 vs. Pepperdine, MPSF; 24/.595 vs. Penn State; NCAA; and 26/.367 vs. UCI, NCAA.

Among the tactics an opposing team could try against a hot-hitting team, two would be tough serving (to disrupt the passing to the setter, and perhaps the sets to the hitters) and finding a way to block better than you ever have, by getting two (or even three) players up against the opponent's top hitters.

As shown in the following chart of key statistics from USC's postseason matches (which you can click to enlarge), the high rate of service errors by Trojan opponents suggests that they may have been opting for the aggressive serving tactic. Penn State, with 21 such errors in only four games, stands out in this regard (although the elevated altitude in Provo, Utah, which would cause the ball to carry further, also could have been a factor).


In the national championship match, UCI had only 11 service errors (very low, considering the match went five games). More importantly, though, the Anteaters greatly increased their blocking productivity.

I've read several online articles this afternoon about last night's match, but I haven't been able to find any quotes from UCI Coach John Speraw regarding strategies he employed so that his team's block could be so effective against USC. The best I could find was: "We did a much better job of taking away their tendencies" (from a list of postgame quotes on the UCI athletics site). Comments from any "X's and O's" people would be welcome!