Sunday, December 21, 2008

Penn State Women Complete Undefeated Season with Title-Match Sweep of Stanford

As virtually all college-volleyball fans would know by now, Penn State has successfully defended its NCAA women's title and gone undefeated (38-0) in the process, sweeping Stanford in three games in the final. The scores were 25-20, 26-24, and 25-23. Here are a few brief observations on the match...

Penn State outhit Stanford in Games 1 (.257-.167) and 3 (.196-.109). The Cardinal outhit the Nittany Lions in Game 2, .159-.102. As I showed in analyses of the earlier rounds, a team will usually win the game when it outhits its opponent by this much. In this instance, though, it didn't turn out that way for Stanford.

On the ESPN2 telecast of the final, color commentator Karch Kiraly added a statistical flavor to the proceedings, with his periodic evaluations of the teams' serve-receipt/passing effectiveness on a 0-3 scale.

Happy holidays to everyone! VolleyMetrics will be back in the new year to focus on men's college volleyball.

Saturday, December 20, 2008

Pre-Match Analysis of Penn State-Stanford Women's Final

With tonight's NCAA championship match between Penn State and Stanford just hours away, I wanted to provide some pregame statistical analysis. In the aftermath of Penn State's dramatic semifinal victory over Nebraska, one of the most frequent observations among discussants at the VolleyTalk site was how the Nittany Lions appeared to stray from attacking the middle against the Cornhuskers. In order to break things down scientifically, I created the following graph based on Penn State's NCAA tournament matches thus far this season.

Indeed, it appears that the Nittany Lions' three middle-hitters (Christa Harmotto, Blair Brown, and Arielle Wilson) have been getting a declining proportion of the team's hitting attempts in recent matches.

Meanwhile, outside-hitter Nicole Fawcett has, to an increasing extent, been getting the "lion's share" of the hit attempts. Penn State's other main outside-hitter, Megan Hodge, has consistently been getting between 25-35% of the team's hitting attempts in the tournament, thus the degree to which she was set against the Cornhuskers was well within normal range.

From Stanford's perspective, one of the main ways in which the Cardinal has succeeded this season is by limiting opponents' offensive prowess. When that fails, however, Stanford seems to be able to lift its own offense to a higher level. This game article from Stanford's semifinal victory over Texas notes that, "The Longhorns were the first team all year to hit above .300 against the Cardinal..."

In fact, even with the Longhorns hitting .438 and .381, respectively, in Games 4 and 5, the Cardinal was able to prevail in both games by hitting an astronomical .439 and .500 (box score). As discussed in the article, it was Stanford's "Big Three" doing the damage:

It took all that Stanford's three All-American hitters could muster, but in the end Alix Klineman, Cynthia Barboza and Foluke Akinradewo played one of the best matches of their careers. Klineman paced the team with 20 kills, while Akinradewo slammed 17 on .452 hitting and provided six critical blocks. It was Barboza, however, who stole the show in the improbable comeback, recording 15 of her 19 kills in the final three sets.

In conclusion, Stanford appears to have two options: slowing down Penn State's offensive attack or, failing that, prevailing in a slugfest. Neither seems too likely to me.

Wednesday, December 17, 2008

Measuring Serving Effectiveness: The Length of Average Serving Stint (LASS)

For the last few weeks, I've been trying to think of new ways to measure serving effectiveness. Box scores and statistical summaries typically report only service aces and errors (example). My concerns are that aces occur infrequently (limiting their statistical usefulness), and that focusing on aces does not take into account how even serves that are picked up by the receiving team can still be advantageous to the serving team (e.g., by preventing the receiving team from setting up its top available hitting threat).

Alternatively, one can obtain detailed statistics by observing and classifying the receiving outcomes of serves into micro-level categories, such as whether a serve disrupted the receiving team's ability to "mak[e] the first tempo attack," as reported in this article. If a team has the staffpower resources to record such statistics, that's great, but not everyone can.

What I've been conceiving of, therefore, is some sort of "middle ground" statistic -- something easily derivable from online play-by-play sheets (as can be accessed, for example, via this NCAA interactive bracket by clicking on particular games), but that goes beyond just service aces and errors. What I've come up with is the Length of Average Serving Stint or LASS.

The longer a serving stint, the more points the serving team is racking up. If a stint lasts for one serve, the serving team has not received a point (i.e., the other team has sided-out). If a serving stint lasts two serves, the serving team has garnered one point. If a stint lasts three serves, the serving team has accumulated two points, etc. Thus, longer serving stints appear to capture -- indirectly, at least -- effective serving.

Before anyone starts sending me e-mails of complaint, I am aware that the identity of a server is systematically connected to (or "confounded" with) the serving team's front-court line-up, due to the rotation. Thus, if Player A tends to have long serving stints, some (or even most) of the credit might be due to the team's having Players B, C, and D in the front court, rather than Player A's vicious serving. I never claimed that my new statistic was perfect!

Also, I suspect that many coaches already chart their teams' success at winnning points and siding-out, by rotation, which is very similar to my scheme. The difference would just be a matter of focus, as I'm interested in who is serving.

What the LASS does have going for it, however, is relative ease of compilation. One can simply look at a play-by-play sheet and see how many plays in a row somebody served. A sample chart of LASS statistics is shown below, for the University of Texas in its recent NCAA Elite Eight match-up against Iowa St. Shown in each box is the length of a given serving stint; going down the first column shows you each player's first stint (in the order they served), the second column shows each player's second stint, etc. You can click on these graphics to enlarge them.

I've gone ahead and calculated LASS statistics for all regular players from the Final Four teams that will be playing Thursday night, based on each team's two games in last weekend's regionals.

If one were going to adopt the LASS, it would be best to use a much larger database than just two matches; my initial calculations were purely for illustrative purposes.

I'm interested in what readers think are the pros and cons of the LASS. I invite you to use the Comments section to provide feedback. Now enjoy the Final Four!

Thursday, December 11, 2008

Looking Back on Opening Weekend (First Two Rounds) of 2008 NCAA Women's Tourney

I've put together a bunch of statistics on last weekend's opening two rounds of play in the NCAA Division I women's volleyball tournament. Forty-eight matches were played (32 in the first round and 16 in the second), comprising roughly three-fourths of the tournament's total matches (63 are played, in all).

In these 48 matches, 179 total games (sets) were played. The type of result (i.e., sweeps, four- and five-game matches) broke down as follows:

3-0: 24
3-1: 13
3-2: 11

The closeness of many matches is illustrated by looking more closely at the five-game tilts. Five were decided by the minimum two points, another three were decided by the score of 15-12, and only three were decided by 5 or more points.

Regular readers of this site know that I consider hitting percentage to be a very important statistic. For each of the 179 individual games played over the first weekend, I examined each team's hitting percentage in relation to who won the game. In only 19 games (11%) did the lower-hitting team win the game.

The following chart shows the relationship between the margin by which the higher hitting team in a game outhit the lower hitting team (horizontal axis) and the probability of the higher hitting team winning the game (vertical axis).

Starting at the left, when a team outhit its opponent by a very small amount (.001-.049), it had about a 62% chance of winning the game (16/26, which is not significantly above a 50/50 chance probability). If a team outhit its opponent by a somewhat larger margin (.050-.099), it had a 78% chance of winning the game (25/32, which is significantly beyond chance).

The remaining bars in the graph tell us that, if one team's hitting percentage in a game is greater than its opponent's by .100 or more, the higher-hitting team was virtually certain to win the game. In fact, from this point onward, there were only two cases (out of 121 possible) where a team outhit its opponent and lost.

One of these instances occurred in Game 1 of the Illinois-Cincinnati match in the second round. Cincinnati recorded the better hitting percentage (.294 vs. .189, a difference of .105), but Illinois prevailed 26-24.

An even more extreme anomaly occurred in Game 3 of the second-round match between Florida and Colorado State. The Gators were victorious, 25-23, despite being outhit by the substantial margin of .226 (UF .107, CSU .333). For this one, I had to see what happened, so I consulted the online play-by-play sheet. The apparent reason why the Rams lost this game despite a hefty hitting advantage is that they made EIGHT service errors.

I also looked at some miscellaneous hitting statistics. Two teams stood out as super-consistent in particular matches, their game-to-game hitting percentages staying within a band of .100 percentage points throughout five games.

In a first-round win over San Francisco, Duke recorded the following hitting percentages in the five games: .255, .243, .243, .182, and .273 (box score).

Also in the opening round, Purdue hit for the following percentages in defeating Louisville: .333, .290, .333, .321, and .294 (box score).

I hope these statistics will give you something to think about as you await the next round, beginning Friday.

Wednesday, December 3, 2008

Kansas State at Texas Tech Regular-Season Finale

As we await the start of the women's NCAA tournament, I thought I'd share some photos from last Saturday night's match featuring Kansas State at Texas Tech (you can click on the collage to enlarge it).

It was Senior Night for three Red Raider players, Michelle Flores, Brandi Hood, and Amanda Sbragia. Sadly for the seniors (and the broader Texas Tech volleyball community), the team lost to Kansas State to finish 0-20 in Big 12 play this season; going back to last season, Tech has lost 39 straight conference matches. Not surprisingly, Saturday night's match was the last for Coach Nancy Todd, ending her six-year stint with the Raiders.

Shown on top are some shots I took during the warm-ups (I wouldn't want to risk distracting the players with flash photography during actual game action). In the lower right-hand corner, your trusty analyst is shown, serving the ball during a contest between Games 2 and 3 that was open to all members of the audience. A dozen or so pizza boxes are placed on the court on the other side of the net, and anyone whose serve lands on a box wins a pizza or other prize. I didn't hit a pizza box, but I'm proud to say my serve landed in-bounds!

The season isn't over for Kansas State, however, as the Wildcats will be participating in the NCAA tourney.

Monday, December 1, 2008

Correlation Between Seed Number and Making the NCAA Women's Sweet Sixteen

The brackets have been announced for this year's women's NCAA Division I championships. A couple of matches will be played Thursday, but most of the action in the 64-team field gets underway Friday. At each of this weekend's sites, the second round will be played the night after the first-round matches. Tonight, and during the next three weeks, VolleyMetrics will be exploring various statistical aspects of the NCAA's "December Madness."

Tonight, let's start with something very basic, namely the record of No. 1-through-No. 16-seeded teams over the past five tournaments (2003-2007) in making the Sweet Sixteen. That is, of course, the immediate goal of all teams playing this weekend. Unlike the NCAA basketball brackets in which all 64 teams are seeded (i.e., each of the four regions has its teams seeded 1-16), the women's volleyball bracket only seeds 16 teams (explicitly) total.

Thus, for example, among the 16 teams vying to make the four-team regional to be hosted by Penn State, one can see teams labeled with the No. 1, 8, 9, and 16 seeds nationally, and the remaining 12 teams have no seed number by their names. Each of the other three regions also has four seeded teams. The NCAA committee may well, of course, have ranked all 64 teams so that the No. 1 seed gets easier early-round opposition than does, say, the No. 16 seed, but such rankings are not shown explicitly in the brackets released publicly.

At this point, it should be mentioned that all of the 16 explicitly seeded teams are expected to make the "Sweet Sixteen" (i.e., the four four-team regionals). But do things actually work out that way? As stated above, I have examined the past five years' tournaments to look at whether highly seeded teams have a better track record of making the Sweet Sixteen (regionals) than do lower seeds, among the 16 seeded teams. To increase the sample sizes and thus reduce chance fluctuation, I have grouped together the No. 1-through-4 seeds, 5-8 seeds, 9-12 seeds, and 13-16 seeds. Here are the results...

Teams seeded No. 1-4 are a perfect 20-of-20 in making the Sweet Sixteen during the past five years.

Teams seeded 5-8 have had a little bit of turbulence, successfully making the Sweet Sixteen 17-of-20 times (85%). Two of the exceptions occurred a year ago, with No. 6 Washington losing in the second round (round of 32) to BYU, and No. 7 Wisconsin losing in the same round to Iowa St. In 2005, it was No. 5 Stanford losing in the second round to Santa Clara.

Likewise, teams seeded 9-12 have had a 17-of-20 success rate. Last year's wacky brackets also saw the premature exits of No. 9 Kansas St. (to Oregon in the second round) and No. 11 Hawaii (to Middle Tennessee in the second round). The third upset involved No. 12 USC losing to Pepperdine in 2005, again in the second round.

Finally, as might be expected, teams seeded 13-16 have the lowest success rate of making the regionals, 14-of-20 (70%). I won't bother to list all six of the upset losers.

Interestingly, only one year of the last five, 2003, has seen all 16 national seeds move on to the Sweet Sixteen. In 2006, 15 of the 16 did. It thus seems likely that at least one of this year's 16 national seeds will be gone by this weekend. Whether things will be as wild as last year, when six of the 16 seeds failed to make it to regionals, remains to be seen.

Sunday, October 26, 2008

Side-Out Statistics

Getting a side-out (or siding-out) refers to winning a rally on the opponent's serve. Back when the rules specified that a team could score a point only on its own serve, the importance of a side-out was that it earned a team the right to serve. Then, with the switch several years ago to rally scoring -- where the winning team of each rally earns a point, regardless of who served -- siding out was worth an immediate point, along with giving the serve back to the team that won the last rally.

Aside from the scoring aspect, success at siding-out can also be seen as a marker of a team's proficiency at running its offense, which the receiving team gets the first chance to do. Side-out rate can tell us how well, by and large, a team receives serve and passes the ball to the setter so that he or she can make a good set to the chosen hitter, and with what success the team's hitters put the ball away. I say "by and large," because other factors will affect a team's side-out rate, such as the other team's rate of service errors and the receiving team's ability to win long rallies beyond the initial attempted attack.

Further, looking at the side-out success achieved by one's opponents may shed light on the effectiveness of a focal team's ability to block against the opponents' serve-receipt attack.

Tracking teams' side-out rates (and the side-out rates they allow their opponents) may therefore provide a valuable perspective in performance evaluation. I have gone ahead and charted the side-out statistics for one team in particular, the University of Michigan. I have been a Wolverine volleyball fan since my graduate-school days at the U-of-M in the mid-late 1980s, though in recent decades I've been following the team much more over the Internet than in person. With Coach Mark Rosen's squad having just completed the first half of its Big Ten schedule with a 6-4 record (18-4 overall at this point), now seemed a good time to probe the matter of side-out rates (click here for the Wolverines' schedule/results page).

In the plot below (which you can click to enlarge), each W or L indicates whether or not the Wolverines were victorious in a particular game (also known as a set), with the W/L located at the intersection of Michigan's side-out success rate and the opponent's. The color codes represent the opponent. Of necessity, the team that achieves a higher side-out rate in a game will win the game.

Various nuggets can be gleaned from this grid. For example, when Michigan has sided-out with 65-69% success, it has won 6 of 7 games (the only exception being when Wisconsin sided out at a 70% clip to Michigan's 65% in Game 3 of the teams' match in Madison).

If you would like to conduct a similar analysis for your own favorite team, please e-mail me via my faculty webpage (see link in the upper-right corner) and I'll send you my PowerPoint template.

Sunday, October 5, 2008

Pac-10 Competitive Balance Increases

Competitive balance continues to grow within Pac-10 women's volleyball. Stanford, Washington, UCLA, and USC have been national powers over the last several seasons and beyond. Cal made the Final Four last season and is doing well this season, and Arizona has made some noise in the past.

This weekend, the two Oregon schools gave notice that they shouldn't be overlooked, either. The University of Oregon knocked off both UCLA (Friday) and USC (Saturday) in Eugene, allowing only a single game (or as they now call it, "set") in the two matches combined. Oregon State, playing in Corvallis, likewise beat USC, but lost to UCLA, albeit in five games.

What really caught my attention for purposes of this blog, however, is the statistical inclination of the person who writes about volleyball for the UO's athletics website. As seen in this article on the Oregon-USC match, the writer zeroes in on the huge difference between the teams' hitting percentages and uses comparative statistics from the rest of the season to put last night's figures in perspective.

Monday, September 15, 2008

What Predicts Early-Season Poll Rankings in Women's College Volleyball?

The women's college volleyball season has now been going on for a few weeks, so it's time to jump in with some statistical analysis. To mark the occasion, I've scanned some schedule posters I've collected and displayed in my office over the years and edited them into a collage. I hope you like it!

Competition thus far has been exclusively of the nonconference variety, so lacking conference standings, we have only the national polls to judge which teams are doing well.

Today's entry seeks to get inside the heads -- indirectly, of course -- of voters in the September 8 poll of the American Volleyball Coaches Association. The poll presents a Top 25, but also reports voting points (i.e., 25 points for a first-place vote, 24 for second, etc.) for additional teams ("Others Receiving Votes and appearing on two or more ballots").

We thus have voting point totals for 37 teams, from No. 1 Penn State's 1,500 points down to unranked Georgia Tech and Arizona, each which 3 points. What factors might voters be using when they submit their ballots? Using the technique of multiple regression, I examined how well the poll rankings could be reproduced from knowledge of three factors:

*Success in last year's NCAA tournament (1 point for each match won, from 0 for a team that lost in the first round to 6 for the championship team, Penn State; teams that did not make the tournament last year received a -1). Many voters may subscribe to the idea that a championship team (or other historical powerhouses) should continue to be ranked highly until displaced by other teams. Using last year's NCAA success as a predictor reflects, in part, this philosophy.

*Number of returning starters. To the extent that a team has solid, experienced players, that is a plus. I should note that on the volleyball pages of many schools' athletic websites, the reported number of returning starters was not that easy to interpret. Whether a team's libero should be counted (I did not count them) and what to do about teams that may have had more than six players start last year are the main ambiguities. One policy I adopted was not to give a team a value greater than six returning starters.

*Number of wins vs. Top 25 opponents this current season. One obvious way to attain (or retain) a high position in the rankings is to play against and defeat top competition. This measure is somewhat imprecise, as a win over the No. 5 club (for example) is treated the same as a win over the No. 23 squad. Still, wins over Top 25 teams constitute a simple measure, whose usefulness will be determined by the analysis. Anecdotally, St. Louis University's win over then-No. 3 Stanford has propelled the Billikens from 7 voting points in the September 1 AVCA poll to 134 points and just outside the Top 25 in the September 8 poll.

Multiple regression provides several pieces of information. First, the three independent variables of 2007 NCAA wins, number of returning starters, and number of 2008 wins over Top 25 opponents, collectively accounted for 70% of what's known as the variance in the dependent variable of voter points in the September 8, 2008 AVCA poll (R-square = .699, adjusted R-square = .672). In other words, to the extent the teams vary in their vote totals, from Penn State down to Georgia Tech and Arizona, 70% of the amount of difference can be accounted for by the present statistical model.

We can also look at how useful each of the independent variables was individually in helping reproduce the points in the ranking poll. Two variables were statistically significant (i.e., substantial enough in their relationship to ranking points that the results would be unlikely to be due to chance, with p < .001 in both cases).

Because of the small sample size and crudeness of some of the measures, I wouldn't take the following numbers from the regression equation overly seriously, but here they are. For each round a team advanced in last year's NCAA tournament, it would receive about 157 extra points in the ranking-poll voting. Further, for each win over a Top 25 opponent, a team would receive an added 285 points. For those readers with a background in regression analysis, the standardized Betas were .547 for 2007 NCAA tournament matches won, and .460 for number of 2008 wins against Top 25 opponents.

Number of returning starters was not a significant predictor, so we cannot reject the null hypothesis of zero impact for that factor. Because most of the teams returned most or all of last year's starters, there was little variation on this item, which reduces its predictiveness.

If you're a fan of a particular team, there's nothing you can do about last year's NCAA tournament. However, with all appropriate cautions about drawing causal inferences, you should try to root your team on to victories over Top 25 opponents over the next couple of months, as that is likely to move your team up in the national rankings and keep it there.

Monday, August 18, 2008

Men's 2008 Olympic (Indoor) Pool Play

I've added a chart (on which you can click to enlarge) providing a statistical summary of the just-concluded men's indoor Olympic pool play, to go along with the one for women's pool play (entry immediately below).

Complicating matters, there was a three-way tie for first place in men's Group B. Looking at Group A, hitting and blocking appeared to be the key performance indicators. The higher a team finished in the win-loss standings, the better it tended to do in hitting and blocking (deviations from a perfect one-to-one relationship are shown in color fonts).

It should also be acknowledged that the meanings of some of the statistics are ambiguous. One example is digging. As I noted in a posting last October, the AVCA definition of a dig is "when a defensive player keeps a bona fide attack in play with a pass." Digging an opponent's spike attempt is good, but it may reflect some weakness in the defensive team's blocking, as balls would be getting through to the backcourt.

Sunday, August 17, 2008

Women's 2008 Olympic (Indoor) Pool Play

With pool play now complete in the Olympic women's (indoor) volleyball competion, I've created a table (below) to let readers see how the final standings in each of the two pools (first through sixth place) track with how the teams have ranked thus far (before medal play) in six statistical performance areas. (You can click on the table to enlarge it.)

A perfect correlation is represented by all of the numerals appearing in black font, such as Group B's rankings on hitting percentage. Brazil, which finished first in its pool with a 5-0 record, also had the highest team hitting percentage in the pool; Italy (4-1), which finished second in the pool, also had the second-highest hitting percentage; the trend continued all the way down to sixth-place finisher Algeria, which was also sixth in hitting percentage.

Discrepancies are shown in color fonts (the specific colors don't really mean anything, they're just used to draw attention to the numbers). For Group A's win-loss standings and hitting percentages, the discrepancies were relatively minor. As shown in blue, the U.S. finished ahead of China in the standings (second vs. third), but the Americans had a slightly lower hitting percentage than did the host nation. Japan and Poland (in red) also exhibited a similar reversal.

Serve-reception effectiveness, in Group B, was also a perfect discriminator of placing in the standings. Other statistics ranged from pretty good -- blocking in Group A, and digging in Group B -- to a chaotic lack of association with placement in the standings.

It looks to me like China's third-place finish in the Group A standings may be somewhat of an underperformance, relative to the team's rankings in the statistical categories. It should be noted, however, that China's two losses were each in tight, five-game matches. Eight teams make the medal playoffs, so China will still have a chance to win a medal.

One final note on data quality: Though the availability of statistics seems to have improved over time, there are still serious shortcomings. For some statistics on the NBC website, a team's line of data is missing (although I was able to use other statistics to derive the missing ones). There are also plenty of apparent typos. For example, in serve-receipt, the categories of "success," "faults," and "continuation of play," do not sum to the number of attempts, for some teams.

Saturday, August 16, 2008

NBC Olympics Website for Volleyball Stats

The NBC Olympics website is now providing extensive (indoor) volleyball statistics -- hitting, blocking, serving, digging, setting, and receiving. Further, hitting percentage is now being calculated correctly (subtracting errors from successful kills, before dividing by attempts). Clicking here leads you to men's statistics at the team level, but from that page you can click on a heading to get to women's statistics, as well as varying whether you get team or individual statistics. Pool play ends this weekend for both men and women; upon conclusion of this stage of the competition, I'll present some analyses of how the pool winners did in the various statistical categories.

Statistics are available in a similar format for men's and women's beach volleyball, but there seem to be a lot of holes in the numerical information provided. There's already been a lot of excitement for the U.S. teams in the sand, as the legendary women's duo of Misty May-Treanor and Kerri Walsh had a first-game scare on Thursday against a Belgian team, whereas the men's pair of Phil Dalhausser and Todd Rogers eked out a three-game victory over a Swiss combo last night (U.S. time).

Tuesday, August 12, 2008

NBC Olympic Announcers Sunderland and Barnett Informative on Strategy and Stats

NBC's late-night Tuesday/early-morning Wednesday Olympic coverage featured a women's (indoor) volleyball match between the U.S. and Venezuela, ultimately won 3-1 by the Americans. Thus far, I've found announcers Paul Sunderland (a member of the 1984 U.S. men's gold-medal team) and Kevin Barnett (an Olympian of more recent vintage) to be informative on strategy and statistics.

As Game 4 got underway, Barnett (I believe, rather than Sunderland) asserted that U.S. blocking (7 in Game 3 alone) and Venezuela's poor serving (1 ace compared to 12 errors, through three games) were the "dictating statistics of this match." According to the final box score, the Americans indeed outperformed the Venezuelans in blocking, 16 to 6. Jen Joines, who entered the match in a substitute role for the U.S., was singled out for accolades by the announcers.

The announcers also stressed the importance of playing "in system." By this, they mean consisent, quality passing to the setter, so that she can have the maximum possible options in deciding which hitter to set. U.S. passing difficulties, both on serve reception and on free balls, were described as the "single greatest thing that's going to hold this team back in this tournament," again according (as best I could tell) to Barnett.

Saturday, August 9, 2008

Following Statistical Aspects of the 2008 Summer Olympic (Indoor) Volleyball Competitions

With the Summer Olympics underway in Beijing, I would like to welcome everyone to VolleyMetrics, where I'll be presenting statistical analyses of several of the volleyball matches and taking comments from interested viewers. Before starting, I would like to offer condolences to the Bachman and McCutcheon families for the tragic attack that took place earlier.

First, for results and boxscores of (indoor) volleyball, you can go to the following link (I may add in some postings on beach volleyball, I'm not sure):

As some of you may have seen this morning on television in the U.S., the American women opened up with a 3-1 victory over Japan. The boxscore for this match is available here.

Two of the major topics I have emphasized on this blog are hitting percentage and defending against opponents' hitting via blocking and digging.

If you look at the boxscores provided through the NBC Olympics website, you'll see that hitting percentage, as traditionally defined by volleyball statisticians, is not provided. The raw ingredients to calculate it, however, do appear to be present.

In the U.S.-Japan boxscore, just on top of the individual-player statistics, you'll see that for the aggregate team statistics, only the countries' successful spikes ("S") or "kills," and attempts ("A") are shown, and then the ratio of the two. That, of course, leaves out hitting errors, which count negatively against a player or team in the usual hitting percentage statistics.

Once you scroll down to the individual-player statistics, however, you'll see that in addition to successes (SUCC) and attempts (ATT), you get "FLTS" (which I interpret as faults or errors) and "CONT" (which I interpret as playing continuing, without the spike attempt immediately helping or hurting the attacking team). Note that for each player, SUCC + FLTS + CONT always adds up to ATT. From this part of the boxscore, therefore, we can add up all the players' FLTS numbers and get a team error total, which can be used along with the team success and attempt numbers higher up in the boxscore.

The U.S. thus had 64 successful kills, 17 faults (errors), and 145 attempts. Hitting percentage thus equals 47 (from 64 - 17) divided by 145, which yields .324.

Japan had 49 successful kills, 21 faults (errors), and 136 attempts. We then take 49 - 21, which equals 28, then divide by 136, yielding .206.

More later...

Wednesday, May 28, 2008

2008 Olympic Qualifying Tournaments

I recently visited the NBC Olympics website to learn about developments in the volleyball competitions taking place later this summer. While there, I found this article (accompanied by extensive statistics) on the final women's Olympic qualifying tournament, which concluded recently.

Among the statistics listed in the final tournament standings, were each team's points scored and allowed, and games (or sets) won and allowed (the first reference to "Points" in the chart appears to refer to points in the standings, 2 for a win and 1 for a loss; points won and loss in the rally scoring are separate).

Volleyball, like tennis and perhaps other sports, uses an aggregative or hierarchical scoring system. First, a team wins points. Upon winning 25 points in a game (15 in a Game 5), with at least a two-point margin, a team would then win a game (also known as a set). Winning three games would then give a team the match.

Certainly there would be a positive correlation among the winning of points, games, and matches. Teams that win more points should win more games, and those that win more games should win more matches. Focusing on the relationship between points and games (because these would have the largest numbers of observations), I wasn't sure that it would necessarily be perfectly linear, however.

Perhaps the plot of the two variables would be S-shaped. As a theoretical example, Team A might average 8 points per game (roughly a ratio of .33 if the opponent consistently scored 25 points per game), whereas Team B might average 12 points per game (roughly a ratio of .50). Team B would have a higher points ratio relative to the opposition, but both teams would have an equal (equally tiny) ratio of games won to games lost. Higher up the ability scale, where teams had moderately good point ratios, there could be a positive linear relation between point ratio and game ratio. Lastly, at the highest ability level, there could be slight variation among teams in their point ratios, but all could have uniformly high game ratios.

For what it's worth, given the small sample size of only eight teams, the obtained correlation between point ratio and game ratio was a near-perfect r = .989. The relation is shown below (you can click on the image to enlarge it).

Several findings are worth noting:

1. The range of ratios was much smaller for points (between .848 and 1.16) than for games (.25 to 2.857).

2. No teams were located particularly near the 1.00 neutral point for either ratio. To do so, a team would have had to play virtually all close matches, winning and losing equally often and by equal margins.

3. Although the overall distribution exhibited a virtually perfect linear pattern, there appeared to be greater flatness in the plot among the least successful teams, than among the other squads.

Saturday, May 3, 2008

Preview of Penn State-Pepperdine NCAA Men's Final

In anticipation of tonigh't NCAA men's volleyball final between No. 1 Penn State and upstart Pepperdine (which only made the field via a late surge through the Mountain Pacific Sports Federation tournament), user "nomas" on the VolleyTalk discussion board offers this analysis of the match-up.

One of the salient issues is that Penn State has not played that tough of a schedule during the season, thus likely inflating the Nittany Lions' statistical prowess. As seen on Penn State's schedule, the Lions have played only a few matches against traditionally strong teams from the Pacific (Hawaii, UCLA, and Long Beach State), and two of those three matches were back in early January.

Monday, April 28, 2008

Preview of Men's NCAA Final Four

This coming Thursday and Saturday, the men's volleyball version of the NCAA Final Four will take place in Irvine, California (tournament website). The semi-finals will match Penn State against Ohio State, and Pepperdine against Long Beach State.

Men's collegiate VB is much more of a low-key affair than women's, with fewer schools fielding teams, less media coverage, etc. Accordingly, I have not posted nearly as much on the men's season as I had done for women's play last fall.

The Final Four is actually all there is of the men's "tournament." Three teams earn automatic berths via conference tournaments of the Mountain Pacific Sports Federation, Midwest Intercollegeiate Volleyball Association, and Eastern Intercollegeiate Volleyball Association. The field is rounded out by one at-large team, inevitably from the MPSF. The at-large choice is often controversial, and this year's was no exception.

The two contenders were BYU and Long Beach State, the regular-season MPSF co-champions at 18-4. Each lost in the conference tournament to eventual champ Pepperdine (which tied for fourth in the regular-season conference race at 12-10), thus giving the automatic bid to the Waves.

This Salt Lake City Deseret News article attempts to break things down scientifically:

The NCAA manual gives some criteria for selection the fourth at-large team — which didn't help clear up matters.

One of the first matters is won-loss records. Both finished tied atop the MPSF regular-season standings with identical 18-4 league records. Long Beach won the rights to host the MPSF Tournament, to be the top seed and to have the quarterfinal bye because of beating BYU twice in the regular season.

However, BYU had a better overall record, finishing 25-5 to LBSU's 23-6 — that's two more wins and one fewer loss than the 49ers.

Also considered is the strength of schedule, head-to-head competition, results against common opponents and results against Final Four qualifiers and other teams under NCAA tournament consideration.

LBSU had faced the other three Final Four teams in regular-season play, winning twice at Ohio State, losing at home to Penn State and going 1-2 all year against Pepperdine. The 49ers were also 1-1 against Cal State Northridge, who finished a game behind BYU and LBSU in the final league standings and lost to the Cougars in the MPSF semis.

BYU's five '08 losses were to Stanford (at home), twice to Long Beach (on the road), to UCLA (on the road) and to Pepperdine in the MPSF finals.

Long Beach's six losses were twice to Pepperdine (a regular-season defeat in Malibu and the MPSF semifinal loss), at hoem to Northridge, Penn State and UCLA and on the road at UC Irvine.

To me, the issue comes down to this. With NCAA tournament bids extremely scarce, why does the MPSF choose to have a conference tournament at all? The result this year is a regular-season co-champion staying home, while a team that finished six games behind in the standings will be playing in the Final Four. Perhaps the NCAA requires these conference tournaments to be held. It doesn't in other sports, however, so it probably doesn't in men's volleyball.

Saturday, February 9, 2008

Tradeoff of Serving Aggressively (or Cautiously): Service Errors vs. Aces

The other night, I caught a replay of UCLA hosting USC in men's volleyball, on Fox College Sports. It was the second match of the season between the Bruins and Trojans. In the first match, as the TV announcers pointed out, the teams had combined for 51 service errors, so it was suggested they would be toning down their aggressive service in the rematch. Indeed, UCLA and USC cut their combined service errors in half, to 25, in the second match.

That got me to thinking about coaches' decision-making strategies involved in choosing whether to have their teams serve aggressively or cautiously. The most aggressive type of delivery would seem to be the jump serve, as illustrated in these brief video clips I found on the web (here and here).

Such a serve has the potential to generate an ace or, if not that, a ball that the receiving team struggles to retrieve and thus takes the team out of its offense. Trying to pulverize the ball on the serve also, however, creates the potential for a missed serve, either into the net or out-of-bounds long. Jump-serving truly seems to be a high-risk/high-yield proposition (see here for a general discussion of risk/return tradeoffs, from the perspective of investing).

Additional web-searching turned up a 2000 American Volleyball Coaches Association (AVCA) newsletter article on serving, written by Melissa Stokes, who this past fall completed her 12th year as women's coach at Missouri State (formerly known as Southwest Missouri State).

The article by Stokes covers serving strategies and practice drills, with a healthy supply of statistics thrown in (probably from 1999). Using stats from the Missouri Valley Conference (in which her team plays), Stokes reveals that though her team was not among the league leaders in aces per game, it kepts its service error rate very low, resulting in virtually a 1-to-1 ratio of aces and errors. In contrast, there were a few MVC teams whose error-to-ace ratios were roughly 2-to-1. Stokes commented that:

Every team will have a different serving philosophy. This chart shows you that these statistics support our serving philosophy. We may not have posted as many aces, but committing fewer errors allowed our players the opportunity to score points at the net or by playing defense.

My curiosity sufficiently piqued, I then went and found the boxscores for the two UCLA-USC men's matches (both of which, incidentally, were won by the Trojans).

UCLA at USC (January 23, 2008, 5 games)
UCLA: 11 aces (2.2/game), 30 errors (6.0/game)
USC: 3 aces (0.6/game), 21 errors (4.2/game)

USC at UCLA (February 6, 2008, 3 games)
UCLA: 3 aces (1.0/game), 11 errors (3.7/game)
USC: 4 aces (1.3/game), 14 errors (4.7/game)

Speaking in approximate terms, UCLA's rates of aces and service errors each were cut in half from the first to the second match. This kind of proportionality is what I would have expected. USC doubled its rate of aces (albeit from a low baserate), but only experienced a slight proportional rise in errors.

These are just some limited examples, but I hope this posting can spur an expanded discussion of serving tradeoffs.

Sunday, January 6, 2008

Ferocity of Men's vs. Women's Spikes at the Elite College Level: Rates of Being Dug

As collegiate volleyball in the U.S. switches from women's play in the fall to men's in the winter/spring, so too does VolleyMetrics shift its focus. Fittingly, given this transition, there was a recent discussion topic on the VolleyTalk boards about the differences -- and relative enjoyability -- between the men's and women's games.

The discussion appeared to focus on the greater power of men's spiking than that of women's. In fact, one discussant characterized the men’s game as “Pass, set, boom.” Whether one finds beauty in these rocketing blasts or prefers the (assumed) longer rallies in the women's game is in the eye of the beholder, but the consensus that this difference exists was wide.

Here at VolleyMetrics, however, we want hard numbers. As an initial step, we can look at men's and women's team hitting percentages. Keep in mind that (as best I can tell), there are far more women’s volleyball programs in Division I alone than there are in men’s Divisions I, II, and III combined.

Having said that, it turns out that only five women’s Division I teams exceeded a hitting percentage of .300 this past season (Penn St., .350; Texas, .343; Nebraska, .327; Stanford, .316; and Florida, .303; a link to NCAA women's statistics is available in the upper-right of this page).

In contrast, at least 11 teams in the combined D I-II configuration surpassed .300 in the most recently completed men's season (the men's top 10 list stops with two teams tied for 10th at .307).

That's pretty good evidence, but me and my computer wanted more. Conveniently, as part of a separate VolleyTalk discussion thread on statistical refinements (which I excerpted a couple of entries down), user "p-dub" had suggested computation of:

"dig %", which is digs/non-error attacks...

I like the idea, which to elaborate a little, takes Team A's dig total and divides it by Team B's (total number of attacks - errors). More colloquially, we might call this measure "dig-ability" -- of the spike attempts coming at a team, what percentage of them do they dig up? (Note that a spike attempt blocked back in the face of a hitter is considered an error, so blocked balls are accounted for.)

Another thing I wanted to do is study the same schools' men's and women's programs, thus holding constant the strength and prestige of athletic programs, quality of facilities, etc. (assuming the men's and women's volleyball teams get equal access to these facilities).

Given the nature of the above formula, dig-ability can only be computed on a match-by-match basic, and not with aggregate seasonal stats. Given my roots as a former UCLA Daily Bruin men's and women's volleyball correspondent (1980-82), I selected the four UCLA-USC matches (two men's, two women's) played during 2007. For any given match, the dig-ability of both teams' spikes are calculated separately, so these four matches would generate eight data points. Further, to augment and diversify the sample, I also used the four Penn State-Ohio State matches of 2007 (the Nittany Lions and Buckeyes are in the same conference, the Big 10, in women's play, but in different men's conferences). The eight matches and 16 dig-ability data points are summarized as follows (with box-score links):

Women's: #4 USC vs #5 UCLA (Oct 05, 2007) 4 games
UCLA Dig 66 / (USC Attempts 150 - USC Errors 25) = .528
USC Dig 63 / (UCLA Att 170 - UCLA Err 30) = .450

W: #9 UCLA vs #6 USC (Nov 02, 2007) 4 games
UCLA Dig 89 / (USC Att 207 - USC Err 34) = .514
USC Dig 87 / (UCLA Att 183 - UCLA Err 29) = .565

Men's: USC vs UCLA (Jan 27, 2007) 3 games
UCLA Dig 28 / (USC Att 92 - USC Err 4) = .318
USC Dig 28 / (UCLA Att 112 - UCLA Err 26) = .326

M: #4 UCLA vs #12 USC (Mar 31, 2007) 3 games
UCLA Dig 24 / (USC Att 100 - USC Err 23) = .312
USC Dig 20 / (UCLA Att 94 - UCLA Err 12) = .244

W: Ohio State vs #3 Penn State (Oct 10, 2007) 3 games
OSU Dig 32 / (PSU Att 93 - PSU Err 8) = .376
PSU Dig 39 / (OSU Att 112 - OSU Err 27) = .459

W: #1 Penn State vs Ohio State (Nov 21, 2007) 3 games
OSU Dig 42 / (PSU Att 103 - PSU Err 10) = .452
PSU Dig 45 / (OSU Att 121 - OSU Err 29) = .489

M: #11 Ohio State vs #7 Penn State (Feb 01, 2007) 4 games
OSU Dig 30 / (PSU Att 131 - PSU Err 21) = .273
PSU Dig 45 / (OSU Att 133 - OSU Err 22) = .405

M: #6 Penn State vs #8 Ohio State (Apr 4, 2007) 5 games
OSU Dig 24 / (PSU Att 118 - PSU Err 26) = .261
PSU Dig 29 / (OSU Att 116 - OSU Err 24) = .315

It is plainly evident that spike attacks in women's matches have a higher dig-ability rate than those in men's, which will not surprise many observers. Whether the magnitude of difference is greater or less than expected, or about as expected, may differ among members of the volleyball community.

This analysis seemed to provide a good occasion to introduce the box plot, a graphical statistical tool. At a glance, the box plot displays the following aspects of a statistical distribution: the median (Mdn; point at which half the scores are greater and half are less than), the lower quartile (1Q; the point that cuts off the lower 25% of scores), the upper quartile (3Q), and the lowest and highest individual values in the distribution. This online calculator allows one to type in individual data values and automatically generates the necessary statistics for a box plot (and more).

The mean and median, respectively, were virtually identical within the men's distribution (.307, .3135) and within the women's (.479, .474); this is not always the case in analyzing data. Here are the two boxplots (you may click on the image to enlarge it).

The usual cautions about small sample size are applicable. Still, across the four men's matches and across four women's matches (each set taken from two regions of the country), the results appear to exhibit considerable consistency. Yes, women's spike attempts were more dig-able than were men's in the matches studied, but the magnitude of difference was less than .20 (roughly .48 - .31).

Whether these statistics indicate (to paraphrase a popular book title from several years ago) that the men's game is from Mars and the women's from Venus, or (to paraphrase a colleague of mine) the men's is from North Dakota and the women's from South Dakota, I don't know, but at least some statistical information can be added to the discussion.