Monday, June 25, 2012

Research study, part III: The future success of first-out winners, broken down by style

Without ado, the graded-stakes-winning success-rate of each style.

Table 1

Behind 27.9%
Pressed 23.6%
Led 23.5%
Middle 21.7%


Chi_Square confirmed that the variation here is not suggestive of an effect. In other words, it is a reasonable thought that style of victory has no bearing on future success, or at least that good horses can come from every initial style.

The "behind" percentage is most suggestive, standing out above the remaining three styles. However, let's take a step back: in the qualifying races, horses who won their debuts from behind went on to be graded stakes winners 28% of the time; horses who won another way became graded stakes winners 23% of the time. And that difference is suspect, since the sample for the "behind" winners was just 86, far less than half of any other group. If an aberration was going to occur, it was most likely going to be from this group.

The graded stakes rates for my group of debut winners are in the range of batting averages; the overall rate of .236 is close to the current typical batting average of .250. Think of the sample sizes as at-bats, ranging from 225 for "pressed," to 267 for "middle." We have here three players with about 250 at-bats each, well less than a full season's worth, with batting averages between .217 and .236. There is no likelihood the .236 hitter will finish the season with the best average. The players look indistinguishable. The "behind" player has hit better in just 86 at-bats, but at .279, he has hardly torn it up. If given a full-time job, he were to hit only .200, no one would bat an eye.

A standard error, or average amount the percentage is off just because of the limits of the sample size, can be calculated for each style. For behind, it's .048; for middle, the style with the largest sample size, .025.

Using earnings, "behind" remains on top using average, with almost $30,000 more per debut winner than any of the other styles. But "behind" falls to 3rd using median (although just $7,000 behind "winner" "middle").

Table 2. Mean earnings

Behind $284,931
Led $255,113
Middle $252,092
Pressed $233,606

Table 3. Median earnings

Middle $143,225
Led $137,426
Behind $136,098
Pressed $125,396


That the "behind" group shows better using mean than median suggests some "big horses" in that group, although this characterization must be considered in the light of the highest graded stakes winning percentage, too. In other words, the strong performance of the "behind" group is hardly limited to a handful of horses, even with the sample of just 86 (it contained 24 eventual graded stakes winners).

As an identifier of top, top horses I found the 75th percentile for earnings among all of the graded stakes winners. It was just over $800,000. In turn, I figured the percentage of $800,000 earners by style.

The "led" and "behind" groups may have an advantage here. The rate of the "led" winners may be particularly noteworthy, given the larger sample size pertaining to it.

Table 4

Behind 7.0%
Led 6.8%
Middle 5.2%
Pressed 4.9%


To keep you grounded, the percentages represent 17 $800,000 earners in the "led" group and six in the "behind" group. In order, the 17 top earners in the "led" group were Ashado, Surfside, Southern Image, Discreet Cat, Diabolical, The Cliff's Edge, Lady Tak, Madcap Escapade, Grand Hombre, Grand Slam, Cash Run, E Dubai, First Samurai, Harmony Lodge, Pomeroy, Texas Glitter, and Esteemed Friend.

Based on the data I've reviewed, the "behind" group has plenty of graded stakes winners and a reasonable number of big stars, although not a huge advantage over the other styles in the latter. I thought that perhaps the puzzlingly low median was due to more "flame-outs" than in other groups. I am always on the look-out for traits that correspond to higher rates of injury. And if I found a higher rate of horses in the "behind" group who went on to do just about nothing, I thought that might also mean that there were more lucky, low speed-figure, "clunking up" winners in the group. But comparing the 10th percentile in earnings for the styles, or comparing on similarly low percentiles, yielded little.

Table 5. 10th percentile in earnings

Behind $36,677
Pressed $35,988
Led $35,539
Middle $31,951


If I had to venture a guess, I would say "behind" is on top of this chart because the average winning purse in those horses' maiden special weights happened to be the highest, maybe because of a preponderance of California tracks in that sample. Career earnings of $35,000 don't represent the spoils of much more than a maiden victory. The true "flame-out" percentage is probably higher than 10%, which is why I made comparisons as well at some slightly higher percentiles, without seeing anything terribly interesting. The fact that "middle" has the highest median but the lowest 10th percentile probably indicates that the 10th percentile sheds little light on why the median is what it is.

The picture becomes far more interesting when we compare success in terms of short and long graded stakes winners.

Table 6. Rated by percentage of short graded stakes winners

Led 18.3%
Pressed 16.8%
Middle 16.1%
Behind 12.8%


Table 7. Rated by percentage of long graded stakes winners

Behind 18.6%
Pressed 11.5%
Middle 10.9%
Led 10.8%


Looking at the above data in a different form, and using totals instead of percentages to emphasize the sample size element, here are the number of short and long graded stakes winners in each group, and the ratio between the two.

Table 8

Led 46 27 1.70
Pressed 38 26 1.46
Middle 43 29 1.48
Behind 11 16 0.69


The "behind" group had the highest rate of long graded stakes winners and the lowest rate of short graded stakes winners, with a ratio essentially the opposite of the other three styles. This makes intuitive sense: horses who win from off the pace in sprints are telling you something -- they want to go longer. Lately, however, a belief has taken hold that this assumption is knee-jerk. My data suggest otherwise (unless trainers are forcing their closers into longer distance races, making the assumption that so many sophisticates dislike). In any event, it would seem that the debut come-from-behind sprinters manage acceptably when they are run longer, irrespective of whether they are given ample chance at the stakes level in shorter races as well.

It's an open question whether the pattern of closing debut winners becoming good distance horses has its equivalent in front-running debut winners becoming good sprinters. Significance tests of the ratio of short to long graded stakes winners are complicated by the fact that horses can be in both groups (note there were 59 total graded stakes winners from the "led" group, not the 73 that are the sum of the short and long winners). Still, I will state with some confidence that the distance data for the front-running winners does not indicate special distance proclivities, short or long. Although the short/long ratio is higher for the “led” group than overall, the difference is not striking.

In part I I showed that long graded stakes winners make more money than short graded stakes winners. I therefore wondered now if the advantage the "behind" group showed in mean earnings was a result of the higher percentage of long graded stakes winners among the "behind." The overall superiority in mean earnings for "behind" remains when the sample is restricted to graded stakes winners, and in fact extends to median earnings as well for graded stakes winners. So I was wondering if the "behind" graded stakes winners simply outperformed their counterparts, or whether they just earned the most money because they did more of their good work in routes.

These data were not what I expected. In stark contrast to the overall numbers, in the "behind" category, the eight "short-only" graded stakes winners earned more than the 13 "long-only" graded stakes winners. The edge was $261,455 in mean and $139,810 in median. And among "short-only" graded stakes winners, the "behind" group was easily tops in mean and median earnings. The "behind" group was last in mean and median earnings among "long-only" graded stakes winners. Graded stakes winners in the "behind" group made the most money on the backs of sprinters and horses who went both ways, short and long, not on the strength of the predominant routers.

I went to the data and looked horse by horse at "graded stakes distance type" and earnings, also considering the ability of the horses in a holistic sense. The pattern I present does not stand out to my naked eye. There were some excellent sprinters in the "behind" group (Midnight Lute, Dream Supreme), but they seemed offset by just as many good distance horses. Perhaps if I examined the graded stakes winners of other groups, the difference would be more phoney graded stakes-winning sprinters in those groups.

The take-away question may be, if the "behind" group can produce some really outstanding sprinters, doesn't it stand to reason that it should be able to produce more of them? Perhaps we have evidence for the rare if veritable "come-from-behind" sprinter. Or perhaps the discrepancy between the frequency of graded stakes types and grade I types suggests that the low number of short-only "behind" graded stakes winners is flukish.

A couple of additional notes from the earnings data, at the risk of going too deeply into it given its likely non-significant nature. The key to the "middle" group's top median earnings was the earnings of the non-graded stakes winners, which were $111,494 on the median, over $16,000 more than any of the other styles...Athough the category's percentage of graded stakes winners was right on the overall study average, I'm convinced after reviewing the earnings' data that the "press" winners had the worst subsequent performance. The median earnings of $410,497 for the press graded stakes winners, compared to $552,784 for the remaining styles, is particularly glaring.

No comments:

Post a Comment