As a baseball enthusiast and wildly
unsuccessful former high school pitcher, I have always been fascinated by the
greatness of a dominant pitcher. As a
child, I was lucky enough to watch the mastery of Greg Maddux and the dominance
of Pedro Martinez. At that time, I
wasn’t sure how to calculate ERA, but I knew that Maddux’s seasons in the 90s under
2.00 were special. Later, as I matured
and developed a strong liking of numbers and all things mathematical, I found
myself pouring over tables and tables of statistics, believing that the numbers
could reveal true greatness. In every
statistic, there are inherent weaknesses, none of which need to be discussed in
this forum. Gone are the days that ERA
and Wins dominate the statistical landscape.
They’ve been replaced with FIP and SIERA, both highly useful and well
thought out statistics. In the end
though, I found myself wanting more. To
satiate my want, I found myself doing what every stat geek and math nerd would have
done. I opened up an Excel spreadsheet
and went to work.
The goal of DIPS theory and FIP was to
quantify a pitcher’s effectiveness by only measuring things that he could
control. Voros McCracken’s research from
the early 2000s told us that pitchers have little to no control over balls put
in play. FIP essentially tries to
measure the exact opposite of BABIP.
There’s a lot of merit to this idea.
Pitchers that do not walk hitters and avoid giving up home runs are
generally more successful that those that fail in these areas, something Greg
Maddux taught me all those years ago.
There is still something to be said though for a pitcher that just avoids solid contact, whether the ball leaves the yard or not. Naturally, I’m not the first person to have this theory. Balls in play are included in the calculations for both tERA and SIERA. The problem with these statistics is that they are very complicated to understand. I set out to find a much simpler method of determining a pitcher’s value. This brings us to the basis of my study, the average hit given up by a pitcher. After suffering through a 3-0 high school playoff loss some years ago in which the pitchers threw dueling three hitters with very different outcomes, it is safe to say that simply eliminating hits does not necessarily guarantee success as a pitcher. Using very simple statistics, it is easy to figure out what pitcher “gets hit the hardest.” The formula is Average Hit (AH) = SLG/BAA = TB/H. If we take all qualified pitchers from the 2012 season, here are the pitchers that induced the weakest contact and those that got hit the hardest.
Pitcher
|
AH
|
Pitcher
|
AH
|
Felix
Hernandez
|
1.38
|
Ervin
Santana
|
1.95
|
Jake
Westbrook
|
1.39
|
Derek
Holland
|
1.84
|
David
Price
|
1.41
|
Phil
Hughes
|
1.78
|
Lucas
Harrell
|
1.43
|
Ivan
Nova
|
1.77
|
Josh
Johnson
|
1.44
|
Mike
Minor
|
1.75
|
Justin
Masterson
|
1.44
|
James
McDonald
|
1.73
|
Jarrod
Parker
|
1.44
|
Edwin
Jackson
|
1.73
|
Gio
Gonzalez
|
1.45
|
Bruce
Chen
|
1.73
|
Johnny
Cueto
|
1.45
|
Jason
Vargas
|
1.72
|
Tim
Hudson
|
1.46
|
Tommy
Hanson
|
1.71
|
As you might expect, the pitchers that
excel at this category are generally either “dominant” pitchers, such as Felix
Hernandez and David Price, or sinkerball pitchers, such as Jake Westbrook and
Justin Masterson. Flyball pitchers tend
to find themselves in the right column.
There are many factors that affect the average hit though that are not
accounted for, namely park and defense.
Not everyone gets to throw 125 innings in Safeco Field or AT&T
Park. Others gain benefit by pitching in
front of strong defensive clubs such as the Braves and Angels. The first adjustment to make is for the
parks. Now, it would foolhardy and
shortsighted to simply adjust based on a pitcher’s home park. For example, Matt Cain throws the majority of
his innings in AT&T Park, but he also has to throw a handful of innings at
Coors Field. Based on innings pitched in
each park, I calculated a weighted park factor for each pitcher, signified by
PPF. I’ll leave the nitty gritty details
of this calculation out of this explanation. The following shows with pitchers pitched in
the most hitter friendly and most pitcher friendly environments this season.
Pitcher
|
PPF
|
Pitcher
|
PPF
|
Clay
Buchholz
|
1.109
|
Felix
Hernandez
|
0.851
|
Jon
Lester
|
1.107
|
Madison
Bumgarner
|
0.913
|
Jeremy
Guthrie
|
1.097
|
Jason
Vargas
|
0.914
|
Josh
Beckett
|
1.088
|
Ryan
Vogelsong
|
0.922
|
Gavin
Floyd
|
1.066
|
Tim
Lincecum
|
0.923
|
Jake
Peavy
|
1.058
|
Matt
Cain
|
0.924
|
Trevor
Cahill
|
1.057
|
Dan
Haren
|
0.926
|
Wade
Miley
|
1.054
|
Barry
Zito
|
0.933
|
Chris
Sale
|
1.052
|
A.J.
Burnett
|
0.941
|
Derek
Holland
|
1.051
|
R.A.
Dickey
|
0.942
|
The adjustment for park is applied
directly to the average hit allowed as calculated above. To adjust, I simply divided the average hit
by each pitcher’s park factor. For
example, the average hit allowed by both Jake Peavy and Madison Bumgarner was
1.65 total bases. After adjustment, Jake
Peavy would have theoretically allowed 1.56 total bases on a neutral field, and
Madison Bumgarner would have allowed 1.81.
The top ten and bottom ten in adjusted average hit (adjAH) are listed
below.
Pitcher
|
adjAH
|
Pitcher
|
adjAH
|
Jake
Westbrook
|
1.35
|
Ervin
Santana
|
2.04
|
Gio
Gonzalez
|
1.42
|
Jason
Vargas
|
1.88
|
Johnny
Cueto
|
1.42
|
James
McDonald
|
1.83
|
Rick
Porcello
|
1.42
|
Dan
Haren
|
1.82
|
David
Price
|
1.43
|
Ivan
Nova
|
1.81
|
Trevor
Cahill
|
1.44
|
Madison
Bumgarner
|
1.81
|
Tim
Hudson
|
1.44
|
Phil
Hughes
|
1.81
|
Lucas
Harrell
|
1.44
|
Tim
Lincecum
|
1.80
|
Justin
Masterson
|
1.45
|
Matt
Cain
|
1.76
|
Luis
Mendoza
|
1.46
|
Derek
Holland
|
1.75
|
Pitcher
|
HERA
|
Pitcher
|
HERA
|
Gio
Gonzalez
|
2.38
|
Ivan
Nova
|
4.64
|
David
Price
|
2.64
|
Dan
Haren
|
4.40
|
Clayton
Kershaw
|
2.69
|
Ervin
Santana
|
4.26
|
Justin
Verlander
|
2.74
|
Bruce
Chen
|
4.25
|
Yu
Darvish
|
2.77
|
Mike
Leake
|
4.21
|
Chris
Sale
|
2.93
|
Phil
Hughes
|
4.16
|
Jered
Weaver
|
2.93
|
Joe
Blanton
|
4.11
|
Trevor
Cahill
|
2.98
|
Rick
Porcello
|
4.10
|
Johnny
Cueto
|
3.01
|
Henderson
Alvarez
|
4.09
|
Tim
Hudson
|
3.04
|
Ubaldo
Jimenez
|
4.04
|
Pitcher
|
WERA
|
Pitcher
|
WERA
|
Cliff
Lee
|
0.30
|
Ricky
Romero
|
1.31
|
Bronson
Arroyo
|
0.39
|
Edinson
Volquez
|
1.29
|
Joe
Blanton
|
0.40
|
Ubaldo
Jimenez
|
1.21
|
Scott
Diamond
|
0.40
|
Tim
Lincecum
|
1.09
|
Kyle
Lohse
|
0.41
|
Aaron
Harang
|
1.06
|
Tommy
Milone
|
0.43
|
Yu
Darvish
|
1.05
|
Wade
Miley
|
0.43
|
Matt
Moore
|
1.03
|
Clayton
Richard
|
0.43
|
C.J.
Wilson
|
1.01
|
Mark
Buehrle
|
0.44
|
Justin
Masterson
|
0.96
|
Dan
Haren
|
0.48
|
Tommy
Hanson
|
0.91
|
Pitcher
|
eERA
|
Pitcher
|
eERA
|
Justin
Verlander
|
3.10
|
Ricky
Romero
|
5.51
|
Gio
Gonzalez
|
3.16
|
Tommy
Hanson
|
5.40
|
Clayton
Kershaw
|
3.35
|
Ervin
Santana
|
5.39
|
R.A.
Dickey
|
3.39
|
Dan
Haren
|
5.25
|
David
Price
|
3.40
|
Ivan
Nova
|
5.24
|
Lucas
Harrell
|
3.56
|
Henderson
Alvarez
|
5.11
|
Kyle
Lohse
|
3.57
|
Tim
Lincecum
|
5.02
|
Chris
Sale
|
3.58
|
Ubaldo
Jimenez
|
4.93
|
Josh
Johnson
|
3.60
|
Mike
Leake
|
4.93
|
Jordan
Zimmermann
|
3.60
|
Bruce
Chen
|
4.88
|
A strong correlation seems to exist between eERA and ERA, but how does this compare to other more widely accepted ERA estimators? First, let’s look at how well FIP estimates ERA. It is worth noting that all the following statistics were adjusted so that the average ERA, eERA, FIP, tERA, and SIERA of the 88 pitchers used in this study were equal.
As you can see, a strong relationship exists when using either eERA, FIP, or tERA. The linear correlation goes down considerably when we use SIERA, which is surprising as it is widely considered to be a better estimator than tERA. Of all the data presented though, eERA shows the strongest correlation. There is not a large difference between eERA and tERA. If you remove the high outlier on the tERA near 6.00 (Jeremy Guthrie), the correlation increases to 0.6329, which is still weaker than eERA. Admittedly, this metric is not perfect, but what metric truly is? I welcome feedback on the information I have presented here. With the Cy Young winners yet to be announced, it will be interesting to see if Justin Verlander and Gio Gonzalez actually take home the prizes after leading their respective leagues in eERA. Bill James and Rob Neyer’s Cy Young Predictor currently lists Verlander as the fourth best candidate in the American League and Gio Gonzalez as second in the National League. The favorites by that metric are David Price and R.A. Dickey, who would be second and third in their leagues respectively by eERA.
--Stats All Folks
No comments:
Post a Comment