As a baseball enthusiast and wildly
unsuccessful former high school pitcher, I have always been fascinated by the
greatness of a dominant pitcher.  As a
child, I was lucky enough to watch the mastery of Greg Maddux and the dominance
of Pedro Martinez.  At that time, I
wasn’t sure how to calculate ERA, but I knew that Maddux’s seasons in the 90s under
2.00 were special.  Later, as I matured
and developed a strong liking of numbers and all things mathematical, I found
myself pouring over tables and tables of statistics, believing that the numbers
could reveal true greatness.  In every
statistic, there are inherent weaknesses, none of which need to be discussed in
this forum.  Gone are the days that ERA
and Wins dominate the statistical landscape. 
They’ve been replaced with FIP and SIERA, both highly useful and well
thought out statistics.  In the end
though, I found myself wanting more.  To
satiate my want, I found myself doing what every stat geek and math nerd would have
done.  I opened up an Excel spreadsheet
and went to work.  
 
The goal of DIPS theory and FIP was to
quantify a pitcher’s effectiveness by only measuring things that he could
control.  Voros McCracken’s research from
the early 2000s told us that pitchers have little to no control over balls put
in play.  FIP essentially tries to
measure the exact opposite of BABIP. 
There’s a lot of merit to this idea. 
Pitchers that do not walk hitters and avoid giving up home runs are
generally more successful that those that fail in these areas, something Greg
Maddux taught me all those years ago.  
There is still something to be said though for a pitcher that just avoids solid contact, whether the ball leaves the yard or not. Naturally, I’m not the first person to have this theory. Balls in play are included in the calculations for both tERA and SIERA. The problem with these statistics is that they are very complicated to understand. I set out to find a much simpler method of determining a pitcher’s value. This brings us to the basis of my study, the average hit given up by a pitcher. After suffering through a 3-0 high school playoff loss some years ago in which the pitchers threw dueling three hitters with very different outcomes, it is safe to say that simply eliminating hits does not necessarily guarantee success as a pitcher. Using very simple statistics, it is easy to figure out what pitcher “gets hit the hardest.” The formula is Average Hit (AH) = SLG/BAA = TB/H. If we take all qualified pitchers from the 2012 season, here are the pitchers that induced the weakest contact and those that got hit the hardest.
| 
   
Pitcher 
 | 
  
   
AH 
 | 
  
   
Pitcher 
 | 
  
   
AH 
 | 
 
| 
   
Felix
  Hernandez 
 | 
  
   
1.38 
 | 
  
   
Ervin
  Santana 
 | 
  
   
1.95 
 | 
 
| 
   
Jake
  Westbrook 
 | 
  
   
1.39 
 | 
  
   
Derek
  Holland 
 | 
  
   
1.84 
 | 
 
| 
   
David
  Price 
 | 
  
   
1.41 
 | 
  
   
Phil
  Hughes 
 | 
  
   
1.78 
 | 
 
| 
   
Lucas
  Harrell 
 | 
  
   
1.43 
 | 
  
   
Ivan
  Nova 
 | 
  
   
1.77 
 | 
 
| 
   
Josh
  Johnson 
 | 
  
   
1.44 
 | 
  
   
Mike
  Minor 
 | 
  
   
1.75 
 | 
 
| 
   
Justin
  Masterson 
 | 
  
   
1.44 
 | 
  
   
James
  McDonald 
 | 
  
   
1.73 
 | 
 
| 
   
Jarrod
  Parker 
 | 
  
   
1.44 
 | 
  
   
Edwin
  Jackson 
 | 
  
   
1.73 
 | 
 
| 
   
Gio
  Gonzalez 
 | 
  
   
1.45 
 | 
  
   
Bruce
  Chen 
 | 
  
   
1.73 
 | 
 
| 
   
Johnny
  Cueto 
 | 
  
   
1.45 
 | 
  
   
Jason
  Vargas 
 | 
  
   
1.72 
 | 
 
| 
   
Tim
  Hudson 
 | 
  
   
1.46 
 | 
  
   
Tommy
  Hanson 
 | 
  
   
1.71 
 | 
 
As you might expect, the pitchers that
excel at this category are generally either “dominant” pitchers, such as Felix
Hernandez and David Price, or sinkerball pitchers, such as Jake Westbrook and
Justin Masterson.  Flyball pitchers tend
to find themselves in the right column. 
There are many factors that affect the average hit though that are not
accounted for, namely park and defense. 
Not everyone gets to throw 125 innings in Safeco Field or AT&T
Park.  Others gain benefit by pitching in
front of strong defensive clubs such as the Braves and Angels.  The first adjustment to make is for the
parks.  Now, it would foolhardy and
shortsighted to simply adjust based on a pitcher’s home park.  For example, Matt Cain throws the majority of
his innings in AT&T Park, but he also has to throw a handful of innings at
Coors Field.  Based on innings pitched in
each park, I calculated a weighted park factor for each pitcher, signified by
PPF.  I’ll leave the nitty gritty details
of this calculation out of this explanation.  The following shows with pitchers pitched in
the most hitter friendly and most pitcher friendly environments this season.
 | 
   
Pitcher 
 | 
  
   
PPF 
 | 
  
   
Pitcher 
 | 
  
   
PPF 
 | 
 
| 
   
Clay
  Buchholz 
 | 
  
   
1.109 
 | 
  
   
Felix
  Hernandez 
 | 
  
   
0.851 
 | 
 
| 
   
Jon
  Lester 
 | 
  
   
1.107 
 | 
  
   
Madison
  Bumgarner 
 | 
  
   
0.913 
 | 
 
| 
   
Jeremy
  Guthrie 
 | 
  
   
1.097 
 | 
  
   
Jason
  Vargas 
 | 
  
   
0.914 
 | 
 
| 
   
Josh
  Beckett 
 | 
  
   
1.088 
 | 
  
   
Ryan
  Vogelsong 
 | 
  
   
0.922 
 | 
 
| 
   
Gavin
  Floyd 
 | 
  
   
1.066 
 | 
  
   
Tim
  Lincecum 
 | 
  
   
0.923 
 | 
 
| 
   
Jake
  Peavy 
 | 
  
   
1.058 
 | 
  
   
Matt
  Cain 
 | 
  
   
0.924 
 | 
 
| 
   
Trevor
  Cahill 
 | 
  
   
1.057 
 | 
  
   
Dan
  Haren 
 | 
  
   
0.926 
 | 
 
| 
   
Wade
  Miley 
 | 
  
   
1.054 
 | 
  
   
Barry
  Zito 
 | 
  
   
0.933 
 | 
 
| 
   
Chris
  Sale 
 | 
  
   
1.052 
 | 
  
   
A.J.
  Burnett 
 | 
  
   
0.941 
 | 
 
| 
   
Derek
  Holland 
 | 
  
   
1.051 
 | 
  
   
R.A.
  Dickey 
 | 
  
   
0.942 
 | 
 
The adjustment for park is applied
directly to the average hit allowed as calculated above.  To adjust, I simply divided the average hit
by each pitcher’s park factor.  For
example, the average hit allowed by both Jake Peavy and Madison Bumgarner was
1.65 total bases.  After adjustment, Jake
Peavy would have theoretically allowed 1.56 total bases on a neutral field, and
Madison Bumgarner would have allowed 1.81. 
The top ten and bottom ten in adjusted average hit (adjAH) are listed
below.
| 
   
Pitcher 
 | 
  
   
adjAH 
 | 
  
   
Pitcher 
 | 
  
   
adjAH 
 | 
 
| 
   
Jake
  Westbrook 
 | 
  
   
1.35 
 | 
  
   
Ervin
  Santana 
 | 
  
   
2.04 
 | 
 
| 
   
Gio
  Gonzalez 
 | 
  
   
1.42 
 | 
  
   
Jason
  Vargas 
 | 
  
   
1.88 
 | 
 
| 
   
Johnny
  Cueto 
 | 
  
   
1.42 
 | 
  
   
James
  McDonald 
 | 
  
   
1.83 
 | 
 
| 
   
Rick
  Porcello 
 | 
  
   
1.42 
 | 
  
   
Dan
  Haren 
 | 
  
   
1.82 
 | 
 
| 
   
David
  Price 
 | 
  
   
1.43 
 | 
  
   
Ivan
  Nova 
 | 
  
   
1.81 
 | 
 
| 
   
Trevor
  Cahill 
 | 
  
   
1.44 
 | 
  
   
Madison
  Bumgarner 
 | 
  
   
1.81 
 | 
 
| 
   
Tim
  Hudson 
 | 
  
   
1.44 
 | 
  
   
Phil
  Hughes 
 | 
  
   
1.81 
 | 
 
| 
   
Lucas
  Harrell 
 | 
  
   
1.44 
 | 
  
   
Tim
  Lincecum 
 | 
  
   
1.80 
 | 
 
| 
   
Justin
  Masterson 
 | 
  
   
1.45 
 | 
  
   
Matt
  Cain 
 | 
  
   
1.76 
 | 
 
| 
   
Luis
  Mendoza 
 | 
  
   
1.46 
 | 
  
   
Derek
  Holland 
 | 
  
   
1.75 
 | 
 
| 
   
Pitcher 
 | 
  
   
HERA 
 | 
  
   
Pitcher 
 | 
  
   
HERA 
 | 
 
| 
   
Gio
  Gonzalez 
 | 
  
   
2.38 
 | 
  
   
Ivan
  Nova 
 | 
  
   
4.64 
 | 
 
| 
   
David
  Price 
 | 
  
   
2.64 
 | 
  
   
Dan
  Haren 
 | 
  
   
4.40 
 | 
 
| 
   
Clayton
  Kershaw 
 | 
  
   
2.69 
 | 
  
   
Ervin
  Santana 
 | 
  
   
4.26 
 | 
 
| 
   
Justin
  Verlander 
 | 
  
   
2.74 
 | 
  
   
Bruce
  Chen 
 | 
  
   
4.25 
 | 
 
| 
   
Yu
  Darvish 
 | 
  
   
2.77 
 | 
  
   
Mike
  Leake 
 | 
  
   
4.21 
 | 
 
| 
   
Chris
  Sale 
 | 
  
   
2.93 
 | 
  
   
Phil
  Hughes 
 | 
  
   
4.16 
 | 
 
| 
   
Jered
  Weaver 
 | 
  
   
2.93 
 | 
  
   
Joe
  Blanton 
 | 
  
   
4.11 
 | 
 
| 
   
Trevor
  Cahill 
 | 
  
   
2.98 
 | 
  
   
Rick
  Porcello 
 | 
  
   
4.10 
 | 
 
| 
   
Johnny
  Cueto 
 | 
  
   
3.01 
 | 
  
   
Henderson
  Alvarez 
 | 
  
   
4.09 
 | 
 
| 
   
Tim
  Hudson 
 | 
  
   
3.04 
 | 
  
   
Ubaldo
  Jimenez 
 | 
  
   
4.04 
 | 
 
| 
   
Pitcher 
 | 
  
   
WERA 
 | 
  
   
Pitcher 
 | 
  
   
WERA 
 | 
 
| 
   
Cliff
  Lee 
 | 
  
   
0.30 
 | 
  
   
Ricky
  Romero 
 | 
  
   
1.31 
 | 
 
| 
   
Bronson
  Arroyo 
 | 
  
   
0.39 
 | 
  
   
Edinson
  Volquez 
 | 
  
   
1.29 
 | 
 
| 
   
Joe
  Blanton 
 | 
  
   
0.40 
 | 
  
   
Ubaldo
  Jimenez 
 | 
  
   
1.21 
 | 
 
| 
   
Scott
  Diamond 
 | 
  
   
0.40 
 | 
  
   
Tim
  Lincecum 
 | 
  
   
1.09 
 | 
 
| 
   
Kyle
  Lohse 
 | 
  
   
0.41 
 | 
  
   
Aaron
  Harang 
 | 
  
   
1.06 
 | 
 
| 
   
Tommy
  Milone 
 | 
  
   
0.43 
 | 
  
   
Yu
  Darvish 
 | 
  
   
1.05 
 | 
 
| 
   
Wade
  Miley 
 | 
  
   
0.43 
 | 
  
   
Matt
  Moore 
 | 
  
   
1.03 
 | 
 
| 
   
Clayton
  Richard 
 | 
  
   
0.43 
 | 
  
   
C.J.
  Wilson 
 | 
  
   
1.01 
 | 
 
| 
   
Mark
  Buehrle 
 | 
  
   
0.44 
 | 
  
   
Justin
  Masterson 
 | 
  
   
0.96 
 | 
 
| 
   
Dan
  Haren 
 | 
  
   
0.48 
 | 
  
   
Tommy
  Hanson 
 | 
  
   
0.91 
 | 
 
| 
   
Pitcher 
 | 
  
   
eERA 
 | 
  
   
Pitcher 
 | 
  
   
eERA 
 | 
 
| 
   
Justin
  Verlander 
 | 
  
   
3.10 
 | 
  
   
Ricky
  Romero 
 | 
  
   
5.51 
 | 
 
| 
   
Gio
  Gonzalez 
 | 
  
   
3.16 
 | 
  
   
Tommy
  Hanson 
 | 
  
   
5.40 
 | 
 
| 
   
Clayton
  Kershaw 
 | 
  
   
3.35 
 | 
  
   
Ervin
  Santana 
 | 
  
   
5.39 
 | 
 
| 
   
R.A.
  Dickey 
 | 
  
   
3.39 
 | 
  
   
Dan
  Haren 
 | 
  
   
5.25 
 | 
 
| 
   
David
  Price 
 | 
  
   
3.40 
 | 
  
   
Ivan
  Nova 
 | 
  
   
5.24 
 | 
 
| 
   
Lucas
  Harrell 
 | 
  
   
3.56 
 | 
  
   
Henderson
  Alvarez 
 | 
  
   
5.11 
 | 
 
| 
   
Kyle
  Lohse 
 | 
  
   
3.57 
 | 
  
   
Tim
  Lincecum 
 | 
  
   
5.02 
 | 
 
| 
   
Chris
  Sale 
 | 
  
   
3.58 
 | 
  
   
Ubaldo
  Jimenez 
 | 
  
   
4.93 
 | 
 
| 
   
Josh
  Johnson 
 | 
  
   
3.60 
 | 
  
   
Mike
  Leake 
 | 
  
   
4.93 
 | 
 
| 
   
Jordan
  Zimmermann 
 | 
  
   
3.60 
 | 
  
   
Bruce
  Chen 
 | 
  
   
4.88 
 | 
 
A strong correlation seems to exist between eERA and ERA, but how does this compare to other more widely accepted ERA estimators? First, let’s look at how well FIP estimates ERA. It is worth noting that all the following statistics were adjusted so that the average ERA, eERA, FIP, tERA, and SIERA of the 88 pitchers used in this study were equal.
As you can see, a strong relationship exists when using either eERA, FIP, or tERA. The linear correlation goes down considerably when we use SIERA, which is surprising as it is widely considered to be a better estimator than tERA. Of all the data presented though, eERA shows the strongest correlation. There is not a large difference between eERA and tERA. If you remove the high outlier on the tERA near 6.00 (Jeremy Guthrie), the correlation increases to 0.6329, which is still weaker than eERA. Admittedly, this metric is not perfect, but what metric truly is? I welcome feedback on the information I have presented here. With the Cy Young winners yet to be announced, it will be interesting to see if Justin Verlander and Gio Gonzalez actually take home the prizes after leading their respective leagues in eERA. Bill James and Rob Neyer’s Cy Young Predictor currently lists Verlander as the fourth best candidate in the American League and Gio Gonzalez as second in the National League. The favorites by that metric are David Price and R.A. Dickey, who would be second and third in their leagues respectively by eERA.
--Stats All Folks




No comments:
Post a Comment