What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Beat articles, power rankings, statistical analysis, etc. goes here.
allenciox
Ex-GM
Posts: 350
Joined: Sun Jun 16, 2019 2:23 pm
Has thanked: 98 times
Been thanked: 144 times

What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by allenciox » Mon Mar 08, 2021 8:25 pm

Ok, just over a week ago I posted Initial analyses from 2045 BBA Season Statistics describing how I saved all the different splits for the season stats in BBA and UMEBA last season, and some initial "statistics" on the results.

So the first thing I did was to do an analysis of how important the various pitching ratings are for predicting pitcher outcomes in 2045 BBA. The results are included below. But first I want to describe how I reached these results.

I took all the pitcher data and combined it with the ratings information. For all the analyses included herein, I used the vL and vR splits so that I had true measures of how each pitcher did against both left-handed and right-handed batters.

Then I ran stepwise regression models for each of the outcome measures I was looking at. For those that don't know, I will describe how these work and then what I did for each one. Suppose I am trying to predict FIP for pitchers using a variety of different ratings for the pitchers. In this case, I might use STU, MOV, CON, gb_pct (a conversion of NEU, FB, GB, EX GB, etc. to supposed ground ball per cent), etc. What it does is the model asks, "Which variable is MOST predictive of FIP of all those choices." Then it asks, "Am I confident enough in that prediction that only once in a 'blue moon' (defined as a probability called alpha) that the data could have been arranged to look like this if that variable was not really predictive?"

Once it has the "best predictor variable", it asks: "Now I am going to subtract out the influence of that variable on the target, and try again with all remaining variables to repeat the process" But before it does that, it asks, "Are there any variables that I added as predictors before that no longer need to be in the model in order to get adequate predictions?" If so, then it removes those variables.

It continues the process until it stops being able to find any additional predictor variables that have significant influence beyond that alpha probability. It then returns an equation, like so: <Prediction>=Intercept + w1<var1> + w2<var2> + ... for all significant predictor variables. The intercept corresponds to the value of the prediction if all other variables are 0. For each w1, w2, etc. term, there is also a "standard error" computed for it which indicates how much the actual weight might vary from the computed weight. Generally, if we knew the actual relation, there would be a 67% chance that each weight would be within one standard error of what we compute, and a 95% chance within two standard errors.

Ok, so now I have explained it, I'll tell you what I did. I first separated all the pitchers into "starters" (those that had started at least 85% of their games), and "relievers" (those that had started less than 10% of their games). Any pitchers that had less than 50 batters faced (bf) or that did not meet either criteria were excluded from the analysis.

So then I did analyses for all sets of pitcher handedness X batter handedness X reliever vs. Starter (i.e. 8 analyses), in each case trying to predict bb per bf (bb_bf), strike outs per at bat against (k_ab), home runs allowed per AB (hr_ab), FIP, and ERA. In each case, I did a "weighted" regression where I weighted by bf so that pitchers that pitched more had a stronger effect on the results. The variables that I used to predict each one were the "appropriate" STU, CON, MOV (i.e. if I was looking at vR batters, I used the vR variants, if I was looking at vL batters, I used the vL variants), gb_pct, kb (1 or 0 regarding whether they were a knuckleballer or not), kc (1 or 0 whether they had a knuckleball curve pitch or not), PIT (# of pitches in their repertoire), good_pit (# of pitches in their repertoire that they had greater than a 2 rating on), and Stamina. In all my tests, I used an alpha of .01 ---- saying it would be ok if 1% of the time it thought something was significant that really wasn't.

The result was that the ONLY significant variables in any of those regressions were the relevant STU, MOV, and CON ratings. I was surprised that good_pit did not matter for starters, so I checked and there was only one pitcher with a "starter" role with at least 50 batters faced in all of BBA in 2045 that had fewer than three good pitches. It didn't find an effect because there weren't enough examples to show it. I therefore removed that one player (Miguel Angel Garza) from the rest of my analysis.

The other thing that I found was that the effect of STU, MOV, and CON for all the predictions for starters LvL, LvR, RvL, RvR and for predictions for relievers LvL, LvR, RvL, RvR were within roughly one "standard error" of each other suggesting that it really was using CONvL, CONvR, etc. as a prediction for what would happen: i.e. left handed relievers weren't getting any "boost" from the engine compared to right-handed relievers, etc.

Since this was true, for each pitcher, I could just take the actual percentage of time that pitcher faced a left-handed or right-handed batter, compute their combined stat based on that actual percentage (e.g. if a certain pitcher faced righties exactly 2/3 of the time, I would compute MOV = 2/3*MOV_vR+ 1/3*MOV_vL for that pitcher). I then could do an analysis across only two groups, starters and relievers, again weighted by batters faced (for starters and relievers, there WERE significant differences in the computed weights).

The other thing I did was to verify via looking at diagnostic plots whether a "linear" function, as mentioned above, was a good fit for the data, or whether a more complicated function would be needed. What I found was that a linear function worked great except for when a pitcher's CON was below 5. As you can verify from the player editor (and I have also found previously looking at perfect team and perfect team tournament data), each point in CON below 100 in the editor (or 5, for us) is about equivalent to two points in CON above 5. What I did was to adjust the scores so that I subtracted five from STU, MOV, and CON for each pitcher. For CON, if that left a negative number I then multiplied that by two (so a CON of 4 would be converted to -2, a CON of 3 would be converted to -4, etc.)

Now that I have explained the procedure, let me go through the results. First, here is a table for the results predicting bb_bf (walks per batter faced):

bb_bf:
role_TYPE__RSQ_InterceptSTUMOVCON
relievePARMS0.603260.12829..-0.017912
relieveSTDERR.0.00256..0.001000
starterPARMS0.725570.11872..-0.017348
starterSTDERR.0.00239..0.000871
This shows the results separately for relievers and starters. So the _RSQ_ represents the r-square value for the regression. If you could always exactly predict the percentage of batters that would be walked, this would be 1 (or 100% of the variance). Here we have an r-square for relievers of 60.3% and an r-square for starters of 72.5%. These are really high numbers, considering the ratings are rounded. The entire r-square is accounted for by CON, knowing a pitcher's STU and MOV tells you absolutely nothing about the percentage of batters they are going to walk --- which is what we would expect to see (yes, I know we already know this is the way it works, but verifying it gives confidence in our data). Note that the r-squared for relievers is lower, this we also would expect to see this, since starters generally face a lot more batters than relievers do, so the results will get closer to what they "should" be.

Since we have subtracted five from each stat before doing this analysis, this tells us (from the Intercept) that a reliever with CON of 5 vR and vL would expect to walk 12.83% of batters, while a CON=9 reliever would expect to walk about 5.72% (12.83% - 1.79%*4) of batters in this league.

Starters have a slightly lower baseline: a starter with a control of 5 would expect to walk 11.87% of batters (footnote: is this statistically different than relievers: note that it would take five standard errors to make them equivalent, so yes, as a baseline starting pitchers walk a lower percentage of batters at the same CON value as a reliever would). However, the difference in the percentage of batters that starters walks is almost exactly the same as it is for relievers, so while starters have a lower baseline, the effect of a point difference for starters appears to be about the same as for relievers (they are within one standard error of each other).

So, now let's look at percentage of abs that result in strike outs (k_ab):
role_TYPE__RSQ_InterceptSTUMOVCON
relievePARMS0.807930.0640110.038591..
relieveSTDERR.0.0078220.001295..
starterPARMS0.711820.0897210.026739..
starterSTDERR.0.0064460.001389..
Surprise! The only stat that affects strike out rates is STU. I'll bet you never would have guessed that, would you? :cool: Look at the r-squared values here: 80.8% for relievers and 71.1% for starters. These are huge values, once again indicating that our equation is about as perfect as it could get with the rounded information we have.

Note something else: a baseline 5 STU reliever gets about 2.6% fewer strike outs than a baseline 5 STU starter. Well, duh... recall that a starter gains about a point or so in STU when set as an RP, which probably not coincidentally makes them even (i.e. a 5 STU starter would strike out about 9% of batters, and a 6 STU reliever would also strike out about 9% of batters). However, each point in STU results in MORE strike outs for relievers than it does for starters. A 10 STU starter would strike out about 8.97%+5* 2.67%= 22.3%, while, as an 11 STU reliever that pitcher would strike out about 6.4%+ 6*3.86%=29.6% of the batters he faces.

Moral of the story: high STU pitchers might work better as relievers.

Let's look at hr_ab:
role_TYPE__RSQ_InterceptSTUMOVCON
relievePARMS0.326000.051478.-.008992451.
relieveSTDERR.0.001521.0.000890140.
starterPARMS0.298430.048840.-.006994349.
starterSTDERR.0.001561.0.000875614.
If it surprises you that MOV is the only rating that affects home runs allowed, stop reading right now --- you are clearly not ready for this class. Note, though, that the r-squared values are much, much lower than they were for bb and K. Part of the reason is obviously park factors, but that doesn't seem sufficient to me to completely explain the disparity. Whether the baseline value for home run % between starters and relievers (i.e. when MOV is 5) is different is not entirely clear (the intercepts are almost exactly two standard errors apart). However, as with K %, HR % changes more rapidly per point of MOV for relievers than starters.

Now, let's look at the measures that you have all been waiting for, FIP and ERA. First, FIP, since a pitcher has more control over that:
role_TYPE__RSQ_InterceptSTUMOVCONSTU_pctMOV_pctCON_pct
relievePARMS0.497887.50995-0.33721-0.57310-0.184240.308080.523600.16832
relieveSTDERR.0.229060.026090.063650.04386...
starterPARMS0.532637.23568-0.26882-0.36175-0.317520.283540.381560.33491
starterSTDERR.0.199060.029240.061840.04191..
For a composite statistic, using all three of the measures, getting 50% of the variance accounted for by r-square I think is quite good. Note that I have added three additional columns here. What I have done is added all three weights (STU, MOV, CON) and then divided each of the individual measures by that so I can indicate a relative percentage that each stat contributes to the FIP score. If they were all equal in importance, then they would each be 33.33% Here we see that MOV is the highest relationship (38% for starters, 52.4% for relievers) but there is a huge difference in how effective CON is: It appears to be about twice as important for starters than relievers, and about three standard errors apart --- clearly a significant difference.

Recall that differences in MOV and STU seem to result in bigger performance differences in relievers than starters, while CON differences were about the same. That seems to account for it. To me, this implies that you want to make your high CON pitchers starters, high STU and MOV pitchers relievers.

Something else to note: a 5-5-5 pitcher would not expect a very good FIP in this league: about 7.51 for a reliever or 7.24 as a starter. Each point across the board is worth about 1.2 FIP as a reliever and about .95 FIP as a starter: so a 8-8-8 reliever would expect about 3.9 FIP and a 8-8-8 starter about 4.8 FIP.

Here is the table for ERA:
role_TYPE__RSQ_InterceptSTUMOVCONSTU_pctMOV_pctCON_pct
relievePARMS0.315137.39922-0.32010-0.56718-0.206600.292630.518500.18887
relieveSTDERR.0.326250.037170.090660.06246...
starterPARMS0.283487.04395-0.25327-0.31463-0.324010.283960.352760.36328
starterSTDERR.0.322360.047350.100150.06787...

Note how similar it looks to FIP, only the r-square is lower since fielding is now a big component.

I must admit, though, I was very surprised about the ERA results. Not by what is there, but what isn't. What isn't clear to a lot of people is that MOV is also a composite: of the built-in MOV (basically, percentage of fly balls that are home runs) and the ground-ball percentage. PItchers that force more ground balls, everything else being equal, will give up less home runs (thus show a higher MOV). There is generally a trade-off for this, however. Generally, ground ball pitchers give up more hits (since a greater percentage of ground balls are hits than fly balls). In previous analyses I have done for perfect team, ground-ball percentage plays a significant role in predicting ERA, but not necessarily FIP, since ERA includes BABIP as part of it.

So I did a final analysis of BABIP. I don't need to show the table for that. None of the ratings had a significant effect for BABIP --- not even gb_pct. Perhaps it is that one season is just not enough data to show it. But I find it perplexing to some extent.

So I have really geeked out here, and it might be that none of this is of interest to anyone except the geekiest of us all (myself). I currently plan to do two or three more features: batting, defense, and perhaps running. But it is also time-consuming to do all this analysis and to write it up as well.
Is this of interest to people? If it really isn't, I could just keep it all to myself (and use it to secretly beat everybody ;-) )

Thanks for your attention, assuming you read this far.

Oh, one final note: I did also perform the analyses WITHOUT CON below 5 as double importance. Results were clearly better (higher r-squared) for bb_bf, FIP, and ERA when low CON was double-pointed.

User avatar
Dington
GB: Recruiting & Development Director
Posts: 4816
Joined: Mon Mar 30, 2020 12:06 am
Has thanked: 2074 times
Been thanked: 1147 times
Contact:

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by Dington » Mon Mar 08, 2021 9:23 pm

😳
Image
Nashville Bluebirds GM
HOW I BUILD A WINNING TEAM <---Click
Kuwait City GM 2042-43
2043 UMEBA United Cup Champion*

User avatar
shoeless.db
BBA GM
Posts: 2317
Joined: Wed May 29, 2019 10:25 pm
Has thanked: 1823 times
Been thanked: 1090 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by shoeless.db » Tue Mar 09, 2021 8:04 am

Good stuff. Any chance the analysis can be provided as poetry next time?
Sacramento Mad Popes (unretired)
-- Vic Caleca Team News Award Winner 2052
-- BBA Champion 2053
-- Pacific Champs 2040, 2042, 2043, 2047, 2048, 2049, 2051, 2053, 2054
Life is a bit more beautiful when time is measured by the half inning rather than the half hour.

User avatar
aaronweiner
BBA GM
Posts: 12020
Joined: Wed Jun 16, 2010 1:56 pm
Has thanked: 50 times
Been thanked: 761 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by aaronweiner » Tue Mar 09, 2021 8:26 am

As with all good science, I think the gallery would like an abstract.

I understood every word, but, then, I am an actual statistics teacher.

Spiccoli
Ex-GM
Posts: 1376
Joined: Fri Jun 23, 2017 4:24 pm
Has thanked: 123 times
Been thanked: 118 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by Spiccoli » Tue Mar 09, 2021 9:52 am

Ok... Anyone have a cliff notes TLDR summary?... lol

Shouldn't any good research study have a summary paragraph for Dummys?
Scott Piccoli GM Twin Cities

allenciox
Ex-GM
Posts: 350
Joined: Sun Jun 16, 2019 2:23 pm
Has thanked: 98 times
Been thanked: 144 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by allenciox » Tue Mar 09, 2021 10:11 am

How about a list of takeaways. Many of these takeaways are "tentative", that can be more iron-clad once I have more seasons worth of data to analyze:

1. Most or all of a pitcher's effectiveness can be determined by the three primary ratings, STU, MOV, CON. Other ratings, including number of pitches (as long as, if a starter, that they are at least three decent ones), which pitches they are, ground ball percentage, stamina, etc. seem to have little to no effect on how effective the pitcher is.

2. MOV appears to be the most important rating, but, for starters not by a huge amount over CON and STU. In fact, considering starters based on just summing up their STU, MOV, and CON should work fine.

3. High STU and MOV are more important for relievers than for starters. In fact, CON appears to be much less important for relievers than the other ratings.

4. However, beware of pitchers with a CON < 5. They are worse than they appear!

User avatar
RonCo
GB: JL Frontier Division Director
Posts: 19815
Joined: Sat Nov 14, 2015 10:48 pm
Has thanked: 1982 times
Been thanked: 2902 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by RonCo » Tue Mar 09, 2021 2:47 pm

On BB_BF:
However, the difference in the percentage of batters that starters walks is almost exactly the same as it is for relievers, so while starters have a lower baseline, the effect of a point difference for starters appears to be about the same as for relievers (they are within one standard error of each other).
I'm wondering if this result indicates that on the whole relievers tend to sit further down the rating scale than starters. By that I mean that all "5"s are not created equal, nor are all "6"s and so on. If relievers tend to be low on the individual rating scale and starters tend to be higher, which I'd guess is likely, might one might see the dynamic you're describing happen?

On K_AB:
Moral of the story: high STU pitchers might work better as relievers.
Unless I'm missing something, this data set is a little noisy due to the bonus step-up pitchers get when they are used as relievers (or steps down as starters, depending on how you want to think about it). In other words some of the guys in the Starter sample are pitching a number of innings with the reliever bonus, and some of the Reliever sample are pitching a number of innings with the starter penalty. The "5" cross-over is interesting, though, and makes me ask about where the sample might be in the rating scale again.

The Log5 results engine should be consistent for all cases in the end.

On HR_AB:
Note, though, that the r-squared values are much, much lower than they were for bb and K. Part of the reason is obviously park factors, but that doesn't seem sufficient to me to completely explain the disparity.
That makes me wonder about the distribution of offensive ratings and how they influence this. Again, the Log5s are the Log5s. If the league's hitter's POWER ratings trend or skew higher than the league's EYE ratings, then I would guess that would influence the pitcher's ability to control things, too. Thoughts?

I'm also wondering about the impact of sample size on the fidelity of the model. In most games there are many more Ks than BB, and more BB than HR. So with fewer occurrences of each, the ability of randomization holds greater sway. Add this to the performance maps between Ks and HR, for example (the gap between low K/AB pitchers and high K/AB pitchers is considerably less than that between High and low HR/AB pitchers), and you've got to add even more sample to get better RSQ values. That's my story, and I'm sticking to it, anyway.

Here, for example, is the baseline editor's performance map for Movement:
Perf-Maps-KvHR-v21.PNG
Perf-Maps-KvHR-v21.PNG (19 KiB) Viewed 604 times

On FIP:
there is a huge difference in how effective CON is: It appears to be about twice as important for starters than relievers, and about three standard errors apart --- clearly a significant difference.
I'm on the idea that this is more about the distribution of ratings among pitchers of different types. From above relievers tend to give up more walks than starters of the same rating. But, if "5" relievers tend to be lower on the rating scale than "5" starters, then you'd probably see this. I've got to think a bit more about it.

On BABIP:
So I did a final analysis of BABIP. I don't need to show the table for that. None of the ratings had a significant effect for BABIP --- not even gb_pct. Perhaps it is that one season is just not enough data to show it. But I find it perplexing to some extent.
On the whole this makes me happy--though I think the lack of influence on GB% is probably a small flaw in the results engine. Bottom line, unless OOTP changed an algorithm, a fast majority of pitchers will not have any real influence on BABIP.
GM: Bikini Krill
Nothing Matters But the Pacific Pennant
Roster

User avatar
CTBrewCrew
GB: FL Heartland Division Director
Posts: 5111
Joined: Wed Sep 04, 2019 11:20 am
Location: Milford, CT
Has thanked: 923 times
Been thanked: 1317 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by CTBrewCrew » Tue Mar 09, 2021 9:54 pm

Image

User avatar
CTBrewCrew
GB: FL Heartland Division Director
Posts: 5111
Joined: Wed Sep 04, 2019 11:20 am
Location: Milford, CT
Has thanked: 923 times
Been thanked: 1317 times

Re: What affects pitching effectiveness? The second report from 2045 BBA Season Stats

Post by CTBrewCrew » Tue Mar 09, 2021 9:55 pm

Seriously...very interesting @allenciox 👍🏻
Image

Post Reply Previous topicNext topic

Return to “League Features”

Who is online

Users browsing this forum: No registered users and 7 guests