What affects defense? The fourth report from 2045 BBA Season stats

Beat articles, power rankings, statistical analysis, etc. goes here.
allenciox
Ex-GM
Posts: 350
Joined: Sun Jun 16, 2019 2:23 pm
Has thanked: 98 times
Been thanked: 144 times

What affects defense? The fourth report from 2045 BBA Season stats

Post by allenciox » Sun Mar 21, 2021 3:33 pm

So, last time I looked at the batting results from last season's stats. This time I will focus on defense. There are currently two more reports planned: base running/stealing, and finally, WAR to tie them all together --- i.e. how can we predict a player's WAR in BBA based on their ratings?

Notice how all the reports tie into WAR. For pitchers, it is easy --- WAR is computed directly from FIP. But for batters, it is an amalgam of BatR (Batting Runs), DefR (runs saved, generally represented by ZR), and BsR (or base running runs). All divided by the number of runs represented by a single win (generally around 10).

So let's get on with it. First of all, I computed averages for each of the ratings and defensive stats, for each player that had at least 50 IP (innings played in the field) based on their listed position (this is weighted by IP so that players that played more had a bigger influence on results). Note that the first row in the table below is the total for all positions:
POSFREQRNGARMERRtdpDEFBIZ_R_BIZ_L_BIZ_E_BIZ_U_BIZ_Z_zr_yrEFFPCT
5246.868.168.104.548.1495.67%82.95%59.64%38.17%18.46%0.096831.0000.987
C817.366.92.1.517.6396.28%91.49%76.67%0.73%0.00%0.411741.0020.992
1B724.075.674.963.487.1894.61%85.36%72.13%66.45%22.65%-0.324600.9960.992
2B677.506.958.678.508.8093.60%76.28%54.82%41.99%22.37%-1.237690.9950.987
3B495.549.957.616.027.9992.73%83.16%70.24%56.64%28.15%-0.171890.9990.967
SS658.909.099.869.679.3894.68%79.05%57.40%43.54%31.51%2.194331.0130.986
CF639.119.139.533.079.0299.25%91.94%65.59%58.48%26.64%1.577561.0080.995
LF745.707.927.561.647.1297.11%78.27%42.03%20.73%8.85%-1.343930.9900.986
RF536.069.807.981.527.7597.07%78.87%39.19%18.53%6.29%-0.437490.9980.987
The FREQ column indicates the number of players that are listed at that position that had at least 50 IP. The RNG, ARM, and ERR columns are set to the rating appropriate for their position (i.e. RNG corresponds to IF_RNG if infielder, OF_RNG for outfielder, C_ABI for catchers). The BIZ columns correspond to the average probability of making routine plays, likely plays, even plays, unlikely plays, and remote plays, going from left to right. ZR_yr refers to their ZR normalized to 1200 IP. EFF is the average defensive efficiency, and PCT is the average fielding percentage.

Some notes from the table:

First of all, the BBA features very strongly talented defensive players, especially at the skill positions. The average SS has a range of 8.9, with all other defensive stats averaging over 9. Centerfielders have a far greater range than LF or RF, and even higher than SS. 3B and RF both average close to a 10 rating on Arm. The average DEF rating across all positions is a whopping 8.21.

The second thing is to look at zr_yr and efficiency averages across positions. These are supposedly normalized for each position to have an average value of 0 for ZR and 1.0 for efficiency. And this appears to be the case: as we can see, the average efficiency is exactly 1 across positions, and the zr_yr is close to 0 (excluding catchers, it would be almost 0 ---- not sure why catchers have such a positive zr_yr). Of course, I have the stats by "listed" position, or the position each player is designated. That is because there is nothing in the data to tell me exactly which position they played, unfortunately. It is possible a team could have a player listed as a SS all year while playing him exclusively at 2B. And, indeed, there is evidence that players do, with some frequency, play at a position other than the one they are listed at.

On a position by position basis, zr_yr and EFF vary substantially. Essentially, they are highest for SS and CF. This is likely because when SS or CF end up playing other positions, they are usually better at that position than the average "listed" player at that position. On the other hand, when second or third baseman end up playing SS, or LF or RF play CF, they are worse than the listed players at that position, and so they perform worse.

The next thing I did was to do a stepwise regression to determine which ratings predicted zr_yr for each position. Please see my previous posts for how the stepwise regression works. Here are the results for each position (_EDF_ in this case refers to the number of players that qualify in that analysis):
POS_TYPE__RSQ__EDF_InterceptIF_RNGIF_ARMIF_ERROF_RNGOF_ARMOF_ERRC_ARMC_ABItdpht_num
1BPARMS0.3332270-11.01262.62842.........
1BSTDERR..1.92300.44440.........
2BPARMS0.5710165-38.91395.02178.........
2BSTDERR..4.10510.53989.........
3BPARMS0.4665446-32.63502.274301.99556........
3BSTDERR..5.91540.418040.51812........
CPARMS0.1993779-7.3413.......1.05392..
CSTDERR..1.7874.......0.23762..
CFPARMS0.5289761-52.3696...5.92180......
CFSTDERR..6.5494...0.71548......
LFPARMS0.6397072-19.3796...3.16278......
LFSTDERR..1.6400...0.27973......
RFPARMS0.5425851-21.5618...3.48739......
RFSTDERR..2.8136...0.44838......
SSPARMS0.5806362-88.92748.250611.94552........
SSSTDERR..9.95300.951270.51203........
I also tried using DEF instead of the individual ratings, but in nearly every case (the exception being 1B, and that only by a slight amount), I got a better r-squared result using the individual fielding ratings.

Unsurprisingly, the appropriate RNG rating showed a very significant relationship to the ZR result for every position, with r-squareds around 50% or higher except for 1B and C. Arm also showed up as significant for 3B and SS (but not RF?). I think there are two reasons for this:

1. One is a phenomena that is referred to in multiple regression as multicollinearity Good defensive players tend to have high ratings in all the ratings, poor defensive players have low ratings across them all: thus, the model picks one (usually the best one, in this case, range) to be the thing that is affecting it, even though they all might be affecting the result somewhat.

2. We just don't have enough data yet to make firm determinations. Since we are doing regressions separately for each position, there are only on the order of fifty or so players for each one. Once we have more seasons, we should be able to "fine-tune" the analysis better. I will expect more of the ratings to come into play then. However, based on these results, we are still likely to see range "dominating" the other ratings in the predictions. After all, range alone did better than DEF alone in most cases.

One way to look at which positions are most important is to "add together" the coefficients for the individual ratings for each position (for example, for SS this would be 8.25+1.946=10.1956).

When we do this, we get the following ranking of importance, defensively:

1. SS (10.20)
2. CF (5.92)
3. 2B (5.02)
4. 3B (4.27) --- note that at 3B, arm is almost as important as range
5. RF (3.49)
6. LF (3.16)
7. 1B (2.63)
8. C (1.05)

The order seems reasonable, but some of the numbers are kind of surprising: for example, that SS show up with 40% more importance than CF, with 3B being close to 2B in importance and 1B not far removed from the corner OF positions. Perhaps we have a lot of GB pitchers in the league so that is influencing the importance of IF versus the OF? It will be interesting to see if this holds up across multiple seasons.

It is not surprising that the Catcher position comes in last here, by a substantial margin. But the C position might be short-changed here. The effectiveness of the catcher at catching base stealers is not likely considered in ZR, nor is any effect on Catcher ERA (CERA). So, I looked at these separately for catcher, and here are the results (weighted by IP, of course):
_DEPVAR__TYPE__RSQ__EDF_InterceptC_ARMC_ABI
RTO_PARMS0.000006930.5692..
RTO_STDERR..0.8778..
SBA_ipPARMS0.08668680.1359-.004418136.
SBA_ipSTDERR..0.01230.001739179.
CERAPARMS0.12112685.9486.-0.17023
CERASTDERR..0.4184.0.05561
Here, RTO_ is the percentage of time a runner is thrown out attempting to steal. It shows no effect for C_ABI or C_ARM. The intercept shows the average percentage, about 30.57%. The SBA_IP stat is the number of stolen bases allowed per Inning played by the Catcher, and here we find that there are fewer of them the higher the catcher arm rating is, which is not surprising --- runners are scared off by the good arm at attempting to steal.

The thing is, there is actually NO evidence here of higher catcher arm helping in any way. In fact, it might be hurting. If you look at analyses of the run benefits and costs (see RonCo previous posts on this) a caught stealing ranges from 2.5 to 3 times as valuable to the defense as a stolen base is to the offense, on average. At 2.5 times as valuable, a stolen based would have to be successful more than 71.5% of the time, at 3x, it would have to be successful at 75% rate to be a benefit to the offense. So 1-.306=69.4% would not cut it.

We do see a significant effect of Catcher Ability on CERA here, but because that is also strongly affected by the pitchers on the team, it is hard to know "how effective" it is here. I have recently done some defensive simulations using beta OOTP 22 to see how important catcher ability, by itself, is at affecting runs scored by the opposition, and I found that, in an "MLB"-type environment:

1. below a rating of about 4 for C Ability (in this case, 70 on the 250-point internal scale), Catcher Ability is by far the MOST important defensive rating for any player.

2. At a rating of 5 or higher, it is still only slightly less important than range for SS or CF.

However, I do not know if those analyses would hold in our environment, and there is no data that I have been able to acquire from the season ratings that could indicate walk or strike out rates for individual catchers. At least for my future behaviors as a GM, I plan to consider catcher ability as perhaps as important as 2B or at least 3B defense, and I will avoid catchers with C ABI < 5 like the plague.

User avatar
Dington
GB: Recruiting & Development Director
Posts: 4967
Joined: Mon Mar 30, 2020 12:06 am
Has thanked: 2125 times
Been thanked: 1177 times
Contact:

Re: What affects defense? The fourth report from 2045 BBA Season stats

Post by Dington » Sun Mar 21, 2021 9:38 pm

Good work again.
Image
Nashville Bluebirds GM
HOW I BUILD A WINNING TEAM <---Click
Kuwait City GM 2042-43
2043 UMEBA United Cup Champion*

Post Reply Previous topicNext topic

Return to “League Features”

Who is online

Users browsing this forum: No registered users and 5 guests