We are back with another positional writeup of draft prospects, this time focusing on the linebackers. We previously dove into the quarterbacks, running backs, wide receivers, tight ends, offensive linemen and interior defensive linemen and edge defenders.
Thanks to math and feature engineering, we can use natural language processing to compare prospects to their contemporaries and those from the past before tying in advanced descriptive stats that we have built previously to gauge how well a prospect fits within a certain mold performed in the NFL.
For this analysis, we took prospect write-ups from The Athletic's Dane Brugler, one of the best football film analysts out there, over the past eight seasons (including 2022) and used latent semantic analysis (LSA) to derive similarity scores between the text in prospects’ scouting reports.
After building our dataset to span eight seasons, we can create a prospect's score in a number of ways. We decided to use a weighted average of similar players’ WAR (wins above replacement), using the similarity score derived above as the weights. For example, if a player has a 0.60 similarity score with a player who has earned 7.0 WAR since being drafted and a -0.3 similarity score with someone who has earned 4.0 WAR, his overall score would be +3.
Using the analyses above, we can look at 2022 prospects in a couple of ways. First, we can examine player comparisons for notable prospects. Second, we can rank the players in each position group by the score derived above. These scores have correlated well with draft position and future WAR generated at the NFL level, although a more robust analysis using additional seasons and data sources is beyond the scope of this article.
Let’s start by looking at the most successful NFL linebackers' text comparisons so that we can then see what it means for prospects in the 2022 class.