This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

10 for 10: #9 Gamelogs, Box Scores and Splits Now for 1920 to Yesterday

Posted by Sean Forman on July 13, 2010

This post is the ninth in our series of ten new features for our tenth anniversary.

Thanks to the heroes at RetroSheet, we now have access to box scores and gamelogs for every major league game from 1920-2009, and play-by-play going back to 1950. Ninety years of major league history and more than 150,000 games. That includes over 80% of the batting seasons in major league history (back to 1871), 74% of the team seasons, and full careers for 72% of all major league players.

This latest update fills the doughnut hole we had for 1940-1951, so we now have Ted Williams' 84-game on-base streak in 1949, and Joe Dimaggio's 56-game hitting streak.

The Play Index now contains all of these seasons as well. Just to brag a little bit about the database we now have on the site. Our play-by-play database (1950-2010) has over 9m rows with 200 columns each or just over 1.8 billion cells of pbp data. The batting gamelog table from 1920-2010 has over 3.7m rows, and you can search them all. The batting splits table has over 6.1m different yearly splits calculated. There is a lot of data on our server.

Thanks again to the people at RetroSheet (led by Dave Smith) for this tremendous body of work. It really takes your breath away to see what these folks have accomplished. When I heard of their project ten or so years ago it struck me as the Crazy Horse Monument of baseball statistics, but the big difference is that they're now 80% done with it.

For a complete listing of what our data now covers, see our Data Coverage Page. It runs down the years we have pitch data, hit location and type data, and the extent to which we are missing play-by-play from 1950-1973.

The last of our 10 for 10 will launch next week, Murphy willing.

8 Responses to “10 for 10: #9 Gamelogs, Box Scores and Splits Now for 1920 to Yesterday”

  1. JDV Says:

    Just as a fan and hobbyist, I'm amazed at what the folks at baseball-reference.com and Retrosheet (and there are others, including SABR) have enabled me to have ready access to. Thanks to all of you for not only making it easier to satisfy my many curiosities, but also for feeding my appetite for curiosities that I had never previously considered.

  2. nightfly Says:

    Astounding. Great googly moogly, Sean, I hope you saved the receipt! AND made about fourteen backups of your backups.

  3. Ted Says:

    @1 - My words exactly! As a "regular" guy who has been a fan since my boyhood in the late '70s, I give my thanks to Sean for this site, and all the others at SABR, Retrosheet, etc. who have worked so hard to make this happen. I remember reading the original 1969 Encyclopedia at my grandparents' house- they were big baseball fans and were greatly responsible for me becoming one as well. Who ever would have dreamed that someday all this info could possibly be available like this? Again, thank you so much for your efforts, Sean, and to everyone else who has contributed to this fine effort!

  4. JeffW Says:

    Great! Thanks to everyone involved.

  5. mikeyjax Says:

    Thanks for redefining the word WOW for another time - I'm now up to infinity plus one in the amount of times I've been blown away!

    Someone had asked about pics of the more modern guys and it has been stated about copywright issues and the like.... the BR bullpen has baseball card shots of a lot of them... couldn't their pictures be brought over from there?

    Thanks again Sean

  6. Chuck Hildebrandt Says:

    Good god, Sean, when will you stop amazing us?

    Crossing my fingers that #10 is either ability to search splits in PI, or a mobile version of the site.

  7. Mike Sandler Says:

    This is fantastic!! I remember in 1963 sending $1.00 plus 5 Philly Cigar wrapers (I have no idea where I got them from) for a copy of the Baseball Encyclopedia (the Turkin-Thompson). I spent many hours using it, even though there was only 4 or 5 statistical categories for each player. If I remember for batters it was games, runs, hits and batting average. For pitchers games, wins, losses & winning percentage. Not even RBI's or ERA. It's unbelievable how far we've come. Thanks Sean!!

  8. Matt Vandermast Says:

    This is great! Thanks to Sean and to everyone else involved.