January 11, 2018

Meta-cognition and data analysis

Learning how to perform various parts of a data analysis project - import, cleaning, visualizing, modeling - is hard. Learning how to bring those parts together in your day-to-day work is even more challenging. It’s hard because most analysis work is performed in isolation. We often see the results of great analysis work in blogs, articles, or presentations, sometimes even with source code, but seeing a final analysis product and it’s source code can obscure a lot of technique and thinking that went into the final product. It’s like seeing a beautiful cake with a recipe card next to it - seeing the final product and the instructions to produce it doesn’t mean you’ll be able to replicate it.

Understanding the meta-cognition that occurs during data analysis work might be the most opaque part of an analyst’s job. Allowing more people to see and understand the thinking that goes into crafting an analysis will in turn help more people to execute their own analyses.

Hadley Wickham provided a great example of this by posting a video showing the “whole game” of a data analysis using R, RStudio, and R Markdown. We get to see how Hadley organizes his R Markdown document and the types of comments and questions he types outside of the code chinks. We also get to see him make a few typos and deal with some error messages. Most importantly, he narrates his thought process.

There’s a lot to learn from watching this video, particularly for those new to R and/or data analysis work. It helps clarify the process of executing an analysis project and brings transparency to the thinking that occurs during that work. The more we can highlight the meta-cognition of data science work, the more accessible the field will be to new entrants.

January 9, 2018

The Marble Game in R

The college football playoff selection process is broken.

The current College Football Playoff (CFP) was created to address the shortcomings of the Bowl Championship Series (BCS). Under the BCS system, a combination of polls and computer rankings were used to select two teams to play for the national championship. It seemed like a good idea, but controversial the results of the BCS system are well-documented. One of the most obvious flaws: several teams finished with an undefeated season and did not have a chance to play for a championship.

In the hope of mitigating BCS-like contoversy, the CFP relies on a selection committee to create a four-team playoff each year to determine the national champion. The selection committee’s ability to apply more subjective criteria was intended to prevent BCS-style controversy, but the past two seasons (2016-2017 and 2017-2018) led to a decent amount of controversy surrounding the selection of playoff teams. In a throwback to the BCS days, the 2017-2018 college football season ended with an undefeated team (Central Florida) that was not allowed to compete in the CFP.

Something needs to change

In response to the flaws of the CFP system, David Burge proposed an easy-to-comprehend alternative approach to determine which NCAA football teams would be invited to the four-team playoff called the “Marble Game.” The motivation behind the Marble Game is clear:

The rules of the Marble Game are simple:

  • Each team starts with a pre-determined amount of marbles.
  • If a team wins a home game, they take 20% of the loser’s marbles.
  • If a team wins a road game, they take 25% of the loser’s marbles.
  • Neutral site games are treated as home wins for the victor (20% marble transfer).
  • At the end of the season, the four teams with the most marbles are invited to participate in the CFP.

Using these rules, I created an R package, marblr, to simulate the results of the Marble Game. The marblr package uses game data scraped from Massey Ratings to simulate the Marble Game using Burge’s rules. Additionally, the marblr package allows for simulations that grant extra weight to teams from the Power Five conferences (ACC, Big 12, Big Ten, Pac-12, SEC, plus Notre Dame). As a default, teams from the Power Five start with 120 marbles, while all other teams start with 100.

Simulating the 2017-2018 Marble Game

The plot above illustrates the outcome of the Marble Game for each NCAA Division I FBS football team1. The first ten weeks of the season are very noisy - it’s hard to discern a clear top tier of teams - but a distinct cohort of competitive teams begins to separate from the pack between weeks 10 and 15. At the end of week 15, conference championships have been played and final bowl and CFP selections are made.

We know that Alabama, Georgia, Clemson, and Oklahoma were selected to play in the CFP for the 2017-2018 season, but which teams would have been invited to the CFP via the Marble Game?

Table 1: Marble Rankings after Week 15
Rank Team Marbles
1 Clemson 425
2 Ohio St 405
3 Oklahoma 405
4 UCF 385
5 Georgia 378
6 USC 353
7 Alabama 327
8 Wisconsin 310
9 Auburn 274
10 Miami FL 256

According to the Marble Game, Ohio State and Central Florida (the lone undefeated team) would have received CFP invites over Georgia and Alabama - teams that both won their first-round CFP games and made the national championship game in 2017-2018. At the same time, both Ohio State and Central Florida won their bowl games. It’s hard to make a definitive case for the groups of four produced by the CFP or the Marble Game.

Ideally, the Marble Game would be used to determine an 8-team playoff. Under this system, each Power Five conference has a strong chance for their champion to make the playoff and undefeated teams from non-Power Five conferences would have a chance too. Conferences with multiple championship-caliber teams could send more than one team to the playoff. Such a playoff would have created the following first-round games in 2017-2018:

  • Clemson (1 seed, ACC) vs. Wisconsin (8 seed, Big Ten)
  • Ohio State (2 seed, Big Ten) vs. Alabama (7 seed, SEC)
  • Oklahoma (3 seed, Big 12) vs. USC (6 seed, Pac-12)
  • UCF (4 seed, AAC) vs. Georgia (5 seed, SEC)

This bracket would resolve the controversy following this year’s selections. Each major conference is represented, stronger conferences (Big Ten and SEC) have multiple contenders, and a non-Power Five team (Central Florida) was able to enter based on the marbles they accumulated through an undefeated regular season.

I’d like to see the first round of the tournament hosted at the higher seed’s home field, with the semifinal and final rounds following the same process the CFP currently uses2.

Moving forward

College football deserves a better process for determining a national champion. There are simply too many seasons with more than four teams with a solid case to compete in a playoff. While the calendar and physical constraints prevent the development of a March Madness-style bracket in college football, expanding the CFP to an 8-team format could mitigate most of the controversy generated under past systems.

The CFP could continue their process of a selection committee, but the Marble Game is an attractive substitute. It places more weight and importance on winning late-season games - games that are either conference championships or rivalry games. It also would discourage programs from scheduling “cupcake” games; scheduling strong opponents would translate to better opportunities to increase a team’s marble count. Most importantly, the criteria for advancing to the playoff would be transparent, consistent, and fair to all teams.

It’ll never happen, but it’s a fun idea to consider.

  1. The only games that count for the purposes of the Marble Game are games between FBS teams - Alabama’s 56-0 drubbing of Mercer provides no useful information about Alabama’s potential to contend for a national title.

  2. The current semifinal games rotate between the following bowl game pairings each year: Rose Bowl and Sugar Bowl; Orange Bowl and Cotton Bowl; and the Fiesta Bowl and Peach Bowl. The site of the championship game is determined through a bidding process.

January 6, 2018

Moving to Blogdown

A few months ago, RStudio announced the release of blogdown, a package that allows you to generate and maintain a Hugo blog using R and RStudio. I’ve hosted my blog on Github Pages using Jekyll for a few years, but the opportunity to make my blog more R-friendly was really appealing. I downloaded the package and started tinkering with the stock template.

Since I wrote all of my blog posts in markdown, it was easy to add my posts to the stock Hugo theme in blogdown and immediately see my content rendered via Hugo. It would have been easy for me to flip the switch and get a new version of my blog up and running on Github Pages, but I wasn’t fully satisfied with any of the available themes. I know enough HTML and CSS to be dangerous, but not enough to create a working custom Hugo theme in an afternoon.

Every few weeks since the initial announcement of blogdown, I tinkered around with customizing a Hugo theme for my blog, but didn’t really commit the time to get it done until the week between Christmas and New Year. Starting with blogdown creator Yihui’s version of the Lithium theme and using Jason Becker’s json.blog theme as a guide, I took some of the concepts from my old blog and created my own twist on a Hugo theme. Yihui’s blogdown book helped me to really understand the underpinnings of how blogdown works and how to get my blog hosted on Netlify.

The ability to host my blog on Netlify was a big selling point for me. Not only did it allow my blog to use HTTPS (a benefit unavailable to users of Github Pages), it also allowed me to simply push source files to Github and it would render the site update automatically. I loved this feature with Github Pages and Jekyll and it was a must-have for me to make the switch to blogdown and Hugo. This feature allows me to create and publish a post from my iPhone or one of my iPads (#multipadlifestyle) - no laptop required. In fact, this post started on my iPhone via Drafts, then moved to Ulysses and was published via Workflow and Working Copy on my iPad Pro. A good deal of my writing will be done in R Markdown moving forward, but I love having the ability to publish with an entirely iOS workflow.

The blogdown version of my blog is up and running, and I’m really happy with the results. If you’re an R user and want to either start a blog or are thinking about a move from Jekyll to Hugo, I’d highly recommend using blogdown.

November 20, 2017

Management Quality in Public Education

What impact does a high-quality superintendent have on student performance? A new NBER paper by Victor Lavy and Adi Boiko investigates this question using data from schools in Israel, finding a significant effect on student achievement (.04 SD) for top-quality superintendents. From the abstract:

We exploit a quasi-random matching of superintendent and schools, and estimate that superintendent value added has positive and significant effects on primary and middle school students’ test scores in math, Hebrew, and English. One standard deviation improvement in superintendent value added increases test scores by about 0.04 of a standard deviation in the test score distribution. The effect doesn’t vary with students’ socio-economic background, is highly non-linear, increases sharply for superintendents in the highest-quartile of the value added distribution, and is larger for female superintendents.

How did top-quality superintendents achieve these results? By improving the focus and clarity of school priorities and procedures, with an emphasis on improving school climate. The authors note the similarity of this management approach to Fryer’s 2014 study of Houston schools adopting a charter-style “no excuses” approach.

The results of this study, along with Fryer’s 2014 study, should inform the training and practice of superintendents in the United States. While successful superintendents will also likely need to develop and demonstrate competency in instructional leadership, finance, HR, and legal matters, the value-add of strength in these domains isn’t as clear. Superintendents have the most impact on student outcomes when they help bring focus and clarity to schools while also emphasizing strong school culture. The leaders and programs that embrace this approach are likely to deliver better outcomes for their students in the long run.

November 14, 2017

Measuring Success in Education

Do low-stakes international assessments of student performance accurately describe student ability? According to this recent NBER Working paper, a country’s results may depend on their students’ intrinsic motivation to do well.

The authors used an experimental approach that offered treatment students a small financial incentive to do well on a test. Students in the treatment group were told just before the test that an envelope with the equivalent of $25 (US) was theirs, but that $1 would be subtracted for each incorrect answer. The researchers conducted the experiment in three high schools in China and two in the US. They found no impact of the financial incentive in China, but the effect size in the US was 0.20-0.23 standard deviations.

The authors conclude that the massive difference in treatment effect between countries indicates that “success on low states assessments does not solely reflect differences in ability between students across countries” and that “low-stakes tests do not measure and compare ability in isolation.”

With this in mind, researchers, policymakers, and the public should be a little more cautious when interpreting the results of low-stakes assessments like PISA. Student motivation clearly plays a significant role in the results of a country’s performance - a topic that deserves further investigation. For example, NCES will release new NAEP results in early 2018. Does student motivation vary significantly for NAEP across state lines? If so, how much of a role (if any) is it contributing to state performance?

Low-stakes assessments can be informative, but they may be measuring more than student ability. This study is a reminder to policymakers and advocates to exercise caution when comparing performance on these tests across jurisdictions.