Learning how to perform various parts of a data analysis project - import, cleaning, visualizing, modeling - is hard. Learning how to bring those parts together in your day-to-day work is even more challenging. It’s hard because most analysis work is performed in isolation. We often see the results of great analysis work in blogs, articles, or presentations, sometimes even with source code, but seeing a final analysis product and it’s source code can obscure a lot of technique and thinking that went into the final product. It’s like seeing a beautiful cake with a recipe card next to it - seeing the final product and the instructions to produce it doesn’t mean you’ll be able to replicate it.
Understanding the meta-cognition that occurs during data analysis work might be the most opaque part of an analyst’s job. Allowing more people to see and understand the thinking that goes into crafting an analysis will in turn help more people to execute their own analyses.
Hadley Wickham provided a great example of this by posting a video showing the “whole game” of a data analysis using R, RStudio, and R Markdown. We get to see how Hadley organizes his R Markdown document and the types of comments and questions he types outside of the code chinks. We also get to see him make a few typos and deal with some error messages. Most importantly, he narrates his thought process.
There’s a lot to learn from watching this video, particularly for those new to R and/or data analysis work. It helps clarify the process of executing an analysis project and brings transparency to the thinking that occurs during that work. The more we can highlight the meta-cognition of data science work, the more accessible the field will be to new entrants.