Some data exploration

I was curious about whether the relative difficulty of the puzzles was uniform across users. In other words, are difficult puzzles difficult for everyone?

Specifically, I was cared about my own experience: puzzle #3, for example, was fun but also maddening! To investigate this question, I used the length of my raw code as a proxy for difficulty:

## Source: local data frame [10 x 3]
## Groups: <by row>
## 
## # A tibble: 10 x 3
##    file_name                puzzle_no n_lines
##    <chr>                        <int>   <int>
##  1 01_long_number_doubles.R         1      26
##  2 02_checksum.R                    2      33
##  3 03_number_swirl.R                3     113
##  4 04_valid_passcode.R              4      54
##  5 05_escape.R                      5      71
##  6 06_memory_banks.R                6      59
##  7 07_discs_towers.R                7     166
##  8 08_instructions.R                8      50
##  9 09_garbage.R                     9      54
## 10 10_codes.R                      10      53

Here’s a quick look at what that looks like:

From this, it appears that puzzles on day 1 and 2 were super simple, and days 3 and 7 were the toughies. Note: day 10 is only half finished, and it wasn’t because it was too easy…

Now let’s take a look at the Advent of Code leaderboard. This lists the first 100 people to get both stars (solve both halves of the puzzle) and the first 100 people to get the first star. These are the super speedsters, who apparently can whiz through these puzzles in minutes! We’ll use the RCurl package to help us grab the info:

##   day both_stars one_star
## 1   1   00:01:16 00:00:57
## 2   1   00:01:35 00:01:08
## 3   1   00:01:37 00:01:12
## 4   1   00:01:45 00:01:16
## 5   1   00:02:03 00:01:18
## 6   1   00:02:11 00:01:26

We can tidy up our data a bit using tidyr, dplyr, and lubridate, and then plot it using ggplot/ggridges.

According to these times, the whiz coders also found puzzles 3, 7, and 10 to be difficult. We can use the more traditional box-plot to make the comparison with my lines of code more directly.

And there you have it! How quickly the whizzes solved difficult puzzles was extremely variable, but on average, the time it took them matched the number of lines of code I wrote.

All puzzles are equal, but some puzzles are more equal than others.