One man’s complex is another man’s simple (AOC 2021 Day 5)

Yesterday I wrote about the virtues of a simple straightforward solution as opposed to a super "clever" one. Today reminded me that what seems simple to one person might be clever to another.

Having successfully survived bingo with a giant squid, Today's challenge had us navigating our sub so as to avoid dangerous parts of the ocean. We were given a bunch of lines represented by endpoints as input. If we graphed all the lines, some of them would overlap. The more the overlap, the more dangerous. Given all the points on the graph that are covered by a line, we had to determine how many points were covered by two or more lines.

For example, if we have the following 10x10 grid with 1,1 being the upper left adn 10,10 the lower right and the following lines:

1,2 --> 8,2
3,1 --> 3,7
2,4 --> 7,4
6,1 --> 6,8

we'd have the following lines:

..X..X....
XXXXXXXX..
..X..X....
.XXXXXX...
..X..X....
..X..X....
.....X....
.....X....
..........
..........

If instead of X we marked each square with the number of times it was drawn on we'd have

..1..1....
11211211..
..1..1....
.121121...
..1..1....
..1..1....
.....1....
.....1....
..........
..........

In the above example, the answer to part 1 would be 4 since 4 squares have more than one line on them

Here our lines are all vertical and horizontal. For part 1 we only had ton consider horizontal and vertical even though the input might specify diagonals.

Part 2 required we also deal with diagonals but only the ones with a 45 degree angle.

For a relatively new programmer the obvious solution would be to create a 2D array. Then you just have to scan through the data and fill the array. Finally, go over the array and count the cells that were hit more than once.

This approach presents three problems. The first, which is minor is that you'd first have to scan the input to determine the required array size. The second problem is that you could have negative values in your input so you'd have to somehow compensate for that possibility. Finally, you could have spares input. What if there were only two lines but one was from -1000000,-1000000 –> -999999,999999 and the other from 1000000,1000000 –> 1000001,1000001. You have two tiny lines but you'd need a HUGE array.

For me, an easier, simpler solution was to use a dictionary (also called hash table, or map depending on language). My grid would start as an empty map and then I'd add points by walking the line segments. For example if I had a line from 0,0 –> 2,0, I'd add three entries to the map. The keys would be (0,0), (1,0), and (2,0), and the values would be all 1 since each location was hit once.

If I then added the line (1,0) –> (1,2), I'd end up with this final map:

  {(0,0) : 1,
   (1,0) : 2,
   (2,0) : 1,
   (1,1) : 1,
   (1,2) : 1}

Note that the (1,0) entry now has a two since it was hit twice.

Once we added all the lines, pull out the values and count how many are greater or equal to two.

Once armed with this approach it was easy enough to solve the problem. You can check out the code here.

I'll maintain that I used a simple, clear, and maintainable approach but it's also worth noting that if I presented this problem to relative beginners, depending on what tools they've used, they might very well lean heavily on an array representation. This would certainly be true when I was a beginner - we never saw maps or dictionaries until we built them in our data structures courses. Now, with students cutting their teeth in languages like Python where dictionaries come up much earlier, perhaps today's beginners would also opt for the map / dictionary solution.

In any event, I always like problems like this where there are multiple ways of representing the data in the solution which leads to some nice food for thought.

2021-12-05

Programming

#Advent of Code