Post

Problem 11

Differently from last posts this problem cannot be solved with an obvious sql expression. To start, we will cover a useful data ingestion technique and then we are going deep into window functions to solve the core problem.

Problem 11

Data Ingestion

The problem involves the following grid of numbers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48

There are different approaches to handle it inside postgres. We choose to represent this grid as a 3-column table containing: row number, column number, the respective number in the grid.

1
2
3
4
5
create table problem_11_data(
  row_number int,
  col_number int,
  n int
  )

After copying this grid into a file (grid.txt), we can easily import the rows to a temporaty table.

1
2
3
4
create temp table rows(
  row text
  )
\copy rows from 'grid.txt'

The first problem that we encounter when importing data from files is that, in theory, we loose the rows ordering. Here we are going close our eyes over this issue, hoping that the following works.

1
select row_number() over (), row from rows

row_number is a window function that indicates the order of the row inside the partition. The over clause optionally specifies the partition, ordering and filters. With an empty specification we mean to consider the whole set as a single partition. Since there is no order, we hope that postgres will use order in which the rows were ingested. In my machine, I have the following result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 row_number |                            row                             
------------+-------------------------------------------------------------
          1 | 08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
          2 | 49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
          3 | 81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
          4 | 52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
          5 | 22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
          6 | 24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
          7 | 32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
          8 | 67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
          9 | 24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
         10 | 21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
         11 | 78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
         12 | 16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
         13 | 86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
         14 | 19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
         15 | 04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
         16 | 88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
         17 | 04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
         18 | 20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
         19 | 20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
         20 | 01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48

We can see that in this case, row_number reflects the original ordering. We are now close to our desired data format. We just need to split row string into a table using with ordinality clause to generate col_number.

1
2
3
4
5
with t1 as (
  select row_number() over (), row from rows
),   t2 as (
  select row_number, ordinality, n from t1, string_to_table(t1.row,' ')  with ordinality n
) insert into problem_11_data (select row_number, ordinality as col_number, n::int  from t2);

Core Problem

The problem asks us to traverse the grid in all directions, horizontally, vertically and diagonally to find out the maximum product of four consecutive numbers. Our approach is to use a custom aggregate function to compute the prod of consecutive numbers while using window definitions to define the different ways of traversing the grid. The custom aggregate/window function is called prod and its definition will show up in another post. What I want to emphasize in this post is the use of the window clause. The solution is composed of two parts in a big with statement. In the first part, we compute the product of 4 consecutive numbers using different orders, that are specified in the respective window clauses. Then, we concatenate all these results using many union clauses to finally pick the maximum product of four consecutive numbers in every direction.

1
2
3
4
5
6
7
8
9
10
11
12
13
with
  t             as (select row_number as row, col_number as col, n from problem_11_data),
  horizontal	as (select prod(n) over (partition by row		order by col 	rows 3 preceding) from t),
  vertical		as (select prod(n) over (partition by col		order by row 	rows 3 preceding) from t),
  pos_diag		as (select prod(n) over (partition by col+row	order by col 	rows 3 preceding) from t),
  neg_diag		as (select prod(n) over (partition by col-row	order by col	rows 3 preceding) from t)

  select max(prod) from 
        (select * from horizontal union
		 select * from vertical union
		 select * from pos_diag union
		 select * from neg_diag);

Once the data is ingested, the solution can be naturally expressed in sql. Certainly, the syntax for window definitions is not very concise. Besides, rows 3 preceding still seems very ackward to me.

This post is licensed under CC BY 4.0 by the author.