Range generation in Redshift
05 Jul 2019A little while ago, I’d written an article on Range generation in PostgreSQL, and whilst Redshift and PostgreSQL are similar database beasts, they are quite different in terms of their query capabilities due to their different strengths.
In today’s article, we’ll walk through a couple of different solutions to range generation.
generate_series
The generate_series is a pretty serious tool in terms of achieving exactly what this article is talking about. Unfortunately, there is no analog for this function within the Redshift database management system. Who knows; maybe one day.
Until then though, we need to create our own.
The query
There’s a bit going on in here.
The process starts with the digit
CTE. Is does nothing more than counts from 0, up to 9.
The next CTE seq
uses the digits
CTE, 4 times. Each time its treated as an increasing power of 10. a
handles the 10^0
, b
handles the 10^1
, etc. This catersian effect quickly multiplies our 0-9 range up to 0-9999. If more numbers are required, we simple add another CROSS JOIN
, and treat the next power of 10.
That’s it as far a number generation. It’s pretty easy.
I’ve seen other developers materialise this information into a physical table, and simply join to that table also.
Extending into time
Applying this into some date ranges now, we use the seq
CTE and a little date arithmatic. My example here just uses the numbers within seq
to enumerate dates. Adding LIMIT
to this query also prevents us from utilising all 10000
values produced by seq
.
And that it.