2021-04-22

Row Generation in SQL Server

While writing complex queries for SQL Server, I have often needed to generate a specific number of records in order to solve a problem in a set-based manner.  At times like this I reach for a trusty Row Generator functions (also referred to as a "numbers table").

Here is a simple implementation of a row generator as a table-valued function:

CREATE FUNCTION fRowGenerator(@rowCount int) RETURNS @output TABLE (rownumber int) AS BEGIN DECLARE @outputCount int = 0 WHILE ( @outputCount < @rowCount ) BEGIN INSERT INTO @output SELECT TOP ( @rowCount - @outputCount ) ROW_NUMBER() OVER ( ORDER BY object_id ) + @outputCount rownumber FROM sys.columns SET @outputCount = @outputCount + @@ROWCOUNT END RETURN END GO

[SQL Server 2005 and above compatible]


This uses a system table which is expected to have a reasonable number of records (around a thousand), and copies it into the output table in a loop.  I have used several different versions of this function in the past, including one which exponentially grows records in a table variable before outputting, but for simple cases this is all that is needed.


The output table uses a row "number" as the output, which implies that it is 1-based.  For many set-based situations this is appropriate (as opposed to the more common 0-based used for arrays), but it's easy enough to add arithmetic into the outer query if 0-based is needed.


As a table-valued function, this can be included into queries with a simple syntax, and it can also use CROSS APPLY to multiply a set of records based on a value in a column.  However, table valued functions don't work well with the query optimiser which is unable to assess the potential cost of the function, so should be used where the expected number of records to be generated is small.


I've used this type of query in the past to calculate Service Level Agreement (SLA) expiry for a set of issues.  The SLA is defined as a number of working hours, so the calendar days after the issue date need to be evaluated to determine which are working days.  With a row generator, one row per day can be generated for a period and this can be anti-joined to other data to remove weekends, bank holidays, company holidays, etc. leaving only the working days.


No comments:

Post a Comment