... until the collector arrives ...

This "blog" is really just a scratchpad of mine. There is not much of general interest here. Most of the content is scribbled down "live" as I discover things I want to remember. I rarely go back to correct mistakes in older entries. You have been warned :)

2014-08-14

Passing integer lists to SQL Server, Redux

Newer versions of SQL Server support array-valued query parameters, but not all database drivers support them (e.g. jTDS or, at time of writing, the Microsoft JDBC driver). It is a widely known pattern to pass lists of integers in a CSV string, and then to use a TSQL function to parse that string. This method is also known to be quite slow for large lists. Here is a TSQL function alternative that uses a binary representation of the integers, and runs about an order of magnitude faster than the CSV approach:

CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS TABLE AS RETURN
WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
SELECT TOP (LEN(@idList) / 4)
  i + 1 AS idx
, CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
FROM indices

GO

SELECT * FROM dbo.fnBinaryToIntTable(CAST(134 AS BINARY(4)) + CAST(463 AS BINARY(4)))
OPTION(MAXRECURSION 0)

Unfortunately, this function requires the calling query to specify OPTION(MAXRECURSION 0) for lists with more than 100 elements. This is a significant source of error as it not an obvious calling requirement. It is possible to bake the option into the query by using an intermediate table variable:

CREATE FUNCTION dbo.fnBinaryToIntTable(@idList VARBINARY(MAX))
RETURNS @result TABLE(idx INT, n INT)
AS BEGIN
  WITH indices AS (SELECT 0 AS i UNION ALL SELECT i + 1 FROM indices)
  INSERT INTO @result
  SELECT TOP (LEN(@idList) / 4)
    i + 1 AS idx
  , CAST(SUBSTRING(@idList, i*4 + 1, 4) AS int) AS n
  FROM indices
  OPTION(MAXRECURSION 0)
  RETURN
END

... but this performs only 2-3x better than the CSV solution. Trade-offs, trade-offs.

Blog Archive