## T-SQL: Sorting Numeric Values That Are Stored As Strings

January 26, 2019 Leave a comment

Following are some tricks that I have found useful for sorting numeric data that is stored as strings. Normally, string values of "2", "1", and "11" will sort as "1", "11", "2". The examples below will sort this sequence as "1", "2", "11" instead.

One caveat; I have not performed extensive performance testing or examined query plans to get an idea of the efficiency of these queries. Be sure to perform your own evaluations before using these in production environments.

For the first example, set up a sample table and some relatively clean test data. The data is "clean" because it is uncomplicated by leading zeros or spaces. For our sample table, there are two columns with numeric data stored as strings: Volume and Series.

CREATE TABLE #Book

(

BookID int identity(1,1) NOT NULL PRIMARY KEY,

Title nvarchar(200) NOT NULL DEFAULT(”),

Volume nvarchar(20) NOT NULL DEFAULT(”),

Series nvarchar(20) NOT NULL DEFAULT(”)

)INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ’13’, ”)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘1’, ‘3’)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘1’, ‘new’)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘5’, ”)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘1’, ‘1’)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘4’, ”)

If you know that you have clean data such as this, this simple query will sort the string values in the Volume and Series columns as numerics.

SELECT *

FROM #Book

ORDER BY

Title,

RIGHT(SPACE(20) + Volume, 20),

RIGHT(SPACE(20) + Series, 20)

As you can see, this was achieved by padding each column with its maximum length (20 characters) in spaces, and then sorting by the rightmost 20 characters of each value. Here are the results:

BookID Title Volume Series

5 Journal of Sloths 1 1

2 Journal of Sloths 1 3

3 Journal of Sloths 1 new

6 Journal of Sloths 4

4 Journal of Sloths 5

1 Journal of Sloths 13

For the next example, add some ‘dirty’ data to the Volume column. This new data is numeric values with leading zeros.

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘012’, ”)

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ’02’, ”)

The following query builds on the first example and handles the dirty data by removing the leading zeros. This is accomplished by converting the values to numeric and then back to nvarchar during the sort operation, like so:

SELECT *

FROM #Book

ORDER BY

Title,

RIGHT(SPACE(20) + CONVERT(nvarchar(20), CONVERT(int, Volume)), 20),

RIGHT(SPACE(20) + Series, 20)

The results:

BookID Title Volume Series

5 Journal of Sloths 1 1

2 Journal of Sloths 1 3

3 Journal of Sloths 1 new

8 Journal of Sloths 02

6 Journal of Sloths 4

4 Journal of Sloths 5

7 Journal of Sloths 012

1 Journal of Sloths 13

Now add some ‘dirty’ data to the Series column.

INSERT #Book (Title, Volume, Series) VALUES (‘Journal of Sloths’, ‘1’, ’02’)

The prevoius method of converting the Series column to a numeric value and then back to nvarchar is shown here.

SELECT *

FROM #Book

ORDER BY

Title,

RIGHT(SPACE(20) + CONVERT(nvarchar(20), CONVERT(int, Volume)), 20),

RIGHT(SPACE(20) + CONVERT(nvarchar(20), CONVERT(int, Series)), 20)

However, in this case, the result is the following error: "Conversion failed when converting the nvarchar value ‘new’ to data type int." This happens because the Series column includes a value of "new" in addition to all of the numeric values. The value "new" cannot be converted to a numeric, so the result is an error.

The solution is to add CASE statements to the ORDER BY clause so that conversions to numeric are only performed if the values are actually numeric. The value is not numeric it is sorted as-is. Here is the final query.

SELECT *

FROM #Book

ORDER BY

Title,

RIGHT(SPACE(20) +

CASE WHEN ISNUMERIC(Volume) = 1 THEN CONVERT(nvarchar(20),

CONVERT(int, Volume)) ELSE Volume END, 20),

RIGHT(SPACE(20) +

CASE WHEN ISNUMERIC(Series) = 1 THEN CONVERT(nvarchar(20),

CONVERT(int, Series)) ELSE Series END, 20)

And the final set of results:

BookID Title Volume Series

5 Journal of Sloths 1 1

9 Journal of Sloths 1 02

2 Journal of Sloths 1 3

3 Journal of Sloths 1 new

8 Journal of Sloths 02

6 Journal of Sloths 4

4 Journal of Sloths 5

7 Journal of Sloths 012

1 Journal of Sloths 13

Obviously, the more logic that is added to the ORDER BY clause, the more performance will be affected. Choose the simplest query that satisfies the requirements imposed by your data, and always remember to evaluate the performance under real-world conditions.