{"id":45346,"date":"2025-09-24T11:16:31","date_gmt":"2025-09-24T01:16:31","guid":{"rendered":"https:\/\/database.guide\/?p=45346"},"modified":"2025-09-24T11:16:33","modified_gmt":"2025-09-24T01:16:33","slug":"cleaning-numeric-strings-before-conversion-in-sql-server","status":"publish","type":"post","link":"https:\/\/database.guide\/cleaning-numeric-strings-before-conversion-in-sql-server\/","title":{"rendered":"Cleaning Numeric Strings Before Conversion in SQL Server"},"content":{"rendered":"\n<p class=\"\">Working with messy numeric data is one of those unavoidable realities in database development. Whether you&#8217;re importing data from CSV files, web APIs, or legacy systems, you&#8217;ll often encounter numeric values stored as strings with all sorts of unwanted characters mixed in. SQL Server&#8217;s conversion functions are pretty strict about what they&#8217;ll accept, so you need to clean up these strings before attempting any conversions.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">The Challenge with Dirty Numeric Data<\/h2>\n\n\n\n<p class=\"\">When SQL Server tries to convert a string to a numeric type using functions like <code><a href=\"https:\/\/database.guide\/how-cast-works-in-sql-server\/\" data-type=\"post\" data-id=\"22481\">CAST()<\/a><\/code>, <code><a href=\"https:\/\/database.guide\/convert-in-sql-server\/\" data-type=\"post\" data-id=\"22517\">CONVERT()<\/a><\/code>, or their <code>TRY_<\/code> equivalents <code><a href=\"https:\/\/database.guide\/how-try_cast-works-in-sql-server\/\" data-type=\"post\" data-id=\"22531\">TRY_CAST()<\/a><\/code> and <code><a href=\"https:\/\/database.guide\/how-try_convert-works-in-sql-server\/\" data-type=\"post\" data-id=\"22541\">TRY_CONVERT()<\/a><\/code>, it expects the string to be in a clean, recognizable format. Throw in some extra spaces, currency symbols, formatting characters, or non-numeric junk, and you&#8217;ll either get conversion errors or NULL values where you expected numbers.<\/p>\n\n\n\n<p class=\"\">Usually the best way around this is to systematically strip out the unwanted characters and normalize the format before attempting the conversion. This approach is typically more reliable than trying to handle every edge case during the conversion itself.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Common Cleaning Options<\/h2>\n\n\n\n<p class=\"\">Most numeric string cleaning involves removing or replacing specific character patterns. Here are some common options:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"\"><code><a href=\"https:\/\/database.guide\/how-to-replace-all-occurrences-of-a-string-with-another-string-in-sql-server-replace\/\" data-type=\"post\" data-id=\"1986\">REPLACE()<\/a><\/code> for removing specific characters like currency symbols, commas, or parentheses<\/li>\n\n\n\n<li class=\"\"><code><a href=\"https:\/\/database.guide\/how-the-translate-function-works-in-sql-server-t-sql\/\" data-type=\"post\" data-id=\"4079\">TRANSLATE()<\/a><\/code> to do a similar thing to <code>REPLACE()<\/code> (although these functions <a href=\"https:\/\/database.guide\/sql-server-replace-vs-translate-what-are-the-differences\/\" data-type=\"post\" data-id=\"1925\">work differently<\/a>)<\/li>\n\n\n\n<li class=\"\"><code><a href=\"https:\/\/database.guide\/understanding-the-ltrim-function-in-sql-server\/\" data-type=\"post\" data-id=\"33576\">LTRIM()<\/a><\/code> and <code><a href=\"https:\/\/database.guide\/overview-of-the-rtrim-function-in-sql-server\/\" data-type=\"post\" data-id=\"33578\">RTRIM()<\/a><\/code> for stripping whitespace from the beginning and end<\/li>\n\n\n\n<li class=\"\"><code><a href=\"https:\/\/database.guide\/how-to-return-a-substring-from-a-string-in-sql-server-using-substring-function\/\" data-type=\"post\" data-id=\"1809\">SUBSTRING()<\/a><\/code> for extracting portions of strings when the numeric part is embedded in text<\/li>\n\n\n\n<li class=\"\"><code><a href=\"https:\/\/database.guide\/how-the-patindex-function-works-in-sql-server-t-sql\/\" data-type=\"post\" data-id=\"4008\">PATINDEX()<\/a><\/code> for finding positions of patterns, useful for locating where numeric data starts or ends<\/li>\n\n\n\n<li class=\"\">Regular expressions through CLR functions or newer SQL Server features for complex pattern matching<\/li>\n<\/ul>\n\n\n\n<p class=\"\">The trick is to apply these operations in the right order. You typically want to handle the most disruptive characters first, then work your way down to the subtle formatting issues.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example<\/h2>\n\n\n\n<p class=\"\">In this example we&#8217;ll use <code>REPLACE()<\/code> to clean the numeric strings before conversion. We&#8217;ll also look at combining <code>REPLACE()<\/code> with <code>TRANSLATE()<\/code> to achieve the same result.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Sample Data<\/h3>\n\n\n\n<p class=\"\">Suppose we have the following table and data:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE TABLE SalesRaw (\n    Id INT IDENTITY(1,1),\n    RawAmount VARCHAR(50)\n);\n\nINSERT INTO SalesRaw (RawAmount)\nVALUES\n    (' 1,200 '),     -- extra spaces and a comma\n    ('$450.75'),     -- currency symbol and decimal\n    ('3000'),        -- clean number\n    ('12abc34'),     -- bad characters inside\n    ('(200)'),       -- negative stored in accounting style\n    (NULL),          -- null value\n    ('7 500');       -- embedded space\n\nSELECT * FROM SalesRaw;<\/code><\/pre>\n\n\n\n<p class=\"\">Output:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Id  RawAmount<br>--  ---------<br>1    1,200   <br>2   $450.75  <br>3   3000     <br>4   12abc34  <br>5   (200)    <br>6   null     <br>7   7 500    <\/pre>\n\n\n\n<p class=\"\">The <code>RawAmount<\/code> column is <code>VARCHAR<\/code>, which is a string type (not a numeric type). <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Attempting to Convert the Data Without Cleaning<\/h3>\n\n\n\n<p class=\"\">Now if we try to convert the above column directly to a numeric type, we&#8217;ll get errors on most rows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    Id,\n    RawAmount,\n    CAST(RawAmount AS DECIMAL(18,2)) AS CleanAmount\nFROM SalesRaw;<\/code><\/pre>\n\n\n\n<p class=\"\">Output:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Msg 8114, Level 16, State 5, Line 1<br>Error converting data type varchar to numeric.<\/pre>\n\n\n\n<p class=\"\">It threw an error at line 1, due to it having a value that couldn&#8217;t be converted to our numeric type (in this case <code>DECIMAL(18,2)<\/code>).<\/p>\n\n\n\n<p class=\"\">We could avoid the error by replacing <code>CAST()<\/code> with <code>TRY_CAST()<\/code>, but that still won&#8217;t result in a conversion:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    Id,\n    RawAmount,\n    TRY_CAST(RawAmount AS DECIMAL(18,2)) AS CleanAmount\nFROM SalesRaw;<\/code><\/pre>\n\n\n\n<p class=\"\">Output:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Id  RawAmount  CleanAmount<br>--  ---------  -----------<br>1    1,200     null       <br>2   $450.75    null       <br>3   3000       3000       <br>4   12abc34    null       <br>5   (200)      null       <br>6   null       null       <br>7   7 500      null    <\/pre>\n\n\n\n<p class=\"\">This function returns <code>null<\/code> whenever a conversion fails, and so most of our rows return <code>null<\/code>, which means most of the conversions failed.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Using <code>REPLACE()<\/code> to Clean the Data<\/h3>\n\n\n\n<p class=\"\">In order to get the conversions to succeed, we&#8217;ll need to clean the raw data before attempting to convert it. The idea is simple in that we will strip out everything that isn\u2019t part of a valid number. In particular, we&#8217;ll remove commas, spaces, and dollar signs. We&#8217;ll also handle accounting-style negatives (e.g., <code>(200)<\/code> becomes <code>-200<\/code>).<\/p>\n\n\n\n<p class=\"\">One way to do this in SQL Server is by using a custom <code>REPLACE()<\/code> chain.<\/p>\n\n\n\n<p class=\"\">Here\u2019s an approach that handles most of the cases above:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    Id,\n    RawAmount,\n    TRY_CAST(\n        REPLACE(\n            REPLACE(\n                REPLACE(\n                    REPLACE(\n                        REPLACE(\n                            RawAmount, '(', '-'\n                        ), ')', ''\n                    ), '$', ''\n                ), ',', ''\n            ), ' ', ''\n        ) AS DECIMAL(18,2)\n    ) AS CleanAmount\nFROM SalesRaw;<\/code><\/pre>\n\n\n\n<p class=\"\">Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Id  RawAmount  CleanAmount<br>--  ---------  -----------<br>1    1,200     1200       <br>2   $450.75    450.75     <br>3   3000       3000       <br>4   12abc34    null       <br>5   (200)      -200       <br>6   null       null       <br>7   7 500      7500       <\/pre>\n\n\n\n<p class=\"\">We can see that most of the rows were converted successfully. Two failed, because our data cleansing process didn&#8217;t catch them. This is mainly because one of the raw values was <code>null<\/code>, which is an unknown value, so we wouldn&#8217;t be able to clean it anyway (there&#8217;s nothing to clean). The other value contains a bunch of letters in the middle, which suggests that it&#8217;s either not a number, or it uses a different base (e.g., base-16), which means that it would need more than just a bit of a clean.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Using <code>TRANSLATE()<\/code> to Clean the Data<\/h3>\n\n\n\n<p class=\"\">Here&#8217;s an example that uses <code>TRANSLATE()<\/code> to replace some of the <code>REPLACE()<\/code> functions (although we still use one <code>REPLACE()<\/code> function):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    Id,\n    RawAmount,\n    CleanAmount = TRY_CAST(\n        REPLACE(\n            TRANSLATE(RawAmount, '()$, ', '-' + REPLICATE(' ', 4)),\n            ' ', ''\n        ) AS DECIMAL(18,2)\n    )\nFROM SalesRaw;<\/code><\/pre>\n\n\n\n<p class=\"\">Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Id  RawAmount  CleanAmount<br>--  ---------  -----------<br>1    1,200     1200       <br>2   $450.75    450.75     <br>3   3000       3000       <br>4   12abc34    null       <br>5   (200)      -200       <br>6   null       null       <br>7   7 500      7500       <\/pre>\n\n\n\n<p class=\"\">In this case the code is a lot more concise, however, you&#8217;ll need to be careful that the arguments work for your scenario, otherwise you could end up with a mess.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Takeaways<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"\">It&#8217;s good practice to cleanse numeric strings before conversion in order to avoid runtime errors.<\/li>\n\n\n\n<li class=\"\">You can use <code>REPLACE()<\/code> (and\/or <code>TRANSLATE()<\/code> if you prefer) to strip out known problem characters.<\/li>\n\n\n\n<li class=\"\">Use <code>TRY_CAST()<\/code> or <code>TRY_CONVERT()<\/code> to avoid failures when bad data slips through (and set the bad values to <code>null<\/code>).<\/li>\n\n\n\n<li class=\"\">Watch for accounting-style negatives and other business-specific quirks.<\/li>\n<\/ul>\n\n\n\n<p class=\"\">With a reliable cleansing pattern in place, you can safely convert messy text-based numbers into proper numeric data types and move on with analysis or reporting.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Working with messy numeric data is one of those unavoidable realities in database development. Whether you&#8217;re importing data from CSV files, web APIs, or legacy systems, you&#8217;ll often encounter numeric values stored as strings with all sorts of unwanted characters mixed in. SQL Server&#8217;s conversion functions are pretty strict about what they&#8217;ll accept, so you &#8230; <a title=\"Cleaning Numeric Strings Before Conversion in SQL Server\" class=\"read-more\" href=\"https:\/\/database.guide\/cleaning-numeric-strings-before-conversion-in-sql-server\/\" aria-label=\"Read more about Cleaning Numeric Strings Before Conversion in SQL Server\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[10,77,90,69,61],"class_list":["post-45346","post","type-post","status-publish","format-standard","hentry","category-sql-server","tag-how-to","tag-mssql","tag-number-format","tag-string-characters","tag-t-sql"],"_links":{"self":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/45346","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/comments?post=45346"}],"version-history":[{"count":5,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/45346\/revisions"}],"predecessor-version":[{"id":45364,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/45346\/revisions\/45364"}],"wp:attachment":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/media?parent=45346"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/categories?post=45346"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/tags?post=45346"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}