Start a new topic

RegEx Needed for Inconsistent Zipcodes

I have an import file that has zip codes with ending 0s. 
E.g. 123450000 should be 12345


But also contains zip codes with the correct +4 and the 5 digit zip code. 


Ideally, I want to import zip codes as one of the following:

123450000 = 12345

12345 = 12345

123456789 = 12345-6789


Is there a regular expression that can assist to format all the variations of zip code I have in my import file?


Thanks for your help!




Chloe, I am not sure that regex could do all of that however regex is not my strong suit.


I would personally use a bit of VB code in the Omatic API based on the order I wanted and what was most important to me. There would also be an assumption that the original data source has limited validity checking (i.e. it allows 5 digit zip + 0000 or possibly other non-valid formats) hence there may be a future need for 'data cleansing' if needed.


Assume that column "A" is where your zipcode data is.  Note: none of the below is tested, syntactically checked or 'thoughtful/elegant' in approach.


dim zipval as string = ""

dim ziplen as integer = 0


'truncate to max 9 char's - may want to to check that they are all numeric but......

zipval=left(import.Fields.GetbyExcelName("A").Value,9) 

ziplen=;len(zipval)


'if len is 9 then check if last 4 is 0000, if so we truncate the 0000 if not we put it in #####-#### format

if if ziplen=9 then

      if mid(zipval,6,4)="0000" then 

          zipval=mid(zipval,1,5)

     else

          zipval=mid(zipval,1,5)+"-"+mid(zipval,1,5)

     end if

else

   'as the len is not 9 char's we will simply take the first 5 #'

    zipval=mid(zipval,1,5)

end if

import.Fields.GetbyExcelName("A").Value=zipval


 

I think it's as simple as having a dictionary that removes all instances of '0000' because I don't believe there are any zip codes with four zeros in a row. 

Login to post a comment