Start a new topic

RegEx to Format and Abbreviate Street Names

Here is a large set of regular expressions designed to replace a street name with its standardized USPS abbreviation. (See the "Street Suffixes" and "Secondary Units" tabs). It takes an address such as "15800 Countrydrive Avenue blvd., apt. 1506" and formats it to "15800 Countrydrive Ave Blvd Apt 1506". This isn't perfect as it mistakes an intended part of the street name as an abbreviation. Hence "Avenue" becomes "Ave" and some of the meaning is lost. However if the street name includes a possible abbreviation (such as "Countrydrive") then the abbreviation is appropriately NOT applied.
Regular Expressions - Street Suffixes & Secondary Unit Designators.xlsx (UPDATED BY A LATER POST)

The list includes 202 Street Suffixes and 24 Secondary Unit Designators. The "Use Regular Expressions" check box must be painstakingly ticked for each item. The form needs a "Check All" button or link.

image


There are two regular expressions that should also be included, BLANK and PO Box. They are:

1. BLANK --> , AND \. (two match-on entries)
2. PO Box --> (?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b

Hopefully someone can modify the regular expression format to restrict alterations to only the last word before the Secondary Unit Designator. That would be helpful.

1 person has this question

Patrick,
I am very new to Import-O-Matic and even newer to regular expressions! I just loaded your revised long version of the address abbreviations and I experienced an issue during the test. When I entered "123 Newton St Southeast", the dictionary converted it to "123 Newton St SoutheaStreet" Is there a fix for that? One more question, in our database, we spell out Road, Drive, Street etc. but we abbreviate directionals such as SE or NW. So, ideally the address in our database would read "123 Newton Street SE". Can that all be done with one dictionary? Any suggestions you have would be appreciated!
Katherine Flowers
Huntsville Hospital Foundation

Hi Katherine, 

I am sorry to hear you are experiencing trouble. Regular expressions are a really cool feature of ImportOmatic, so let's see if we can get this working! I have two suggestions you can try. First ensure you running the latest version of ImportOmatic 2.2.5. Second check to see that you are copying all spreadsheet columns (A, B, and C) into the Address Abbreviation dictionary. You should end up with a dictionary that looks similar to this:

image

I was not able to reproduce the incorrect formatting results you were getting. What might end up fixing the issue is to recreate the dictionary, which leads into your second question. 

It certainly is possible to format the street abbreviations as long form and restrict the directions to short form. To do this we will take the long form spreadsheet and replace the direction rules (first 8 rows) with the direction rules from the short form spreadsheet. You can download the below spreadsheet and create the dictionary, this should achieve the results you are looking for. 

Address Abbreviations (Long with Short Directions).xlsx

Please let us know if this works for you and if it solves the problem. If not we will definitely take another look at what's going on. 

Thanks!

Patrick

Patrick,
Thanks a million for making the new dictionary! I love it and I named it "Patrick's Amazing Address Dictionary" in your honor:) Also, we were using version 2.2.2 of IOM so that probably contributed to my other issue. Thanks again!
Katherine Flowers
Huntsville Hospital Foundation
Our organization is located in St. Charles. We have created a dictionary to convert data coming in to be St. however when we run the import we are getting values containing two periods. (i.e. St..) does anyone know what we are doing wrong? We are matching on the values below:
\bSaint\b
\bsaint\b
\bSt\b
\bst\b

 Omatic Suppot was able to help me. You have to change the format to (\bsaint\.|\bsaint\b)


Hi. I am trying to fake my way through regular expressions by making changes to some of these created by Omatic, but I am in Canada and many of the abbreviations we use are missing. Many of our rural constituents live on a "Range Road" which they sometimes spell out or incorrectly abbreviate to "Rge", but we want to see it as "Rg Rd"
I seem to have attempted to alter a code that will not change it unless it is at the end of the line or something? Obviously if the word "range" is in another word of the address, I don't want it abbreviated, but it could be preceded by a number or word and followed by "Rd." Could someone take a look at my code and provide advice?

(?i)(?
Also, when I tested this PO Box code from this list with "P.o. Box 25", it resulted with "P.o. PO Box 25 instead of replacing the P.o.
I didn't copy the whole dictionary in because they don't all apply to us in Canada and they don't all match the abbreviations we use. Would that have affected the way the dictionary understood it?

(?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b
Does Blackbaud have an AddressAccelerator product for Canada? If so, I would import the data as is and then run AA against it. It is then matching to exact addresses per your postal system and not using simple find/replace.

Is there a regular expression for changing expiration dates?  In a CSV file the dates are fine, but if I save it and have to reopen the file, EXCEL changes the dates from Jun-16 to 16-Jun and RE doesn't like that.  I've got an Excel workaround but wondered if Omatic had a regular expression for this? 

Thanks.

Tammy

 

 

Tammy,

Take a look at this post: http://www.omaticsoftware.com/Forums/tabid/108/aft/1076/Default.aspx

It may not be exactly what you are looking for but it will get you on the right track.
Hi,

I'm testing out the Address Abbreviations (Long) but am running into a problem with Ave, St., etc when I am entering an apartment abbreviation as well.
I'm entering 234 West Ave Apt. 45 and it's giving me 234 West Ave, Apartment 45. Does anyone know the solution to this? I would like for Ave to be spelled out.

Thanks!




I would guess that first it is replacing " Apt" with ", Apartment" and then it is not recognizing "Ave," as being Avenue.

I'm very new to ImportOmatic and RegEx.  I want to import addresses that have been abbreviated, i.e. I need to set up a RegEx to change rd to Road and st to Street, etc. I was excited to see this topic and copied the Address abbreviations (Long) into my dictionary as instructed - but it does not work for me at all.  As this is entry is over a year old I wonder if there has been an update or change at all?

thanks

Jenny

One thing to check is that the "use regular expressions" is checked off for each entry. They way to do it automatically is to have a third column in the excel spreadsheet set to TRUE. It might already be on there but just make sure.

Thank you for your response Wayne

Yes, the RegExp box is checked for each entry and the TRUE column was copied in.  However, the only one that works is the one for the PO Box, i.e.

Replacement Value:  PO Box

Value to match on:  (?i)\b((P\.O\.)|(P\. O\.)|(PO BOX)|(POBOX)|(BOX)|(POB)|(P O BOX))\b

None of the others work, a couple of examples are:

Replacement Value:  Street

Value to match on:  (?i)(?

Replacement Value: Road

Value to match on:  (?i)(?

I'm stumped.  Any advice you can give would be very helpful.

thanks

Jenny

 

Login or Signup to post a comment