How may rows are we talking about here? I would personally find a way to normalize the data as best you can.
welcome to WebmasterWorld, Runaard!
ideally you would have canonicalized the format of all phone numbers as part of cleansing and validating all user input values before inserting into your database.
you should read about SQL injection and test your forms to see if your application accepts malicious input.
in your situation i would use a regular expression in the where clause of the sql query.
I agree with eelix,
You could either use replace / regexp functions to only search for the numbers in the string, but this would run the replace or regexp function on all rows, not just the ones you want.
This could become pretty slow pretty fast.
If you could get all the phonenumbers as number only, it would be a lot better.
thanks for your replies, I ran the replace() sql function to normalize the database, but now i am struggling to format the user input to standardize it in the form XXXXXXXXXX
to be a valid number it's got to have 10 digits or a 1 followed by 10 digits.
remove all non-numeric characters, remove the leading 1 if any, if it's not 10 digits, reject the input.
you might go further and validate the area code.
|to be a valid number it's got to have 10 digits or a 1 followed by 10 digits. |
Outside of the US and Canada and a small number of other countries that solution fails.
Check out [aa-asterisk.org.uk...] for UK phone number format details for example.
The solution to the original problem can probably be found in libphonenumber at: [code.google.com...]
|(for example XXX-XXX-XXXX and XXX XXX XXXX and XXXXXXXXXX) |
of course you are correct but i was basing my assumption on the example provided in the OP.
I was taking the wider view that this thread will eventually be read by people from all over the world. :)
the libphonenumber option is interesting for international applications if you are using a supported language but i didn't see a port to PHP, for example. =8)
the numbers are in South African format, most people here write the number in the form XXX XXX XXXX, but XXXXXXXXXX would be better for a database, there is some external program that rolls back my 'replace()' changes to the database :( back to square one
Now all you need to do upon output is format it in a way the user seeing it is expecting the phone number to be formatted (might be locale or use specific (local numbers/ international exit code / ...)
The big advantage is that you have a specific and generic format in your database.
input -> stored
1-800-flowers -> 18003569377
+32/4184.108.40.206 -> 32485111111
Now as you learn locations, you learn mistakes people make on input, correct those.
E.g. some people assume because their internationel exit code is 00, that that is the case everywhere -> correct it.Or if you know that there is a discrepancy between (foolish) local traditions and actual numbers one can dial, you can correct it too:
00324220.127.116.11 -> 32471111111
+32(0)418.104.22.168 -> 32491111111
You can also add back missing info by e.g. adding back in the country code from an address:
Belgium 011/11.11.11 -> 3211111111
USA (212)555-5555 -> 12125555555
Be careful with checking lengths: e.g. in Germany (just like in Belgium) the length of a phone number is not fixed. And some countries have such a small population that their phone numbers look suspiciously short.
On output: if you know what/where it is going to be used, you can either output locally written numbers or a full international one as appropriate.
I've run multiple times into an application that is too US centric to be used properly and demanding a phone number in a US format - they'll usually end up with a fake one is they insist on not accepting my real one.
|And some countries have such a small population that their phone numbers look suspiciously short. |
The shortest number on the planet that I'm aware of is something like +290 3555 - and that includes the country code.