Welcome to WebmasterWorld Guest from 54.167.252.62

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

a regex question

     
10:04 am on Dec 18, 2007 (gmt 0)

10+ Year Member



Greetings,

I have a phpbb2 board, and have a mod to prvent non members to post URLs. This most has stopped spam bots big time, but it is also hurting some valid users, who want to post anything with valid words like .com or .net.

How can I modify this regular expression to check for complete valid URL and not just ".com", ".net " etc.

Thanks for any suggestions.

//-- mod : Active-Member-URLs-Only -----------------------------------------------------
{
if( preg_match("/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i",$HTTP_POST_VARS['message']) ){
message_die(GENERAL_ERROR, sprintf($lang['url_active_only_message'],intval($board_config['url_post_posts']),intval($board_config['url_post_days'])));
}
}
//-- fin mod : Active-Member-URLs-Only -------------------------------------------------

1:38 pm on Dec 18, 2007 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



"/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i"

To check for complete addresses you could use -

<?php
$test = array("http://www.example.com/", "www.example.com", 'example.com', 'example.com?this=that', 'www.example.com/', 'example.com/somedirectory/some_page.html?var=var1&var2=4#fragment');
foreach ($test as $k => $v) {
// PATTERN BELOW
$p = "%(?:http://)?(?:www\.)?[\w\.\?#/&-]+%i";
//
if (preg_match($p, $v)) {
echo "$v - Works<br />\n";
}
else {
echo "$v - Not Working<br />\n";
}
}
?>

I havent put in any capturing patterns as I dont know if you need them or not (and its a little faster without the capturing).
I have removed your testing for specific domains, as you are missing a lot of .co.COUNTRY domains. So if you do want to test for specific domains them then the regex will need to be improved.
This should stop people typing in a complete domain, but it will also stop people if they type 'end of sentence.start or another' as the bold bit will trigger the regex, however most people are not going to forget to put a space in after a . Same if they miss a space around & or #

[edited by: PHP_Chimp at 1:46 pm (utc) on Dec. 18, 2007]

8:15 pm on Dec 18, 2007 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Just thought -
Words with a - in will also match the regex. Although most people dont use then, they are valid in words. So you may want to make it clear to people that they cant use hyphenated words (may be a big problem for surnames).

You can also get rid of the i modifier, as it isnt needed.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month