Welcome to WebmasterWorld Guest from 54.145.198.123

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

a regex question

     
10:04 am on Dec 18, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 20, 2003
posts: 451
votes: 0


Greetings,

I have a phpbb2 board, and have a mod to prvent non members to post URLs. This most has stopped spam bots big time, but it is also hurting some valid users, who want to post anything with valid words like .com or .net.

How can I modify this regular expression to check for complete valid URL and not just ".com", ".net " etc.

Thanks for any suggestions.

//-- mod : Active-Member-URLs-Only -----------------------------------------------------
{
if( preg_match("/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i",$HTTP_POST_VARS['message']) ){
message_die(GENERAL_ERROR, sprintf($lang['url_active_only_message'],intval($board_config['url_post_posts']),intval($board_config['url_post_days'])));
}
}
//-- fin mod : Active-Member-URLs-Only -------------------------------------------------

1:38 pm on Dec 18, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:July 12, 2007
posts:766
votes: 0


"/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i"

To check for complete addresses you could use -

<?php
$test = array("http://www.example.com/", "www.example.com", 'example.com', 'example.com?this=that', 'www.example.com/', 'example.com/somedirectory/some_page.html?var=var1&var2=4#fragment');
foreach ($test as $k => $v) {
// PATTERN BELOW
$p = "%(?:http://)?(?:www\.)?[\w\.\?#/&-]+%i";
//
if (preg_match($p, $v)) {
echo "$v - Works<br />\n";
}
else {
echo "$v - Not Working<br />\n";
}
}
?>

I havent put in any capturing patterns as I dont know if you need them or not (and its a little faster without the capturing).
I have removed your testing for specific domains, as you are missing a lot of .co.COUNTRY domains. So if you do want to test for specific domains them then the regex will need to be improved.
This should stop people typing in a complete domain, but it will also stop people if they type 'end of sentence.start or another' as the bold bit will trigger the regex, however most people are not going to forget to put a space in after a . Same if they miss a space around & or #

[edited by: PHP_Chimp at 1:46 pm (utc) on Dec. 18, 2007]

8:15 pm on Dec 18, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:July 12, 2007
posts:766
votes: 0


Just thought -
Words with a - in will also match the regex. Although most people dont use then, they are valid in words. So you may want to make it clear to people that they cant use hyphenated words (may be a big problem for surnames).

You can also get rid of the i modifier, as it isnt needed.

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members