homepage Welcome to WebmasterWorld Guest from 184.73.87.85
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
a regex question
sdani




msg:3530108
 10:04 am on Dec 18, 2007 (gmt 0)

Greetings,

I have a phpbb2 board, and have a mod to prvent non members to post URLs. This most has stopped spam bots big time, but it is also hurting some valid users, who want to post anything with valid words like .com or .net.

How can I modify this regular expression to check for complete valid URL and not just ".com", ".net " etc.

Thanks for any suggestions.

//-- mod : Active-Member-URLs-Only -----------------------------------------------------
{
if( preg_match("/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i",$HTTP_POST_VARS['message']) ){
message_die(GENERAL_ERROR, sprintf($lang['url_active_only_message'],intval($board_config['url_post_posts']),intval($board_config['url_post_days'])));
}
}
//-- fin mod : Active-Member-URLs-Only -------------------------------------------------

 

PHP_Chimp




msg:3530230
 1:38 pm on Dec 18, 2007 (gmt 0)

"/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i"

To check for complete addresses you could use -

<?php
$test = array("http://www.example.com/", "www.example.com", 'example.com', 'example.com?this=that', 'www.example.com/', 'example.com/somedirectory/some_page.html?var=var1&var2=4#fragment');
foreach ($test as $k => $v) {
// PATTERN BELOW
$p = "%(?:http://)?(?:www\.)?[\w\.\?#/&-]+%i";
//
if (preg_match($p, $v)) {
echo "$v - Works<br />\n";
}
else {
echo "$v - Not Working<br />\n";
}
}
?>

I havent put in any capturing patterns as I dont know if you need them or not (and its a little faster without the capturing).
I have removed your testing for specific domains, as you are missing a lot of .co.COUNTRY domains. So if you do want to test for specific domains them then the regex will need to be improved.
This should stop people typing in a complete domain, but it will also stop people if they type 'end of sentence.start or another' as the bold bit will trigger the regex, however most people are not going to forget to put a space in after a . Same if they miss a space around & or #

[edited by: PHP_Chimp at 1:46 pm (utc) on Dec. 18, 2007]

PHP_Chimp




msg:3530584
 8:15 pm on Dec 18, 2007 (gmt 0)

Just thought -
Words with a - in will also match the regex. Although most people dont use then, they are valid in words. So you may want to make it clear to people that they cant use hyphenated words (may be a big problem for surnames).

You can also get rid of the i modifier, as it isnt needed.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved