homepage Welcome to WebmasterWorld Guest from 54.237.213.31
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
a regex question
sdani

10+ Year Member



 
Msg#: 3530106 posted 10:04 am on Dec 18, 2007 (gmt 0)

Greetings,

I have a phpbb2 board, and have a mod to prvent non members to post URLs. This most has stopped spam bots big time, but it is also hurting some valid users, who want to post anything with valid words like .com or .net.

How can I modify this regular expression to check for complete valid URL and not just ".com", ".net " etc.

Thanks for any suggestions.

//-- mod : Active-Member-URLs-Only -----------------------------------------------------
{
if( preg_match("/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i",$HTTP_POST_VARS['message']) ){
message_die(GENERAL_ERROR, sprintf($lang['url_active_only_message'],intval($board_config['url_post_posts']),intval($board_config['url_post_days'])));
}
}
//-- fin mod : Active-Member-URLs-Only -------------------------------------------------

 

PHP_Chimp

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3530106 posted 1:38 pm on Dec 18, 2007 (gmt 0)

"/(http¦\bwww\.¦\.(com¦us¦net¦biz¦info¦org¦ru¦su)\b)/i"

To check for complete addresses you could use -

<?php
$test = array("http://www.example.com/", "www.example.com", 'example.com', 'example.com?this=that', 'www.example.com/', 'example.com/somedirectory/some_page.html?var=var1&var2=4#fragment');
foreach ($test as $k => $v) {
// PATTERN BELOW
$p = "%(?:http://)?(?:www\.)?[\w\.\?#/&-]+%i";
//
if (preg_match($p, $v)) {
echo "$v - Works<br />\n";
}
else {
echo "$v - Not Working<br />\n";
}
}
?>

I havent put in any capturing patterns as I dont know if you need them or not (and its a little faster without the capturing).
I have removed your testing for specific domains, as you are missing a lot of .co.COUNTRY domains. So if you do want to test for specific domains them then the regex will need to be improved.
This should stop people typing in a complete domain, but it will also stop people if they type 'end of sentence.start or another' as the bold bit will trigger the regex, however most people are not going to forget to put a space in after a . Same if they miss a space around & or #

[edited by: PHP_Chimp at 1:46 pm (utc) on Dec. 18, 2007]

PHP_Chimp

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3530106 posted 8:15 pm on Dec 18, 2007 (gmt 0)

Just thought -
Words with a - in will also match the regex. Although most people dont use then, they are valid in words. So you may want to make it clear to people that they cant use hyphenated words (may be a big problem for surnames).

You can also get rid of the i modifier, as it isnt needed.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved