Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

PHP regular expression experience

12:11 pm on Aug 30, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 3, 2004
votes: 0

Hello world, i need to analize page with regular expression to cut off content from it. But the problem in following: the page is simple created all content is putted into tags <p>{content}</p> and there are many rubbish information if i have with such editing!
May be somebody faces this problem to make exactly right content from pages?
Thanks in advance!
1:27 pm on Aug 30, 2005 (gmt 0)

New User

joined:Feb 2, 2005
votes: 0

You can try this:

$reg_exp = array('<(?i)style[\s\S]*?\/style>','<(?i)script[\s\S]*?\/script>','<!--[\s\S]*?-->','@<[\/\!]*?[^<>]*?>@si', '@<[\/\!]*?[^<>]*?@si','@
$results = preg_replace($reg_exp, array(""), $your_html_goes_here);

After what $results should hold 'content only'. I should warn you that this code still has some problems (mostly with javascript detection)


Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members