Msg#: 4314267 posted 6:00 pm on May 18, 2011 (gmt 0)
My goal is to get google to stop crawling specific URL's and setup a accurate sitemap. I am running a phpbb3 message board with SEF URL's. The problem I have is the forum script generates a URL for every reply in a topic, basically anchors.
This creates 1000's of useless URL's in the eyes of the search engine, even though the users like them for bookmarking.
Direct Link to post = brewerscubs.com/messageboard/milwaukee-brewers/carlos-gomez-16796.html#p412994
I have been researching and trying to find a way to tell robots.txt to disallow any url containing "#p" but have not had any luck. Also, my host, siteground, is busting my marbles about CPU usage from the testing i have been doing with a gsitecrawler so my days of testing are numbered... I need to get it right this time, so i turn to the experts :)
The topic number in the URL is converted on the fly..
So the redirect would be: redirect 301 /messageboard/(forum number)http://www.example.com/messageboard/(forum NAME)
But what about the change to the second part, the topic? No way could I create a redirect for every topic as there are thousands.
Would it be best for to add a disallow to the old urls, ignore it, or another route?
The site has been active for years, and I am just now paying attention to SEO. The pages in the board had no meta data at all prior to last week. Now the description is pulled from the text on the page and the title is the the topic title.
Msg#: 4314267 posted 10:07 pm on May 18, 2011 (gmt 0)
Use a RewriteRule to match incoming external URL requests and internally rewrite them to a PHP script that can then interpret the old URL from the request, look up the new URL in an array or database, and then send the correct HTTP 301 redirect.