Forum Moderators: goodroi
Somewhere along the development process we had a mix-up between whether the restaurant reviews on our site were to be at, for example, review.php?id=123 or reviews.php?id=123 - note the absence/presence of the 's'.
So we opted for one of them, which was the review.php?id=123 format and stuck with it. We then noticed that Google had indexed some restaurant reviews using the reviews.php?id=123 format, much to our horror, as it was resulting in 404s.
We responded by adding to our robots.txt file the following:
Disallow: /reviews.php
So Google stopped listing reviews.php?id=123 on the search returns, and all was well.
However, in Google Webmaster Tools, some 3 months after adding the above to robots.txt, I'm seeing today "URLs restricted by robots.txt (41)", and thereunder are listed 41 URLs (right up to yesterday's date), with reviews.php?id=123 format addresses.
Am I losing traffic from Google as a result of this?
Have I handle this correctly?
And is there anything I can do?
Appreciate the help of my peers. Thank you kindly.
The best bets are to tell the bots that the pages are either gone or moved, so remove the directives in robot.txt and either serve a 404 or 410 for the reviews.php pages, or 301 redirect those pages to the proper review.php pages. I'd probably choose the 301.
Either way, G will most likely keep those pages hanging around in the supplemental index for six months to a year, but it should be no cause for concern.
In all other cases the custom 404 would be the way to go.
I would not use the robots.txt protocol for this, as you DO want Google to access the old URLs and see that the files have moved or gone.
Can I ask one final question, following on from your advice.
If we opt to implement a 301 redirect, as you suggest.
In the .htaccess file, what do you suggest we use?
Note we have, according to Google Webmaster Tools Diagnostic, Web Crawl page, 54 instances of "URLs restricted by robots.txt"
Therefore, do I need to include 54 lines in my .htaccess file, or is there one line that I can use? All 54 instances of "URLs restricted by robots.txt" are of the same type
where thedomaininquestion.com/reviews.php?id=128 SHOULD BE thedomaininquestion.com/review.php?id=128
All 54 instances have this one and the same issue.
Therefore it's probable there's some way to include one line rather than a line like this one (following), 54 times, no?
redirect 301 /reviews.php?id=128 [thedomaininquestion.com...]
Thoughts?
Thanks in advance!
to get good answers on questions about htaccess you might want to post your question here [webmasterworld.com...]
redirect 301 /reviews.php [thedomaininquestion.com...]
That's worked for me in the past - test it to make sure it works for you.