Welcome to WebmasterWorld Guest from 18.104.22.168 , register , free tools , login , search , subscribe , help , library , announcements , recent posts , open posts Subscribe and Support WebmasterWorld
Google indexing weird links ones with // in them netchicken1 msg:3118665 6:44 pm on Oct 12, 2006 (gmt 0) I have been following the googlebot when it indexes the site and checking its paths for the Robots.txt file.
Having disallowed every instances where there might be dup content and feeling smug about it I got ....
Disallow: /xmb/member.php? Disallow: /xmb/memcp.php? Disallow: /xmb/chat/ Disallow: /xmb/cp2.php? Disallow: /xmb/xmb/chat/
Now however I am finding google getting creative with its bot and indexing links with double / in them.
These // don't even exist on my board structure at that level, yet they are allowing the bot to index dup pages, that the above code had stopped them!
The bots seem to just add an extra / when they feel like it
goodroi msg:3118938 9:36 pm on Oct 12, 2006 (gmt 0)
They probably found a link that had a typo (extra slash) in it. For those double slash pages that Google has found it would probably be a good idea if you did not return a 200 status code.
Also if your problem is only with Google, then you can use their wildcard option in your robots.txt. That might make your robots.txt simpler.
abates msg:3119063 11:33 pm on Oct 12, 2006 (gmt 0)
I've been having getting that on my site as well. I've got several redirects using mod_rewrite, so I'm thinking it's possible that the rewrite might be causing it. I haven't seen any other spiders doing it, just googlebot.
I do note that none of the // URLs which Googlebot have been fetching have turned up in results and none appear if I do a site search.
jdMorgan msg:3119111 12:54 am on Oct 13, 2006 (gmt 0)
If you're on Apache, and have permission to use mod_rewrite in .htaccess, message #3115787 in this thread [ webmasterworld.com] contains some code to cure this and some other common "bad-URL" problems.