homepage Welcome to WebmasterWorld Guest from 54.196.69.189
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
robots.txt case sensitive?
the_nerd




msg:1527181
 8:03 am on Aug 18, 2003 (gmt 0)

I recently learnt what to do with a robots-txt file. (Where - here, of course :) )

Now I'm wondering whether all spiders read it the same way: Before I found out the hard way that Google sees a "minor" difference between index.cfm, InDeX.CfM and INDEX.CFM - I had used different spellings all over the website. So I included every way to spell "Index.cfm" I had used before but index.cfm in the robots.txt file. Googlebot understands this well, avoiding all other spellings.

But other spiders just seem to read robots.txt, find "INDEX.CFM" forbidden and go away.

What would you do? Drop the robots.txt since Google probably got the point and stopped reading the multiple filenames of the same file?

 

Ryan8720




msg:1527182
 3:13 pm on Aug 18, 2003 (gmt 0)

What exactly are you trying to do? Do you want the robots to NOT index index.cfm, or do you want it indexed?

Filenames are case sensitive. You should go back and use the same spelling "all over the website".

the_nerd




msg:1527183
 7:48 am on Aug 19, 2003 (gmt 0)

Hi Ryan,

You should go back and use the same spelling "all over the website".

That's exactly what I do now (but I mixed spellings in my pre-WW-Life). Unfortunately Google keeps old copies of e.g. "index.CFM", "iNdeX.cFm" etc. That means,
1. people get old Google-Caches
2. Google might treat this old Versions as duplicate (multiple) content.

So I want it to drop all the old copies, by putting them into the robots.txt file. As I said, this works just fine with Google.

But I'm not sure how e.g. Altavista would react if it finds "iNdEX.CFM" disallowed. Would it read "index.cfm" anyway or just go away?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved