homepage Welcome to WebmasterWorld Guest from 54.147.196.159
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Google ignoring robots.txt
No real surprise, but a real pain
voice220

10+ Year Member



 
Msg#: 242 posted 8:53 am on Jan 19, 2004 (gmt 0)

On several pages on of our sites, there are links that have everchanging URLs. They all go via a specific link directory (eg. /go/places). Our robots.txt says to ignore all /go/ and beyond places. Google on the other hand ignores that completely it seems, and indexes those ephemeral pages. Any ideas?

 

ukgimp

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 242 posted 8:56 am on Jan 19, 2004 (gmt 0)

is your robots.txt correct in its syntax. You may be at a fault, not google.

voice220

10+ Year Member



 
Msg#: 242 posted 9:31 am on Jan 19, 2004 (gmt 0)

I thought about that and had another look. This is what it looks like, and there's really not that much to do wrong in those two lines :?

User-agent: *
Disallow: /go/

ukgimp

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 242 posted 9:36 am on Jan 19, 2004 (gmt 0)

[searchengineworld.com...]

voice220

10+ Year Member



 
Msg#: 242 posted 9:56 am on Jan 19, 2004 (gmt 0)

Thanks, but it validates just fine.

spud01

10+ Year Member



 
Msg#: 242 posted 10:12 am on Jan 19, 2004 (gmt 0)

No errors detected! This Robots.txt validates to the robots exclusion standard!

So is there any truth to google not obeying the robots.txt file?

This could also explain why our site has lost any relevance on google with no top 100 position for all the relevant keywords we optimized for.

The site is optimized for 2 different resolution (800x600 & 1024x768) and hence there is duplicate content, but the duped pages/folders are disallowed in the robots.txt file.

voice220

10+ Year Member



 
Msg#: 242 posted 10:31 am on Jan 19, 2004 (gmt 0)

It would appear so, because our disallowed pages are happily indexed.

DaveAtIFG

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 242 posted 3:37 pm on Jan 19, 2004 (gmt 0)

There's a similar discussion at [webmasterworld.com...]

Welcome to WebmasterWorld voice220! :)

voice220

10+ Year Member



 
Msg#: 242 posted 4:12 pm on Jan 19, 2004 (gmt 0)

Thanks DaveAtIFG, the above link explained a lot, and may just have solved my dilemna. :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved