homepage Welcome to WebmasterWorld Guest from 107.21.163.227
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Robots.txt
Does Google honour it....
Rusky




msg:1527155
 2:43 pm on Jul 3, 2001 (gmt 0)

Hi All

I read in the past that Google did not play the game and ignored the robots.txt instructions is this still the case or has it started behaving ?

 

starec




msg:1527156
 3:21 pm on Jul 3, 2001 (gmt 0)

Yes, Googlebot does understand and follows instructions of robots.txt. I don't know anything about its past behavior re robots.txt

Brett_Tabke




msg:1527157
 3:24 pm on Jul 3, 2001 (gmt 0)

They've gotten better this year. After you put a robots ban on something, expect 90-120days for it to be removed from the Google system. They don't remove it right away. Nor will it stop them from spidering the pages banned by the robots.

Rusky




msg:1527158
 3:28 pm on Jul 3, 2001 (gmt 0)

Thanks.

What about the first time that you expose a site to its spidering Brett, I know that you have had some problems in the past with sites that are already indexed.

optimizing123




msg:1527159
 3:02 am on Jul 4, 2001 (gmt 0)

It slipped up on my site about 2 months ago even though the robots.txt was in place and obeyed by other SE's. So I think it is still unreliable.

Brett_Tabke




msg:1527160
 8:35 am on Jul 4, 2001 (gmt 0)

That way doesn't appear to bad rusky. It was in the next cycle.

Rusky




msg:1527161
 8:42 am on Jul 4, 2001 (gmt 0)

OK, thanks for the info everybody

Son_House




msg:1527162
 12:57 am on Jul 5, 2001 (gmt 0)

> Nor will it stop them from spidering the pages banned by the robots.

Brett, any idea why they still spider banned pages? I would think it would be a waste of time and bandwidth for them. I recently added a number of pages to the banned list and was hoping they would not spider them anymore. Well, as long as they don't index them, that's what counts.

Brett_Tabke




msg:1527163
 1:32 am on Jul 5, 2001 (gmt 0)

Data mining.

2_much




msg:1527164
 2:25 am on Jul 5, 2001 (gmt 0)

I've never seen this happen. I thought banned sites stop getting spidered. Any site that we have that has been spidered is either in the index or added in the next update.

I had assumed that not getting spidered is the only indication that a site is banned.

Son_House, how do you know that the pages are banned?

The other issue I've seen is that pages that have no inbound links aren't listed in the directory, but are still spidered. As soon as they get inbound links, then they're added to the database.

Brett_Tabke




msg:1527165
 3:15 am on Jul 5, 2001 (gmt 0)

We were meaning pages excluded by Robots.txt 2_much. (not banned really, just blocked from indexing).

On another note, it is interesting to block a nonexistent directory with a robots.txt and watch some spiders try to spider it. You can really spot who isn't playing fair that way.

Son_House




msg:1527166
 5:58 am on Jul 5, 2001 (gmt 0)

Thanks for the info Brett.

2_much, like Brett said, we were talking about pages added to the robots.txt that we don't want added to the index. I'm sorry I was not clearer on that. I need some rest :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved