Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Remove a robots.txt disallow - how long until indexed?

         

wernizh

2:25 pm on Sep 12, 2007 (gmt 0)

10+ Year Member



I do have a site: www.example.com
It is listed in the serps with about 200 pages.

Then I added a new folder www.example.com/xyz
During test period of 4 to 5 days I blocked this folder for the robots with a disallow: /xzy/ in robots.txt

After the test period I remove this line in robots.txt.

Even the next day I saw google crawler the different /xyz content. But it never show up in the SERPS with site:www.example.com/xyz
Now, after a week it only shows 1 page in /xyz

I do link form www.example.com to diffent articles in /xyz.

Do you have experience with similar cases? When will it recover to show more of /xyz?

What is the best solution to initialy block content to spiders? robots.txt seems not to be the best one... as content does not get into the SERPS after remove of blocking.

liborson

6:06 pm on Sep 12, 2007 (gmt 0)

10+ Year Member



Get an average (PR5+) or strong back-links (PR7+) from older sites linking to /xyz pages and you will be in SERPS within couple of days (depending on the competitiveness of XYZ pages).

My observations are that for non-competitive, non-commercial phrases (lets say around 1,5 mio - 2 mio of total results) couple of back-links from PR6+ long time ago established websites will make my targeted pages into top 5 SERPs usually within 3 days. PR8+ back-link made it within 10 hours...

g1smd

12:27 am on Sep 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I accidentally took off the robots.txt file from an established site and many of the previously blocked URLs appeared in the SERPs within days.

It took a few weeks for them to be deindexed once the file was put back up.

Vimes

11:50 am on Sep 13, 2007 (gmt 0)

10+ Year Member



Similar experience to g1smd removed a directory disallow by mistake, it took Google a week to rank 1000+ pages, much longer to remove them.

Vimes.

WiseWebDude

4:51 pm on Sep 13, 2007 (gmt 0)

10+ Year Member



I did this a few times and actually within a day the URLs were right back in (I guess when Google compares its daily pull of your robots.txt file to your files it will release them back to play). They STILL crawl the pages in the robots.txt file, they just don't let them out to play is all...so they are still there and updated when the webmaster decides he wants them in SERPs.

g1smd

7:21 pm on Sep 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> They STILL crawl the pages in the robots.txt file, they just don't let them out to play is all... <<

Are you sure about that? The robots.txt exclusion is there to say "do not crawl this page".

When you use a robots noindex meta tag, those pages are crawled, and then do not appear in the SERPs.

WiseWebDude

4:15 pm on Sep 14, 2007 (gmt 0)

10+ Year Member



G1smd, not 100% sure, but we tested a while ago and watched the Googlebot gobbling up pages that had a robots.txt disallow on it then we took off the disallow and the next day they were in index, pretty as you please. I think they DO crawl those pages, they just don't show them in the results. I was wondering the same thing so we made some new pages to see and sure enough we were right.

[edited by: WiseWebDude at 4:16 pm (utc) on Sep. 14, 2007]

g1smd

5:50 pm on Sep 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So you are saying that Google is crawling pages that you have disallowed in robots.txt?

That shouldn't be happening.

WiseWebDude

9:43 pm on Sep 14, 2007 (gmt 0)

10+ Year Member



I am not 100% sure. I thought that was the case, like you said, but I wondered after that test. We checked and Googlebot WAS there on the file that was disallowed. Perhaps it found the file, compared it to the robots.txt file and left, but it was funny how fast the info popped into index after disallow was removed. I am not sure exactly how they do it though. Was interesting though.

g1smd

9:47 pm on Sep 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This thread might help with one or two ideas: [webmasterworld.com...]

Sorry that thread is a bit of a long read.