homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 60 message thread spans 2 pages: < < 60 ( 1 [2]     
Lag between crawl and index appearance in Google
what's the usual time period nowadays

 1:21 am on May 21, 2004 (gmt 0)


Pls be patient with the newbie :-)

Google has sent both its MediaPartners and the regular Googlebot to a new site of mine, and they have extensively crawled the site. Furthermore, I can see how every day they are coming back to the site, spidering away.

The site went live three weeks ago. The first intense Googlebot session came about 1 day after go-live. But although it is now abundantly clear that Google's bots know about the domain, and regular visitors, today when you do a Google search on 'site: aabbccdd.com', Google says: "Your search - site:aabbccdd.com - did not match any documents".

Furthermore, when you do a Google search for the domain, simply as: "aabbccdd.com" Google says:

" Sorry, no information is available for the URL aabbccdd.com"

And no results appear from the site in any search results when you do even the most detailed keyword search.

Given the impressive speed with which the Googlebots first found the site and came around and begun indexing it three weeks ago, should I expect to see the site in Google yet?

Or is it too early, and should I wait a bit longer before assuming something is amiss.

I don't feel as if there is anything on the site that should penalise it or get it blocked. It is a G-rated vanilla standard content rich site with no weird or exotic SEO attempted, and with a valid robots.txt.

The only 'SEO' attempted, aside from good titling, metadata and the rest of it, are just 3 external links to it off 2 other sites I run, from a different host. I have also got all the gTLD's for the same site, and have them resolving to the one domain too, but that is only aabbccdd.com/net/org/info. So nothing extreme here either.

Should I just be a bit more patient? I am aware that in the old days you had to wait forever to get into people like AltaVista's index, but I am assuming nowadays getting into Google's index is a lot quicker.



 1:26 am on Jun 21, 2004 (gmt 0)


Our new site is now listed in relavent SERPS.

Launched June 11, spidered on June 16 and 17, indexed today.

Now I question the "sandbox" reports...


 7:48 am on Jun 21, 2004 (gmt 0)

Fearless, its important to say what you mean by 'indexed'. My site was cralwed in early June, continues be crawled daily, and is being 'indexed' now. By 'indexed' I mean that the number of pages returned with a site:www.domain.com returns a number that increases almost daily. However, I receive no Google referrals (since the pages 301'd to it were dropped appropriately from the G index). When I search for "unique stringz" I will show up behind anyone with unique stringz in the link to me.
I understand that I haven't waited anywhere long enough for this to be unusual yet. The redirected pages from the prevous domain were only dropped Tuesday and the advertising campaign for the site is still in development stage so this was expected and not hurting us. But I am interested in whether I am just waiting for Google to realize that the content has just been moved rather than duplicated, or whether I am in a sandbox that may last much longer. I have not emailed G directly but have sent in the 'dissatisfied with results' form outlining my concerns.


 5:35 am on Jun 22, 2004 (gmt 0)

Since we seem to have conflicting reports about whether a sandbox exists for new sites or pages, perhaps we should consider that it may be conditional. If so, upon what is it dependent?
I would like to toss an idea up on the wall and see if it sticks.


A PR 7 link is likely to bring visitors as are most organically gained links. Google has enough tracking between the toolbar and adwords/adsense that they can determine if a site is getting traffic from other sources, which could be indicative of a quality site. Think about a site that starts out with a link campaign consisting of mostly reciprocated links from buried link/resource pages. The traffic from these pages is going to be minimal, so you'll have many links/little traffic. Or you have g1smd's example with organically obtained pr 6 and pr 7 links. Here you will have a high traffic to inbound link ratio. google could use this info to determine (sort of) if a site has a certain level of quality and staying power, or if a site is simply following a standard format for success with a mass emailed reciprocated links program. Additonally, a site using adwords could have the tracking code installed, which could be another determination of a useful resource. Folks come in through adsense, their IP is attached. They visit pages and end up on the conversion/objective page. Therefore, the site must be useful for the keywords used so it should not be sandboxed.


 6:19 am on Jun 22, 2004 (gmt 0)


I see where you are coming from, but that all seems very speculative. The precise mechanisms of Google are very mysterious, and the effect of them also seem to vary.

For example, I've had lots of organic links appear into the site, as well as some PR5 and 6 sites link into it from related content categories. It is also a site that is not especially SEO'd, and which Google and the other SE's have extensively spidered. Furthermore, occassionally I will get a referral from Google (from for example one of the overseas Google domains) where it is clear that the results must be coming from an indexed page, but when I search using the exact same key words that brought the referrer in, I get no results from my site.

Whether that means that the Google system is somehow super-sophisticated and intelligent, or is just another IT system that is plain flaky and erratic, I don't know. The effect seems to be the same.

All I know is that I have a good, straight-forward useful content site that my casual visitors and visitors from the industry sector it highlights really like. I've built it with no fancy tricks and no desire to gain huge mountains of traffic, as I recognise it is a niche site.

But so far (7 weeks after it went live) it seems to be in the sandbox, which I presume means a site that may technically be visible in the G index, but from which no results appear in the SERPs.

So I can either

(a) be peeved that Google is deliberately (but mistakenly) delaying the visibility of my site in its index (if the sandbox is a real administrative policy at Google)

or (b) just resign myself to waiting a bit longer for the results to start appearing (which I'd be reasonably happy with if the delay was just a technical thing to do with stuff like file-system propagation delays within Google's massive data center and index, or something).

Whatever is the cause, the main thing is, I guess, that lots of web surfers using Google at the moment are missing out on opportunities to find quality, relevant content from my site.


 7:27 am on Jun 22, 2004 (gmt 0)

I see where you are coming from, but that all seems very speculative.
Absolutlely! I am definitely speculating.

but when I search using the exact same key words that brought the referrer in, I get no results from my site.
That sounds encouraging. Have you checked gooogle's different datacenters by ip? You may be bouncing around in the index on some, but not on others. i don't have a current list of IP's though.

All I know is that I have a good, straight-forward useful content site that my casual visitors and visitors from the industry sector it highlights really like.
Then you should sit back and wait, everything will be ok. Ok, thats bad advice. Never sit back and wait. Promote or Die! But still, you should be ok.

Whatever is the cause, the main thing is, I guess, that lots of web surfers using Google at the moment are missing out on opportunities to find quality, relevant content from my site.
Presumably, when your site is adequately and appropriately indexed, you will be much busier. Take this time to add more quality content to your site so that when it comes time to make your splash, it is a monstrous one that completely blindsides your competitors.


 6:51 pm on Jun 24, 2004 (gmt 0)

Another update.

This whole thread is becoming a bit of a soap opera :-) Sorry folks.

Today I checked for my site's presence in the Google index (site: www.mysite.com), and now it no longer exists there, at all. Zip, zero, nada. Gone. About a week ago I noted here that it was now fully in the index, but now it ain't there at all.

Furthermore, when I check to see what sites are linking to me now (link:http://www.mysite.com/) I now get ZERO links in. Only the other day, there were a pile of sites linking to me.

But yet every day the Googlebot still comes, sucking up site bandwidth but not bloody doing anything for me!

I am now more than a bit confused.

After about 8 weeks of going live, I am back exactly where I started from. A site that doesn't exist, according to Google.

I suspect Google must hate my site, or the CMS I am using, or maybe Google is just a complete idiot.

I am inclining now to believe the latter, esp when I see some of the absolute garbage sites that sit on #1 in the SERPs in some of the keyword areas my site relates to.


If this continues, I may as well just put an entry blocking the Googlebot in my robots.txt file, for all the good Google is doing me.


 8:08 pm on Jun 24, 2004 (gmt 0)

moocow -- misery loves company :)

Some of us have suffered the same fate (yesterday's PR update was the coup-de-grace), which to me seems at best a harsh penalty for a minor crime, and at worse a complete foul-up on Google's part.

But I can't believe that the cost of Google's spiders outweighs the potential benefit, can it? Recall: getting traffic from Google has a price that's terribly difficult to beat: free. Well, ok, so it's not really free because you pay bandwidth, some incremental server load, etc. But trust me: when google works, it works well.


 9:14 pm on Jun 24, 2004 (gmt 0)

Thanks sublime1 for your words of support.

I guess more than anything else I am surprised by what a strange beast Google seems to be. Out of the index, in the index, out of the index, etc, all in the space of a month or two.

In trying to figure out if there was anything I might have done to trigger this latest Google weirdness, a few site-wide tweaks I made recently that affected every page on site are the only thing that I could think of.

They were only actually *removals* of things: eg I removed a common and redundant entry from all the page titles to make them appear more sensible in search results, and removed a common block of useless DC metadata from all pages.

Maybe that caused Google to pull the whole site from its index and now it is attempting to re-digest it.

But that doesn't explain why the display of *back-links* into my site should also now dissappear. All the URL's of the site were un-changed.


 9:57 pm on Jun 24, 2004 (gmt 0)

When I do a site command for my web site, I HAVE TO include the www for whatvr reason. Omitting the www yilds no rsults. In the initial example you did not include this. Is it possible that you did not try site:www.whatever.com? Just checking...


 10:13 pm on Jun 24, 2004 (gmt 0)

hi jo1ene.

Actually, now I do have some backlinks appearing, whether I do it as site: www.mysite.com, or site: mysite.com.

Results 1 - 6 of about 110 for site: www.mysite.com
Results 1 - 7 of about 214 for site: mysite.com

They weren't there half an hour ago, mind you.


 4:14 pm on Jun 28, 2004 (gmt 0)

Moocow, in my experience you need no less than three months to get consistent and stable results in google searchs. In that period you will fluctuate, and also can dissapear from time to time. I know is frustrating, but there's nothing to be worried about.

And good content is still being the best way to assure visitors from google. Care about that, about the pages' tittle, about the internal links' anchor text, try to show clean urls (not more than one or two parameters if your site is dinamic, better if no one), and organize the site in order to make it easier to the bot to reach all the pages. If you do that, and you have a quality content, you do not need to extra care about the external links, they come alone.

Be patient. I know it's easy to say and hard to do it, but you have no other choice!


 4:44 pm on Jun 28, 2004 (gmt 0)

Thanks Patricio for those comments. I guess I have no option but to sit it out.

The site is still 'dark' as far as Google is concerned, alas. Now it's 8 weeks....

I still add stuff every other day, of course. I built the site for a purpose, not just for Google :-)

What I am avoiding though is large scale changes that will impact every page on site. I've stabilised on a titling style and a core set of metadata etc, so Google should be able to figure out now that existing pages haven't changed and don't need re-spidering.

I suspect some of my site-wide changes earlier may have forced Google to re-spider the whole site again, and thus re-set the clock on its appearance in the index.

So I will be much more cautious and selective in what I update.


 6:33 pm on Jun 28, 2004 (gmt 0)

Sorry I've been away from the thread for awhile.

We were spidered within a few days of being launched with a handful of good solid links, a site map, robots.txt, main URL "submitted" etc, etc No paid services, no great SEO, no adwords.

(Side note- just for the good ole days I submitted to ODP (dmoz.org)- don't know if we'll ever get listed or if I care...)

Within a few more days, our site was completely functional within Google: SERPS, referrals, traffic, backlinks, etc. The spiderbots cruise through every other day or so. Toolbar PageRank stable at "4" (I can live with that.)

I was quite pleased.

On weird thing (I guess) -my "signature" on a public discussion forum (let's call it "dailywid(get).com") used to include an HTML link to the site. All of the threads where I posted were being returned in "links" and serps.

I'd say for my field (politics and not-for-profits) the more things change.. the more they stay the same.


 7:16 pm on Jun 28, 2004 (gmt 0)

I suspect some of my site-wide changes earlier may have forced Google to re-spider the whole site again, and thus re-set the clock on its appearance in the index.

I don't think any change could "forced google to re-spider" your site.

If he was ready to index your site, he would done so, doesn't matter how deep your changes were. I don't know why, but google does need some time to add a new site in a stable and consistent way. When this is done, site changes and additions are reflected in google very soon (48 hours, may be, sometimes even less). But the first three months you can experience, most of the times, the proccess you are describing.


 1:34 am on Jul 5, 2004 (gmt 0)

I started the thread, so I may as well keep updating it until the site is in the G. index....

Week 9: my new site in question is still not appearing in the Google index.

Furthermore link:http://www.mysite.com still shows ZIP, although site: www.mysite.com shows a few site pages out there with my site mentioned.

Interestingly, the preview of the new MSN search has found the site and loves it. The site is #1 in the SERPs for the main keywords I expect people will be looking for its subject with, and in fact dominates the first two pages of the search results on the new MSN search (in part because the MSN search doesn't cluster results from sites yet).

This is great, not so much because of the prominence of my site, but because the results that show up in Google for the same key words really are garbage sites, and have been for a long time, especially the first 5 or 6 sites, including a clumsily SEO'd garbage site that sticks at #1 under Google. Roll on MSN search!


 4:05 pm on Jul 5, 2004 (gmt 0)


If a site is banned by google then googlebot will not visit the site again, right?

If it is right, then if googlebot visit the site then this site still will maybe reindexed, right?

If googlebot still visit banned site then if a new site is banned by google then even if googlebot crawled the site a long time ago then this banned new site will never appear in the google index, right?


 8:57 pm on Jul 5, 2004 (gmt 0)

You are probably right.

However in my case I don't think the site is banned, or I hope it isn't.

The Googlebot visits my site every day. Yesterday alone it sucked up about 50 pages. This behaviour would not be sensible if the site was banned.

My guess is that my site is sitting in the so-called Google 'sand-box', for some reason.


 9:25 pm on Jul 5, 2004 (gmt 0)

I just submitted a 1200+ page site back on June 14th. Prior to that, and going back all the way to early February when I began building the site, I had "noindex, nofollow" in the meta tags, and disallow all in the robots.txt file.

Google has been around only once since then, looked at the index page and left. The index page is in Google, but only if I look for it by name.

Prior to this new sandbox effect, I was consistently able to get new sites fully indexed and high up in the serps for my keywords, sometimes in less than 30 days.

If it takes Google another month or so to get me in the results for my keywords, that's fine. If it takes much longer, I'm in deep trouble.


 9:28 pm on Jul 5, 2004 (gmt 0)

Duh! I forgot to add a question. Does anyone think that having the sites blocked from robots for four months has anything to do with Google only coming around once.

I have several hundred more pages to add, but I'm thinking that it might be better to wait until the site is indexed, then add the new content over a period of weeks for the benefit of the freshbot. Any thoughts on that one?


 10:26 pm on Jul 16, 2004 (gmt 0)

Once again, I started the thread, so I may as well keep updating it until the site is in the G. index.

Site went live May 1, now it is July 17, and still no appearance of the site in the Google index, no link:http://www.mysite.com backlinks, etc.

Yet every day the Googlebot comes around, teasing and taunting me, sucking up the new content I've added, spidering my forums, etc.

Yahoo has progressively been adding more and more pages into its index, MSN, Teoma, Alexa, you name it, they've all started to show the site in SERPs.

But not Google.


 8:29 pm on Aug 21, 2004 (gmt 0)

Moocow, any updates?

My site is still in the sandbox after nine weeks. If I do a site:www.mydomain.com search, only two pages--the index page and an interior page--show up, and there's no descriptions.

I've been getting backlinks, and I believe I've done everything correctly: title, anchor text, site map, h1 tags, meta tags (for Yahoo, not G), keyword density, etc.

My index page has a PR of 1, so I assume the site isn't banned.

What in the heck is going on?


 11:33 pm on Aug 21, 2004 (gmt 0)

Google seems to have stagnated somewhat lately, sounds like lots of people are waiting to get into the index. My new site has been in the index for three months now, over 300 pages crawled and indexed so far, lots of backlinks showing, but still last in the SERPs, even for quoted search for the exact title of the homepage, which only returns 4 results, mine dead last. Just waiting....


 12:32 am on Aug 22, 2004 (gmt 0)

Nope, still not there.

From 1 May to today, still no appearance in the Google index for me.

I have emailed Google twice about it, with no result.

Emailing Google actually has made the problem **worse**.

On EXACTLY the day I emailed Google about the sites non-appearance in the Google index (5 August), the Googlebots stopped coming to visit the site, and have basically stayed away ever since.


Before I emailed Google, the Googlebots were visiting multiple times every day and hundreds of times a week, and had been doing so for several months.

Only in the last few days have the Googlebots come again, but only this time I think because I've started manually submitting new pages I've added to the site.

What's annoying is also the fact that now crappy little blog entries only one or two days old, as well as newspaper stories etc that mention the site are showing up in the Google index, but pages off the site itself still remain invisible to Google searchers.

My site was featured in a major daily newspaper in the last fortnight. People of course rushed to the search engines to find it, and explore the theme the site focusses on. Of course, none of them could find the site in Google.

When they did arrive at it, some asked me why the site wasn't in Google. with the tone of 'you idiot, why isn't your site in Google'. I had no answer, other than to say to people that Google itself is the idiot in question here.

I feel quite strongly annoyed by Google on this point. Not just because there has been a delay in the site showing up in the index, which is fair enough, but because it appears that Google is deliberately blocking the site from appearing in its index.

If this keeps up, I think I'll start an anti-Google campaign, get rid of my AdSense code, and block the Googlebots IP addresses from accessing the site.

After all, what's the point of Google to me?

The site is effectively a non-profit thing for me, so I have nothing to lose financially, and the media attention such a campaign might attract would probably get me more traffic than I'd forgoe by blocking Google.

Especially from some of the local media outlets who have relationships with Google competitors :-)

And Google would get a dent in its reputation too, once people realise that Google's 'Searching 4,285,199,774 web pages' thing is a pile of s**t.

Many average web users seem to think Google is omniscient, and is a reliable guide to what's out there on the web. Clearly that ain't so. More people need to know this, especially given the quite strong, near monopoly Google has on search.

An anti-Google campaign is a very tempting proposition indeed...

More power to Yahoo, I say. They are well on the way to fully indexing the site now. As are Jeeves, MSN preview, Alexa, even a few Chinese search engines, etc etc etc.


 1:06 am on Aug 22, 2004 (gmt 0)
Some more info:

My new site went live around the 10th of july. First crawled by G on the 12th.
On or about the 18th of August it came out of the so called "sandbox" ands started showing in some of the serps.

Not too bad - about 4 weeks.

The site is an ebook, so it was submitted to download dot com - ebookad dot com and a few others. The serps show my ebook in those websites results rather than the actual homepage . I don't know if that's clear. In other words, if you do a search for "book about widgets", the number one results are from download dot com (my book)., not my actual website.

Not that that is terrible, but gives credence to the theory of "authority sites " dominance.


 8:13 am on Aug 22, 2004 (gmt 0)


That's a bit confusing. Do you mean that pages from your actual site are showing up in the SERPs, or just pages that link to your site? Because if it's the latter, I don't think you're out of the sandbox yet.

In my case, several pages which link to me show up in the SERPs for appropriate search terms, and some of them are even fairly well-placed, but my actual homepage places dead last for every term I can think of to search for it with. Some of my subpages show up with reasonable position, but only for highly specific search terms.

Anyway, just trying to get an idea of how long the sandbox lasts, I don't think G has brought sites out of the sandbox for a while now, but I'd like to see a counterexample.


 8:32 am on Aug 22, 2004 (gmt 0)

I have several sites that are #1 for allinurl, allinanchor and allintext yet fail to make the top 10. I also have a couple of sites that are new and #1 for these items and place around #6 or #7. These particular sites are in less competitive areas (ex. 240,000 results) and have lots of PR5 and a few PR 6 incoming links. None of the other sites in the top 10 even compare to the targeting and content of these 2 sites, both of which have been in the Google index since February. This is a sandbox effect, plain and simple.

Can you rise to the top 10 while being in the sandbox?
Absolutely! But you will need a much stronger attack to get there.

On Yahoo and MSN these sites are #1 and have been there for a couple of months, and I believe they deserve to be there, not just a bunch of SEO.

Truly, if Google does not do an update soon, I think they are going to start losing a few percentage points from the search audience. I'm actually getting better results on MSN sometimes! I never thought I would say that, and I hate saying that, so please Google ... UPDATE!


 12:29 pm on Aug 22, 2004 (gmt 0)

<<<<That's a bit confusing. Do you mean that pages from your actual site are showing up in the SERPs, or just pages that link to your site? Because if it's the latter, I don't think you're out of the sandbox yet. >>>>>>>

I thought that was a little confusing.

My site is definately out of the sandbox. In the Serps #1 - #4 are sites that link to my site and my site is #5. Not that I'm crying, but you would think it would be the other way around.


 3:11 pm on Aug 22, 2004 (gmt 0)

My site is definately out of the sandbox. In the Serps #1 - #4 are sites that link to my site and my site is #5. Not that I'm crying, but you would think it would be the other way around.

Yes, I am experiencing this too!
People's link pages and review pages are ranking above the actual pages. I've seen this for lots of SERPS, not just my sites.


 5:42 pm on Aug 22, 2004 (gmt 0)


register another domain name, and start over. get some inbound links, do not request google to add you to it's list, and wait.


 8:56 pm on Aug 22, 2004 (gmt 0)

register another domain name, and start over. get some inbound links, do not request google to add you to it's list, and wait.

That's not really an option. The URL is already becoming quite well known (newpaper articles, quite a few external links, footprint in other search engines etc).

Quite frankly I also couldn't be bothered futzing around like that just to please a perversely moronic search engine algorithm. I haven't designed the site as a search engine trap, it is a Quality Content Site.

I'll wait a bit longer, and badger Google a bit more too.

This 60 message thread spans 2 pages: < < 60 ( 1 [2]
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved