Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Spidering fall off since Sunday Oct 26

         

cj94111

5:18 pm on Oct 28, 2008 (gmt 0)

10+ Year Member



Hi-

We have noticed a very large fall-off in Google's spidering of our sites since Sunday. This is happening across two sites, each runs on its own server farm. We are hosted at RackSpace.

We normally get a huge amount of spidering on a daily basis, in the range of 500K to 800K pages spidered daily. It was running ~700K and then dropped down to <20K over the last couple days. Our page rendering times are well within our normal range. We have made no changes to our load balancer.

Is anyone else seeing a similar pattern?

Thanks!

Greg

gouri

8:22 pm on Oct 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you noticed more time between cache dates?

cj94111

8:34 pm on Oct 28, 2008 (gmt 0)

10+ Year Member



Cache dates of what?

gouri

8:36 pm on Oct 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



of the pages of your site that are in the index

gouri

8:38 pm on Oct 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



maybe this is related to the end of the month Google index update.

cj94111

8:44 pm on Oct 28, 2008 (gmt 0)

10+ Year Member



I have over 9 million pages in the index so there is no practical way that I know of to compare cache dates. The front door of the site was cached today.

Receptional Andy

8:54 pm on Oct 28, 2008 (gmt 0)



there is no practical way that I know of to compare cache dates

The number of pages is not really significant - you can still get a sample worth monitoring and even a statistically valid sample from a relatively small amount of dates.

You can check a pattern of pages like home >> category >> subcategory >> page and use that as the basis to figure out the caching cycle your site has.

To my mind, there's a process involved in figuring this out:

- Check that the spidering data is valid (how are you measuring - can we rely on this data?)
- If the data is valid, determine the affected pages (should be part of the measuring process, or at least the data should be collected)
- See if there has been any other impact on the affected pages

As you work through such a process, it often becomes clear where (if at all) any issue worth addressing lies.

eeek

10:59 pm on Oct 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



We have noticed a very large fall-off in Google's spidering of our sites since Sunday.

Glad it's not just me.

cj94111

11:24 pm on Oct 28, 2008 (gmt 0)

10+ Year Member



Andy-

1) Our spidering data is captured from the our server logs and then post processed in order to separate out the various bots. We then cross-correlate our firgures to what Google reports in the WMT console to ensure that our data is within the same range as what Google reports(it is). We have been doing this now for about 5 years.

2) Since we do not have a process that goes out and looks at a sample set of pages to see their cache timestamp and to measure how often those are updated, I don't know how this would be useful for the current situation. It makes sense to set this up to monitor results going forward.

Thanks

Greg

dstiles

11:40 pm on Oct 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Across a few dozen small sites on one server (dates inclusive, average per day in parentheses)...

17th-19th 7957 (2652)
20th-22nd 5110 (1703)
23rd-25th 9675 (3225)
26th-28th 5824 (1941)

1st-28th (total to date for month) 45818 (1636)

So I'd say slightly higher than average but could do better?

g1smd

1:58 am on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see spidering slacken off for 2 to 5 days every 4 to 6 weeks, and have done for years.

Receptional Andy

2:16 am on Oct 29, 2008 (gmt 0)



Greg: 5 years of spidering statistics is great, I wish I had 5 hours of statistics for most of the sites I work with :)

The way I interpret your question is that Google is spidering less than in the past few months, and should this be something to be concerned about.

With that much data, you should be able to see whether this change fits within standard deviation, and so is not something to be unduly concerned about - the sort of pattern g1smd mentions.

If you can connect it with some kind of quality statistic (ranging from cache date all the way up to conversions) you can judge if it has had any impact worth responding to.

Remember that on a busy forum such as this one, and on an index of billions of pages, lots of people will be experiencing less frequent spidering at the same time as you, but that doesn't necessarily mean that the causes are the same.

eeek

3:58 am on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just looked at my logs. Spidering has resumed.

cj94111

3:54 pm on Oct 29, 2008 (gmt 0)

10+ Year Member



eeek-

Our spidering started picking up steam again late yesterday, so it looks like things are on the mend.

Andy-

This was WAY out of the standard deviation. Our normal range is 600K to 900K pages spidered a day. On some very rare occasions (like 2x per year) that might drop to 200K, but what we saw on Sunday and Monday was 20K. So we are talking several standard deviations off.

However, as mentioned above and consistent with eeek's experience, things are turning back up again.

Thanks!

Greg

infnx

4:03 pm on Oct 29, 2008 (gmt 0)

10+ Year Member



same issue happened to me on one of my sites where the spidering fell from 30K down to 2K.. Today its clocking at 9k so far.. Looks like its back on again

kms11

4:32 pm on Oct 29, 2008 (gmt 0)

10+ Year Member



I see exactly the same since 10-26 for to sites (from 900k to 200k & from 350k to 40k). I remember that we discussed a similiar behavior a few month ago. Verifying with others definitely helps in these scenarios - so in this case it looks like it is current Google behavior and not site behavior and from my experience it should be back in a few days.

regards,
KMS11

gouri

12:13 pm on Oct 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Has anyone not had their homepage cache date updated in a while? In some cases, 10 days.