Welcome to WebmasterWorld Guest from 23.22.140.143

Message Too Old, No Replies

Googlebot visits, but it takes hours before page is in results

     
2:21 pm on Jun 18, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


The website in question is a blog. Up until last week Mr.G indexed a new article in an hour or so after it was posted (and after I pinged Feedburner). This pattern stopped abruptly on Thursday.

What I've noticed now is this:

- let's say a new article is posted 8 hours ago.
- I search it in all Google datacenters but it does not appear to be indexed in any of them until one hour ago.
- when it appears in the search results it says it was indexed 7 hours ago, although for the first six hours it does not appear in any data center.

Even when the new article shows in the results, it takes ages for all the datacenters to update.

I checked to see if I am breaking any of G's webmaster guidelines and I DO NOT. The bot comes and downloads the new page but it does not index it. Can anyone explain this situation please? It's driving me insane!

I can't understand why suddenly Mr.G stopped indexing the new articles.

4:49 pm on June 18, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


It's the third time this year that this has happened and I don't know how it got fixed the first two times.

It's indexing old pages that have a link to the new post, but it does not want to index the new post.

I wish I could post my website URL here to get an opinion on what's wrong.

Please help. Is this some kind of penalty?

I'm beginning to loose my mind over this, literally.

5:06 pm on June 18, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


A penalty that delays indexing for part of day? That's hardly likely, IMO. Penalties drive your urls down the rankings or remove them altogether.

Spidering and indexing are two separate steps - they must be in a data set as large as Google's. So we just can't think of Google the way we would think of a mySQL, Access or Oracle database, where once a record is added then it's immediately findable.

Your situation sounds to me like one of these:

1. An infrastructure change on Google's back end, possibly a temporary re-allocation of resources.

2. A different classification of your blog, so it's "freshness" in the search results is now a second tier priority, not top tier.

From your report, your new urls show up in less than a day, even though they may not migrate to all data centers for a while. Many people would envy that situation! And even a close inspection of your website is not likely to add any further insight.

So I'm not sure you've got a problem here. Do your server logs show that googlebot still comes by an hour or so after the Feedburner ping?

5:27 pm on June 18, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


Yes it does. Sometimes the bot takes the new URL 2 or 3 times in the hour following the feedburner ping. It just does not want to appear in the results.

I'm not so worried about ranking because I always post original content. I know this because before I post anything I do a search and it returns no results. So, theoretically, my post should be the only result, or at least on the first page.

It is a problem because I post original content. Scraper sites copy everything and they get indexed faster and appear in the search results and get all the traffic. So practically I'm working in vain.

Another thing I don't understand:

when a new URL eventually turns up in the results, it says it was indexed 7 hours ago, although it started appearing in the results just 5 minutes ago. Could this be a geo location issue? (the new page gets indexed in a far datacenter and needs more time to get into the main index)

5:51 pm on June 18, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


Something I a forgot to mention, that started a couple of weeks ago: in any given day it indexed a number of posts as usual, and a few of them did not get indexed until the next day.

Starting from last Thursday none of the new posts get indexed until the next day.

6:44 pm on June 18, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


[quote]it says it was indexed 7 hours ago[quote]

That's when the spider recorded the page. As I said before, spidering and actually showing up in the index are different stages, but the timestamp is for when googlebot got the source code from your server.

Yes, there's a change in your pattern, and I can sympathize with your concern about scraper sites - although I doubt that there's much you can do to change it. Do you have Webmaster Tools set up - and do you watch it for feedback from Google?

9:54 pm on June 18, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


Yes, I have Webmaster Tools set up. No feedback from Google.

And for the timestamp... the Googlebot gets the source code much earlier than what the timestamp says. I have some doubts that the timsestamp shows when the bot gets the source code (It usually downloads a new post in around an hour after the feedburner ping.)

How come I never see a timestamp that's less than 7 hours?

Until this situation I was able too see any timestamp from a few seconds to 22 hours. (from what you're saying a new post was indexed as soon as it was spidered).

This is the third time this situation happened. I'm beginning to believe that there's a time penalty of some sort, but I can't figure out the reason (I can't figure out how it got solved the first two times either) because I'm playing by the rules.

2:12 pm on June 19, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Oct 26, 2002
posts:3292
votes: 6


time penalty of some sort

I don't think so. Google's spidering and indexing behavior is (as far as we know/think), algorithmically-driven, based largely on PageRank. The settings tend to sometimes slip and slide a notch or two. When they do, it's logical that some sites will see changes in frequency of spidering and speediness of indexing.

I actually don't think you have a "problem" here, but this is just a slight modification to G's behavior.

2:50 pm on June 19, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


I've seen blogs with less PageRank that get indexed with no problems, so I don't think PageRank is a factor in this matter.

If there's no penalty, some settings are changing for certain and I think those settings are referring to which datacenter the bot is assigning my domanin. I say this because I observed that the datacenters update far slower than usual.

3:10 pm on June 19, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Oct 26, 2002
posts:3292
votes: 6


I've seen blogs with less PageRank that get indexed with no problems, so I don't think PageRank is a factor in this matter.

I didn't say it was *all* PageRank. Obviously there are other factors, can be as simple or as complicated as the folks at Google would like it to be.
3:29 pm on June 20, 2008 (gmt 0)

New User

5+ Year Member

joined:June 20, 2008
posts:2
votes: 0


I know this because before I post anything I do a search and it returns no results. So, theoretically, my post should be the only result, or at least on the first page.

Just a quick question: are you blogging for Google or for the visitors of your site?

It seems to me you are over-obsessed with your Google Rankings... just do whatever you do -write original content- and Google will follow (eventually).

7:10 pm on June 22, 2008 (gmt 0)

Preferred Member

5+ Year Member Top Contributors Of The Month

joined:Aug 16, 2006
posts: 397
votes: 1


It seems to me you are over-obsessed with your Google Rankings

Just my point. I don't care about Ranking because I usually am the first to post on a particular subject. But if scraper sites get indexed faster than me, there's no point in posting at all.

12:36 pm on June 23, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member wheel is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Feb 11, 2003
posts:5063
votes: 11


Here's just a wild thought assuming that speed of getting indexed is what actually makes the difference when fighting scraper sites. How about a small script that checks who's asking for a page and giving a 404 or a blank page until it's Googlebot requesting the page - after which shut the script down and publish the page.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members