Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Googlebot never visits twice the same page

         

JorgeV

8:24 am on Aug 30, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello-

At one of my site, I am posting daily articles. Googlebot visits the new pages nearly instantaneously , and they appear in the SERP within an hour. This is fine, however, Googlebot never visit these pages again.

I mean, I post new articles, then , once in a while, I update existing articles, with additional content, which results in adding several paragraphs of text. When an article is updated, it's listed again on the front page of my site. But, Googlebot never visit these pages again, and therefor, never index the newly added content.

Am I doing something wrong?

Thank you,

tangor

5:22 pm on Aug 30, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No, g just has strange ways of doing things from time to time. (sigh)

After a while they will get around to crawling the older pages, but there's no hard rule for that.

Oddly, I have a different problem in that g keeps hitting all my evergreen pages relentlessly and often does not see new content until two or three crawls later.

not2easy

6:29 pm on Aug 30, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Googlebot never visit these pages again

Two questions:
1. Are you certain that pages are not visited, I mean do your access logs not show any visits? Over what period of time? I mean is this a total span of two weeks, 6 months, a year?
2. Have you submitted sitemaps and created a GSC account to check for new data?

JorgeV

8:31 pm on Aug 30, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



1. Are you certain that pages are not visited, I mean do your access logs not show any visits?

Yes. I wrote a script which is processing the log files, to keep track of all bot activity for each page.

Over what period of time? I mean is this a total span of two weeks, 6 months, a year?

On year, at least. I started looking at it, one year ago, so I have "only" 1 year of data.

2. Have you submitted sitemaps and created a GSC account to check for new data?

yes.

not2easy

3:23 am on Aug 31, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



How certain are you in the reliability of your scripts? How many Googlebot UAs are your scripts tracking and have you updated the UA of your scripts? Are you only tracking certain Googlebot UAs or is it UA and IP? Do any other robots visit your pages?

skaterpunk

11:13 am on Aug 31, 2019 (gmt 0)

5+ Year Member Top Contributors Of The Month



never index the newly added content.

An article is indexed once and that's it. Adding new content, I'm assuming to update and add to the information, may help with moving up in the SERP's and keeping a good SERP position, but you don't get a whole new index.

JorgeV

11:56 am on Aug 31, 2019 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



How certain are you in the reliability of your scripts?

Honestly, I can't think of any weakness of my script. Some are tracking humans, I am tracking bots :-)

How many Googlebot UAs are your scripts tracking and have you updated the UA of your scripts?

It handles all those listed here : [support.google.com...]

Are you only tracking certain Googlebot UAs or is it UA and IP?

I record a hit as bot, when this is the correct UA + correct IP range. I also record, aside, hits from an IP range which is supposed to be from Googlebot, but with a different UA.

Do any other robots visit your pages?

Yes, (Bing, Baidu, and Yandex, I don't track other bots as I do with those)

not2easy

12:24 pm on Aug 31, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If your robots.txt file is not accidentally blocking their bots' access, I don't know of any other way to keep googlebots from visiting a site. If that happened to one of my sites I would be examining my access logs manually to verify my conclusions.

lucy24

7:38 pm on Aug 31, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The idea of the Googlebot--or, for that matter, any search-engine spider--visiting a page just once and then never again is so wildly improbable that we really need to find an alternative explanation. Obviously the first candidate is the log-processing function. Pull up a week or so of logs and use a text editor to search for something simple like
^66\.249\.[67]\d.+?GET /pagename
I tried this myself just to confirm that I have pages which get more search-engine visits than humans. (How I wish there were a meta called something like "archived", telling search engines that this content will never change significantly so you don't need to keep crawling it!)

Can we assume that each of your pages has at least some dynamic content? Doesn't have to be the whole page; a single php include will do. I bring this up because one grasping-at-straws explanation is that 100% hard-coded pages will generally return a 304, and perhaps your log-processing script ignores those.