Forum Moderators: open

Message Too Old, No Replies

The way (and frequency) googlebot spiders

It comes very rarely, and its behaviour is confusing

         

feelingalive

8:06 am on Aug 5, 2004 (gmt 0)

10+ Year Member



Hi all,

I have a site whose main page is PR3 and all others are PR0, although there is a good interlinking.

Anyway, I did a redesign 2 months ago. There are no new pages, it's 301 from old pages already indexed by google to new pages. 301 works fine because clicks on google's result lead you to the new pages.

I've got the following visits from googlebot:

/ (main page): 25 visits
/****.php (top level pages): all 2 visits
/A/****.php (one of the subfolders): 1 visit to each of its pages
Other subfolders (/B/ and /C/): Only the index was visited once, individual pages inside subfolders never visited.

When there is a link to /A/xxx there is also a link to /B/xxx and /C/xxx. So, here are my questions:

- How is this behaviour possible?
- How often can I exepect a page with PR0 to be recrawled?
- Provided that there is good interlinking, shouldn't I expect that when googlebot visits the index it revisits the whole site?

Thank you in advance

jtbell

4:35 pm on Aug 5, 2004 (gmt 0)

10+ Year Member



- Provided that there is good interlinking, shouldn't I expect that when googlebot visits the index it revisits the whole site?

The Googlebot doesn't think in terms of sites, it thinks in terms of pages. How often it visits a page basically depends on the PR of that page, as far as I know. The lower the PR, the less often it visits. Inner pages usually have lower PR than the index page.

My index page is PR7 and Googlebot hits it almost every day. A page that is linked directly from the index page gets hit about every two days. Pages that are two clicks away get hit about twice per week. (based on a quick small sample)

feelingalive

8:26 pm on Aug 5, 2004 (gmt 0)

10+ Year Member



Does it make a difference if a page is in the main directory or it is in a subfolder? Or it's just how many "clicks away"?

hutcheson

8:41 pm on Aug 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's the number of clicks away, and that alone.

Now, sometimes if a page has NOT been assigned a page rank, the Google toolbar will show an ESTIMATED pagerank, which is based on the page's depth in the directory structure. This has caused some confusion for some people. Ignore it -- if the page is not in the index, it ain't gonna be found! If the page IS in the index, it'll eventually be assigned an actual pagerank, based on the inbound links and nothing else.

Or a page may appear higher (or lower) in the listings for a particular search because of the presence or absence of search terms in the directory names. But that's a different issue.

yowza

8:48 pm on Aug 5, 2004 (gmt 0)

10+ Year Member



It's the number of clicks away, and that alone.

What about the fresh content effect. Don't pages that change often get visited more often? I think that if you had two equal pages (same PR, directory depth, inbound links, outbound link, etc) with one that changes content every day and one that never changes, the changing page would be visited a lot more than the unchanging page.

jtbell

11:16 pm on Aug 5, 2004 (gmt 0)

10+ Year Member



Don't pages that change often get visited more often?

That's what I thought, too (now that you remind me of it), but I can't find any evidence of it among my small number of pages. I have some pages that are linked directly from my home page, but don't get any links from anywhere else (as far as I know), and link back only to the home page. I was revising one of them almost every day for a couple of weeks, ending last weekend. The others have remained the same for months. Googlebot has hit each of them twice since Sunday.

That would indicate either that there's no freshbot effect (on frequency of Googlebot visits) or that if there is one, it either evaporates quickly or you have to update regularly for a longer period than a couple of weeks in order for it to take effect.

BillyS

1:34 am on Aug 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I started a new project in late May of this year. I add one page of new content a day and this appears on the home page - so my index page changes daily.

Google first found me in June. In July my entire website was crawled by Googlebot twice. My index page is crawled daily and shows a "fresh" tag in search. The site is a PR2 showing 4 backlinks (two of them from the site itself). All 140 pages or so show in Google currently.

What is truly impressive is the msnbot which grabs my RSS feed hourly and seems to be constantly spidering my website. It could be that I am confusing this bot with early design tweaks to the site. But this spider too seems to like fresh pages. I would ban it if I had any kind of traffic at all. Funny thing is I show about 30% of my pages in the MSN search preview but none in their production model - well, not so funny.