Welcome to WebmasterWorld Guest from 50.17.117.221

Forum Moderators: open

Message Too Old, No Replies

Google's spidering pattern lately?

     
3:10 am on Aug 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 4, 2000
posts:1324
votes: 0


I have noticed that google will quickly grab the home page of a new site and index it - sometimes in just a few days. And, then not visit for an extended period of time.

Does anyone have a handle on what that extended period of time is?

It is nice to see new home pages going up quickly but our rarely contain our real meat and potatoes content...

-s-

11:31 am on Aug 16, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member ciml is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 22, 2001
posts:3805
votes: 2


I think you'll see quick, deep spidering only if you have high PR links to you. How much PR you need, I don't know.
11:39 am on Aug 16, 2003 (gmt 0)

New User

10+ Year Member

joined:Aug 4, 2003
posts:26
votes: 0


Yes. When your site is linked by a high PR site, then Googlebot may find you and process a quick crawl.
2:42 pm on Aug 18, 2003 (gmt 0)

Full Member

10+ Year Member

joined:May 8, 2003
posts:292
votes: 0


Yeah the same thing happened to me; the main page got indexed pretty quick. But no deep spidering. I wondered if that was bcuz of dynamic pages. But i have found higher PR of the main page (already indexed) will go a long way in full indexing of the site
3:15 pm on Aug 18, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 11, 2003
posts:495
votes: 0


I have a new site with a PR0 put up last month. I used mod_rewrite to make the urls look static. Googlebot visited and grabbed the robots.txt. Two days later, he came back with his relatives and deep crawled the whole site. He comes by almost every day and deep crawls about every 7-10 days. On the other days, he grabs any new or changed pages I put up.

If you go out and get some links, I think he'll do the same to you. When I put up new pages, I make the info available to visitors on some other same topic sites thus giving me direct links back. Google follows them back and grabs my new stuff within 24 hours. I've pretty much got enough links now where he's looking on his own for new pages.

7:15 pm on Aug 18, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 8, 2003
posts:98
votes: 0


"mod_rewrite to make the urls look static"
What is mod_rewrite and what is do you mean by static(working link)?
Also what is "robots.txt" , i have seen it in the html of other web sites but did not know if i should put it in my site, Should I? Below is an example of what i have seen in a <head> tag,

<META NAME="Robots" CONTENT="index,follow">
<META NAME="GOOGLEBOT" CONTENT="INDEX, FOLLOW">
<META NAME="Robots" CONTENT="index,follow">

Any input on if its good to use, please let me know
Thanks

7:25 pm on Aug 18, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 9, 2002
posts:95
votes: 0



"mod_rewrite to make the urls look static"
What is mod_rewrite and what is do you mean by static(working link)?

mod_rewrite is an Apache webserver module that allows a URL to be rewritten to a different form. One use of this module is to convert forms like

[foo.bar...]

to

[foo.bar...]

Googlebot refuses to crawl URLs containing CGI forms because of looping issues. However, it will crawl URLs that appear to be subdirectories. It's unfair, it's wrong, but it just is.

As to robots.txt -- it tells Googlebot whether it should NOT crawl the site. There's a debate on whether its absence will also prevent Googlebot from crawling your site, but absence is supposed grant permission to crawl. You can read more about robots.txt at [robotstxt.org....]

8:09 pm on Aug 18, 2003 (gmt 0)

New User

10+ Year Member

joined:June 15, 2003
posts:22
votes: 0


On a related note, what percent of a website's traffic would you expect to come from Googlebot? I don't recall this being discussed anywhere.
9:24 pm on Aug 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 15, 2003
posts:2395
votes: 0


>> what percent of a website's traffic would you expect to come from Googlebot?

Googlebot's traffic pattern of late can have a large impact - especially for smaller sites. It can be double-digit percentages. I'm not speaking about the visitors being sent from the SE here, but about the traffic the bot generates by itself - false pageviews that is.

With regards to visiturs (humans) it depends a whole lot on the site, SE traffic in general is most important for new sites where a critical mass of new users have not been found yet, imho. As the site grows larger and more established, the percentage of new users (SE's, links) relative to repeat users (Bookmarks, addressbar) will decline.

/claus

10:14 pm on Aug 18, 2003 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
posts:1843
votes: 0


I have some large sites with medium traffic (i.e. more pages then daily visitor numbers), and general bot traffic can make up around 60% or more of total traffic.

SN

PS: I always exclude all bots when doing my stats.

10:45 pm on Aug 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 4, 2001
posts:1076
votes: 3


Regarding msg #1 & 2... the following experience involving 3 sites, all in the same 30 day time frame in May-June, may throw some light on what happens.

One was an existing site with PR4 that had a major update. It was spidered within 2 days and continues to be spidered acrross all pages.

Second was a new site with a reasonable startup group of PR6/5/4 inbounds. It was spidered within a few days, the index page appeared after about 2 weeks and the other pages after about a month. The site is spidered regularly every few days. The index page has just started showing PR4.

Third was a site with basically the same group of startup links, but for some reason Google isn't seeing all of them. This site shows a PR2 and gets nothing like the same number of spider visits.

It may be that PR4 and above get regular spidering, below PR4 gets less frequent spidering.... its too small a sample to be making statements of fact.