Welcome to WebmasterWorld Guest from 54.162.240.235

Forum Moderators: open

Message Too Old, No Replies

Google's spidering pattern lately?

     
3:10 am on Aug 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have noticed that google will quickly grab the home page of a new site and index it - sometimes in just a few days. And, then not visit for an extended period of time.

Does anyone have a handle on what that extended period of time is?

It is nice to see new home pages going up quickly but our rarely contain our real meat and potatoes content...

-s-

11:31 am on Aug 16, 2003 (gmt 0)

WebmasterWorld Senior Member ciml is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I think you'll see quick, deep spidering only if you have high PR links to you. How much PR you need, I don't know.
11:39 am on Aug 16, 2003 (gmt 0)

10+ Year Member



Yes. When your site is linked by a high PR site, then Googlebot may find you and process a quick crawl.
2:42 pm on Aug 18, 2003 (gmt 0)

10+ Year Member



Yeah the same thing happened to me; the main page got indexed pretty quick. But no deep spidering. I wondered if that was bcuz of dynamic pages. But i have found higher PR of the main page (already indexed) will go a long way in full indexing of the site
3:15 pm on Aug 18, 2003 (gmt 0)

10+ Year Member



I have a new site with a PR0 put up last month. I used mod_rewrite to make the urls look static. Googlebot visited and grabbed the robots.txt. Two days later, he came back with his relatives and deep crawled the whole site. He comes by almost every day and deep crawls about every 7-10 days. On the other days, he grabs any new or changed pages I put up.

If you go out and get some links, I think he'll do the same to you. When I put up new pages, I make the info available to visitors on some other same topic sites thus giving me direct links back. Google follows them back and grabs my new stuff within 24 hours. I've pretty much got enough links now where he's looking on his own for new pages.

7:15 pm on Aug 18, 2003 (gmt 0)

10+ Year Member



"mod_rewrite to make the urls look static"
What is mod_rewrite and what is do you mean by static(working link)?
Also what is "robots.txt" , i have seen it in the html of other web sites but did not know if i should put it in my site, Should I? Below is an example of what i have seen in a <head> tag,

<META NAME="Robots" CONTENT="index,follow">
<META NAME="GOOGLEBOT" CONTENT="INDEX, FOLLOW">
<META NAME="Robots" CONTENT="index,follow">

Any input on if its good to use, please let me know
Thanks

7:25 pm on Aug 18, 2003 (gmt 0)

10+ Year Member




"mod_rewrite to make the urls look static"
What is mod_rewrite and what is do you mean by static(working link)?

mod_rewrite is an Apache webserver module that allows a URL to be rewritten to a different form. One use of this module is to convert forms like

[foo.bar...]

to

[foo.bar...]

Googlebot refuses to crawl URLs containing CGI forms because of looping issues. However, it will crawl URLs that appear to be subdirectories. It's unfair, it's wrong, but it just is.

As to robots.txt -- it tells Googlebot whether it should NOT crawl the site. There's a debate on whether its absence will also prevent Googlebot from crawling your site, but absence is supposed grant permission to crawl. You can read more about robots.txt at [robotstxt.org....]

8:09 pm on Aug 18, 2003 (gmt 0)

10+ Year Member



On a related note, what percent of a website's traffic would you expect to come from Googlebot? I don't recall this being discussed anywhere.
9:24 pm on Aug 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> what percent of a website's traffic would you expect to come from Googlebot?

Googlebot's traffic pattern of late can have a large impact - especially for smaller sites. It can be double-digit percentages. I'm not speaking about the visitors being sent from the SE here, but about the traffic the bot generates by itself - false pageviews that is.

With regards to visiturs (humans) it depends a whole lot on the site, SE traffic in general is most important for new sites where a critical mass of new users have not been found yet, imho. As the site grows larger and more established, the percentage of new users (SE's, links) relative to repeat users (Bookmarks, addressbar) will decline.

/claus

10:14 pm on Aug 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have some large sites with medium traffic (i.e. more pages then daily visitor numbers), and general bot traffic can make up around 60% or more of total traffic.

SN

PS: I always exclude all bots when doing my stats.

10:45 pm on Aug 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Regarding msg #1 & 2... the following experience involving 3 sites, all in the same 30 day time frame in May-June, may throw some light on what happens.

One was an existing site with PR4 that had a major update. It was spidered within 2 days and continues to be spidered acrross all pages.

Second was a new site with a reasonable startup group of PR6/5/4 inbounds. It was spidered within a few days, the index page appeared after about 2 weeks and the other pages after about a month. The site is spidered regularly every few days. The index page has just started showing PR4.

Third was a site with basically the same group of startup links, but for some reason Google isn't seeing all of them. This site shows a PR2 and gets nothing like the same number of spider visits.

It may be that PR4 and above get regular spidering, below PR4 gets less frequent spidering.... its too small a sample to be making statements of fact.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month