Forum Moderators: open

Message Too Old, No Replies

Page rank and googlebot crawl schedule

What's your Page rank and when will googlebot deep crawl your site?

         

latimer

6:58 pm on Jun 5, 2002 (gmt 0)

10+ Year Member



Last month it seemed the theory that less page rank puts you lower on googlebots spidering schedule was confirmed (for existing sites, new sites seem to get special attention). Also, that higher page rank will result in more of a large sites pages being crawled. Anyone care to share this months crawl dates on their sites, number of pages crawled and page rank? We are currently with a lowly page rank of 3 on the index and 0 on other levels and haven't seen googlebot yet.

Jack_Straw

10:16 pm on Jun 7, 2002 (gmt 0)

10+ Year Member



What are you guys thinking? Suppose you have a store with thousands of products? Are you saying those product pages have no value?

I have been recently shopping for an MP3 player. I would like to see more pages listed (not less) when I search for specific items. If I put in a particular brand and model, the search engine should show me information pages about that brand and model and place where I can buy it.

I really don't get how you are thinking. Pages from a site with lots of good product pages with information about each product and a way to buy them are very valid search results.

dcheney

10:45 pm on Jun 7, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Contractor,
I have to disagree. My site (non-commercial/ad free) currently has roughly 10k pages with another 15k coming on line in the next year. I do have "overhead" pages (well under 10%) which help someone navigate to the proper page if they come in the front gate. But otherwise, the vast majority is unique content about a specific person/organization. The vast majority of search engine hits are directly to these inside unique pages based on the proper names of the person/organization.
My site is static, built via a custom database.

The Contractor

12:56 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok guys my thoughts:

There is absolutely no reason for google to cache every page of every site when that could be thousands of pages. You mention you have thousands of products. What if google crawled every page of amazon.com and buy.com - do you really think you would have a chance to compete?

If you cannot get google to know what your site is about in a couple thousand pages - the extra 90K pages are not going to help.

If I had a site of public records, do you think it is Googles duty to spider every record?
Sure it's all unique content but not really ;)

Jack_Straw

1:10 am on Jun 8, 2002 (gmt 0)

10+ Year Member



Duty? I don't know about duty.

But, public records are valuable information. If I am researching obscure stuff, I hope to find it there. I believe Google intends to be be comprehensive. Isn't that why they call it Google?

You can't really be arguing that no one source can have more than "a couple thousand" pages of useful information.

The Contractor

1:15 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jack,

I am not saying at all that the site does not have usefull information. What I am saying if Google cached every page of the public records that are on file for local governments, how could a site that has the same compete? Do a search for "Windows Information" on Google. Now what if Google cached every page of Microsoft. Who do you think would have the first 1000 pages of SERPS?

Would you really like to compete with this on a grand scale? Just because somebody dumps a database into a website does not mean that Google should crawl every page.

Doofus

2:05 am on Jun 8, 2002 (gmt 0)



I'm afraid Google disagrees with you on this one, Contractor. I'm not worried at all that Google will chop me off someday simply because my site is too big. On the contrary, I'm surprised I don't get special bonus points from Google. That's why I implied that PageRank is ultimately cheating Google, in that it does certain things poorly that could be done better.

Each of my records is about a specific person, group, or corporation. Many, many searchers put the name of a specific person, group, or corporation into the Google search box. Google might even spit out a white pages listing when it responds, if it detects that it's a candidate for a white pages scan. If you put in a ten-digit U.S. telephone number, it will do a reverse lookup in the U.S. white pages. They aren't using the best collection of white pages data, but it's still useful.

Have you ever heard the term, "I googled him and found out..."? I think it was in a New York Times article about two people on a first date, and it turned out that each had already "googled" the other. It was a lightweight piece, and Google loved the publicity.

Name searches are powerful on Google if you know how to search, particularly when the name is not so common. I think this is one of the more significant contributions search engines have made to our society. For investigative journalists, Google is the first port-of-call. Over 95 percent of my Google referrals are zeroing in on a specific name, so I try to optimize that page (one page per name) for the name itself.

Google has even bragged about their ability in this respect. I believe it was Sergey who said that the first thing any employer might want to do when looking at a promising resume, is to "google" the person.

Google becomes a verb, as happened many years ago with Xerox, as in "I'll xerox a copy for you." This is like living in heaven for any public relations department in an aggressive company.

fathom

2:13 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks Doofus

I never put to two references together for now. Google/Xerox!

It is excellent branding eh!

The Contractor

2:23 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doofus - you are missing the whole point from where this thread started :)

Jack_Straw

6:47 am on Jun 8, 2002 (gmt 0)

10+ Year Member



Contractor,

It seems to me that Doofus has the point correct. His point, if I may re-state it, is that Google's method of determining what pages to crawl and index is counter to their goal of indexing all quality relevant content because it sometimes causes large sites with good content to be under-indexed. To me, this seems to be a good and interesting point.

You countered with assertions that large sites should not be indexed simply because they have lots of pages. And your posts carry a strong suggestion that you think that any site having a large number of pages must be spam. That, it seems to me, misses the point.

I think the suggestion that any site with many pages must be spam is a very wrong and ill conceived.

Google and the search public both benefit from an algorithm that indexes all quality and relevant content, irrespective of it comes from a large or small site.

vitaplease

7:51 am on Jun 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hopefully this discussion will be irrelevant in a year or two, when computer and memory prices have dropped further and Google will have more than enough capacity to spider and index anything.

For the moment, this does not seem to be the case.

If you were Google, how would you choose what to spider frequently and index deeply and what not? Does choosing for higher pagerank plus new unindexed sites not make sense?

This 41 message thread spans 5 pages: 41