Forum Moderators: open
When did I state any of the above, use the word spam, and say google should not index sites that are large? Whew!
This thread started on PR vs Crawl.
I "am" saying that higher PR sites should be crawled first/deeper.
I "am" saying that if I dump news, public records, one paragraph of words taken from somewhere and call them a page, and any other information that is freely available elsewhere into a database, "yes" it is useful information - but "no" I should not be given special attention since I have made 100K pages of information that is available everywhere.
And "no" I do "not" believe Google should crawl my entire site if I do a data dump into my web.
Should google crawl every page of sites that use the DMOZ data dump?
Wouldn't that be kind of stupid of Google to replicate data over and over again?
I am "not" saying that because a site uses data found elsewhere it is useless information - I am saying Googles algo "should" recognize this for what it is.
One of my own sites is a directory type, do you realize how easily I could dump a lot of the DMOZ data into by useing my spider to crawl certain area's of DMOZ?
I could have almost guaranteed instant PR/success by doing so - I would rather build it up over the course of of a couple years or more and try to get some content that is available at very few if any other sites.
I don't believe in it - that is my OPINION ;)
P.S. I am not saying anyone in this threads website is a data-dump or that it should not be crawled completely by Google.
Another feature of my site is cemetery transcriptions. I have 50,000 individuals listed, each with a photo of the headstone. I also have 10,000 obituaries, 100,000 census records, 20,000 marriages, 5,000 alumni of colleges and universities, 500,000 military records, etc. etc. I also have links to 100,000 sites.
So when you imply that Google shouldn't deep crawl sites with so many pages because they are somehow junk, I take offense to that. I'm a lousy HTML writer and I have self-taught what little I know but Google deep crawls my site because I do, in fact, have a lot of information that is useful. I'd like to have ten times as much data and will, god willing, in a few years.
I'm not somehow saying that I have a great site but I still think the deep crawl is deserved.
First: Read my message above - no, I mean read it!
Second: Do not sticky me, say what you want to in public!
Third: I have NOT called your site or any other site SPAM, please feel free to quote me when you find those words in a message in this thread.
You said <<So when you imply that Google shouldn't deep crawl sites with so many pages because they are somehow junk, I take offense to that.>>
I never called a site "junk" that has a large # of pages, again - please feel free to quote me when you find those words in a message in this thread.
I think you need to read this entire thread (maybe print it out) before making accusations and putting words in my mouth or posts that simply are not there.
Have a great day ;)
I'm getting a little burned out with watching miss prima donna googlebot. I seem to please most every search engine out there but them....
I am of a good mind to delete my googlebar!
Ann
I agree the system is not perfect (it is, as you point out, biased against new sites), but what are the alternatives? Strong AI would be needed to distinguish "good" from "bad" intrinsically, and what extrinsic measures are there except link pop?
Danny.