Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How far down a page Gbot crawls?

         

tigger

11:43 am on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is there a limit to page size either words or size (kb)- I presume the PR doesn't make any difference other than how often it gets crawled

Its just I'm adding some content onto the bottom of pages and I want to make sure this content is seen

Cheers

Pjman

12:12 pm on Apr 8, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have seen no clear explanation of this anywhere.

In the age of spyware and malware, you would have to think that g bot crawls the entire page for any page it intends to return as a result, always.

incrediBILL

12:18 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't think Google has much in terms of limits these days, nothing like back in the beginning if there are any.

tigger

12:58 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have seen no clear explanation of this anywhere.


Cheers Pjman - I've been hunting also and can't find any guidelines on this

I don't think Google has much in terms of limits these days, nothing like back in the beginning if there are any.


Thanks Bill ....looks like we are all in the dark on this and all we can is "assume" Gbot fully crawls a page

Leosghost

1:49 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It used to be said to be about the first 100 kb "back in the day"..But I've seen pages crawled that were near 10 megs ..and results returned from things right at the bottom end of the 10 megs ..so nowadays who knows out side of the plex ?

Planet13

2:47 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have absolutely no proof whatsoever, but I thought that gbot would crawl all the data of each page it does crawl.

The only thing being that if your pages had low PR, then they wouldn't get crawled that often. Conversely, if the page had higher PR, then it would get crawled more frequently.

anywyay, that is what I have always heard.

tigger

4:45 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks for your feed back guys

aristotle

6:24 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The very bottom of the page is where black hats sometimes put hidden text and hidden links. So to make sure that a page is clean, Google needs to crawl all of it.

engine

7:33 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Why don't you find a big (long) page in google's serps and see how much of it you can find by searching for those terms down the page?
Do that with more a test sample of, say, ten pages on different sites and it should give you a good idea.
Try and pick pages that appear to be a 3-6 months or more old.
Otherwise, create those pages yourself and see what gets indexed.

HTH

TheMadScientist

7:38 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If your goal is to organize the world's information, why would you NOT grab the whole page? What advantage would you gain? How would you know what information you are missing?

It makes no sense to me for them to stop early...
They can't organize what they do not have.

g1smd

7:43 pm on Apr 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Many years ago Google indexed only the first 100 kb from each page and they also showed the filesize in green text in the SERPs.

Just before they stopped showing the filesize in the SERPs, they were reporting file sizes way in excess of 100 kb, almost up to 1 Mb - and indexing the whole page.