Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Is it true that googlebot reads only the first 100kb of code?

         

Jessica

10:32 am on Dec 21, 2006 (gmt 0)

10+ Year Member



My index page is 150kb in size, and i just noticed that Google never spired the links that are on the very bottom on the page. The page is PR5 and those links have been on the page a loong time now.

I have no other explanation why googlebot never spidered/indexed them.

hercules

11:10 am on Dec 22, 2006 (gmt 0)

10+ Year Member



Good question. all I know is from the old days (2 years ago) that pages with less kb are preferable if it comes to Google. Is your site indexed?
If not. try tot put the readable text higher in the code of the page.
or/ and reduce the total coding.

tedster

11:46 am on Dec 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am working with some pages that are currently ranking extremely well and they all range from 100 to 160kb. When I craft a specific search for content at the bottom of the html, these urls are returned in the results -- so I don't think it's as simple as a cut-off at some set number of bytes.

That said, I am also working with that client to put their code on a diet and reduce the file sizes by 40% or more.

jetteroheller

2:24 pm on Dec 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



How many links are on this page?

Just an other discussion, the PR effect of Wikipedia articles.

The possible problem: On one PR7 Wikipedia page are more than 400 links, the link to my site is at place 250 or so

Maybe only the first 100 links count, or more links have less effect.

tedster

2:29 pm on Dec 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good question about links. While I now have solid evidence that Google will follow more than 100 links on a page, there still appears to be an upper limit somewhere.

< edit reason: I originally typed "more than
10 links", but I meant "more than 100 links." >

[edited by: tedster at 7:29 pm (utc) on Dec. 22, 2006]

BigDave

5:44 pm on Dec 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Several years ago, when the 101k limit on cache size was in effect, and google first mentioned trying to limit your pages to 100 links, I did a test.

The 160th link at around 130k was spidered.

So google got the link, even though it was past the limit that would cache. They also followed the link even though it was well past the 100th link.

Out of the 2 possibilities, I would say that an excessively large number of links is likely to be more of a problem than file size, unless google has drastically redesigned how they feed links to the spider.

Another option is that they might be malformed links that the browsers are able to figure out correctly, but it screws up googlebot.

arnarn

8:10 pm on Dec 22, 2006 (gmt 0)

10+ Year Member




My guess is that file size is a factor, but G's definition of a file size might not be your file size (e.g. if they are using a condensed/stripped version of what the original file is.) In that case, we're probably talking about a file "size" that is variable... JAO