Forum Moderators: open

Message Too Old, No Replies

Does Google examine whole pages, or just top 120KB?

         

pawel

9:06 am on Jun 2, 2003 (gmt 0)

10+ Year Member



I recently read on
www.searchengineworld.com
that Googlebot reads only first 120KB of each page it visits.
Is this true? If I have, say, some KB-heavy images on the top of my page, will they count?

takagi

10:21 am on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The limit is not 120KB but 101KB (unless they recently changed it). For a PDF the limit could be slightly higher. It is however not the total of all the bytes needed to display your page, it's just the HTML file. Usually the images are not embedded, so they are not part of the HTML file.

pawel

2:36 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



Thanks.
What about JavaScript? You wrote that only .html file is examined - does that mean I should put all scripts in separate .js files?

takagi

2:56 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google supports not only HTML but also a lot of other file types [google.com] (see also Does G index .txt files? [webmasterworld.com] thread). However, JavaScript files are not indexed. So if your HTML file is longer then 101KB due to the JavaScript in your this file, put the JavaScript in a different '.js' file(s).

Especially if you have quite a lot of common functions used in several pages on your site, put these functions in a separate '.js' file. This will decrease download time for the user, save bandwith for your site, and make it easier to maintain the code. In rare cases, SEs have problems finding the begin/end of your JavaScript and therefore index the code or miss text & links. That won't occur with the JavaScript in an include file.

lorax

4:00 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



At Pubcon, we were told by the Google rep that Google is working hard to enable the bots to follow a URI even if it's wrapped in JS.

That suggests to me that the bots will parse through javascript. Will it be counted as part of the 101KB? I would think so.

takagi

4:39 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wasn't there, but at this forum I could read that Google wants to do more with the JavaScript. It would surprise me if pages that now include '.js' files and are now just below the 101KB limit, would not be completely indexed once Google starts to process include files.

For now, the '.js' files are ignored. And even if that would change in the future, you can still change the robots.txt to prevent Google from spidering the JavaScript files.

lorax

4:51 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> For now, the '.js' files are ignored. And even if that would change in the future, you can still change the robots.txt to prevent Google from spidering the JavaScript files.

True. But if you use an include - they are part of the html file and will be spidered.

I don't know how much of an impact - if at all - this will have. Just something to consider for the future. ;)

DaveN

10:23 pm on Jun 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought we disproved the 101k threshold I could be wrong.

Dave

added found This [webmasterworld.com]

Hardwood Guy

10:32 pm on Jun 3, 2003 (gmt 0)

10+ Year Member



120K? Pretty heavy. I would think if you have some heavy weighted images it would be best to place them near the bottom of the page if possible so the page has some content to load. Otherwise I could see loss of potential traffic.

futureX

2:48 am on Jun 4, 2003 (gmt 0)

10+ Year Member



google reads the entire page, but only caches 101kb :)

rfgdxm1

4:16 am on Jun 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>google reads the entire page, but only caches 101kb

Looks to me like no. I have some large text files on a site. A search on unique text strings beyond 101k doesn't show up on Google.