Forum Moderators: open

Message Too Old, No Replies

Spidering text below footer?

Spidering text below footer

         

luckychucky

3:07 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



(Very first post here, not a webmaster, but a site owner who hires' em..)
We've got a lot of keyword-rich text at the bottom of the page, which Google's apparently not seeing because it doesn't turn up in their cached page as shown. The text displays just fine when you hit the site and in browser searches, but in the cached page everything cuts off neatly at the footer...Our site is still quite new, and maybe G stops at a certain limit for a quick superficial spider-by--if that's so, hopefully we'll get a full deep spidering soon. But I wonder if it has to do with the fact all the text sits below the "footer" in in the source code. Does G stop at the footer and go no further?

AWildman

4:45 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Two things come to mind. It could be that you have long pages and that the se stops looking after x number of characters, thus you aren't seeing the stuff in your footer. Or, you have some bad html, such as an unclosed footer or the end </body> tag before your footer for some reason. This may cause it not to be shown.

martinibuster

5:00 pm on Oct 13, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The bottom of the page is the least important part of your webpage. Anything you put there is regarded as not important. Slapping a bunch of keywords down at the bottom won't help you.

Worst Case Outlandish Paranoiac Scenario: The SE may think the keywords listed at the bottom are not relevant for your site at all, and hurt your rankings.

Best Case Scenario: For anything outside of geo-related keywords, it won't help you.

Also, there's a 100k limit to how much code a spider will swallow.

whiterabbit

6:07 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Also, there's a 100k limit to how much code a spider will swallow

I have a 496K page that is spidered every two days, the links at the bottom being followed just as regularly as the top ones...so I'm not sure that this 100K limit is relevant anymore (its not in my case)

luckychucky

6:27 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



A couple of details:
Site size is well under 100K. Page-bottom text has lots of keywords, but carefully balanced, in reasonable ratios. We've put the text at page bottom because we've aimed to have a page with 12 thumbnail-sized .jpg images (my product's very visual). If I understand things correctly, the image files are skipped over, so the page-bottom text should essentially be the main body. Again the issue is that Google's cached page always cuts off neatly right at the footer's official end and goes no further, even after a few spiderings thus far.

hutcheson

6:35 pm on Oct 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It also depends on how artificial the keyword stuffing is. There are statistical formulas for distributions of real text. It is easy enough to algorithmically detect that a given sample text diverges widely from the norm, and to selectively penalize it. Even the old second-generation search engines did that. Google probably does it in a more sophisticated way.

If you're thinking "get keywords in here", you will most likely fall into an artificial pattern. If you are thinking "add information about my main topic", you probably won't.

luckychucky

6:49 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Like I said, it's nicely balanced, not stuffed, it really is IMHO. And again, that idea doesn't really explain for me why all spidering apparently cuts off exactly where "footer" ends in the source code.

Tomas

8:19 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Site size is well under 100K.

We are not talking about web site size, it could be 100MB, and it doesn't matter. We're talking about size of one page, which includes web page source and text. Basically, it is your html page size, without images, external style sheets and scripts.

I suspect coding error... Check source of your cached page in Google and you'll see where it ends.

DerekH

11:15 pm on Oct 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tomas wrote
>> suspect coding error... Check source of your cached page in Google and you'll see where it ends.

Better still, take your HTML and CSS over to the W3C Validation service and check them out. You'll find them in google (where else!)

And when it suggests things are wrong, it really does pay to fix them - trust me, it does!
DerekH

DerekH

11:20 pm on Oct 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



luckychucky keeps mentioning "footer"

Such a concept isn't an HTML concept.

It's easy to confuse page content with page layout.

google indexes page content. Not page layout.

Page layout - whether via HTML tables or CSS - isn't page content.

I suspect that the "footer" - whatever this means - is somehow malformed HTML, or that somewhere in the file there is HTML that is not valid, and what follows it is lost.

DerekH

nativenewyorker

2:37 am on Oct 14, 2003 (gmt 0)

10+ Year Member



luckychucky,

Welcome to Webmaster World.

From what you have described, it appears that you may have a footer that is attached to the page as a separate file. If that is the case and your main page references the footer as a script, it will not be indexed. Google will not crawl JavaScript files.

Ted

Rick_M

2:42 am on Oct 14, 2003 (gmt 0)

10+ Year Member



I am wondering if it is javascript code as well.