homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / HTML
Forum Library, Charter, Moderators: incrediBILL

HTML Forum

How Many Characters Count in HTML?
Spiders Count How Many Characters, Words

10+ Year Member

Msg#: 10867 posted 6:08 pm on Aug 22, 2005 (gmt 0)

Hi WebmasterWorld,

How many characters (or words) do spiders look at in HTML code? I have heard the most important spots are at the beginning and end of the HTML that is the most significant. I have also heard there should be between 200-800 words in total. Any more specifics on the most important parts? Where does the spider start? For instance, would <HTML> count as 6 characters?




5+ Year Member

Msg#: 10867 posted 1:58 am on Aug 23, 2005 (gmt 0)

Perhaps you're thinking of the <meta> tag for Keyword Content, which is limited to 250 words i believe. That's a max of 250 keywords you can proactively use to tell Google "this is what my site is about". But if you have 3,000 words in your page, and one of them is "doodledorf", and I search for that word, Google will show your site as a result.


WebmasterWorld Senior Member 10+ Year Member

Msg#: 10867 posted 2:15 am on Aug 23, 2005 (gmt 0)

I've heard that the amount of human-readable content on a page should be around the limits you specified. If that's what you're talking about, the number of words you need to be counting are the ones that are actually visible on the front-end of the page, through the browser. If I were keeping track of such things, I'd exclude navigation from the count. I'd also try to put all navigation and such as far down in the source code as possible with ALL other page content above it.

Frankly, though, I don't consider page length that big of an issue. Minimize HTML markup as much as possible by eliminating code bloat, write clean, validating HTML, come up with a good linking structure, use CSS to lay out your page with the "meaty" content coming first in the source, use a <title> that's laser-targetted to the page content, and keep the content itself focused and relevant. Those are the main things.

Purple Martin

WebmasterWorld Senior Member 10+ Year Member

Msg#: 10867 posted 2:23 am on Aug 23, 2005 (gmt 0)

Some of the big search engines used to stop reading a page after 101kb, but now they read the whole page. Database space is cheap these days.


WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 10867 posted 3:26 am on Aug 23, 2005 (gmt 0)

Yes, I agree with PM. What you've heard, Webdude, is a bit out of date - mostly from the days when search engines were focused strongly on matching the on-page text to search queries. Things have shifted a lot today, and the simpler formulas of days gone by are fading from usefulness.

The spider takes in the whole html document unless it's really huge. Once the search engine begins processing, it may discount things like a few attributes here and there, but essentially everything gets processed and stored for further number crunching as the algorithms try to determine relevance for different queries.


10+ Year Member

Msg#: 10867 posted 2:30 pm on Aug 23, 2005 (gmt 0)

Thanks everybody for your feedback and bringing me up to speed! ;-)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / HTML
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved