Forum Moderators: Robert Charlton & goodroi
I need some clarification regarding this rule and the general recommendation for Crawler.html files with more than 100 links.
Thanks so much.
Google will only index the first 100k of your page, thereafter truncating everything after this point.
You may want to stay safe, and consider splitting up your links page into smaller chunks of information.
means sitemap. I think Google can index up to 500KB, at least it did so a while back. As far as over 100 links, I doubt it will matter; google will index them (provided everything else is fine of course.)
I have a few more questions.
I assume all search engines treat these crawler pages the same. Am I right in assuming there is no penalty from the big three (yahoo, msn, google)?
Is it recommended to have a link to the crawler.html file (from, say, the home page)? Currently we link to our site map from our homepage, and our site map follows our site standards in terms of layout, color, font, etc. However, the crawler.html is just a flat html file with no formatting, just links. Ideally, we do not want people to visit this file (as we believe its purpose is more to help search engines dive deeper into the site).
If the crawler.html is not linked to by any other page, then am I right in assuming that it must be manually submitted to search engines?
Any help would be greatly apprecaited. Thanks in advance.