Forum Moderators: open
So my questions are:
How long before atleast those 500 or so get into the index?
and
Is there anything else I can do besides try to get more links to entice google to deepcrawl the rest of the site?
Thanks
One last question... I recently heard something about adding some sort of index, follow code to the robots.txt...
Currently my robots.txt file looks like this:
User-agent: *
Disallow:
Is that robots.txt OK or is there anything I should add? That might help make Google want to crawl and index the whole site a little more?
Also: Google's been to even more pages, since I posted this, and seems to come about every day, but no files are showing in the index yet... How long before the pages it did crawl will be in the index? Will I have to wait for another update?
Cheers.
1. several nice sitemaps since it's a large site (no larger than 100K since Google stops indexing after that)
Actually, this should be "no more than 100 links per page, since Google may not crawl more than that."
The 100K limitation is also correct, but I don't think that's the operative limitation in this discussion.
Incidentally, I've seen Google crawl more than 100 links, but GoogleGuy has mentioned that 100 is generally the limitation. (Anyone have the thread he mentioned this on?)
The quick answer to your question is inbound links and PageRank.
It's also helpful to prioritize, and to structure your site and your site maps so that the pages that really need the PageRank get it first.
Take a look at:
Search Engine Theme Pyramids and Google
Optimising the Pyramid for PageRank
[webmasterworld.com...]
Also, on a really large site, perhaps not all pages are worth getting crawled, and you might want to think about that as you make use of what PageRank you have.
Also remember that any recommendation that google gives will have a safety margin included. There has been some discussion about Google diluting PR even more than usual at somewhere around 200 links, but they still follow them even out to several hundred.
As for the 100k limit, that is a hard limit to how much google will cache of your page (if it is HTML, they go larger for other formats).
Google absolutely *does* follow links that are after the 100k point in a file. It follows them, it passes PR and it counts them as backlinks.
Googlebot reads the whole file, not just 100k. The links to follow are retrieved at this early point. Indexing and caching are at a later point. It currently appears to me that both of these stop at 100k.
How strict is the 100 links per page concept?
[webmasterworld.com...]
The limit to be more aware of is the 101K limit on pages. 100 links is just a guideline that helps encourage to keep pages < 101K. You can put 150 or 200 links on a page with no problem, but keep in mind what an earlier poster mentioned: if you have several hundred links on a page, you might want to take a step back and think if that's the best thing for users. There could be a way to rework the links so that they make more sense for both users and search engines (e.g. break down links chronologically, into alphabetical chunks, by topic, etc.).
So now the truth is revealed! It is very funny how people had reacted to the message of G about 100 links per page when it was first published in the guideline a year ago, and people started to divide their 1-link page to several pages or to directory style of many pages as we can see this day. At the beginning, I used to argue with many link partners, only to find that the wave was too strong so better go on the same boat.
I recall BigDave insisted many times in other old threads that G does follow several hundreds links and it proves you're right.
I just knew for a fact that google follows at least out to 1200 links on a page, and they follow links that are located out at around 140k in a file, and that the link at 140k will even show up as a backlink.
While they do follow all the links, it is unknown how having all those links on one page might impact your ranking.
Since Googlers seem to think that pages with over 100 links are less appealing to visitors, it is quite possible that the number of links on a page is one of the "more than 100 factors" considered when ranking a page.
My home page has over 160 links, and it does not get that much google traffic. I have no idea if they are related, as the home page is not optimized for anything other than the site name. But you go against Google's good advice at your own risk.