Welcome to WebmasterWorld Guest from 18.104.22.168
Why is it so slow? It indexes like 100 new URLs every few weeks. It starts feeling pointless to add new content at this point when it is this far behind. And yes, all those pages are actual original content and not "fluff" pages.
What am I supposed to do?
with regards to the discrepancy, if site: shows 2000+ urls, then most likely that's at least the total Google has indexed., and it may be more. These reporting functions are often flakey.
Yahoo seems to take a little longer to find a site, but once found gets all of it indexed a lot quicker, in just a few steps.
Live takes forever to find a new site, but then grabs all of it within a matter of just a few days.
That's my experience with sites from a few dozen to many hundreds of pages.
I'm not sure your question was answered. Tedster says these reports are often flakey, which I agree is the case. However I am seeing a similar disconnect (actually, a much larger one) between what Webmaster Tools says, and what I know to be the case.
At the moment, I have a customer that is doing a massive URL change on the site. We've managed this very carefully, with all the appropriate 301 redirects.
Google Webmaster tools is reporting that 140 pages have been indexed. A site: query reports that 20,000+ pages have been indexed. This number includes both "old" and "new" URLs. Google has crawled the site extensively, and is gradually digesting the new URL structure, and we can see that they've clearly indexed over 5,000 of the new URLs.
Since this is such a massive URL change, we're trying to keep an eye on all metrics to make sure it's progressing normally. The 140 number reported in Webmaster Tools - clearly at odds with what we're seing is in the index - is quite disconcerting. While site: queries may not be reliable, one would assume that Google would try to provide reasonably accurate numbers in Webmaster Tools. 140 is not even close to reasonably accurate. And it has not changed in days, so it's not a matter of a small lag.
I would love to hear of anyone else that is seeing similar disconnects between what Webmaster Tools is reporting, and what is, in fact, happening in the Google index.
I do not want to hijack ATWeb's thread, though it seems to have been dormant for a while, and I thought my case might shed further light on the question of how reliable the "Total URLS:" vs. "Indexed URLs:" stats reported by Webmaster Tools might be.
Again, I would love to hear other people's experiences with these numbers. Is the 140 number (vs. 5600+ verified by querying the Google index) a signal that something is wrong? Or do people just think it's an unreliable metric?
Note that Webmaster tools is not reporting any problems with accessing or reading the sitemap.
Timster, to answer your questions:
1. The site is very spider friendly. Plain text URLs, reasonably optimized link hierarchy, average of far less than 100 links per page. Well cross-linked within the site. Good navigation, breadcrumbs, etc. Good googlebot activity, no unusual crawl errors in Webmaster Tools.
2. We have just been engaged by the client, so have done no real link building ourselves (yet). They have some links, but a grossly inadequate number overall. Still, there are certainly enough links, both to the home page and deep pages, to give them some external pagerank and make them visible to crawlers from off-site pages.
Thanks for any other insights into possible explanations for the strange Webmaster Tools numbers.
"The stats on my Sitemap Details page don't look accurate. Why not?
The stats on the Sitemap Details page are a close approximation of the status of your URLs. However, this figure might not be 100% accurate. Our internal systems are always changing, and the web itself is an ever-shifting ecosystem. In addition, there may be a lag between when the numbers are calculated and when they are visible to webmasters.
We don't guarantee that our system will index all the URLs in a Sitemap. In addition, we don't index images directly (instead, we index the page that contains the image). As a result, direct image URLs in your Sitemap won't be indexed."
Q: If it doesn't get me automatically crawled and indexed, what does a Sitemap do?
A: Sitemaps give information to Google to help us better understand your site. This can include making sure we know about all your URLs, how often and when they're updated, and what their relative importance is. Also, if you submit your Sitemap via Webmaster Tools, we'll show you stats such as how many of your Sitemap's URLs are indexed. Learn more.
well,it's all relative. the bottom line is that while the site is well structured and has good "on site optimization", it's not yet ranking well, and the key to solving that is getting more links. More good quality links, in particular. To be even more precise, links from "trusted", "authoritative" sites within their domain. An ideal link is from a high quality, relevant page on a "trusted"*, relevant site with anchor text that matches your desired keywords. Sometimes you can't get the ideal link, but you should strive for something as close to that as possible.
*A site that has, itself, established credibility with Google by having good content and good links pointing at it.
So in a nutshell, they don't have enough of those to rank well for our desired keywords. Our job is to find a way to help them get those (organically, not purchasing them).
Hope that helps,
I sometimes feel the total number of results on page 1 of Google is somewhat like an estimate that gets more accurate are you go down the pages.
So sometimes it appears to be too many but when you reach the end it is not as many as you think.
Aside from that... when you reach the last page, sometimes the pages that are too similar in content are all collapsed at the last page and if you open it up, it will be indented results.