Forum Moderators: Robert Charlton & goodroi
Google is only indexing a very small portion of our site...
Matt Cutts is claiming that BD is over..The reason websites are having trouble with supplemental results and not being indexed is due to spam..
Reading webmasterworld for the past few weeks, everyone claimed everything is Google's fault, just be patient with your site..Google needs to recover..
What is your opinion now..Not being fully indexed means that our site has duplicate or some other spam issues?
This random up and down, here today gone tomorrow in Google's new game!, is giving me a King size google headache!
What important issues are you getting out of Matt Cutts blog?
To be honest he is confusing me more..
I am trying to see if the pages that G is indexing is differing from the rest of the pages (as far as SEO) and can't figure it out. I have followed all guidelines on all of my pages.
My question has anyone done recently (after April 26)a change that has increased thier pages indexed on Google?
You can have any number of technical issues that don't seem to affect your site for a long time, and then all of a sudden they do. Part of this current moment with Google may be (for some sites) that Big Daddy finally allowed them to handle more data, compare more data on the fly, and so on. This could even mean the site trips some spam filter that it previously avoided.
Sometimes there may be a dirty trick played by a competitor who notices a liability with your urls, and they begin to point troubled links at your site -- eventually they get spidered and your apparently trouble-free site starts to show trouble. Or you make what seems like a small change, but that change opens up a liability to googlebot that it never had access to before.
So it is important for the webmaster to do some due diligence, assuming there "may" be a problem they can address. These threads go into a lot of the detail that is worth checking:
Checklist for Sudden Drops in Rank [webmasterworld.com]
Dropped from Google - a checklist to find out why [webmasterworld.com]
Dropped Site Checklist [webmasterworld.com]
The url-only problem [webmasterworld.com]
All that said, I am seeing sites vanish from the index that make no sense to me -- for example, a PR7 home page, ten years on line, and over 100,000 very natural backlinks, most of them to deep pages on an extremely authoritative information site.
So I do think we have a mix of "fault" going on.
Just now I had an odd experience with G's search page. I was in classic search and I clicked to switch to personalized search. The result was this message: "We're sorry... but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can't process your request right now."
That rather astonished me, and I tried again and got the same result. To repeat, I was logged in to Google and all I was doing was switching to my personalized search page.
This reinforces my feeling that their paranoia about spam has caused them to ratchet their filters up way too high.
I do agree with this statement. It is just sad that we don't know which is which at this point. How do you fix what is broken when you don't know what is broken? Why attempt to mess around and fix a site when it may NOT be broken in the first place? Which could cause further problems and even destroy rankings in other engines who are having no problems with these same site.
Most of us are just spinning in circles. We have 0, none, zilch, NADA feedback or even clue to what is going on. Especially looking at results of some of the spam crap that show up. If there is a problem most webmasters here have yet to pinpoint it. I don't think Google even knows.
Make sure that every page of the site has a unique title and meta description.
Make sure that every page of the site links back to "/" and to the main section indexes.
Make sure that all domain.com accesses are redirected to the same page in the www.domain.com version of the site.
If you have mutiple domains, then use the 301 redirect on those such that only one domain is indexed.
If you have pages that say to bots "Error. You Are Not Logged In", for example "newthread", "newreply", "editProfile" and "sendPM" links in a forum, then make sure the link has rel="nofollow" on it, and the target page has <meta name="robots" content="noindex"> on it too.
If you have a CMS, forum, or cart that has pages that could have multiple URLs, then get the script modified to put a <meta name="robots" content="noindex"> tag on all but one "version" of the page.
Use the site: search to see what you have indexed, and work to correct these issues. The presense of Supplemental Results, URL-only entries, or hitting the "repeat this search with omitted results included" message very quickly are all indications that you have stuff that needs fixing.
It is a sad fact that systems like vBulletin, PHPbb, osCommerce, and a whole range of popular scripted sites, have a large number of SEO-related design errors built in to them. The designers are clever programmers, but have no clue about SEO or how their site will interact with search engines; and the situation isn't getting any better.
Run Xenu LinkSleuth over your site, and run a few pages through [validator.w3.org ] too - just in case.
If you have done all of that, then you'll just have to wait for Google to fix whatever they have broken at their end.
The point is beyond the basics Google is creating an uncalled for workload. I now lecture myself now not to tell people those things that take me a couple of minutes are easy. It often took me years to be able to do so. The chief dangers though are what Arubicus brings up and what I'm always mindful of.
Google even took a simple program like Adwords and turned it into some exotic “riddle” to pay less for clicks.
Google is now much more fussy about spidering some types of site. Sites might have got away with it before, but I think they stand less chance now.
So, I always advise to do all the basic stuff and then evaluate where you are after that... it often works.
You are right, I now see some examples where I have no idea where the problem lies, and advise waiting until Google fixes the bug at their end.
This is where some of us are stumped.
But yes you are right. Get the basics done first.
I'd say the answer to the title question is that sometimes it's the website and sometimes it's Google. Discovering which is the case is what's driving many webmasters up the wall right now
Actually, there is a third possibility which is, for most sites, the most accurate. It is the INTERACTION between unknown variables on the site, and unknown variables and algos google uses.
With some sites, it's possible to guess if there is an onsite problem. With most sites, because the results (SERP) are the end product of 100's of variables related to the site, and hundreds re: the algo, the whole question is rather pointless, because, by and large, it's not answerable.
Even with more information, it's not knowable. Most things in complex systems (which is what we are talking about) are multi-caused, and multi-caused in non-linear and interacting ways.
That won't stop people from speculating, or thinking they know the answer. Sit back and enjoy the ride, cause you just ain't going to be able to figure out HOW the ride works. It's like a magic trick that amazes, and no matter how hard you try, you can't figure it.
In short, if you look for ONE cause, or one simple answer, you will lead yourself astray, and perhaps do the wrong thing. If you happen to do a right thing, it will be by complete accident.
Yes, Google has increased people's workload, but I still see that many sites have not done even the basic stuff.
I agree that one needs to do those basics, which are fairly well known. But I'm afraid that's "old world" thinking. There is absolutely NO stability, or predictability about how search engines, particularly google, index pages.
That's THE obvious lesson that many aren't quite getting. Someone can do all the basics on a new site, or modified site, have the best darned site on its topic, have good inbound links, and get absolutely nowhere. You can fiddle, tweak, listen to "experts" who really don't know what's going on (since nobody does), and end up with a large investment of time, horrible rankings, and no income.
Or, you could hit the jackpot. It's like playing slot machines now, and for people who are into that, that's fine. But trying to run a business that is completely (and I mean completely) unpredictable is not a wise choice for many people.
Stability and a somewhat predictable business environment is one of, if not the most important requirements to run a successful business long term. Right now we don't have it.
Speaking for us, until we have some stability, we're not investing much in web development beyond basic maintenance tasks. We aren't redesigning our sites (that too is a crap shoot - if we redesign do we lose whatever rankings we have left?) We aren't investing ad money, either, into adwords, and we aren't counting on adsense revenues. We'll be ready if things improve, but we're moving our business elsewhere, so to speak.
I'm through playing guessing games with google.
I'm not saying that this is the full explanation for eveything we see right now -- far from it. But it is worth looking through Google's webmaster guidleines [books.google.com] one more time -- they may have just become better at enforcing some of them, or turned the dials up higher.
I changed url's from format /foo/id/ to /foo/bar.html The old URL's generate 301's to the new ones. The old url's where all supplemental in Google, now only a few are left, but the new ones aren't making it in.
So I'm pondering returning 404's for the old URL's to see if this resolves the issue.
It is a sad fact that systems like vBulletin, PHPbb, osCommerce, and a whole range of popular scripted sites, have a large number of SEO-related design errors built in to them. The designers are clever programmers, but have no clue about SEO or how their site will interact with search engines; and the situation isn't getting any better.
I'd have to vote for "Google is at fault/broken" since we have to do all the things you list to "SEO" our sites even though Google says we should develop our sites for visitors, not search engines.
-- Roger
I tried doing that for a couple of days, made for even less traffic! So I've put all the pages back and replaced the content on each with a memo and link to the current page. Even though they're no longer connected to the site, the old URL's were still sending a lot of visitors. Whenever Google finally gets the site re-indexed, I'll remove the old pages.
So I'm pondering returning 404's for the old URL's to see if this resolves the issue.
persistant 404's will hurt you in google.
Look at the HTTP status codes.
404 document not found (and we don't know why or we're just not saying why)
google interprets this as site is broken temporarily and will keep requesting the page.
410 GONE (this page no longer exists)
google has no problem with that and will remove it from the index.
what I do is put a 301 (document moved permanently) redirect on the URL until the new URL is in the index and then (after requests for the old URL peter out due to the 301) change it to a 410 for a while until 410 errors peter out, then remove the old url (410) completely.
Be careful though - during the time you have the 301 status on the old URL make sure you find any links to that URL and get them updated else bots/visitors will keep coming from those links and requesting the old URL.
what I do is put a 301 (document moved permanently) redirect on the URL until the new URL is in the index and then (after requests for the old URL peter out due to the 301) change it to a 410 for a while until 410 errors peter out, then remove the old url (410) completely.
I've had this in place since early March, and all URL's point to the new ones. Google visits the new ones, but they don't end up in the Index. All the old one's are listed as supplementals or are gone.
What's interesting, is if you look at the cache for the supplemental's, many have the new meta description that I added to them for the new url's. So Google is connecting the old url's to the new one's in some manner.
301'ed two related sites to it. Sites [A] and [B]
After 3 weeks 420 pages indexed.
Removed one of the 301's [B], two weeks later less than 200 pages indexed.
Put an outbound link on site [B], two weeks later I had around 300 pages indexed.
Put a few outbounds on sites that had no pr, but were getting crawled, and the indexed page count went up around 20 pages per site.
are you sure you are using 301 and not 302?
Reid:
Positive. I did the same thing for another site and it worked like a charm.
I've checked the site multiple times for any URL's in the old format and none exist. The majority of requests go to the new URL's, but I still get a few stray requests to the old URL's. The requests are usually from a SE bot, and users for the most part go to the new URL's.
Googlebot hit up 251 of these pages in the new format again this morning, but it did this last weekend, and the weekend before that and they never made it into the index. It didn't hit any of the old pages.