| This 74 message thread spans 3 pages: 74 (  2 3 ) > > || |
|Homepage Dropped From Index|
Is it a bug or the big stick?
PR7 site, competitive industry, site has been up for 5 years. Completely clean - as white hat as can be.
Woke up this morning to find that Google has DROPPED the homepage from the index. I put in the URL, and Google returns a "Sorry, no information is available for the URL" message.
All 150K other pages on the domain continue to remain indexed, however. It is ONLY the homepage that has been dropped.
Anyone else seeing this?
[edited by: bakedjake at 9:36 pm (utc) on Aug. 14, 2004]
I feel your pain. This "bug" has hit me as well.
On one of our main sites we lost only the homepage for about 6 days then it came back on it's own. I was going to give it a week and then start futzing around, but it fixed itself just in time.
very white hat site, up about 3 years. no clue.
There was an epidemic of MIA homepages some months back. What I'm thinking now is that it looks to me like duplicates and/or near-duplicates are getting flushed. No proof, just seems like it.
Two possibles come to mind - could it be that if links are www.domain.com and domain.com and www.domain.com/index.html and just index.html that a dup filter could be triggered (if such exists)?
I have seen a site get hit for use of multiple forms of links to the homepage - even from the same pages on the site. Those folks also used meta refreshes and messed up altogether in a few ways, but that homepage thing was glaring.
Another - sometimes people pull "funny stuff" out there. Could there be anything like that happening, pulling something with redirects? There was some 302 garbage going on a while back. Any very unique phrases that could be looked for?
OK.. maybe a bug.
Interesting thread. This could be consistent with a theory espoused in another thread, [webmasterworld.com...] (sorry, it's buried deep in there), about why certain large sites seem to be losing traffic in the latest SERPs.
The theory is that somehow Google lost its most current index of some large sites and has reverted back to an older version of the index. Therefore, newer pages are not indexed properly. That would then explain why Googlebot has been going nuts in indexing these sites in the past week - Googlebot has hit me more than 150,000 times since Monday. Having realized their error, Google folks are sending Googlebot out to recover what it is missing.
I don't know if I'm seeing confirmation here where none exists but if Marcia is correct that this same problem occurred months ago, then maybe it is further evidence that Google has accidentally reverted back to a previous index. That could be good news because things will get back to normal eventually.
Back in May, 2004 - disappearing homepages
Another disappearing act less than a month ago
|The theory is that somehow Google lost its most current index of some large sites and has reverted back to an older version of the index. |
I can disprove this theory. As I've said, the problem is limited to the homepage, and only the homepage. No other pages are affected.
Marcia, I thought about the removal bug, but common sense kinda points me away from that too. I would expect a potential competitor to remove more than just MY homepage if they were trying to pull something.
I think this is an spider bug. The duplicate links/content point is interesting.
July 29th, even Google experienced this problem:
If this problem didn't hit Google I'd be more worried. Instead, I wonder if they have an algo-out-of-control...
Somehow I don't think you'll get same-day service like Google did in restoring their homepage to the index! ;)
I've seen several of these MIA situations where the linking led to duplicates, like internal links pointing to /index.php and other pointing to /.
That may not be the reason of course, but every example I've seen has had something non-straightforward about its homepage linking.
Steve, I've seen a few problems from the same thing
|...could it be that if links are www.domain.com and domain.com and www.domain.com/index.html and just index.html that a dup filter could be triggered (if such exists)? |
Technically it shouldn't happen, it should be figured out; but that doesn't always seem to be the case. I've seen a few sites run into problems in different ways from inconsistent linking - within the sites themselves.
|July 29th, even Google experienced this problem: |
This is a bit different this time as it is just the home page. Google removed *all* of google.com
This is intended by Google.
Google wants to take out sites that they 'think' (their criteria) is manipulating their search returns in hoping for a better ranking.
Your site seems to be catched by the filter and is dumped to the trash can.
I have been observing this since October 2003 and they are doing the same thing, sometimes the outcome is good and sometimes it turn sour and this month, it seems to hit quite a lot of 'established' site. I have notice at least 4 sites that used to rank #1 with their keyword for years now no way to be seen (and my own opinion think that they deserve the place).
Google objective is clear, is just the current algo doesn't work 100% effectively.
Manipulating it in what way, AthlonInside? If people link with and without the www? There's no real control over how others link.
Manipulating in terms of over-optimization the site, utilizing methods such as buying links, excessive link exchange, H1 tags, ALT stuffing, etc.
IMHO, it has nothing to do with duplicate content filter, or www vs non-www, or the conspiration theory on google try to make more more in adwords. What they want to archieve is to control the level of manipulation.
Unluckly, some people that is not manipulating, looks manipulating! and they become the victim of the right intention of Google.
The curse to the innocents.
even worst, there are more that we can't control which I believe has contributed to the increasing number of MIA victims (many which are innocent).
With so much automated tools today that will add your link automatically and ask you for link exchange,
and tools that simply crawl search engine SERPs and make it a page with adwords in hope to make some money,
-> many falsed links are created.
This seems to be good for YESTERDAY. But today, it can be the poison on so many innocent established sites MIA. Too many links, with long anchor text, that is EXACTLY the same as your page title! (this tools read your page and use your title as the achor tex. Usually, titles are long)
You don't have control over this. But sadly, I believe Google's new filter has not been able to take this into consideration. Thus, these falsed links, have a strong contribution on triggering the MIA penalty.
* No Control = No Harm?
There is a believe where
Google will not punish you for what other webmasters do because you don't have control over them. For example, links from bad neighbourhood, etc.
But it seems funny that the same people strongly believe
Google will REWARD you for what other webmastesr do
because you don't have control over them. For example, links from good neighbourgood.
Knife - Cut vegetables; Weapon
Stocks - Make money; Lose Money
Drugs - Relief Pain; Abuse
War - Revolution; Rebel
If backlinks can serve your site, it can bite you as well.
Over the past week, I've seen subdomains dropped in "keyword spaces" that I watch. Today the www version is reappearing prominently in the SERPS, i.e. subdomain.example.com was dropped, www.example.com is reappearing. It doesn't seem to be penalized.
The one example that I've studied a little was cross linked but did not have duplicate content (that I saw).
I notice similar case as yours in one of the sites I monitor.
IMHO, the reason for this due to the implementation of Local Rank. In local rank, Google doesn't want a few sites from the same owner appear at the same time in the SERPs for a search. That's the possible reason why only www or your subdomain website appear. This also apply to webmaster that have a few websites (different domain) that rank well for the same keyword, now I notice they are filter to one site left in the SERPs.
For some people who read this, they will definitely ask how Google knows which sites are from the same owner. There are not very smart well on this but they are using 2 not-so-good but good-enough methods.
2. C Class IP Address
I'm seeing the same thing on two of my sites, a PR 6 7-year old forum and a PR 7 informational site. Both homepages show the PR has been nullified, however all subindexed pages still show PR.
One really weird thing I am noticing is that when searching for "www.website.com" (my forum site) on Google, I'm seeing what seems to be a nefarious "prepaid legal services" site coming up in place of my forum site as Google's cache of my domain. Should I be worried?
I saw my homepage being PR0 and dropped recently. I found this topic.
Now can you guys tell me DO I HAVE ANY HOPE OF COMING BACK?
I have removed ALT tags and made it a normal looking page... may be google find me innocent :)
We also had the home page dropped for one of our sites. I have no idea why it happened. It seems like other people have posted some ideas, but nothing definite. So I am just going to post some things which are different on this site (compared to other sites we own). It may be one of these factors, although it may have nothing to do with any of them. (I canít justify why any of these reasons would lead to a penalty, but maybe some of you that had an indexing issue/ penalty will see something similar.)
- Very generic Title ("Blues Widgets") with no other text.
- Possible duplicate content on some internal pages (but none of these pages have dropped from the index, only the main page and it has 100% unique content).
- Home page is the only page getting linked to (none of the internal pages have links going to them).
- It has received a higher rate of gaining links than was normal previously (the site has been up for 4 years). It would gain about 1 link per month in the past. This past month, it got about 8.
|I have removed ALT tags and made it a normal looking page... may be google find me innocent :) |
Google DOES NOT penalize for using Alt tags. For gosh sakes, they're there to improve usability! Google may not give them any weight, but they certainly don;t penailize for having them.
If you're a PR0, you've either been caught doing (or linking to) something bad (i.e. doorway page, spam, invisible text, etc.), or you've been caught by the mysterious "glitch" that snags any number of sites after each algo "tweak".
OK folks, not to discourage you, but let's keep this discussion on-topic.
I'm going under the assumption that this is a bug and not a penalty. I have no proof of this, other than my experience. It doesn't look or feel like a penalty to me.
FYI to others experiencing this problem - I am NOT talking about a PR0 penalty. The homepage is still returning PR via the toolbar, Google directory (!), and other PR checking tools.
So far the best guess I've heard has been some sort of combination duplicate filter/linking issue. After close investigation, there are currently four ways we are linking to the homepage:
All of the internal links are linked to the homepage via the requested form of the domain + index.asp. So, if you access the domain via
domain.com, all of the internal links to the homepage point to
domain.com/index.asp. Consequently, if you access the domain via
www.domain.com, all of the internal links to the homepage point to
The domain (not the homepage) has nearly 30,000 links pointing to it. Breaking the homepage links down a bit more:
- Roughly 1000 point to the
www.domain.com form of the homepage
- Roughly 300 point to the
domain.com form of the homepage
- Roughly 500 point to the
www.domain.com/index.asp form of the homepage
- Roughly 50 point to the
domain.com/index.asp form of the homepage
To reiterate: For money terms, Google used to return the
www.domain.com version of the domain. A URL search for all four types used to result in the same page being returned by Google. Now, the
domain.com return nothing, while the
domain.com/index.asp still return results as expected.
domain.com/index.asp do not rank for any of the previous terms that
www.domain.com used to rank for. No other pages other than the
domain.com forms of the homepage has been dropped
I plan to attack this problem in the following order:
1. My next thought is to change all of the internal links to point to
www.domain.com regardless of the requested domain or any other factor.
2. Google is currently maintaining two separate indexes of the site - one for
www.domain.com and another for
domain.com (a site query on both shows a different amount of pages returned). I'd like to 301 the entire
domain.com version of the site to
www.domain.com. This is SOP for me on all of the other sites, but tracking reasons have prohibited doing this until this point on this particular site.
3. Email firstname.lastname@example.org and beg for help.
I have a site that is doing exactly this:
home page is showing PR, but site search shows www.example.com as URL only. No cache for the home page.
About 2 months ago, I changed the htaccess file to 301 all http: //example.com requests to http: //www.example.com
I just checked the server headers and they're just fine. Barring a hosting outage that I didn't notice - I have to assume that this is a spidering problem with G.
[edited by: PatrickDeese at 6:50 pm (utc) on Aug. 16, 2004]
I'm with Marcia. If it were my site, I'd start by making sure that *all* inbound links pointed to the same exact URL variation.
And I'd search for scraper sites, dup stuff, etc.
Can't tell you how many homepages we've seen vanish for these sorts of reasons. When they were our pages that vanished, every time we found the issue and corrected it, the page came back, though sometimes with waits from 4-12 weeks.
<EDIT>Oops, missed part of the thread...almost certainly Google's maintaining two separate indexes of the site is the problem. It has happened to us. 301's should fix the problem</EDIT>
[edited by: caveman at 6:56 pm (utc) on Aug. 16, 2004]
Patrick, when I type the actual URL into Google, I get a "Sorry, no information was found..." message. What happens when you type the URL into Google?
caveman, as I've said, that's typically SOP for me. You've never heard me rant about internal link architecture before. ;-) But I inherited this site, and unfortunately this happened in the middle of us trying to fix all of this. :)
Yep, and with good reason. This sort of haphazard management of linking caused several sites of ours to plummet prior to, and during, Florida. We learned our lesson the hard way. Now, paying close attention to the set up of any new site, like you say, is SOP. Never had a problem since.
Hmm... this is what I get with a G search:
Google can show you the following information for this URL:
Find web pages that are similar to www.example.com
Find web pages that link to www.example.com
Find web pages that contain the term "www.example.com"
Not a very good sign - either the Google couldn't get into the site, or its in the process of being de-listed. :(
I am hoping it's a spidering problem - because suddenly about 60% of the pages are not showing their Adsense - using the alt advert instead.
|I am hoping it's a spidering problem - because suddenly about 60% of the pages are not showing their Adsense - using the alt advert instead. |
Oh, that's interesting.
PD - Did you throw up the 301s once you discovered the problem, or are you saying that you think the 301s are causing the problem?
|Did you throw up the 301s once you discovered the problem, or are you saying that you think the 301s are causing the problem? |
I have been adding the redirect domain.com to www.domain.com on all my sites for the past couple of months.
Prior to that I've only had a custom 404 page in my htaccess.
All issues aside, I am hoping that my web host had some sort of technical issue, or perhaps a misconfiguration that affected this site and that they resolved without telling me - it could also simply be that something I did screwed things up - I was making site-wide changes a couple of weeks ago and my connection went out for a couple hours - maybe Google was spidering when that happened.
>>>when I type the actual URL into Google, I get a "Sorry, no information was found..." message.
I really think this is a google bug at the moment.I am seeing sites in Alexa's top 2000 returning this when the quote "Sorry, no information was found..." usually means a site has been penalised.
And as another poster said,the same thing happened to google.com
"there are currently four ways we are linking to the homepage"
I would think it is likely you don't have to look any further.
I believe that Google is (thankfully) going after duplicate content all over the Internet. This leads to some situations where relatively innocent duplicate content gets bungled up. Sloppy webmastering is sometimes to blame, but sometimes it is normal old business of www versus non-www. The proactive thing to do is make it virtually impossible for Google to screw up, meaning make all your links absolute and have them pointing to the same consistent URLs.
Perhaps this shouldn't be necessary, but as a defense mechinasm there is no downside.
| This 74 message thread spans 3 pages: 74 (  2 3 ) > > |