Forum Moderators: open
Woke up this morning to find that Google has DROPPED the homepage from the index. I put in the URL, and Google returns a "Sorry, no information is available for the URL" message.
All 150K other pages on the domain continue to remain indexed, however. It is ONLY the homepage that has been dropped.
Anyone else seeing this?
[edited by: bakedjake at 9:36 pm (utc) on Aug. 14, 2004]
What I don't understand, however, is what has changed? I'm not the only one reporting this behavior - I've spoken to 15 other people who have had the same thing happen.
Something big changed last week. I think Google's getting ready to make a big announcement when they IPO.
FWIW - Even with the homepage drop, G traffic to this site was up something like 20% last week.
So your hundreds of internal https pages pointing to a different location is clouding the water some. If you fix them to all point to the same location (http://www.yourdomain.com/) then we'll crawl all those pages in a little while, and you should be fine shortly afterwards.
I guess the takehome message for non-bakedjake members is to doublecheck your internal linkage. I'd pick a canonical root page like http://www.yourdomain.com/ and just stick with that by making sure any internal pages point there instead of to other versions of your root page.
Anticipating the problem was in fact the duplicate linking (thanks Marcia, who originally suggested the idea to me), I've gone ahead and implemented step one (see msg21) of my plan. So it should be corrected soon, and I'll let everyone know.
But GG, what changed last week/over the weekend? :) Whatever it was, you sure hit a lot of people with it.
New duplicate content filter?
or...
By my count, there are more links pointing to
www.domain.com than www.domain.com/index.asp, unless you count HTTPS links. Did you change the parsing/weight/something of https pages? ;-) [edited by: bakedjake at 12:00 am (utc) on Aug. 17, 2004]
[mydomain.com...] or [mydomain.com...]
With the extra / being the only difference there.
No different. But if you notice, whenever you type an address without the trailing / the server actually redirect it to the one with trailing /.
Traling / usually notify the server that it is a directory and it should read the default index file, or list the content of the directory (depending to the server configuration)
Never know a competitor could have used the site removal thing!
That is correct, and indicitive of the problem that I had as well as some others I talked to. But I could invision it happening to any page that could be accessed two different ways and served the same content, such as a directory:
/directory/
/directory/index.html
The advice and simple solution to the problem is to make sure you are consistent with your page references.
I believe that it is their strategic error and lack of common sense in their new algorithm.
Because in *any reasonable* algorithm the index (main entry) page should always inherit PR from the page with the *highest* PR in the site. In other words PR of the index page of the site should always be the highest one in the site.
This is very easy to implement and if they did not do this they lost their common sense and made the biggest error in the Google history. And it seems that they really did this error
Common sense tells us that index page represent the site. If we compare the site with a person or company and a page in the site with the "product" we may say that people usually measure the authority (PR) of the person by the authority (PR) of its greatest product. They often prefer to search for company or author (i.e. index page) than for a particular product because they believe that if a company made one good product may be other products are also good. And in general case they are right.
Because it is actually the consumers demand it is in Google's interest quickly fix this problem and make index page as searchable as the highest PR page in the site.
I would like to stress that index page *requires special treatment* because without this index page *in general* has much lower rank than other site pages. For example, when I link to another sites I try to link to the page with specific information, relevant to the topic of my page. And (surprise!) as a rule it is not an index page. Other seems do the same. When I now try to find the new part of my site in Google I find first numerous download sites that point to me and after them (you guess right!) my download page and not index one after them all. It is like suggest to come through the window instead the door.
As about what happened recently, I believe that Google virtually blocked PR spreading to other pages of the same site. Their error was that they seems did not understand that index page should be an exception. You may play and adjust spreading coefficients but not for index page. It should always inherit the highest PR.
There is no reason for this to be the case, or assumed.
The problem here is mostly due to sloppy webmastering, and of all the things Google can be blamed for, that isn't one. Maybe they should be somewhat better at mindreading the sloppiness of some websites, but they are at best a junior partner in the blame.
I've seen problems with this several times, but the most interesting was with one site a while back where I suggested to the webmaster that she make all her links back to the homepage absolute URLs. Well, she did all over, including the bottom navigation - but the problem has persisted; she did NOT make the change with her side navigation, which happens to be one of those expandable Javascript navigation menus. Things that make you go hmmmm.
"In other words PR of the index page of the site should always be the highest one in the site."
There is no reason for this to be the case, or assumed. The problem here is mostly due to sloppy webmastering...
I meant that index (main entry) page should always inherit PR from the page with the *highest* PR in the site.
Because common sense tells us that the authority (PR) of the site as a whole (index page) should be as high as the authority (PR) of its best page.
For example the authority of Shakespeare should be not less than the authority of its best play, say Hamlet.
It should not depend on the web master skill because it's the matter of consumer demand and not of the author /company self promotion. People often prefer to remember and search by companies and authors rather than by products.
With Shakespeare it happens to be all OK but the problem seems still here.
For example when I searched for Adobe I saw Acrobat page first. Again Google suggests us to go through the window instead of the door.
But they probably have already fixed something. Now when I search for my product, I still see it after all download pages that point to it.
But now I see my index page above my download page and previously I saw my download page only. I changed nothing.
Vadim.
> Their error was...
It doesn't look it was an error. More like an unintended side effect. And I stop here.
Why would Google spend so much time, so much of their money paying expensive PhDs AND risk an entire index or half an index JUST to fix sloppiness of webmasters? Seriously.
A bug is a bug is a bug...
what is somebody links to me like mydomain.com/index.htm , while my intention to google understanding is "www.mydomain.com/"?
or say if somebody links me www.mydomain.con and i internally link homepage as www.mydomain.com/index.htm?
In fact, I'm waiting for a site to return from the Google Grave as I type since it went missing last week due to a similar mistake of mine. Oops! (I blame Dreamweaver Guv')
On a broader issue - no other search engine seems to have this problem, so why is Google tripping on this?
It looks like the same 'error' that happened to the other sites mentioned in this thread so google thought it was duplicate content and therefor made one of the domains low valued. The only difference here is that the domain still is in the google index (cache).
Internal pages still have their usual PR.
Is this the same bug and for me more important, I changed the original content back again and how long would it take to get the PR back? It would really help me it it could get it's original PR back by the end of this month. (so if GoogleGuy could help me out here.....)
So while everyone is busy changing or making sure they don't happen to have one stray link, the fact is that unless you have some magic way of controlling every link on the internet, google may completely screw up the way they index pages. I have seen things like this in the past but they always seemed to be fixed a few days or weeks later. It's about three months since I started seeing this and it doesn't look like any progress is being made at all.
I was hoping they would fix it, but I am seriously doubting it now. I just wonder why they would fully index a ppc tracking url and not the actual page?
Hey Google, what's your problem? Nobody can control every link on the internet and why in the world would you index results from other SE's and PPC engines?
The tracking urls I have indexed have ZERO backlinks, the real url's have hundreds, including the GOOGLE directory!
You would think that they would at least get urls in their own directory right...
It doesn't look it was an error. More like an unintended side effect. And I stop here.
Why would Google spend so much time, so much of their money paying expensive PhDs AND risk an entire index or half an index JUST to fix sloppiness of webmasters? Seriously.
Because otherwise they are at risk to loose customers.
PR0 for index page (while other site pages have large PR) produce not what the customers expect and wish. So their customers may begin to use other search engines more.
Vadim.
PR0 for index page (while other site pages have large PR) produce not what the customers expect and wish. So their customers may begin to use other search engines more.
Vadim, off-topic, 95% of the customers of Google don't know what PageRank is, let alone actually care about it. Look at how many people use MSN "because it is the default".
Vadim, on-topic, PR0 of the index page is not what was happening. The index page still returned PR.
More like an unintended side effect.
That's exactly what it was. And it looks like G changed something on Thursday. I thought our homepage came back because of the changes we made, but in talking to other webmasters with the problem I'm starting to think something was changed on their end as well.
I've read this thread and others here, but I'm still a novice in this area. Is there any action I can take at this point to get my index page listed again? Should I submit it? Should I email G? Would they even respond?
Thanks in advance.
Vadim, off-topic, 95% of the customers of Google don't know what PageRank is, let alone actually care about it. Look at how many people use MSN "because it is the default".Vadim, on-topic, PR0 of the index page is not what was happening. The index page still returned PR.
I of course understand that people do not care of PR. They also do not care about webmasters skill and other reason why they cannot find main entrance to the site (index page). They blame Google
I meant that people simply would like to find the index page as easy as most top positioned or popular page in the site. Because index page like the main entrance, like a representative of a company or author.
Since it is a consumer demand it should be satisfied independently of the web master skill because best content authors are not always the best web masters.
For example, Internet Explorer is most forgivable browser to html errors and I believe that it is because Microsoft made the research and found out (surprise!) that people do not care about webmasters errors but prefer the browser that can fix this errors if possible. If the content is good, of course.
Vadim.
Is there any action I can take at this point to get my index page listed again? Should I submit it? Should I email G? Would they even respond?
2.The only reliable advice seems to get as many good (high PR) relevant links to index page as possible. And fill it with good content (though I should admit index page often is not the most appropriate place for large good content form the visitors point of view). And make some relevant outbound links (though again, index page is not often the most convenient place for them).
This is trivial, this is hard but it seems that this is the only reliable way. Because actually Google advises the same.
Vadim.