Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Would accidental .htaccess 404 trigger Google penalty?

index page dropped from Google index after error during redesign

         

shashlik

5:12 am on Dec 23, 2007 (gmt 0)

10+ Year Member



We had a re-design of our domain and for a number of days an .htaccess file forwarding 404s to the index page (and Google spidered in that time).

Subsequently, the index page dropped out of the index completely and our website went down in listings badly as result, as only internal pages are still showing.

Of course, we removed the .htaccess immediately, but on yesterdays spidering Google hit us even worse and we are now showing on page 3-5.

Has anybody experienced this situation? Are we still on the way down or will we regain our old listings.

What needs to be done now to improve the situation, what can we do to find out where the problem really is?

- already done: reinclusion request

tedster

6:20 am on Dec 23, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've seen this happen when the server does not return a 404 status code in the http header, but instead shows a 302 status (temporary redirect).

shashlik

7:29 am on Dec 23, 2007 (gmt 0)

10+ Year Member



That is unluckily not the case, header comes back as HTTP/1.1·200·OK; it is a static website with static pages, no PHP or content management systems in play.

nektotigra

12:13 pm on Dec 23, 2007 (gmt 0)

10+ Year Member



shashlik,
You don't have to use any CMS system to change the missing page's header. Just add a "ErrorDocument 404 /error404.html" string to your .htaccess file and create a custom 404 page, call it "error404.html" and upload it to the root directory of your server. That's all.

Marcia

12:36 pm on Dec 23, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've had a site go down that I didn't know about for several days, it disappeared from Google but then came back same as ever.

(Not so with MSN, if there's a hosting problem and you slip out and/or return the wrong header code, you're gone forever.)

shashlik

2:11 pm on Dec 23, 2007 (gmt 0)

10+ Year Member



nektotigra, thank you, but this was not really our question.

Maybe I was unclear. The website did not disappear, "only" the index page did (and completely, it is not a -30 or -950 penalty).
As all content flows from the index page, and subsequent pages have much less Pagerank, all other pages dropped quite a bit.

We assume it was a double content problem because with the wrong .htaccess 404-forwarding suddenly all 404 pages "looked" like the index page, and were indexed as such by Google (one can see one of them as identical copy of the index page in Google's cache).

This happened on the second round of Googlebot over those pages, after the .htaccess forwarding was already long fixed.

Therefore our question;
- has anybody experienced this before.
- how long does a re-inclusion request nrom ally take
- any other things we should do, eg somehow ask Google to drop the non-existent 404 pages (and how, as they are not really existing)

Marcia

3:08 pm on Dec 23, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you done up a sitemap? If not,this would be a perfect time to do it and keep watch on the site in Webmaster Central.

shashlik

7:46 pm on Dec 23, 2007 (gmt 0)

10+ Year Member



Done a sitemap containing index.htm quite a while ago.

There might be another error reason, we also selected to display only links without www. in Google Webmaster Tools, but it seems the page with the highest pagerank for the months before was index.htm with www. -- never expecting they would not both be in the index or Google would substitute correspondingly after offering this tool and recommending so strongly to decide on one or the other.
What good are those tools if afterwards all rankings are messed up...

If this is part of the problem, we are assuming that index.htm without www. will be indexed shortly?
Any experiences on this or should we revert back to both www. and not?

jdMorgan

9:06 pm on Dec 23, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First, when you find yourself in a hole, stop digging! Unless you find another real error, don't change anything. Google's timeline is weeks and months, not days; It may take some time for your page to recover. But continuing to change things during this recovery process will only make it take longer.

While you wait for things to settle down and for your page to recover, take a look through some of the older threads here on protocol, domain, page, index, and query canonicalization (There are many, try a search [webmasterworld.com]). Also, consider the usefulness of having your home page called "/index.html", rather than the shorter and more-prevalent "/" (Question: Does google link to "www.google.com/index.html" or just "www.google.com/"?).

Also take the time to check normal and error responses from your server by using a server headers checker such as the "Live HTTP Headers" add-on for Firefox and Mozilla browsers, or one of the similar Web-based header checkers (Note that some of these latter do not display full headers, and in some cases omit intermediate redirects, so test carefully before trusting them).

A proper solution for 404-Not Found and 410-Gone conditions is to provide error pages which acknowledge the specific error (404=We goofed or you found a bad link on the Web, 410=We removed this page intentionally). On the error page, include text links to your home page, site map, and search facility as applicable. You may also include a meta-refresh in that page's HEAD section to forward the client to your home page, but I'd strongly suggest making the timer value long enough for the visitor to actually read and fully-consider the error message; A short meta-refresh is treated as a redirect by some search engines.

Jim

shashlik

12:58 am on Dec 24, 2007 (gmt 0)

10+ Year Member



Thank you Jim; it will be really hard to keep our hands off, as our whole business and employees depend on this listing (and it is Christmas Peak Season here right now). Of course very good advice.

ecmedia

4:29 pm on Dec 24, 2007 (gmt 0)

10+ Year Member



I once changed the robots.txt file and by mistake blocked all search engines including G. Within days the number of pages started to go down and ranking disappeared and then I realized what I did. However, once I corrected the mistake, G returned and reindexed all pages and rankings also returned. So just be patient.

shashlik

1:18 am on Dec 25, 2007 (gmt 0)

10+ Year Member



main problem is (like always with Google) that we actually do not really know what triggered this, we are only guessing. Google Webmaster Tols shows our website as totally without problems, there are no warnings or other problems mentioned.

so there is neither a way of fixing it nor can we be sure that anything will improve and as one poster mentioned, Google works in weeks or months.

this feeling of total helplessness is the worst.