Forum Moderators: open

Message Too Old, No Replies

When a Site Goes from Fully Indexed to One Page?

How does this happen?

         

jehoshua

5:10 am on Jul 19, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,

I have noticed in the last few days that the following website:

[snip]

which was previously fully indexed (all the links showed up in Google), now only has one entry on a Google search.

Any clues please?

Thanks,

Peter

[edited by: pageoneresults at 1:03 pm (utc) on July 19, 2004]
[edit reason] Removed URI Reference - Please Refer to TOS [/edit]

jehoshua

12:54 am on Jul 20, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,

Please PM me here to supply the URL for the website, if you wouldn't mind taking a quick look at the site, just to make sure there is nothing there that would cause this problem. The website has actually gone from:

fully indexed ====> NONE?

on Google. :(

Peter

jehoshua

4:11 am on Jul 23, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,

Well, with the help of "diamondgrl", the apparent 'reason' for the website being dropped from Google was "my bad". (oops).

As "diamondgrl" pointed out:

One thing I noticed is when doing site:example.com, your /test/index_new.html page came up in addition to the home page. Since this is duplicative of the home page, you probably are incurring the duplicate content penalty.

In summary, one of the search results from Google was being referenced from another website (a forum), and it was showing [example.com...]

I was needing help with a new index page, so placed it in the /test path, but had no idea my post from the forum would show up in Google. Later on I deleted the file /test/index_new.html , and no doubt the next time that Google visited, because I have this in .htaccess

ErrorDocument 404 [example.com...]

googlebot went to look for /test/index_new.html, it's not there, a '404' results, and the bot gets redirected to the main page, and as "diamondgrl" pointed out, duplicate content. :(

Now, the 'washup' has included:

1. Modify robots.txt to add this line

Disallow: /test/

2. Create a file called 404.shtml with some server side includes to display referrer, requested URI, etc.

3. Modified .htaccess as follows:

ErrorDocument 404 [example.com...]

Now, I'm not sure if I need this additional change, a further modification, to cover it all 'fully', may be to add this line to .htaccess

redirect 301 /test/index_new.html [example.com...]

Hmm, but then I will be 'penalised' won't I, because of duplicate content?

Thanks,

Peter