Forum Moderators: phranque

Message Too Old, No Replies

Redid the structure of the site to make it more useful for my visitors

and am worried this may cause hundreds of duplicate content pages...

         

David Bruning

6:13 pm on May 30, 2005 (gmt 0)

10+ Year Member



In one site, I use a database and modrewrite to create everything.

Originally the format was [domain.org...]

After reworking everything, I have dropped the *number*.php part so it's [domain.org...]

My problem is if google checks one of the previously indexed pages, it still appears to exist, just with no content. As there are hundreds of pages indexed, this could pose a serious duplicate content issue I fear.

The new work that has been done makes the site MUCH better for human visitors, so this needs to happen.

What would you do?

1) Use the google removal tool to remove the old pages
2) Use a rewrite rule to redirect to a 404 error <- if so may I ask for help crafting the rule? I know \b\d+\b matches a number, but so far, nothing I have created works.
3) create a 404 error for each page.

As always, any help is much appreciated :)

Dave

jd01

9:18 pm on May 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Dave,

Do all of your pages now have the same URL, or does the /cat/ change by page?

How are you generating the information for the 'new' URL's?

The reason I ask is you can actually 301 (permanent redirect) the indexed pages to the new version, to retain any link credit and help with indexing. Unfortunately, if they are all the same name (/cat/) the SE's will see them as one big page with a huge amout of information, so you *might* have to look for a more creative solution...

Please, let us know and I'm sure someone will be able to point you in a direction.

You might also start with the forum charter or this thread to help understand mod_rewrite a little:
[webmasterworld.com...]

Justin

David Bruning

9:27 pm on May 30, 2005 (gmt 0)

10+ Year Member



god no :)

each page has it's own url.

previously I had up to 700 pages per category simply labeled 1.php 2.php 3.php all in the directory.

Now there is a much more intuition navigation setup - with the result that there are no longer and numerical php pages - it's all directory names instead.

But for each page google indexed, it tries to go to numberical php page, and it doesn't give a 404 error or anything, just the template with no info.

David Bruning

9:56 pm on May 30, 2005 (gmt 0)

10+ Year Member



Made it so the old url's returned 404's.

It's not a money maker site, so if it takes a few months to lose the old stuff and get the new pages indexed, that is fine :)

jd01

10:51 pm on May 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have to be brief, but why not...

RewriteRule ^(^.]+)\.php$ http://www.yoursite.com/ [R=301,L]

This will redirect anything that ends in .php to your index page, so users will still go to your site and not think it's broken.

Justin