Forum Moderators: phranque

Message Too Old, No Replies

redirecting dynamic pages

         

morags

1:58 pm on Aug 24, 2006 (gmt 0)

10+ Year Member



I want to redirect all pages from www.oldsite.com to the index page of www.newsite.com. To do this, I used:

RedirectMatch 301 (.*) [newsite.com...]

All works as it should. Well almost.

All pages are being redirected to the home page, but any page with a "?" in it causes the following -

www.oldsite.com/foo.pfp?bar=1

is being redirected to

www.newsite.com/?bar=1

The index page does show as expected, but will search engines treat www.newsite.com and www.newsite.com/?bar=1 as duplicates? Both show the same page.

If it does create duplicate pages, how do I redirect an entire old site of dynamic urls to the home page of the new site?

jdMorgan

12:06 am on Aug 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Search engines consider URLs to be duplicates if the page content is the same.

Is it that the exact value 'bar=1' is the problem, or is it that you want to get rid of the query strings completely? The purpose and goal are not clear, so I hesitate to offer 'my best' solution, as it may not be what you want or need.

With a more specific description of your old and new URLs and your purpose, it should be reasonably easy to find an efficient and thorough solution. For example, if you are already rewriting static URLs to dynamic URLs that match or resemble your example, then there's only one easy solution that avoids an infinite rewrite loop, and it's tricky.

Jim

morags

7:46 am on Aug 25, 2006 (gmt 0)

10+ Year Member



Hi Jim,

It's the complete query string I want to get rid of. googlebot and slurp are currently on the site - so I need to act quick.

What currently happens:

All oldsite.com pages are 301'd to home page of newsite.com

e.g www.oldsite.com/foo.php?bar=1 is redirected to www.newsite.com/?bar=1 (I don't really want the "?bar=1", but couldn't work out how to 301 directly to the home page alone).

Bots are currently reindexing these old pages as www.newsite.com/?bar=1, www.newsite.com/?bar=2 etc. All are sending 200 OK headers. However, every page is an identical copy of www.newsite.com/index.html - which is what worries me.

I now plan, immediately, to use MOD_REWRITE on www.newsite.com so that any url containing "/?" is rewritten to "www.newsite.com/" - once I work out the directive to do so.

Have I done the right thing - eventually?

Or do SEs ignore anything after "/?" and I am worrying unneccessarily? That's all I really wanted to know.

jdMorgan

2:45 pm on Aug 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You may end up with duplicate content problems, because SEs index by URLs, not pages.

RewriteCond %{QUERY_STRING} .
RewriteRule (.*) http://www.newsite.com/$1? [R=301,L]

Jim

morags

6:19 pm on Aug 25, 2006 (gmt 0)

10+ Year Member



I managed to get it working successfully, but had to remove the "$1" from the destination url. With the "$1" the rewritten url still had the querystring appended. But after removing it, all works as it should.

Thanks Jim.

morags

9:41 pm on Aug 25, 2006 (gmt 0)

10+ Year Member



Stll have a slight problem. Google has obviously got hold of some of the URLs which were being created incorrectly. When Googlebot comes round it looks for these non-existent URLs - and gets a 200 OK.

e.g. incorrect URL was www.newsite.com/?bar=1

When google comes looking for this URL it gets served the home page, and a 200 OK header. So it looks like I wasn't quick enough to implement the correct rewrite rule before Google got hold of some of the wrongly rewritten URLs.

So what I think I should do now, if possible, is rewrite (on the NEW server) any URLs which contain "/t" and rewrite them to www.newsite.com/. Correct? If so, what rewrite rule should I use?

And is this really a problem? If it is, then I can think of a very easy way to bring down competitors sites - and we all know that Google says that this is not possible! (how about I create a 100,000 page spammy site, get it indexed by Google, then 301 it incorrectly to competitors site - giving them 100,001 home pages?)