Forum Moderators: phranque

Message Too Old, No Replies

Remove ?example.com_id_1105 from domain in Google

How do you remove query strings from URL to avoid supplemental index?

         

macweb

12:25 am on Oct 13, 2007 (gmt 0)

10+ Year Member



Hello,

I'm a newbie here, so forgive me if this problem has been addressed elsewhere.

Basically my site is listed on Google, with all pages unique apart from one page very similar to the following example:

www.example.com/?some-site.com_id_110501_question

This "page" appears in Google's supplemental index and I'd prefer to get rid of it because it's a duplicate of the main domain:

www.example.com

I can't tell Google Webmaster Tools to delete the duplicate, because it's basically the home page with a hijacked parameter appended. (I'm surprised Google indexes this.)

I'm aware of htaccess 301 redirects, but only the basics of redirecting and old page to a new page, like this:

redirect 301 /old/old.htm http://www.new.com/new.htm

Is there any way to use htaccess to remove the ?some-site.com_id_110501_question so the page disappears from Google's radar and eventually drops out of the index by itself?

Any help you can offer is appreciated.

Mac

[edited by: jdMorgan at 3:37 am (utc) on Oct. 15, 2007]
[edit reason] examplified [/edit]

g1smd

12:39 am on Oct 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would use a rewrite rule to silently rewrite the URL request to /page.does.not.exist. That's a rewrite, NOT a redirect.

That will ensure that the URL request serves a 404 response. Google will then delist it from the index.

macweb

1:03 pm on Oct 13, 2007 (gmt 0)

10+ Year Member



Many thanks. Does anyone know the exact htaccesss file code for this?

g1smd

6:53 pm on Oct 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are several examples of what you need posted in just the last few days.

Start here:


RewriteCond %{QUERY_STRING} some-site.com_id_110501_question
RewriteRule .* /this.file.does.not.exist [L]

[edited by: jdMorgan at 3:37 am (utc) on Oct. 15, 2007]
[edit reason] examplified [/edit]

jdMorgan

3:46 pm on Oct 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Alternatively, you could 301-redirect it to the correct URL:

RewriteCond %{QUERY_STRING} some-site\.com_id_110501_question
RewriteRule ^$ http://www.example.com/? [R=301,L]

Jim

[edited by: jdMorgan at 3:38 am (utc) on Oct. 15, 2007]

macweb

7:28 pm on Oct 14, 2007 (gmt 0)

10+ Year Member



Thanks guys, but neither of these has any effect.

Any other suggestions?

jdMorgan

7:44 pm on Oct 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The "%" was missing from the two previous posts. Also, if you have no other working mod_rewrite rules, you will need to preface the code with either both of the following lines, or the second line only, depending on how your server is set up:

Options +FollowSymLinks
RewriteEngine on

Jim

macweb

10:08 pm on Oct 14, 2007 (gmt 0)

10+ Year Member



Jim,

I have code in my htaccess file to redirect inbound links pointing to domain.com to transfer to www.domain.com (to transfer maximum google pagerank).

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
</IfModule>

I tried inserting the code provided in the previous replies, but it has no effect on the domain name string. Placing it within or without the <If tags makes no difference.

jdMorgan

11:06 pm on Oct 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, let's see the whole combined RewriteRule sequence then, and an example log entry from your server access log.

Also, be sure you've completely flushed your browser cache after changing the code on your server. Otherwise, your request will be served from cache, and will not be sent to the server.

Jim

macweb

12:16 am on Oct 15, 2007 (gmt 0)

10+ Year Member



Jim,

Okay, it was a cache problem. I wasn't aware the cache needed to be cleared for .htaccess.

Now I know.

For future reference, I confirm the following lines will remove the specified string from your URL:

RewriteCond %{QUERY_STRING} some-site\.com_id_110501_question
RewriteRule ^$ http://www.example.com/? [R=301,L]

e.g. If your page is listed in Google as:
www.example.com/?some-site\.com_id_110501_question

then the above code will 301 redirect to:
www.example.com/

Just change the
some-site\.com_id_110501_question
to whatever string you want removed.

Thanks for your help guys. Great job.

Mac

[edited by: jdMorgan at 3:35 am (utc) on Oct. 15, 2007]
[edit reason] examplified [/edit]

jdMorgan

12:47 am on Oct 15, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, if your browser has cached a page, then it won't send a request to your server if that page is re-requested, so no server-side code can/will have any effect; The second and subsequent request will be served from the browser cache until the cache entry expires.

Put the more-specific query string rule ahead of your "www" rule, so as to avoid stacked redirects for the query string.

Jim

macweb

1:03 am on Oct 15, 2007 (gmt 0)

10+ Year Member



Jim,

>>Put the more-specific query string rule ahead of your "www" rule, so >>as to avoid stacked redirects for the query string.

Okay I've done that also.

What is a "stacked redirect" and why does placing it before the www redirect solve it?

Mac

jdMorgan

1:23 am on Oct 15, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



With the "www" rule placed first, a request for "example.com/?some-site.com_id_110501_question" would first get redirected to "correct" the non-www domain, and then it would get redirected again to remove the query string. So, you'd have two 'stacked' redirects.

This is inefficient, and according to current thinking, less than optimal for passing PR -- The evidence indicates that PR is preserved through one 301 redirect, but not always through more than one.

So a good rule of thumb is: Place external redirects first, ordered from most-specific pattern to least, then place internal rewrites, again ordered from most-specific pattern to least.

Jim

[edited by: jdMorgan at 3:36 am (utc) on Oct. 15, 2007]