Welcome to WebmasterWorld Guest from 54.226.62.26

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Redirect dynamic url starting with ?

     

IngoZ

4:24 pm on Apr 12, 2012 (gmt 0)



I want to redirect few urls to a 404 not found page but don't know how. These dynamic urls were indexed by Google. I tried to remove them in Google Webmaster Tools but it's not possible, the urls are redirecting to the homepage and when I create a removal request appears as "site removal" not "page removal". All I want is to eliminate these pages.

/?start=50

/?start=100

/?ref=akagunduz.com

/?refsite=www.n1ads.com&ref=alexa-traffic-rank

enigma1

5:23 pm on Apr 12, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



First you shouldn't redirect from inside your domain to a 404. You can do a 404 straight way, although in this case you don't even need to do that. Just make sure these incorrect links aren't exposed somewhere in your domain otherwise the errors you see won't go away.

lucy24

10:09 pm on Apr 12, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



You want less code, not more. Take away the line that redirects bogus requests to the home page.

When you say "tried to remove" are you talking about the URL Removal area or the "ignore parameters" area? Here you need the parameters. This function is for parameters you no longer use, and for parameters that don't affect the content of the page.

IngoZ

10:27 pm on Apr 12, 2012 (gmt 0)



I used the Url removal option, not really redirecting, those pages are displaying the content from the homepage, lost positions for most keywords. I removed from SERPs all duplicated pages except the pages above. Also used rel canonical.
At your suggestion I added as parameters "?" and "=".

incrediBILL

12:34 am on Apr 13, 2012 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Um, not sure how that's going to work as "?" and "=" aren't parameters, "start", "ref" and "refsite" are actual parameters.

Another simple method is to put code into your pages so when parameters are passed that should be ignored by Googlebot you include the meta robots NOINDEX in the header of the page.

Also, did you add those URLs into robots.txt?

Once your make a crawling mess getting rid of it can create just as big, if not a bigger, mess to undo the damage.

BTW, work on methods that work for removing stuff from ALL search engines otherwise it'll still show up in Bing, Yahoo, etc. and ultimately end up scraped somewhere and right back into Google all over again.

IngoZ

8:56 am on Apr 13, 2012 (gmt 0)



I have added in my robots.txt

User-agent: *
Disallow: /*=
Disallow: /*?
Disallow: /*&

IngoZ

7:55 pm on Apr 13, 2012 (gmt 0)



I still think I should use something in htacces to block these urls, faster.

For another site I have an url indexed like this
website.tld/?refsite=www.n1ads.com
, also displaying the content from the main page, I can't remove it.

g1smd

2:15 am on Apr 14, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Send 404 or even 410:

RewriteRule %{QUERY_STRING} (^|&)something=value(&|$)
RewriteRule ^somepath - [G]

IngoZ

12:29 pm on Apr 14, 2012 (gmt 0)



I used something like this and works

RewriteEngine On
RewriteCond %{QUERY_STRING} ^ref=(.*)
RewriteRule ^.* /404.php%1? [NE,R=permanent]

g1smd

1:41 pm on Apr 14, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Your condition will match only when ref is the first parameter. My example allowed for there to be preceding parameters and still match.

The (.*) capture will capture the value for the ref parameter and the rest of the query string parameter names and values. My code captured only the first value.

The condition will now be checked for all requests: pages, images, stylesheets, js files. You should limit what is checked.

You're now sending status "301 Moved" in response to those requests. That is a problem. You should send 404.

The rule needs the L flag.

lucy24

5:16 pm on Apr 14, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



You're now sending status "301 Moved" in response to those requests. That is a problem. You should send 404.

:: detour to mod_rewrite docs, which I really ought to have memorized by now ::

Is there a mod_rewrite flag that says 404? I've only ever found a 410 [G].

I don't think we ever nailed down the original question: Did these queries formerly exist, or are they purely the product of google's fevered imagination? Does the site use query strings at all?


I just checked something I should have checked ages ago on my own (100% static) site. Was distressed to discover that if I make up a completely random query and tack it onto the name of a completely random html page, the query is simply ignored. Is this a problem?

g1smd

6:55 pm on Apr 14, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Yes, it is.

It's a potential source of infinite duplicate content. However, searchengines should be quite good at spotting this problem. With no dynamic content on the page, all URL versions should be byte for byte identical.

If you use no query stings at all for anything then such requests can all be either redirected or blocked.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month