homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

Redirect dynamic url starting with ?

 4:24 pm on Apr 12, 2012 (gmt 0)

I want to redirect few urls to a 404 not found page but don't know how. These dynamic urls were indexed by Google. I tried to remove them in Google Webmaster Tools but it's not possible, the urls are redirecting to the homepage and when I create a removal request appears as "site removal" not "page removal". All I want is to eliminate these pages.







 5:23 pm on Apr 12, 2012 (gmt 0)

First you shouldn't redirect from inside your domain to a 404. You can do a 404 straight way, although in this case you don't even need to do that. Just make sure these incorrect links aren't exposed somewhere in your domain otherwise the errors you see won't go away.


 10:09 pm on Apr 12, 2012 (gmt 0)

You want less code, not more. Take away the line that redirects bogus requests to the home page.

When you say "tried to remove" are you talking about the URL Removal area or the "ignore parameters" area? Here you need the parameters. This function is for parameters you no longer use, and for parameters that don't affect the content of the page.


 10:27 pm on Apr 12, 2012 (gmt 0)

I used the Url removal option, not really redirecting, those pages are displaying the content from the homepage, lost positions for most keywords. I removed from SERPs all duplicated pages except the pages above. Also used rel canonical.
At your suggestion I added as parameters "?" and "=".


 12:34 am on Apr 13, 2012 (gmt 0)

Um, not sure how that's going to work as "?" and "=" aren't parameters, "start", "ref" and "refsite" are actual parameters.

Another simple method is to put code into your pages so when parameters are passed that should be ignored by Googlebot you include the meta robots NOINDEX in the header of the page.

Also, did you add those URLs into robots.txt?

Once your make a crawling mess getting rid of it can create just as big, if not a bigger, mess to undo the damage.

BTW, work on methods that work for removing stuff from ALL search engines otherwise it'll still show up in Bing, Yahoo, etc. and ultimately end up scraped somewhere and right back into Google all over again.


 8:56 am on Apr 13, 2012 (gmt 0)

I have added in my robots.txt

User-agent: *
Disallow: /*=
Disallow: /*?
Disallow: /*&


 7:55 pm on Apr 13, 2012 (gmt 0)

I still think I should use something in htacces to block these urls, faster.

For another site I have an url indexed like this
, also displaying the content from the main page, I can't remove it.

 2:15 am on Apr 14, 2012 (gmt 0)

Send 404 or even 410:

RewriteRule %{QUERY_STRING} (^|&)something=value(&|$)
RewriteRule ^somepath - [G]


 12:29 pm on Apr 14, 2012 (gmt 0)

I used something like this and works

RewriteEngine On
RewriteCond %{QUERY_STRING} ^ref=(.*)
RewriteRule ^.* /404.php%1? [NE,R=permanent]


 1:41 pm on Apr 14, 2012 (gmt 0)

Your condition will match only when ref is the first parameter. My example allowed for there to be preceding parameters and still match.

The (.*) capture will capture the value for the ref parameter and the rest of the query string parameter names and values. My code captured only the first value.

The condition will now be checked for all requests: pages, images, stylesheets, js files. You should limit what is checked.

You're now sending status "301 Moved" in response to those requests. That is a problem. You should send 404.

The rule needs the L flag.


 5:16 pm on Apr 14, 2012 (gmt 0)

You're now sending status "301 Moved" in response to those requests. That is a problem. You should send 404.

:: detour to mod_rewrite docs, which I really ought to have memorized by now ::

Is there a mod_rewrite flag that says 404? I've only ever found a 410 [G].

I don't think we ever nailed down the original question: Did these queries formerly exist, or are they purely the product of google's fevered imagination? Does the site use query strings at all?

I just checked something I should have checked ages ago on my own (100% static) site. Was distressed to discover that if I make up a completely random query and tack it onto the name of a completely random html page, the query is simply ignored. Is this a problem?


 6:55 pm on Apr 14, 2012 (gmt 0)

Yes, it is.

It's a potential source of infinite duplicate content. However, searchengines should be quite good at spotting this problem. With no dynamic content on the page, all URL versions should be byte for byte identical.

If you use no query stings at all for anything then such requests can all be either redirected or blocked.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved