Welcome to WebmasterWorld Guest from 54.162.93.137

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Dynamic URL Rewrite - RewriteCond %{QUERY STRING}

RewriteCond QUERY_STRING for index.php?page=somepage

     
8:02 am on May 24, 2011 (gmt 0)



Hi all!

Could someone please help me on how to rewrite RewriteRule for query string "?page=otherpage"?

I tried this:
RewriteCond %{QUERY_STRING} page=index
RewriteRule ^index.php$
http://www.domain.com/$1?
[R=301]


It works for index.php, so the url path will only show
http://www.domain.com/
, instead of
http://www.domain.com/index.php?page=index


But what if RewriteCond %{QUERY_STRING} page=otherpage?
How to write the RewriteRule for the above RewriteCond?

The current path is
http://www.domain.com/index.php?page=otherpage
.
I want it to be read as:
http://www.domain.com/otherpage


Thanks in advanced!

hideaky
1:17 am on May 28, 2011 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member




RewriteCond %{QUERY_STRING} ^([^&]*&)*page=([^&]+)
RewriteRule ^index.php$ http://www.domain.com/%1? [R=301,L]

This assumes that your pages are no longer dynamically-generated and no longer link to other pages by using the "page=" query string, so you're trying to get rid of the old URLs listed in search results.

If this is not the case, see the thread "Changing Dynamic URLs to Static URLs" in our Apache Forum Library. That (very-detailed) thread describes the three-step process needed to use "static-looking" URLs on a dynamically-generated site.

Jim
6:48 am on Jul 21, 2011 (gmt 0)

5+ Year Member



I have a similar problem. I am trying to get rid of a specific way of generating pages that is no longer in use but is still indexed in search results.

One of these query strings for example is -

index.php?option=com_dtregister&controller=validate&task=uniqueUser&no_html=1

Another is
index.php?option=com_dtregister&Itemid=3


Following your example I got this to work

RewriteCond %{QUERY_STRING} ^([^&]*&)*option=com_dtregister
RewriteRule ^index.php$ http://www.example.com/%1? [R=301,L]


But what I'd really like to do is apply [G] so it returns a 410. At least I think that's what I want. I want Google to know that it should forget anything with com_dtregister.

Do you think [G] is the right approach, and if so, how would that work in this case?
7:19 am on Jul 21, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Is there any possibility of humans following these outdated links, or is this strictly for search engines? For humans, the 410 is definitely a last resort; you'd really like them to land somewhere. If the page still exists in query-free form, just chop off the query string and let people land on the "base" page.

But for search engines, if it's a choice between returning a 410 and redirecting a bunch of different urls to the same place, go with the, er, Gone. In mod_rewrite they are set up exactly like [F]:

RewriteRule {any old stuff here} - [G]


Since you're applying it to everything that has a particular query string, you don't need to have anything in particular in the rule. Maybe \.php$ to be tidy about it.

In the %{QUERY_STRING} part, if you want to exclude everything that contains the com_dtregister element, and you're not capturing the rest, you don't need the
^([^&]*;)*option=

part. (Hm. Seems to be a rash of unwanted smileys lately. One way to keep them out is to wrap everything in "code" tags ;)) Just write out the part of the query string that you do want to look for.
9:09 am on Jul 21, 2011 (gmt 0)

5+ Year Member



No one needs these pages. com_dtregister didn't work. We moved on to another program. I want google to stop carrying around these old links. They're diluting our keyword density, too.

I'm lost on the syntax. Would it be possible to provide a real-world syntax-correct example?

thank you
4:26 pm on Jul 21, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Is that a Joomla site?

Make sure you have updated to the very latest "htaccess.txt" file from the newest Joomla install set.

The new file is compatible with all older versions of Joomla but contains important code changes.
4:38 pm on Jul 21, 2011 (gmt 0)

5+ Year Member



joomla 1.5.23
Just can't get the syntax.
7:02 pm on Jul 21, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



com_dtregister didn't work

Did you leave off the anchors? The two elements
^(.*)stuffhere(.*)$

and
stuffhere

are functionally the same in most situations, but the short version generally runs faster and cleaner.
7:53 pm on Jul 21, 2011 (gmt 0)

5+ Year Member



I'm sorry without a full syntax you'd put in an .htaccess file I don't know how to follow what you've posted. Syntax is a bear.
10:46 pm on Jul 21, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Syntax is a bear.

And there you have htaccess in a nutshell. Full version: You said "com_dtregister didn't work". Did your complete line say

RewriteCond %{QUERY_STRING} ^com_dtregister 


or did it say

RewriteCond %{QUERY_STRING} com_dtregister


The first version would only work for query strings that begin with "com_dtregister"-- in other words, never. The second version will work for anything that contains "com_dtregister".

If you are using [G] you don't rewrite at all, you just put in a - (hyphen) [httpd.apache.org] meaning "don't change the original":

RewriteRule .+ - [G]


Did someone else just ask this identical question? What's the apache page doing so near the top of my browser history?
11:36 pm on Jul 21, 2011 (gmt 0)

5+ Year Member



I had:
RewriteCond %{QUERY_STRING} ^([^&]*&)*com_dtregister=([^&]+)
RewriteRule ^index.php$ [domain.com...] [R=301,L]

Then to make it a [G] I changed it to:

RewriteCond %{QUERY_STRING} ^([^&]*&)*com_dtregister=([^&]+)
RewriteRule - [G]

So I believe you're saying I could do this:
(and yes anything with com_dtregister is to be wiped out)

RewriteCond %{QUERY_STRING} com_dtregister
RewriteRule - [G]


Which is better to clean cruft out of Google Cache? 301 or 410?

I was worried the 301 was leaving the links but creating a massive duplicate content at the target (home page). So if I gave Google a 410 eventually it would let go of the idea that the links are ever going to be meaningful.

Does this sound reasonable?
12:20 am on Jul 22, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Both 301 and 410 remove the URL from the index.

301 seamlessly takes the visitor someplace else so you retain the traffic. 410 shuts the visitor out and encourages a bounce unless you put enticing links on your 410 error page.
12:21 am on Jul 22, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Which is better to clean cruft out of Google Cache? 301 or 410?

If I knew the answer to that, I would be rich :)

I was worried the 301 was leaving the links but creating a massive duplicate content at the target (home page). So if I gave Google a 410 eventually it would let go of the idea that the links are ever going to be meaningful.

Does this sound reasonable?

That's assuming for the sake of discussion that "sounds reasonable" and "what google would do" are the same thing. A 410 is better than a 404, and a 301 is better than a 302. No matter what you do, google will periodically visit the old links just to see if maybe you decide in 2023 to reactivate them. But in the long term, 410 is best.

Edit after seeing intervening post: I think you said earlier that this is all about search engines, not human visitors. It's always easier when you only have to think about one or the other :)
7:51 am on Jul 22, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Oops, time ran out for editing. Turns out there may be an entirely different alternative solution if your main concern is g###. That was a rhetorical statement. If you can bring yourself to sign up for Google Webmaster Tools, there's a section on "URL parameters" under "site configuration". (Found it while looking for something else, naturally.)
Only use this feature if you feel confident about how parameters work for your site. Telling Googlebot to exclude URLs with certain parameters could result in large numbers of your pages disappearing from our index.

So if you want "large numbers of your pages disappearing from our index" this would seem to be just the ticket :)
8:39 am on Jul 22, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



While that solution works, it is only good for Google. Configuring the site correctly is better. The fix will then work for all visitors and bots.
11:51 am on Jul 22, 2011 (gmt 0)

5+ Year Member



Great discussion. Thanks again.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month