Welcome to WebmasterWorld Guest from 54.196.214.35

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Dynamic URL Rewrite - RewriteCond %{QUERY STRING}

RewriteCond QUERY_STRING for index.php?page=somepage

     
8:02 am on May 24, 2011 (gmt 0)

New User

joined:May 24, 2011
posts: 1
votes: 0


Hi all!

Could someone please help me on how to rewrite RewriteRule for query string "?page=otherpage"?

I tried this:
RewriteCond %{QUERY_STRING} page=index
RewriteRule ^index.php$
http://www.domain.com/$1?
[R=301]


It works for index.php, so the url path will only show
http://www.domain.com/
, instead of
http://www.domain.com/index.php?page=index


But what if RewriteCond %{QUERY_STRING} page=otherpage?
How to write the RewriteRule for the above RewriteCond?

The current path is
http://www.domain.com/index.php?page=otherpage
.
I want it to be read as:
http://www.domain.com/otherpage


Thanks in advanced!

hideaky
1:17 am on May 28, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0



RewriteCond %{QUERY_STRING} ^([^&]*&)*page=([^&]+)
RewriteRule ^index.php$ http://www.domain.com/%1? [R=301,L]

This assumes that your pages are no longer dynamically-generated and no longer link to other pages by using the "page=" query string, so you're trying to get rid of the old URLs listed in search results.

If this is not the case, see the thread "Changing Dynamic URLs to Static URLs" in our Apache Forum Library. That (very-detailed) thread describes the three-step process needed to use "static-looking" URLs on a dynamically-generated site.

Jim
6:48 am on July 21, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


I have a similar problem. I am trying to get rid of a specific way of generating pages that is no longer in use but is still indexed in search results.

One of these query strings for example is -

index.php?option=com_dtregister&controller=validate&task=uniqueUser&no_html=1

Another is
index.php?option=com_dtregister&Itemid=3


Following your example I got this to work

RewriteCond %{QUERY_STRING} ^([^&]*&)*option=com_dtregister
RewriteRule ^index.php$ http://www.example.com/%1? [R=301,L]


But what I'd really like to do is apply [G] so it returns a 410. At least I think that's what I want. I want Google to know that it should forget anything with com_dtregister.

Do you think [G] is the right approach, and if so, how would that work in this case?
7:19 am on July 21, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


Is there any possibility of humans following these outdated links, or is this strictly for search engines? For humans, the 410 is definitely a last resort; you'd really like them to land somewhere. If the page still exists in query-free form, just chop off the query string and let people land on the "base" page.

But for search engines, if it's a choice between returning a 410 and redirecting a bunch of different urls to the same place, go with the, er, Gone. In mod_rewrite they are set up exactly like [F]:

RewriteRule {any old stuff here} - [G]


Since you're applying it to everything that has a particular query string, you don't need to have anything in particular in the rule. Maybe \.php$ to be tidy about it.

In the %{QUERY_STRING} part, if you want to exclude everything that contains the com_dtregister element, and you're not capturing the rest, you don't need the
^([^&]*;)*option=

part. (Hm. Seems to be a rash of unwanted smileys lately. One way to keep them out is to wrap everything in "code" tags ;)) Just write out the part of the query string that you do want to look for.
9:09 am on July 21, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


No one needs these pages. com_dtregister didn't work. We moved on to another program. I want google to stop carrying around these old links. They're diluting our keyword density, too.

I'm lost on the syntax. Would it be possible to provide a real-world syntax-correct example?

thank you
4:26 pm on July 21, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Is that a Joomla site?

Make sure you have updated to the very latest "htaccess.txt" file from the newest Joomla install set.

The new file is compatible with all older versions of Joomla but contains important code changes.
4:38 pm on July 21, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


joomla 1.5.23
Just can't get the syntax.
7:02 pm on July 21, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


com_dtregister didn't work

Did you leave off the anchors? The two elements
^(.*)stuffhere(.*)$

and
stuffhere

are functionally the same in most situations, but the short version generally runs faster and cleaner.
7:53 pm on July 21, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


I'm sorry without a full syntax you'd put in an .htaccess file I don't know how to follow what you've posted. Syntax is a bear.
10:46 pm on July 21, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


Syntax is a bear.

And there you have htaccess in a nutshell. Full version: You said "com_dtregister didn't work". Did your complete line say

RewriteCond %{QUERY_STRING} ^com_dtregister 


or did it say

RewriteCond %{QUERY_STRING} com_dtregister


The first version would only work for query strings that begin with "com_dtregister"-- in other words, never. The second version will work for anything that contains "com_dtregister".

If you are using [G] you don't rewrite at all, you just put in a - (hyphen) [httpd.apache.org] meaning "don't change the original":

RewriteRule .+ - [G]


Did someone else just ask this identical question? What's the apache page doing so near the top of my browser history?
11:36 pm on July 21, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


I had:
RewriteCond %{QUERY_STRING} ^([^&]*&)*com_dtregister=([^&]+)
RewriteRule ^index.php$ [domain.com...] [R=301,L]

Then to make it a [G] I changed it to:

RewriteCond %{QUERY_STRING} ^([^&]*&)*com_dtregister=([^&]+)
RewriteRule - [G]

So I believe you're saying I could do this:
(and yes anything with com_dtregister is to be wiped out)

RewriteCond %{QUERY_STRING} com_dtregister
RewriteRule - [G]


Which is better to clean cruft out of Google Cache? 301 or 410?

I was worried the 301 was leaving the links but creating a massive duplicate content at the target (home page). So if I gave Google a 410 eventually it would let go of the idea that the links are ever going to be meaningful.

Does this sound reasonable?
12:20 am on July 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Both 301 and 410 remove the URL from the index.

301 seamlessly takes the visitor someplace else so you retain the traffic. 410 shuts the visitor out and encourages a bounce unless you put enticing links on your 410 error page.
12:21 am on July 22, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


Which is better to clean cruft out of Google Cache? 301 or 410?

If I knew the answer to that, I would be rich :)

I was worried the 301 was leaving the links but creating a massive duplicate content at the target (home page). So if I gave Google a 410 eventually it would let go of the idea that the links are ever going to be meaningful.

Does this sound reasonable?

That's assuming for the sake of discussion that "sounds reasonable" and "what google would do" are the same thing. A 410 is better than a 404, and a 301 is better than a 302. No matter what you do, google will periodically visit the old links just to see if maybe you decide in 2023 to reactivate them. But in the long term, 410 is best.

Edit after seeing intervening post: I think you said earlier that this is all about search engines, not human visitors. It's always easier when you only have to think about one or the other :)
7:51 am on July 22, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


Oops, time ran out for editing. Turns out there may be an entirely different alternative solution if your main concern is g###. That was a rhetorical statement. If you can bring yourself to sign up for Google Webmaster Tools, there's a section on "URL parameters" under "site configuration". (Found it while looking for something else, naturally.)
Only use this feature if you feel confident about how parameters work for your site. Telling Googlebot to exclude URLs with certain parameters could result in large numbers of your pages disappearing from our index.

So if you want "large numbers of your pages disappearing from our index" this would seem to be just the ticket :)
8:39 am on July 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


While that solution works, it is only good for Google. Configuring the site correctly is better. The fix will then work for all visitors and bots.
11:51 am on July 22, 2011 (gmt 0)

New User

5+ Year Member

joined:July 31, 2008
posts: 14
votes: 0


Great discussion. Thanks again.