Welcome to WebmasterWorld Guest from 35.173.48.224

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Thought We had a good htaccess file.but

Then I saw this entry show up in Serps

     
12:29 pm on Jan 19, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2007
posts: 51
votes: 0


I though our htaccess file was written correctly but then

[mysite.com...] showed up in the serps.

Any help? Thanks

current htaccess:

Options -Multiviews

RewriteEngine on
#
# Externally redirect direct client requests for "/index.htm" to "/" in
# canonical domain (This applies to /index pages in any directory)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ [mysite.com...] [R=301,L]
#
# Externally redirect to remove multiple contiguous slashes at beginning or end of URL
RewriteCond %{REQUEST_URI} ^//+(.*)$ [OR]
RewriteCond %{REQUEST_URI} ^(.*/)/+$
RewriteRule / [mysite.com...] [R=301,L]
#
# Externally redirect to remove multiple contiguous slashes embedded in URL
RewriteCond %{REQUEST_URI} ^/([^/]+)//+(.*)$
RewriteRule // [mysite.com...] [R=301,L]
#
# Externally redirect non-canonical domain requests to canonical domain.
RewriteCond %{HTTP_HOST} ^mysite\.com [NC]
RewriteRule (.*) [mysite...] [R=301,L]

AddHandler server-parsed .htm

5:08 pm on Jan 19, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2047
votes: 1


I hope there's a fix for these question-mark oddities, too. Mine are usually hit only by Googlebot and they look like:

http://example.com/dir/filename.html?ref=somesite.com

Googlebot gets a 200 because the file exists and the variation isn't 404'd.

Thing is, the "?ref=" includes sites with objectionable names (read: nasty words) or from countries with reputations for site-whacking (.info, .ru, etc.). I want them 404'd forever.

Alas, this (probably wrong:) code doesn't do the trick:


RewriteCond %{REQUEST_URI} ^(.*)ref=(.*) [NC,OR]
RewriteCond %{REQUEST_URI} ^(.*)badwordhere(.*) [NC,OR]
RewriteCond %{REQUEST_URI} ^(.*)badsitehere(.*) [NC]
RewriteCond %{REQUEST_URI}!^/error\.html$
RewriteRule .* - [F]

6:06 pm on Jan 19, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Query strings are not part of REQUEST_URI, because they are not part of the URI (or URL) -- They are data to be passed to the resource at that URI (normally a script). To be clear, a script has a "location" on the web -- a URL or URI, but the data passed to it does not.

As a simple analogy, we put our letter inside an addressed envelope, but we do not consider the contents of our letter to be part of the address that we write on the envelope. In this analogy, the URL is the "address" and the query string is the "message."

Query strings are handled separately in mod_rewrite, requiring the use of a RewriteCond. The RewriteCond can test either QUERY_STRING or THE_REQUEST, although in this case we must use THE_REQUEST to catch the case where the query string is blank, but a "?" is appended to the URL.

Here is a general fix, but be warned, it does not include loop prevention, and it redirects to remove query strings from *all* requests. It cannot be used without modification on sites using scripts to generate pages, nor can it be used as-is on sites which use custom error pages. It also implements a 301 redirect and not a 403-Forbidden response, because it is far more important to keep spurious query strings out of search engine indexes than it is to return a 403 to log spammers (they pay absolutely no attention to your server's response, because all they want is to get listed in your log file).


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]+\?
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

For sites which mix dynamic and static pages, the following can be used to remove query strings from everything except direct client requests for .php files:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?]*)\?
RewriteRule !\.php$ http://www.example.com/%1? [R=301,L]

Tweak to suit! :)

Jim

6:35 pm on Jan 19, 2008 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2007
posts: 51
votes: 0


Oh boy,
Thanks, but I just know if I start tweaking
what I posted I am gonna mess somethin else up since
I am not sure what I am doing in the first place as I had
just copied my htaccess file from somewhere else.
7:06 pm on Jan 19, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Ah, well... This is the place to *learn* how to configure Apache. It's not terribly difficult, but neither is it trivial. I can't recommend cutting and pasting code from *any forum* without understanding exactly what it does to your server configuration -- and more to the point, what using it might do to your search engine rankings, traffic, income, etc.

If these are not important to you, then you can avoid having to research [webmasterworld.com] and modify before using... ;)

Jim

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members