Forum Moderators: phranque
I use phpws on our website with the Superhack to output friendly URL's and it works great. The trouble is that, as we all know, SE spyders will not acept cookies. So a lot of our pages are indexed with the session id attached to the URL in a querry string, this makes for a lot of duplicate pages.
I tried to fix this with the .htaccess file but kept getting 500 internal server errors, later I found out that I needed to use a php.ini file instead. So I fixed the problem by creating a php.ini file and inserting
php_flag session.use_trans_sid off; in it. Now that I have that fixed I can work on removing the duplicate pages indexed by the SE's, this is where I'm having trouble. The links that I need to fix are all in the form of: /calendar-event34.html?224f02268dbe6c05c35f51cc823cb7fd=b50a2a7865642036b2ed32085988a976
I've been doing a lot of research trying to fix this and found some code that I though would fix my problem. I modified it to just remove the a-z0-9=a-z0-9 from the url and figured this was the answer to my problems. But......
DirectoryIndex index.php
Options +FollowSymLinks
RewriteEngine On
#removing sid's from spyders
RewriteCond %{HTTP_USER_AGENT} "Google" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Slurp" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "MSNBOT" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "teoma" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "ia_archiver" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Scooter" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Mercator" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "FAST" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "MantraAgent" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Lycos" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "ZyBorg" [NC]
RewriteCond %{QUERY_STRING} ([a-z0-9]+=[a-z0-9]+)
RewriteRule ^(.*)$ $1? [L,R=301]
the problem I'm having is that instead of writing:
[mywebsite.com...]
I'm getting:
[mywebsite.com...]
WOW! not exactly what I expected. I'm stumped, why would this code do this?
I'm not a webmaster and I've been doing a lot of reading but I'm having a hard time getting my head wraped around this stuff. I'm not sure if this will help or not but here is the .htaccess including the superhack..
DirectoryIndex index.php
Options +FollowSymLinks
RewriteEngine On
#removing sid's from spyders
RewriteCond %{HTTP_USER_AGENT} "Google" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Slurp" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "MSNBOT" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "teoma" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "ia_archiver" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Scooter" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Mercator" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "FAST" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "MantraAgent" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Lycos" [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "ZyBorg" [NC]
RewriteCond %{QUERY_STRING} ([a-z0-9]+=[a-z0-9]+)
RewriteRule ^(.*)$ $1? [L,R=301]
#rewrite rules for phpws friendly url's
#Standard URL (Must have a '~' in it)
RewriteRule ^([a-zA-Z0-9]*~.*)$ index.php?mod_rewrite=$1&%{QUERY_STRING} [NE]
#Module-specific URLs
RewriteRule ^article([1-9][0-9]*).html$ index.php?module=article&view=$1&%{QUERY_STRING}
RewriteRule ^news.html$ index.php?module=article&view=news&%{QUERY_STRING}
RewriteRule ^articlemenu.html$ index.php?module=article&disp=menu&%{QUERY_STRING}
RewriteRule ^announcement([1-9][0-9]*).html$ index.php?module=announce&ANN_user_op=view&ANN_id=$1&%{QUERY_STRING}
RewriteRule ^page([1-9][0-9]*).html$ index.php?module=pagemaster&PAGE_user_op=view_page&PAGE_id=$1&%{QUERY_STRING}
RewriteRule ^photoalbum.html$ index.php?module=photoalbum&PHPWS_AlbumManager_op=list&%{QUERY_STRING}
RewriteRule ^photoalbum([1-9][0-9]*).html$ index.php?module=photoalbum&PHPWS_AlbumManager_op=view&PHPWS_MAN_ITEMS[]=$1&%{QUERY_STRING}
RewriteRule ^calendar-event([1-9][0-9]*).html$ index.php?module=calendar&calendar[view]=event&id=$1&%{QUERY_STRING}
RewriteRule ^bbforum([1-9][0-9]*).html$ index.php?module=phpwsbb&PHPWSBB_MAN_OP=viewforum&PHPWS_MAN_ITEMS[]=$1&%{QUERY_STRING}
RewriteRule ^bbthread([1-9][0-9]*).html$ index.php?module=phpwsbb&PHPWSBB_MAN_OP=view&PHPWS_MAN_ITEMS[]=$1&%{QUERY_STRING}
RewriteRule ^(.*)$ [b]http://www.example.com/[/b]$1? [R=301,L]
For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].
JIm