homepage Welcome to WebmasterWorld Guest from 54.145.183.126
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
301 redirect all pages to homepage
unclej



 
Msg#: 4662951 posted 7:21 pm on Apr 14, 2014 (gmt 0)

Hello
I want to 301 redirect index.html to homepage. Also, I want to 301 redirect all other pages to homepage.

I have the following code in my htaccess.

RewriteEngine On
# Redirect non-www to www:
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^.*/index\.html
RewriteRule ^(.*)index.html$ /$1 [R=301,L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . / [L,R=301]

SetEnvIfNoCase User-Agent .*rogerbot.* bad_bot
SetEnvIfNoCase User-Agent .*exabot.* bad_bot
SetEnvIfNoCase User-Agent .*mj12bot.* bad_bot
SetEnvIfNoCase User-Agent .*dotbot.* bad_bot
SetEnvIfNoCase User-Agent .*gigabot.* bad_bot
SetEnvIfNoCase User-Agent .*ahrefsbot.* bad_bot
SetEnvIfNoCase User-Agent .*sitebot.* bad_bot
SetEnvIfNoCase User-Agent .*BLEXbot.* bad_bot
SetEnvIfNoCase User-Agent .*Blekkobot.* bad_bot
SetEnvIfNoCase User-Agent .*SEOkicks-Robot.* bad_bot
SetEnvIfNoCase User-Agent .*BotALot.* bad_bot
SetEnvIfNoCase User-Agent .*Alexibot.* bad_bot
SetEnvIfNoCase User-Agent .*BecomeBot.* bad_bot
SetEnvIfNoCase User-Agent .*BunnySlippers.* bad_bot
SetEnvIfNoCase User-Agent .*CheeseBot.* bad_bot
SetEnvIfNoCase User-Agent .*Foobot.* bad_bot
SetEnvIfNoCase User-Agent .*exabot.* bad_bot
SetEnvIfNoCase User-Agent .*grub.* bad_bot
SetEnvIfNoCase User-Agent .*grub-client.* bad_bot
SetEnvIfNoCase User-Agent .*hloader.* bad_bot
SetEnvIfNoCase User-Agent .*httplib.* bad_bot
SetEnvIfNoCase User-Agent .*humanlinks.* bad_bot
SetEnvIfNoCase User-Agent .*InfoNaviRobot.* bad_bot
SetEnvIfNoCase User-Agent .*JennyBot.* bad_bot
SetEnvIfNoCase User-Agent .*Jetbot.* bad_bot
SetEnvIfNoCase User-Agent .*larbin.* bad_bot
SetEnvIfNoCase User-Agent .*LexiBot.* bad_bot
SetEnvIfNoCase User-Agent .*LinkextractorPro.* bad_bot
SetEnvIfNoCase User-Agent .*LinkWalker.* bad_bot
SetEnvIfNoCase User-Agent .*LNSpiderguy.* bad_bot
SetEnvIfNoCase User-Agent .*moget.* bad_bot
SetEnvIfNoCase User-Agent .*MSIECrawler.* bad_bot
SetEnvIfNoCase User-Agent .*naver.* bad_bot
SetEnvIfNoCase User-Agent .*NetAnts.* bad_bot
SetEnvIfNoCase User-Agent .*NetMechanic.* bad_bot
SetEnvIfNoCase User-Agent .*NICErsPRO.* bad_bot
SetEnvIfNoCase User-Agent .*Nutch.* bad_bot
SetEnvIfNoCase User-Agent .*Openbot.* bad_bot
SetEnvIfNoCase User-Agent .*Openfind.* bad_bot
SetEnvIfNoCase User-Agent .*psbot.* bad_bot
SetEnvIfNoCase User-Agent .*ProWebWalker.* bad_bot
SetEnvIfNoCase User-Agent .*RepoMonkey.* bad_bot
SetEnvIfNoCase User-Agent .*scooter.* bad_bot
SetEnvIfNoCase User-Agent .*Stanford.* bad_bot
SetEnvIfNoCase User-Agent .*SpankBot.* bad_bot
SetEnvIfNoCase User-Agent .*SiteSnagger.* bad_bot
SetEnvIfNoCase User-Agent .*suzuran.* bad_bot
SetEnvIfNoCase User-Agent .*Teleport.* bad_bot
SetEnvIfNoCase User-Agent .*WebBandit.* bad_bot
SetEnvIfNoCase User-Agent .*WebCopier.* bad_bot
SetEnvIfNoCase User-Agent .*Xenu.* bad_bot
SetEnvIfNoCase User-Agent .*Zeus.* bad_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>


I have 2 questions.
1) is it necessary to redirect index.html or is it redundant because all pages are being redirected anyway? should I get rid of those 2 lines regarding index.html ?
2) is the code all good? do you see anything wrong with it? I just found these on different websites and combined them together but I dont really know what I am doing.

 

not2easy

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



 
Msg#: 4662951 posted 7:41 pm on Apr 14, 2014 (gmt 0)

Copying old solutions is not a good way to deal with issues. Many of the bots you list have not been seen in a very long time and waste your server's time checking. Look at the User-Agents of named robots that actually do visit your site and you will have a much smaller list. UAs can be combined to be more efficient.

You don't mention why you don't want visitors to access your site except for one page, usually mass redirecting to one destination is not beneficial.

There are other issues, the syntax of redirects and bot blocking needs work. Overall you probably will want to start over, one part at a time, so you can understand your rules better and be able to maintain the file as it evolves.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4662951 posted 9:06 pm on Apr 14, 2014 (gmt 0)

I want to 301 redirect index.html to homepage. Also, I want to 301 redirect all other pages to homepage.

Everyone in unison: Noooooo!

index.html to homepage, yes, definitely. Also any other index.xtn to whatever directory it belongs to. The two basic forms are

RewriteRule ^([^.]+)index\.html http://www.example.com/$1 [R=301,L,NS]

or

RewriteCond %{THE_REQUEST} index\.html
RewriteRule ^([^.]+)index\.html http://www.example.com/$1 [R=301,L]


If the [NS] version works, you don't need the with-condition version. If you have literal periods in your filepaths (like apache dot org with all those /2.2/ and /2.4/ directories) the rule needs to be a little more complicated; I've given the simplest version.

None of the .* in the UA list are necessary, since there are no anchors.

Any time you've got a list longer than 3 or 4 lines, put them in alphabetical order. Or numerical order or whatever is appropriate.

You can shave a lot of bytes by replacing "SetEnvIfNoCase User-Agent" by "BrowserMatch". Add "NoCase" only if it's appropriate for the specific entity you're blocking. Textbook example: "Googlebot" is the real thing. "GoogleBot" is a spoofer. Most robots stick with a particular casing.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved