Forum Moderators: phranque

Message Too Old, No Replies

Htaccess stopping G from indexing

         

Swordfish

5:18 am on Mar 17, 2006 (gmt 0)

10+ Year Member



I have added tons of new content over the past 4 weeks and cant seem to get google to notice one page although I quickly had my pages indexed before I added a few more lines to my htaccess.

It looks like this:

Options +FollowSymlinks -Indexes
RewriteEngine On
RewriteRule ^(conventional-reels¦combos¦lights¦rod-holders¦casting-reels¦gps¦chartplotters¦vhf-radio
¦trolling-motors¦hooks¦cast-nets¦line-and-leader¦electric-reels¦lures¦rods¦terminal-tackle¦tools
¦spinning-reels¦storage¦rods¦fishfinders¦accessories¦fly-reels¦sunglasses¦watches¦binoculars¦kites
¦downriggers¦radar¦fighting-belts¦gaffs¦harpoons)/$ /fishing-tackle.php?cid=$1 [nc,L]
RewriteRule ^(conventional-reels¦combos¦lights¦rod-holders¦casting-reels¦gps¦chartplotters¦vhf-radio¦
trolling-motors¦hooks¦cast-nets¦line-and-leader¦electric-reels¦lures¦rods¦terminal-tackle¦tools
¦spinning-reels¦storage¦rods¦fishfinders¦accessories¦fly-reels¦sunglasses¦watches¦binoculars¦kites
¦downriggers¦radar¦fighting-belts¦gaffs¦harpoons)/(.*)\.html$ /fishing-tackle.php?cid=$1&pid=$2 [nc,L]
RewriteRule ^(.*)\-boats\/(.*)\.html$ /boats.php?mlink=$1&plink=$2 [nc,L]
RewriteRule ^(.*)\-boats\/$ /boats.php?mlink=$1 [nc,L]
RewriteRule ^(.*)\-NGK\-Spark\-Plugs\-details\-([0-9]+)\.html$ /ngk-spark-plugs.php?pid=$1&plugid=$2 [nc]
RewriteRule ^(.*)\-NGK\-Spark\-Plugs\-list\.html$ /ngk-spark-plugs.php?lid=$1 [nc]
RewriteRule ^(.*)\-NGK\-Spark\-Plugs\.html$ /ngk-spark-plugs.php?cid=$1 [nc]
RewriteRule ^(.*)\.html$ /sitemap.php?aid=$1 [nc]
RewriteRule ^rss\-(.*)\.php$ /rss.php?cat=$1 [nc]

Can you see any issues?

[edited by: jdMorgan at 8:03 pm (utc) on Mar. 17, 2006]
[edit reason] Fixed side-scroll. [/edit]

lammert

3:02 pm on Mar 17, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In such a case I should look in the Apache access_log file. Is Googlebot visiting, and which return code are you feeding it? Even if your .htaccess is blocking Googlebot, you should still see attempts in the log file of Googlebot crawling your site.

Swordfish

3:26 pm on Mar 17, 2006 (gmt 0)

10+ Year Member



I'm showing that googlebot is still coming by everyday and indexing my home page and everything published to my root. However anything in sub directories is not getting indexed.

I also have an issue where a .html page that doesnt exist is automatically redirected to the home page index.php

Instead of a 404 page not found...

jdMorgan

8:08 pm on Mar 17, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This rule will redirect *any* .html page request to the sitemap, whether it exists or not:

RewriteRule ^(.*)\.html$ /sitemap.php?aid=$1 [nc]

If this is not the problem, then we'll need to know if your have any other directives in your .htaccess file.

Also note that you'll get much better performance out of these rules if you use a forward-looking negative match, rather than the abiguous ".*" pattern. In addition, always use the [L] flag unless you have a known reason not to. For example:


RewriteRule ^([^.]+)\.html$ /sitemap.php?aid=$1 [NC,L]

Jim

Swordfish

10:56 pm on Mar 17, 2006 (gmt 0)

10+ Year Member



Right this line is redirecting

RewriteRule ^([^.]+)\.html$ /sitemap.php?aid=$1 [NC,L]

What changes would I make to this so that the pages that I can publish .html pages manually?

Could this screw with G's indexing?

Can't figure out why G is not indexing the new pages...

My other sites are seeing good indexing...

jdMorgan

1:19 am on Mar 18, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nothing you do with rewrites is going to stop Google from trying to fetch (index) the files. So that is not likely related to the misbehaving rule problem unless you've had the problem for several months.

You can put a RewriteCond ahead of that rule to exclude files that exist from being re-written to the script, using something like


RewriteCond %{REQUEST_FILENAME} !-f

Jim