Forum Moderators: phranque
(1)
www.domain.com/456 (number only, no trailing slash)
(2)
www.domain.com/string-string/
www.domain.com/string/string-string/
(In type (2) the string may or may not be hyphenated, but there is always a trailing slash.)
I have noticed that Yahoo Slurp is trying to crawl all pages without a trailing slash at the end (never with the trailing slash) and they have indexed all pages - including the string-based URLs - with no trailing slash at the end, eg: www.domain.com/string
I am wondering if this means that Yahoo does not fully recognise the internal linkages and is indexing what it sees as a botched site. Be that as it may, I am attempting to head this off with a mod_rewrite so that the trailing slash is always added to the string-based URLs. Incidentally the slashless URLs always display correctly in the browser but they have no Google PR. Only the correctly slashed URLs have Google PR.
My current .htaccess file is:
php_flag register_globals offOptions +FollowSymLinks
RewriteEngine On
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteRule ^(.*)$ [domain.com...] [R=301,L]RewriteRule ^(contact-domain)/$ /$1.php [L]
ErrorDocument 404 /index.php?error=404
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule ^(.+)$ /index.php/$1
</IfModule>
# END WordPress
I have tried altering the WordPress lines to:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule ^([0-9])$ /index.php/$1
RewriteRule ^([a-zA-Z-]+)$ /index.php/$1/
</IfModule>
# END WordPress
but this doesn't force the trailing slash onto the end of the string-based URLs as I would expect it to. In fact nothing changes at all. I would appreciate a pointer as to where I'm going wrong.
1) Cause the incorrect URL to return correct content?
2) Cause Yahoo to correct its incorrect URLs?
The form of your RewriteRule indicates the former, rather than the latter. It is the correct syntax for an *internal rewrite* as opposed to an *external redirect*.
If you want Yahoo to correct the URLs that it has in its database, then you need to specify the redirect syntax:
RewriteRule ^([a-z\-]+)$ http://www.example.com/index.php/$1/ [NC,R=301,L]
I changed your pattern and made the compare case-insensitive in the interest of efficiency.
Jim
What are you trying to accomplish here?
Good question.
I did some further digging, and as far as I can make out, Yahoo thinks it's cool to drop the trailing slash as a matter of policy. Whether this is really the case or not, I don't know, but I have seen it on a number of sites. From what I have seen I would be unlikely to be able to cause Yahoo to alter all their incorrectly indexed URLs because they don't index the trailing slash full stop.
The upshot is that I have changed all the internal links to drop the trailing slash (and notified the few external sites that link in), and done a 301 redirect from the trailing-slashed URLs to the new ones. I suppose this might disrupt the Google and MSN traffic for a while, but I can live with that because on reflection I'm inclined to agree with Yahoo that a slashless URL is not only more elegant but also better for giving people verbally.
Thanks for the reply, and the tip about RewriteCond. I didn't know that.