Welcome to WebmasterWorld Guest from 35.175.120.174

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Redirect /forum//feed to /forum/feed

     
5:48 pm on Jan 22, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1189
votes: 6


For some reason Google has started crawling /forum//feed and it is generating hundreds of 404s.

I don't know why this is happening - if it is the forum software, or some other recent change. But until I work it out I need to redirect these to the correct location to stop the 404s filling up in GWT.

I only need this to run on the /forum/ folder and not anywhere else so would this work?


<Directory /forum/>
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
</Directory>
9:25 pm on Jan 22, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 889


By the usual yawn-provoking coincidence, only yesterday I discovered that the reason the googlebot periodically asks for /directory//blahblah/ is that-- wait for it-- in one place, on one page, I had a coding error that led to a link to /directory// instead of the intended /directory/subdir/ (that is, "subdir" was undefined). I only discovered it when several consecutive human visitors requested the same page, all from the same rarely-visited referring page. Ahem. Cough-cough. I've been redirecting for a good year before I figured out why this was going on; in fact it's a mystery why only Google was doing it. (Could it be that all other search engines are simply more intelligent, and know that multiple slashes have to be a mistake?)

You are right about one thing: The RewriteRule itself will not recognize // in the path. (No idea why. I just know from direct personal experience that it doesn't.) You have to put it in a RewriteCond. But the rule itself isn't optimal.

I would write the rule like this:
RewriteCond %{REQUEST_URI} ^/forum//+(.*)
RewriteRule . http://www.example.com/forum/%1 [R=301,L]
No point in capturing something that will always be the same. And always give the full protocol-plus-domain in the target to avoid the possibility of multiple redirects.

:: edit, edit, edit ::

I'm puzzled about the <Directory /forum/> part, though. This implies you're working in the config file-- but is the forum by itself in a top-level directory? Is it a subdomain? In most situations, you could achieve the same end more cleanly by expressing the pattern as
^forum/
et cetera, putting the rule in the same place as the RewriteRules for the rest of the site.
10:38 am on Jan 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1189
votes: 6


Yes your solution looks neater.

The forum is just in a folder off the domain

example.com/widget-info
example.com/buy-widgets
example.com/forum/
example.com/forum/topic=1
example.com/forum/feed/f=12
example.com/blog/
example.com/goodies/

etc.

Why the <directory> ? I just assumed that would be the most efficient way to run a setup like this. I have one main httpd.conf that has these rules in it, rather than separate .htaccess files in each folder. I have always thought this was better performance-wise because it's faster to skip over a non matching <directory> than to not locate a .htaccess file and then not read it.

I am still learning at this game though. This must be my 17th / 18th year of webmastering and I have nowhere near mastered it.
11:19 am on Jan 23, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 17, 2002
posts:1189
votes: 6


Hmm. Neither example works.

Url is not rewritten and logfile says:

/GET /forum//viewforum.php?f=17 HTTP/1.1 200
/GET /forum//feed HTTP/1.1 404
11:59 pm on Jan 23, 2016 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15934
votes: 889


Well, the big problem with mod_rewrite is that it isn't inherited in the normal way, so any time you have RewriteRules in a directory section, then anything further up the line is abandoned. That includes things like access control and domain-name-canonicalization that you'd want to apply across the board. (For most purposes, "directory section" means both <Directory blahblah> and .htaccess.)

If you've got access to the config file, then yes, it's more efficient to put all your rules there. Save htaccess for your test directory where you can try out changes without having to restart the server every five minutes.

Is it possible you're saying "Directory" when you mean "Location"?
<Directory /forum>
means the physical directory "forum" located at the top level of your server. And surely that's not where your forums are located? I'd think something more like
<Directory /public/blahblah/example.com/forum>
like that. But, again, the inheritability issue. That's why I recommend putting the rule in the same place as the rules that govern the rest of this site, with a pattern starting in
^forum/

:: detour to double-check something ::

No trailing slash in a Directory statement. Not sure if it will outright break the rule (it won't crash the server, but may be interpreted as something other than what you intended); it just isn't used.