homepage Welcome to WebmasterWorld Guest from 54.167.185.110
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Mod rewrite
Which URL is listed on search engines?
jetboy_70




msg:1522056
 1:43 pm on Sep 23, 2002 (gmt 0)

If I have a page on a site with a link to shortpage1.html, and I use Apache's mod rewrite to redirect page requests to longpage1.html, when the page is spidered, which of the URLs will be listed on search engines?

If the answer is longpage1.html, is there a method to get shortpage1.html listed?

 

jatar_k




msg:1522057
 4:43 pm on Sep 23, 2002 (gmt 0)

Yes, the method is to not use mod_rewrite. The spider can only list what it sees, if you rewrite the url it will list whatever url it gets.

DaveAtIFG




msg:1522058
 4:55 pm on Sep 23, 2002 (gmt 0)

If my understanding is correct, mod_rewrite allows you to choose whether a visitor sees the "old" or "new" URL. Take a look at the section entitled "Content Handling" in this doc [httpd.apache.org]. I think if you choose to do an "internal redirect" the "old" URL is presented to the visitor and an "external redirect" presents the "new" URL.

<added>BTW, welcome to WebmasterWorld! :) </added>

jetboy_70




msg:1522059
 10:13 am on Sep 24, 2002 (gmt 0)

Hmmm...

Spent a few hours on this last night. Dave's suggestion worked, but with an external redirect (needed in this scenario) the page was returned with a 302 http status code. I couldn't find a way of changing this to the needed 200 OK code - sending a header using PHP didn't seem to work. If anyone knows a way round this, I'd love to know.

I've compromised on a 404 error trap, most often used on database sites to alter dynamic URLs. I seem to be able to rewrite the http code from 404 to 200 with no problem using this technique.

Both shortpage1.html and longpage1.html are sent to the custom 404 page, which will resolve the same content for both. The only difference is a robots metatag added to the page if shortpage1.html is used to prevent the spidering of duplicate pages if a link to shortpage1.html ever got out into the wild.

This isn't an answer to my initial question, but solves a few of the problems which prompted me to ask it in the first place.

DaveAtIFG




msg:1522060
 2:45 pm on Sep 24, 2002 (gmt 0)

According to the Apache mod_rewrite docs you can select the error value returned when using the external redirection flag as in "R=301". Check the flags section under the RewriteRule directive in this doc [httpd.apache.org].

jetboy_70




msg:1522061
 2:57 pm on Sep 24, 2002 (gmt 0)

Yeah, that doc states that you can alter the code between 300-400. This I can do, and it works as advertised. However, if I stray outside that range to let's say... 200 - no dice. The response code reverts back to 302.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved