homepage Welcome to WebmasterWorld Guest from 54.237.54.83
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Best way to deal with ?* Canonical
What is the best way to deal with?* canonical
Bilbo




msg:3803038
 1:19 am on Dec 9, 2008 (gmt 0)

Perhaps this is not entirely a Apache question but what do people think is the best way to deal with the following canonical:

http://www.example.com/pagerequest.html?some-other-stuff

Traditionally I have used robots.txt

"Disallow: /*?"

But would it be better to redirect(*) such a request to the true url? I am thinking this could see an abusive competitor actually optomising the site for you. To stop all this nonsense I believe you may have to turn there attacks into your defence. By 301 such request a potential Canonical attack could actually benefit a site. That I am sure would destroy these attacks..........

I am must emphasise I am not talking about Xss attacks just attempts to cause a Canonical.

Your thoughts totally appreciated

*rewrite changed to redirect after g1smd pointed out error.

[edited by: Bilbo at 2:00 am (utc) on Dec. 9, 2008]

 

g1smd




msg:3803050
 1:30 am on Dec 9, 2008 (gmt 0)

I would never rewrite that so that the URL displays content originally found at some other URL. That rewrite would make the Duplicate Content problem even worse.

I use a 301 redirect to force the user to come back with a new request for the correct URL.

Bilbo




msg:3803056
 1:35 am on Dec 9, 2008 (gmt 0)

Hi g1smd, (read allott of your stuff and your awesome)

My wording was incorrect it should 301 redirect to the correct url and show that in the header. I have ammended post to show change thanks for highlighting the error.

[edited by: Bilbo at 2:02 am (utc) on Dec. 9, 2008]

jdMorgan




msg:3803516
 4:57 pm on Dec 9, 2008 (gmt 0)

It's not clear what you think is wrong with that URL. If it is the case that you never expect a query string to be appended to a .html URL on your site, then certainly you should redirect to remove spurious queries from those URLs. If the problem you perceive is something else, then please be more specific.

Again, any unique page on your site should be reachable using one and only one URL -- Anything else is a duplicate-content problem if not 301-redirected or rejected with a 404 or 410 error response.

For the purposes of this subject, no variation of any kind should be tolerated -- down to an exact character-by-character URL comparison level. You are looking for an *exact* match of absolutely everything showing in the browser address bar, and anything else needs handling.

Jim

g1smd




msg:3804213
 1:05 pm on Dec 10, 2008 (gmt 0)

I think the problem is that the ?some-other-stuff part is completely redundant information.

In that case, redirect to strip it off. At the same time, within the same redirect, force www on the domain, etc.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved