homepage Welcome to WebmasterWorld Guest from 107.20.109.52
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
/folder/ and /folder = duplicate content?
with and without the trailing slash
sean




msg:176110
 2:24 pm on Nov 25, 2004 (gmt 0)

Any ideas why Google would index folder-root files twice, both with and without the trailing slash? When searching for a specific text string, both of the variations below appear on the same SERPS:

www.example.com/folder/
www.example.com/folder

Now, any ideas on how to get Google to only recognize one of the pages? The pages went from First to Worst around the time this bug surfaced...

 

Brett_Tabke




msg:176111
 11:10 pm on Nov 25, 2004 (gmt 0)

If both pages are not idential.

More than likely, you dropped a slash or a inbound link dropped a slash.

Your server is returning the same valid page for both.

q: Fix?
a: adjust your server settings so that without the slash generates a 404 as it shou.d

sean




msg:176112
 12:35 am on Nov 26, 2004 (gmt 0)

You mean a 301?

[searchengineworld.com...] :
301 = [webmasterworld.com...]
200 = [webmasterworld.com...]

I appreciate the response... time for me to begin a crash course in defensive webmastering.

petehall




msg:176113
 12:41 am on Nov 26, 2004 (gmt 0)

adjust your server settings so that without the slash generates a 404 as it shou.d

I realise this is a Google related discussion, however Yahoo! trims the last / from any listing in it's SERPs.

I found it highly irritating and as such I have to ensure both combinations work or Yahoo! and MSN SERPs suffer...

Unless I am mistaken you will not find a single / as the last character of any link from Yahoo! or MSN SERPs.

[edited by: petehall at 12:49 am (utc) on Nov. 26, 2004]

encyclo




msg:176114
 12:44 am on Nov 26, 2004 (gmt 0)

Also check your site's internal links with something like the W3C link checker [validator.w3.org] or similar tool: more often than not there's an internal link pointing to the directory, but without the trailing slash.

petehall, I'd half-noticed that on one of my sites, but looking further you're completely right and it seems generalized. Usualy it's not too bad when you're talking about "genuine" directories (where the server software generates the 301 automatically), but it could be a problem when the directories are done with mod_rewrite and the appropriate rules are not in place for taking into account this problem. One more reason why I like file extensions on rewritten URLs.

sean




msg:176115
 1:14 am on Nov 26, 2004 (gmt 0)

Turns out the slash-less links are coming from scraper sites.

Hopefully, SERPs will not take too long to return to normal.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved