homepage Welcome to WebmasterWorld Guest from 54.197.94.241
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Handling duplicate content in downloadable MS Word docs
Zivush




msg:4573228
 4:44 am on May 12, 2013 (gmt 0)

I have ms word and power points documents that are templates to download. These templates are used by the readers to create certain types of documents.
I don't want search engine to crawl and index the documents (about 100 docs) because the content of these documents is duplicated - the same sentences are repeated and used as fillers.
The only way I see right now to avoid indexing these pages is to zip them so that it won't be readable by search engines.
Any other solution?

 

Martin Ice Web




msg:4573243
 8:11 am on May 12, 2013 (gmt 0)

@Zivush,

disallow the downloadpath in robots.txt.


As I know that google does fetch the files anyway and i also saw some disallowed files in serps, I prefer to set a

Deny from IP

in .htaccess for the download-folder, that will work.

lucy24




msg:4573252
 9:22 am on May 12, 2013 (gmt 0)

Any other solution?


<FilesMatch "\.(doc|zip)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>

But zipping downloadable files is a decent idea anyway, even if things don't get mangled in transit as much as they used to.

Zivush




msg:4573303
 4:08 pm on May 12, 2013 (gmt 0)

Thanks lucy24 and MIW.
you gave me two good ideas.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved