homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Handling duplicate content in downloadable MS Word docs

Msg#: 4573226 posted 4:44 am on May 12, 2013 (gmt 0)

I have ms word and power points documents that are templates to download. These templates are used by the readers to create certain types of documents.
I don't want search engine to crawl and index the documents (about 100 docs) because the content of these documents is duplicated - the same sentences are repeated and used as fillers.
The only way I see right now to avoid indexing these pages is to zip them so that it won't be readable by search engines.
Any other solution?


Martin Ice Web

WebmasterWorld Senior Member 5+ Year Member

Msg#: 4573226 posted 8:11 am on May 12, 2013 (gmt 0)


disallow the downloadpath in robots.txt.

As I know that google does fetch the files anyway and i also saw some disallowed files in serps, I prefer to set a

Deny from IP

in .htaccess for the download-folder, that will work.


WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

Msg#: 4573226 posted 9:22 am on May 12, 2013 (gmt 0)

Any other solution?

<FilesMatch "\.(doc|zip)$">
Header set X-Robots-Tag "noindex"

But zipping downloadable files is a decent idea anyway, even if things don't get mangled in transit as much as they used to.


Msg#: 4573226 posted 4:08 pm on May 12, 2013 (gmt 0)

Thanks lucy24 and MIW.
you gave me two good ideas.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved