Welcome to WebmasterWorld Guest from 50.16.78.128

Message Too Old, No Replies

Handling duplicate content in downloadable MS Word docs

   
4:44 am on May 12, 2013 (gmt 0)



I have ms word and power points documents that are templates to download. These templates are used by the readers to create certain types of documents.
I don't want search engine to crawl and index the documents (about 100 docs) because the content of these documents is duplicated - the same sentences are repeated and used as fillers.
The only way I see right now to avoid indexing these pages is to zip them so that it won't be readable by search engines.
Any other solution?
8:11 am on May 12, 2013 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



@Zivush,

disallow the downloadpath in robots.txt.


As I know that google does fetch the files anyway and i also saw some disallowed files in serps, I prefer to set a

Deny from IP

in .htaccess for the download-folder, that will work.
9:22 am on May 12, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Any other solution?


<FilesMatch "\.(doc|zip)$">
Header set X-Robots-Tag "noindex"
</FilesMatch>

But zipping downloadable files is a decent idea anyway, even if things don't get mangled in transit as much as they used to.
4:08 pm on May 12, 2013 (gmt 0)



Thanks lucy24 and MIW.
you gave me two good ideas.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month