Forum Moderators: goodroi

Message Too Old, No Replies

Prevent spider from following external links

to keep Amazon items out of search engine database

         

Reno

12:50 am on Aug 15, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At my site I offer some books, dvds, and videos from Amazon.com (as part of their affiliate program). It has happened that on occasion, I've wanted to check with Google to see how many of my pages are in their index (using "site:www.mydomain.com"), and to my surprise, I have found more than once that Google has indexed these Amazon products.

The site only has a few dozen pages, but because of all these items, it would show that I have hundreds of pages in their database!

Can I use robots.txt to prevent the spider from following any external links that lead to Amazon?

All the product feed link addresses are very long and they all start out the same:
[rcm.amazon.com...] .......

(But then of course every one is eventually a different address, since each is a different product).

I'm wondering if something like this might work in my robots.txt file?

User-agent: *
Disallow: [rcm.amazon.com...]

That is probably too simplistic, but thought it best to ask.

Thanks for any help...

............................

Barb

7:11 am on Aug 25, 2006 (gmt 0)

10+ Year Member



Wow - great question. I, too, have Amazon affiliate links on my site. I never thought about it. Though, my Amazon links start differently from yours....weird.