Welcome to WebmasterWorld Guest from 54.162.141.212

Message Too Old, No Replies

Stopping IP's from Viewing Google Cached Pages

     
2:57 pm on Jul 29, 2007 (gmt 0)

5+ Year Member



Hello,

I've been having problems with a few people in a foreign country visiting my website. They've been searching for products of only one of my manufacturers. Then, they've entered many fake requests for product quotes which tie up my company's resources.

I've concluded that this person (not a bot or spider) is either stealing my website content or my manufacturer's designs. My logs show that they stay on a page an average of 25 seconds and up to a minute, enough to download the entire page content.

I've denied the infringing block of IP addresses from the country in .htaccess and have had no hits from this person since. We've never had an order from this country, so blocking the IP addresses was no problem.

However, this person has since gone to Google cache pages to get my content. I know this from the logs during the time I was banning the IP addresses.

My pages are created on the fly and I can block the IP addresses at the program level. But when Google caches the pages, they are static pages and the program can't deny the page in the Google cache.

I can write a javascript for the block which should work in Google's cache, but they can just disable javascript.

Question. Is there a way that I can program into the html page (that gets cached by search engines) to deny particular IP addresses from accessing the page? Do this without removing my site altogether from the search engine?

Thanks for your thoughts.

3:00 pm on Jul 29, 2007 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



There is no real reason to allow Google to display a "cached" version of your site, and disallowing the cache has no adverse effect.

Add this to every page in the

head
section:

<meta name="robots" content="noarchive">

The "Cache" link will disappear gradually as the pages are reindexed. The same meta element will work for Yahoo Search and MSN Live Search too.

3:02 pm on Jul 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is no real reason to allow Google to display a "cached" version of your site

Apart from allowing people to see your site even when it's down.

3:11 pm on Jul 29, 2007 (gmt 0)

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member



allowing people to see your site even when it's down

An edge case, at best - if your site is down often enough to need the Google cache then you need to get better hosting. You can't buy products through the cache. You can't log in to a site through the cache.

You can however copy content through a cache, thus bypassing any IP bans on the site server, you can find removed content long after the site owner has changed it, and you are allowing a third-party to republish your content with their logo at the top.

3:18 pm on Jul 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmmm, I suppose there are a fair few downsides for a webmaster to have this cache enabled with a very small benefit that is actually more of someone else's than webmaster himself.
2:36 pm on Jul 30, 2007 (gmt 0)

5+ Year Member



Thank you. I didn't know about that robots tag. I hope they abide by it and the cached ones go away within days, not months.

I'm not concerned about the site going down - our host is very good.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month