So far i tried ip blocking, even put in the word "torkaland" and "streamica" in referrer and user-agent block list. None of it works! Pls help.
torkaland.blogspot.com and streamica.com
Frank_Rizzo
10:10 pm on Feb 26, 2013 (gmt 0)
Did you restart apache after each change?
tangster
10:31 pm on Feb 26, 2013 (gmt 0)
The block are in the .htaccess. I didn't know you have to restart Apache for it to take affect?
The problem is i cant block the IP of the blogspot site because its owned by Google, i am afraid they might use the same IP to crawl my site and get blocked.
Whereas streamica seems to be pulling RSS feeds from a different IP than what its hosted on and i don't know which IP they are using to scrape the site.
lucy24
11:29 pm on Feb 26, 2013 (gmt 0)
Anything in htaccess takes effect immediately. The only exception is that if your browser has already cached the page, it may not know that there have been changes.
It is trivial to make a conditional block to say, for example,
RewriteCond %{REMOTE_ADDR} {give the numerical IP here} RewriteCond ${USER_AGENT} !Googlebot
... and then take it from there. Currently all the googlebot variants such as the imagebot and the three-or-more mobiles contain the element "Googlebot" (capitalized) in their User-Agent string.
even put in the word "torkaland" and "streamica" in referrer and user-agent block list.
What exactly do you mean by this? That is, what did you do physically?
tangster
2:06 am on Feb 27, 2013 (gmt 0)
Would it be safe to block blogspot.com which is on a Google IP? My concern is that Googlebot may also use the same IP sometimes. If someone can confirm it doesn't that would be of immense help.