Hi,
I have a new site that is dedicated to the UK & NI market. It doesn’t have much going on so now is the time to decide my allow/block policy.
I am pretty fed up with the endless list of bots going through my site and when I check them out their pages say, you can BUY the information that they have scraped for [insert_purpose_here]. I am also fed up with continuously monitoring large logs to see if anything else is misbehaving, sooo….
I am thinking of allowing a few selected bots, google, bing, duckduck etc. using a reverse lookup and stoping everything else that does not have a UK/NI IP. I can then easily manually refine the policy using the logs. I will still have to manually check, but it should be a less arduous process with a much smaller log file.
Nice and simple.
Can you think of any major downsides doing it this way?
Thank you.