Forum Moderators: phranque

Message Too Old, No Replies

Blocking Content Scrapers That Use Cloudflare?

         

jc2021

7:00 am on May 21, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



From thread [webmasterworld.com...]

I like RSS as a mechanism to keep current with a specific site. But some rare RSS bots scrape all content and everything, sometimes multiple times a day. You can see their traffic in your raw access log. Then ban them with their IP range or the bot user agent name. I had to ban a lot and you should as well.


Even when we find the IPs in the raw access log, they seem to be from cloudflare...

Brett_Tabke

11:23 am on May 21, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



used RSS feed specific urls so you can track the clickbacks. I am about to remove rss feeds here. Generated very few clicks last month and over 10k bot pulls.

jc2021

6:09 am on Jun 9, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



used RSS feed specific urls so you can track the clickbacks. I am about to remove rss feeds here. Generated very few clicks last month and over 10k bot pulls.


Can you elaborate?

jc2021

4:46 am on Jul 27, 2022 (gmt 0)

5+ Year Member Top Contributors Of The Month



Any other inputs?

Brett_Tabke

5:36 am on Jul 27, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



add tracking to your rss urls. ?utm=rssfeed&ip=theirIP

Then look for clickbacks.