Forum Moderators: open

Message Too Old, No Replies

Why so many China bots?

         

roshaoar

10:37 am on Oct 10, 2014 (gmt 0)

10+ Year Member



I run a small niche photography site which has posting capability and I look at the logs quite a lot. There are all sorts of ways to stop the post bots like UA, country, IP but I've abandoned them all because the processing overhead outweighs the benefit - they never succeed in actually posting anything.

These bots all behave the same way, they come to a page [get] and then try and post to exactly the same page [post]. Most come from China but Ukraine, Russia, US, South Africa, France, Netherlands have a few too.

The other bots try and expoit loopholes like looking for wp-login, admin etc. Again they never find anything and rather than giving them any attention I just to let them 404.

I'd love to understand a bit more about them, especially those former China get/post things, because they seem so utterly pointless. They never succeed. How are these things run, and why do they identify themselves so easily (MSIE 6). What are they, link spam bots, or infected computers? Or just some script being run going down thousands of addresses?

Many thanks

lucy24

5:26 pm on Oct 10, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



They're looking for sites that do let you post successfully. It's just a preliminary test.

Rough analogy: everyone who sends spam email buys a list. But the really expensive lists are the ones showing email addresses that have actually responded to junk mail. Those are the valuable prospects. Your robots are similarly looking for sites that can be put on the short list.

There's currently a small botnet that does just two things: PUT "nyet.gif"* followed immediately by GET "nyet.gif". On any normal site, the two actions would get a 403 followed by a 404 (assuming a previously unknown visitor). They're looking for sites that get them 200 followed by 200.

rather than giving them any attention I just to let them 404

This may actually be the best thing to do. A 403 is viscerally satisfying, but a 404 tells them that these aren't the droids you're looking for. No lingering questions about what you don't want them to see. Repeat offenders should still get IP-based lockouts simply because they may be followed-up by something more insidious than just asking for a nonexistent file and then quietly going away.


* Russian** for "get out of my sight, you horrible robot".
** I looked it up. Ukrainian uses a different word.

roshaoar

9:05 am on Oct 11, 2014 (gmt 0)

10+ Year Member



Thank you Lucy!

Yes, seeing that you can affect them is quite entertaining but all it really achieved for me was bigger log files and slower processing. And the problem was the list keeps expanding forever as they try different url 'guesses', so just creating work for something of debatable benefit anyway :) Weird that there are so many from China though

keyplyr

9:58 pm on Oct 11, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If blocking bad agents slows your site, move to a new host!

Angonasec

3:32 am on Oct 12, 2014 (gmt 0)



Q/How are these things run, and why do they identify themselves so easily/Q

You're looking at very basic Sinobots. The more devious Sinobots are hosted outside China, mostly in USA, and do their best to look like normal punters browsing.

Detecting and exposing them here can be regarded as a hobby for some log-heads :)

Hobbs

6:30 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The following is based on lots of gut feeling, few facts, many guesses and some intuition, so don't ask for proof or details.

IMHO it probably is one of 2 possibilities:

A)Domaining
A Japanese owned huge operation garbing expiring domains creating doorway pages & funneling their traffic then dropping them, there is a South Korean element to the equation, China population & isp ranges are used for the crawling for the obvious huge size reason, blocking china ranges is 1000+ lines of rules in a firewall & lots of grief.

B) Less likely but possible: A China based spamming/content reaping/current or future search engine that is endorsed or at least beneficial to the Chinese Government, think of it as an easier than censorship way to have content owners block your population, less likely because only few site owners will go through the trouble or even know how to block 1000 ip ranges.

Both could be wrong, but for certain it's a centrally managed by one entity botnet, and it's bigger than the biggest out there, and it's up to no good, hence 403.

lucy24

8:52 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



blocking china ranges is 1000+ lines of rules in a firewall & lots of grief.

What on earth do you use to code your firewall? I block China in htaccess and it's currently
:: shuffling papers ::
well under 150 lines. The only grief is that felt by well-intentioned humans who, through no fault of their own, happen to live in China.

Hobbs

9:27 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



incredibill says 4,000+
to be precise 4,912 lines/ranges here
[incredibill.me...]

personally I don't block even 1000 ranges

The only grief is that felt by well-intentioned humans


Very true and unfortunate, my site is very popular in business circles down there, many forum mentions & links, but I'm hoping they'd understand that there's a price to pay when your gov is turning a blind eye to the same offender for over a decade, I've sent my share of documented complaints already and don't feel guilty anymore. I imagine Nigeria visitors too are used to lots of 403's, unfortunate indeed.

keyplyr

10:15 pm on Nov 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I block all known China ranges as well, for several reasons. The Chinese gov't is hand-in-hand with some of the hacking/theft or blatantly ignore internet copyright agreements. This in addition to the endless script injection attempts & port scans just became too much to endure. I also block politically, for human rights violations. Sorry to say, sanctions always have collateral damage.

I tried allowing several large ranges belonging to a few universities in China, but found much the same behavior there. Blocked now as well...
so sorry.

Also, I've always had difficulty with determining exactly where Chinese ranges belong to. Unlike most other parts of the world where, with a bit of diligence, I could determine the name & type of enterprise assigned to the IP range, for the most part this can't be done with China. 90% of all ranges seem to identify as either Chinanet or China Unicom.

Enter cloud computing, IP forwarding & colocation - for the last couple years, the problem seems to be these same bad agents are coming from ranges outside China so the fight continues.

Angonasec

6:20 am on Nov 10, 2014 (gmt 0)



Q/blocking china ranges is 1000+ lines of rules in a firewall & lots of grief/Q

Miss-info worthy of the CIA :)

In fact the serious Chinese bots don't use Chinese based CIDRs any more.

Ask the CIA for details :)

wilderness

1:33 pm on Dec 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



With only rare instances are 'we' aware of why these rogue bots are harvesting materials.

They don't identify themselves in an accepted protocol (valid User Agent), thus determining why they are gathering the same data is nearly impossible.

In other instances, the files they request identifies their intent (i. e., PHP vulnerabilities, WP vulnerabilities or domain registries).

Even with some 3rd party bots that clearly identify themselves, we never know what their intentions are.

keyplyr

10:51 pm on Dec 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Q/blocking china ranges is 1000+ lines of rules in a firewall & lots of grief/Q Miss-info worthy of the CIA :) In fact the serious Chinese bots don't use Chinese based CIDRs any more. Ask the CIA for details

As usual, I have no idea what you are talking about.

Hobbs

11:57 pm on Dec 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>What do they get in return?

Detailing here how spam works would be counterproductive, but generally it's Money via content theft, spam networks, manipulating holes Google is apparently too busy - or too big - to plug!

I just wasted another 15 min. reporting for the third time in 6 months a certain chinese "stone crusher mining equipment" network hogging top serp spots in Totally Unrelated topics with random stolen content from everywhere, AdSense too is littered with their ads, go figure.