Forum Moderators: open

Message Too Old, No Replies

Linespider - do I allow, or block?

Who uses it?

         

SumGuy

11:26 pm on Apr 1, 2021 (gmt 0)

5+ Year Member Top Contributors Of The Month



Besides a few threads here pointing out a few IP's and user-agents, I'm wondering what humans on planet earth use the data collected / scraped by this bot. I'm thinking if there are any, they are in South Korea and/or Japan.

These questions have led me to Line Corporation's wikipedia page, which again tells me nothing about what that entity actually does for a living. And what exactly is a "Line-app" ?

For what it's worth, my most recent hit comes from IP 147.92.153.9. Very very rare to see these at this point. I expect this will change.

If a lot of people in SK / Japan actually do searches or otherwise interact with "line-apps" then ok, I'm cool with that, and the IP's get a reprieve from occupying space in my router's blocking list. That's essentially what this boils down for me.

iamlost

12:24 am on Apr 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Linespider is Korean online platform Naver’s Japanese subsidiary Line.
Note: currently SoftBank is in minority position with Naver to own majority of Line.
Note: just search for ‘naver line’ or ‘line software’ for details.

As (1) tens of millions of people in east Asia use Line and other Naver services (2) it tends to obey robots.txt and (3) rDNS tends to resolve/identify appropriately:
crawl.147-92-153-9.search.line-apps.com, I allow them.

Note: Naver rDNS eg: crawl.nnn-nnn-nnn-nnn.web.naver.com

YMMV.

lucy24

12:35 am on Apr 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(2) it tends to obey robots.txt
. . . and a good thing, too, since they tend to send humanoid headers and therefore never got blocked by default; they just started showing up.

It’s true that a depressing proportion of any site’s traffic comes from robots. But once you filter out the unequivocal search engines on one side, and the unequivocal malign robots on the other, anyone who is robots.txt compliant can generally be tossed into the “no skin off my nose” category. And, given that they are compliant, there’s no point in blocking them when you can just Disallow them. A request that isn’t made in the first place is less work for your server than a blocked request.

wilderness

3:14 am on Apr 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"Opinions are like . . . "
the word spider in the UA one of the earliest lessons learned.
In more than twenty years, I've NEVER seen anything beneficial.