Forum Moderators: open

Message Too Old, No Replies

Was looking for info on a certain UA, found this.

"Our API handles rotating proxies and headless browsers for you"

         

SumGuy

10:26 pm on Jun 12, 2023 (gmt 0)

5+ Year Member Top Contributors Of The Month



I stumbled across this while looking for a site that would tell me if a certain UA was legit.

www zenrows com /blog/user-agent-web-scraping

The following might be useful for some here that key in on the UA (and to know that outfits like this exist):

==============
Beware that using a wrongly formed user agent will get your data extraction script blocked.

What Are the Best User Agents for Scraping?

We compiled a list of the best user agents for web scraping for emulating a browser and avoid getting blocked:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15

Mozilla/5.0 (Macintosh; Intel Mac OS X 13_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.1 Safari/605.1.15
================

This outfit's main function is:

=================
Frustrated that your web scrapers are blocked once and again?

Our API handles rotating proxies and headless browsers for you.
=================

How nice.
That's why at some point everything comes down to IP blocking.
The UA I was looking for more info on is:

Mozilla/5.0 (Android; Mobile; rv:13.0) Gecko/13.0 Firefox/13.0

tangor

11:55 pm on Jun 12, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pretty sure this kind of stuff is shared around quite frequently. Sigh.

blend27

12:37 am on Jun 13, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Headers, Session Cookies, local(same site) based image at the end of your HTML template invoked by JS, did it load?, next request is CAPTCHA?.

Same session Diff UA? = BLOCK.

No known browsers utilize other methods.

Track your code in the app you build. Header loads image, JS involved, Images served via JS? Go from the top! Load content for Real Browser Visitors only. ?

Don't try to please everybody that has a funky VPN that visits your site.

......and then there are IP Ranges and RDNS...........

PROXY > Funk Proxy or learn that and why that PROXY... code you anti proxy that way...

Your site = Your Rules!

blend27

12:38 am on Jun 13, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One can say "Oh this user does not allow Cookies in their Browser" = aha!

= next!

lucy24

12:51 am on Jun 13, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The eternal caveat: Are you using an access-control rule (proxies, cookies, scripting) that would get you yourself blocked from your site?