Forum Moderators: Robert Charlton & goodroi
Yesterday I got some email from a business that has "servers across the US" working out what keywords businesses are ranking for.
The other day I was looking at a huge database of adwords advertisements and keywords used by businesses worldwide.
What is going on?
Obviously (to me) these are the products of automated queries - supposedly something Google will not allow.
How are these "crawling" companies hoping to build a long term business based on activities that are outside the Google TOS?
I'm not understanding this at all.
Whether this can be a viable long-term business model, well, that's anyone's guess. And how do they do it? well, I'd say the phrase "servers across the US" gives a big clue. Many servers on dissimilar IP addresses, throttled way back to avoid detection.
throttled way back...
every day for years
Maybe I am naive about how clever they are?
It would be pretty easy to prove that the data was coming from automated queries wouldn't it?
I know nothing about how it's done, but I'd imagine that you make sure that you
(a) don't fire off your queries too fast, and
(b) have a huge amount of random IPs (either non-existent or spoof a genuine one)
Impossible to spot if you mix it up.
I've been temporarily blocked from Google for skimming through SERPS too fast in too short a space of time.
Perhaps they have a threshold for so many queries per IP per 24 hour period too. But anything that is programmed can be measured and counter-programmed.
I'd imagine these guys are constantly testing the limits with bots that do occasionally get caught and then making sure that their proper bots run below that threshold.