Convincing IP spoofing / user simulation - Crawler, Spider, and User Agent ID forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Convincing IP spoofing / user simulation

Pointers to detection of IP spoofing

«
1
2
3
»

Simon_H

4:20 pm on Jan 9, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Hi guys. I need some expert advice as this isn't my area...

Let's say a third party wants to do a very convincing job of using bots to make it appear that real users are navigating an ecommerce site. So that would mean spoofing IPs of real users and ensuring navigation patterns resemble that of real users. The obvious problem is the bidirectional issue with IP spoofing, i.e. the spoofer sends requests, but won't get a response. For example, if the spoofer wants to post forms, e.g. add to basket, they're working blind.

So, to deal with this, could the spoofer initially hit the page / add to basket / etc from a non-spoofed IP address and note the response. They then do the same thing from the spoofed IPs and ensure it matches what happened on the non-spoofed one. They could then do this multiple times from multiple spoofed IPs and it would look convincing in the server logs. They would presumably need to re-hit the page from the non-spoofed IP every now and then to ensure nothing has changed.

Is this approach plausible to make the bot more convincing or have I missed something that would preclude this, e.g. protocol handshaking issues?

BTW, I'm asking this because I'm fairly sure based on our stats that we're being intermittently hit by bots that are spoofing IPs and simulating user behaviour.

keyplyr

10:17 pm on Jan 11, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Google publishes the IP ranges that they assign to their cloud services and similar services used by paying customers, and those IPs are not in that range. So this really does seem to be Google themselves.

There are also Google proxies. But better yet, why don't you post the IPs.

Simon_H

10:24 pm on Jan 11, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

@keyplyr I already did - see previous comments! Here they are again: The IPs are 74.125.63.33, 104.132.20.64/71/77/78/86/90/93.

keyplyr

11:49 pm on Jan 11, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

I already did - see previous comments!

Sorry, difficult to navigate back & forth to read thread from a phone. Coming back from CES in Vegas.

Maybe someone has posted similar experience at another forum, including Google forums.

iamlost

1:35 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

All IPs currently resolve to Google in Hyderabad, India.

I saw 104.132.20.64 and 104.132.20.86 New Years Eve MST - running Windows with Safari. Tripped defences and were banned for 24hrs. Haven't seen since.

I've seen 74.125.63.33 regularly for years both as googlebot AND as plain vanilla Win with FF (various versions of both).
Normal rDNS would return something such as


1.66.249.66.in-addr.arpa.PTR86400crawl-66-249-66-1.googlebot.com.

however 74.125.63.33 returns


NameServer ns1.google.com. reports: No such host 33.63.125.74.in-addr.arpa.

And has been blocked since 2010.

keyplyr

4:33 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Depends on the look-up tool used. For 74.125.63.33 I see:

GOOGLE-CORP-74-125-56-0
74.125.56.0 - 74.125.63.255
74.125.56.0/21

No name server needed.

iamlost

5:46 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

No name server needed.

True.
My point though is that it didn't/doesn't return as a legitimate Google bot IP.
You may allow Google to run rampant, I don't. If it is Google, as this certainly appears, and not a bot - but the user-agent string claims to be a bot - then Google is playing games and should be called on it aka blocked. And if it is NOT Google then the same.

If memory serves that IP was (and may still be) used to test AdWords landing pages. Regardless, the behaviour I remember and that Simon_H reports is definitely outside crawling for indexing.

What kinds of work does Google Hyderabad do?

Our sales teams support small and medium-sized (SMB) advertisers in English-speaking countries, approve ads to run on our global search and content networks, provide sales support to major advertisers around the world and evangelize Google products to SMBs in India and Australia. Our user operations & policy team makes sure that all our products include only quality content, protecting users from account hijacking, spam and improper search results. And we have engineers who�ve worked on global products like Gmail, Calendar, Docs and Maps, as well as on engineering productivity tools like building and test infrastructure and tools, and developer tools like our YouTube, Calendar and OpenSocial APIs. We�ve also had engineers adapting Google products for local markets and managing systems for offices in India and other Asia-Pacific markets.

Lots of possibilities as to who in Google might be doing what that show up in our logs and shopping carts... whether to block is a business model and personal temperment decision.

blend27

5:58 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

...initially hit the page / add to basket / etc from a non-spoofed IP address and note the response....

I had to re-read this a few times, so. I am going to throw at You some tech stuff that we did at the GOV several years back.....psssss....

Here are a few pointers if you have some RAM:

1. Each click(target = basket, account, etc = secure) that already has a session needs a next unique UUID for the next possible route of sort(via post only, via AJAX only, via a $('selector') that identifies the which method to use for the action. U could also set AJAX post to a router method that picks up actual method from an randomly generated string, again from the selector. JS Code on Your site could could use sockets to push content back to Users browser once you authenticate the session.

DO NOT USE FORM members for selectors, Generate YOUR post via an obscure .JS file that is unique to a session.

2. Get your self a good PCI Scan, even after #1

keyplyr

8:16 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

@ iamlost

You may allow Google to run rampant, I don't.

Since you quoted me for this response, I assume this is directed at me. I don't know what gives you the idea that I let Google "rum rampant" but I do not, far from it.

lucy24

8:29 am on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

each individual product page is being separately hit from an external inbound link

I don't understand where you're getting this.

Simon_H

10:10 pm on Jan 12, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Thanks everyone for the help. Based on @iamlost's comments, I found another site reporting the 74.125.63.33 IP being used to add items to basket and even checkout with a fake username. This is indeed believed to be a bot from Google Inc and its purpose is allegedly to monitor compliance with adwords/shopping policy, e.g. to validate that prices on-site match prices in the merchant centre feed. The other site blocked the IP and received an email from Google saying they needed to unblock the IP or their adwords account would be suspended.

There are certainly legitimate reasons why Google Inc would want to check out sites (e.g. to review manual penalties), but I find it very difficult to believe Google is doing this to check compliance with adwords policy. If it is, why not use an IP that identifies itself as googlebot; that's how Google themselves advise you can identify legitimate Google activity? Why check only 30 items out of 10,000? Why add to basket only the items with the highest number of paid clicks?

Let's look at this a different way. My original reason for posting was that our Shopping account is showing obvious unnatural click activity and I wanted to determine if this could be caused by a click fraud bot pretending to be human. Now, let's say I worked at Google Inc and I was asked to write/adapt a bot that behaved like it was human in order to allow Google to generate and charge for fake clicks on a Shopping account. Here's how I'd go about it:

1. I'd ensure any fake clicks were on products that already had a high number of legitimate clicks. Because if items that received only a few clicks per month suddenly received a load more, that would appear suspicious.
2. I'd ensure that the total number of daily paid clicks on 'fraud' days was similar to the click count on non-fraud days. Because if fraudulent clicks were simply added to legitimate clicks, the total daily clicks would suddenly jump and click fraud would be suspected. The issue with this is it requires Google to replace legitimate clicks with fake ones, which means that transactions will drop on a 'fraud' day, which may appear suspicious. To help reduce this effect...
3. I'd ensure that a number of items were also added to basket. Because even though the site owner will see reduced transactions, they'd still see items added to basket which would reduce suspicion of click fraud and make it look like the site itself simply wasn't converting.

I appreciate this sounds like a trolly conspiracy theory, but I'm trying to follow the evidence and this would explain both what we're seeing on our Shopping account and also why Google would have a bot learning how to add most-clicked items to basket. (As it could then reproduce this behaviour on bots with spoofed IPs or on a botnet where clicks would be chargeable).

Is this a ridiculous explanation for why our account activity is so unnatural, or might it have some merit? Feel free to be honest!

@lucy24 I think I'm not explaining myself well. Each item hit is an individual direct page hit. So the bot that is doing this is issuing a direct URL to each item page from a list of links it must have.

lucy24

11:48 pm on Jan 12, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

So the bot that is doing this is issuing a direct URL to each item page

Yes, OK, that makes much more sense. Otherwise the whole "links" business becomes a red herring: knowing that an URL exists isn't the same thing as placing a link to it.

agent_x

11:07 am on Jan 14, 2016 (gmt 0)

10+ Year Member

I find it very difficult to believe Google is doing this to check compliance with adwords policy. If it is, why not use an IP that identifies itself as googlebot; that's how Google themselves advise you can identify legitimate Google activity? Why check only 30 items out of 10,000? Why add to basket only the items with the highest number of paid clicks?

My guess would be that it doesn't identify itself as a googlebot in order to discover cloaking activity.

Simon_H

11:59 am on Jan 14, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

@agent_x Not sure. You could argue the same thing with the normal search googlebot where cloaking is far more of an issue, but Google keeps it transparent so site owners know what is hitting their site. Also, the only purpose of cloaking here would be to show the bot a different item price than the customer sees, to trick Shopping into showing lower prices than we sell for. But that would be stupid because (1) we'd end up paying for clicks by customers who wouldn't buy once they saw the real price and (2) Google also manually regularly check the site for compliance (we had that yesterday!), they'd spot this immediately and suspend the site for policy violation.

So I'm not sure that explanation works.

Mark558

12:21 pm on Jan 14, 2016 (gmt 0)

how it will work whit somting like adsense? can it backfire? wil it work whit proxy?

Chrispcritters

7:02 pm on Jan 14, 2016 (gmt 0)

10+ Year Member

I run on of the IP lookup sites and while checking our logs I see no recent activity from those IP addresses.

The appear to be corporate Google IP addresses out of India.

Interesting there are comments from users as far back as 2014 describing ecommerce accounts being created, etc about 74.125.63.33.

I wonder if this is some form of research or ecommerce QA/anti-scam behavior.

have you tried blocking those IP from accessing your site and seeing if there is any reaction from Google?

Chrispcritters

7:03 pm on Jan 14, 2016 (gmt 0)

10+ Year Member

(dupe post)

ogletree

11:39 pm on Jan 14, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

It could also just be real people doing weird things. You might try one of those session recorder services. This will show you what is going on. I have had clients tell me they are convinced that enemies are clicking on their ads and doing stuff on their site and from all I could tell it is just people doing odd things. Website visitors are weird.

Simon_H

12:57 am on Jan 15, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Thanks again.

@ogletree, we already record much of that stuff and it's definitely a bot.

@Chrispcritters Yep, you're right - that's what I've found too and others have said they've seen the same bot. It's allegedly Google checking that the site complies with Google Shopping policies. We're actually seeing both this bot from Google IPs that hits the site approx once per month and adds 30 items to basket in quick succession, and we also see a real person from Google IPs hit the site every now and then, register and go through almost the full checkout process. Presumably the bot can't do the full checkout as it's https. The real person uses the username Mark Mustermann with the UK address Peter House, Oxford Street (repeated 4 times!). This is the same name others report that Google Inc uses.

However, I struggle to believe the alleged reason for the bot visits. As said previously, a legitimate bot visit from Google Shopping would identify itself as googlebot, wouldn't restrict itself to only the most clicked items, wouldn't need to add to basket, etc. Given the unnatural click activity we're seeing, I find the bot situation very suspicious.

blend27

3:32 am on Jan 15, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

@Simon_H

What you might be experiencing is not IP Spoofing, but a session cookie transfer/sharing, which could also happen when the client is on a Cloud.

Have You given a thought to that? Regardless of it is being a GoogleBot or someone running with a clever scraper.

Simon_H

10:19 am on Jan 15, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Hi @blend27. Thanks. Could you explain more as this isn't my area? Are you saying that the fake paid traffic isn't someone spoofing IPs, but is someone doing a cookie transfer to hide their identity?

Just to be clear, there's two separate things going on here. Firstly, we're seeing very unnatural paid click patterns on our Shopping account, but bot-detectors aren't picking this up, so the assumption is that this is intelligent bots or something/someone else faking real buyer behaviour. Are you saying this could be due to cookie transfer?

The other thing that's happening is a bot from Google Inc is hitting our site once per month, quickly adding 30 items to basket in succession. This is allegedly Google checking for policy compliance with Google Shopping. However, this bot activity is questionable and, given the unnatural paid click patterns mentioned previously, I find it difficult to believe the bot is there purely to check for policy compliance.

keyplyr

10:53 am on Jan 15, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

Same client, dynamic IPs. I see this occasionally when schools visit my pages from cloud ISPs. The kind of classrooms that give out tablets to all the students.

motorhaven

12:24 am on Jan 17, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

Could it a map pricing bot? They will add items to a cart to see if the selling price changes.

Angonasec

3:31 am on Jan 17, 2016 (gmt 0)

Hint:
You will learn something by blocking the "offending" IP ranges and observing the results.

If you fuss over blocking genuine G, then you're beyond help :)

Simon_H

5:41 pm on Jan 18, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

@Angonasec Would appreciate a less cryptic response. If you're saying that blocking the IPs will result in a stinky email from Google, yes, others have reported that. Or are you saying something else?

@motorhaven Yes, that's what some people believe it's for. But it doesn't seem to make sense. I have no issue with Google checking policy compliance, and we know they legitimately use bots to compare prices on the product pages with prices in the merchant feed. But creating a bot to add items to basket? Why create a bot to do that when sites that do things like add tax or extra delivery do it way down the checkout process, not at the basket? And why would the bot hit only products with the most paid clicks rather than a random selection? We also see a real person from Google (same IP) visiting the site now and then, registering and going through the checkout process, so why would they need a bot to do a fraction of the job a human already does? Again, I wouldn't be as suspicious if we weren't seeing such unnatural click vs conversion patterns on the site.

lucy24

9:38 pm on Jan 18, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

why would the bot hit only products with the most paid clicks

Because they want to see where their money is going?

ogletree

9:51 pm on Jan 18, 2016 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Top Contributors Of The Month

If this is only a concern because it is messing up your analytics and you can detect these users by ip or UA or something like that just exclude them from your analytics cookie.

Simon_H

11:02 pm on Jan 18, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

@lucy24 I don't understand. It's not Google's money. It's our money! Could you explain more?

@ogletree No, that's not the concern. I created this thread because we're seeing very unnatural click patterns on Google Shopping. The original question was how difficult would it be for a third party (i.e. Google) to create fake traffic that appears human in order to charge for clicks. I'm not a conspiracy theorist, but the evidence is compelling, and because there is minimal variation in the total number of daily clicks irrespective of whether they appear natural or unnatural, it implies that Google is involved in the click manipulation. The 'policy compliance' bot from Google is a secondary concern. It's not that this bot is messing up the stats, it's me trying to understand what this bot is really up to and whether or not it relates to the click fraud we're seeing.

thejimster

9:57 pm on Jan 20, 2016 (gmt 0)

10+ Year Member

I would think that they have the bot add the item to cart just to be thorough. If I created a bot to check Google Shopping compliance, I would definitely have it add the item to cart. I would also have the bot add the products with the most paid clicks, as that is where most of the traffic is going. Wouldn't it make the most sense to make sure those products are compliant, rather than random products that receive much less traffic? What's the issue with having a bot check for compliance, as well as a human? As with organic results, there are algorithmic penalties and manual penalties. I don't see why one would assume that it would be 100% bots or 100% human. If a webmaster learned how to trick the bot, they would be golden. Google can't rely (or choose not to) on humans to manually check every product listing.

I'm skeptical of Google Shopping traffic as well. I'm curious if you opt-in to use the Google Shopping Affiliates (ebay Comemrce Network, become.com, shopzilla, nextag, pricegrabber). We used these affiliates directly back in the day, and only found rampant fraud coming from them. Simply looking at bounce rate, time on site, etc., we could tell bots were hitting our site. The user metrics were just too good looking on the surface, only to find a massive amount of fraudulent orders, credit cards, etc. when we dug deeper. I have not put much effort into including or excluding these affiliates with Google Shopping, but this may be the best place to look in regards to the metrics that just don't make sense in your case.

I agree with you. The bot should identify itself as Googlebot.

Simon_H

12:36 am on Jan 21, 2016 (gmt 0)

10+ Year Member

Top Contributors Of The Month

@thejimster Thanks. That's a very reasonable argument about the bot, although I do disagree! If I were to write a bot to check Google Shopping compliance (and I have written many complex bots similar to this), it wouldn't work like that. There's no point it simply adding an item to cart without the bot completing the checkout process. There's no point it checking 30 items out of 10,000. And I don't agree that checking the most paid clicked items only is a good idea as, statistically, I'd want to randomise my sample to increase the likelihood of finding something out of place. After all, maybe the most clicked items are the only ones that do comply with policy, which is why they're the ones being clicked! The bot is the equivalent of determining if a vehicle complies with safety standards by checking the left brake light only on every model that comes off the production line. Yes, it may find the odd model with a failed brake light, but it's still an essentially pointless exercise.

Yes, we do opt in to use Google's shopping affiliates. I've been in two minds about that as many say it's worthwhile with conversion rates higher than native Google. There does appear to be more avenues for and examples of fraud with the affiliates, although a lot of people would argue that it's native Google where they see the fraud! I'll look into it some more, so thanks.

thejimster

1:38 pm on Jan 21, 2016 (gmt 0)

10+ Year Member

@Simon_H maybe you're a better bot writer than Google? :)

Please post here if you do some testing with the affiliates. I should do some testing as well, as I have NEVER found anything online as far as people testing with and without these affiliates.

Our experiences directly with shopzilla, pricegrabber, nextag, etc. literally sent hundreds (or thousands) of bad orders to our sites. We never found a good, processed web order through them. These CSEs would register a conversion with their "ROI Tracker" or "Conversion Optimizer", so it would appear, on the surface, that they were of great value. When we use Google Shopping (with and without affiliates), we do receive good orders. So, we have definitely seen more fraud with those affiliates than with Google. However, I am just one person with this experience across a dozen or so sites.

I highly suspect this is where your (probably ours as well) issue is, but it's just a hunch.

This 62 message thread spans 3 pages: 62

«
1
2
3
»