homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

How can I tell if someone is cloaking

 2:17 pm on Sep 12, 2013 (gmt 0)

I run a price comparison site. I noticed recently that for one dealer in particular, if I open his product page in 2 browsers, side-by-side--one normal view, and the other through the eyes of my web-crawling script--the prices my web-crawler see are 3% lower across the board, making this dealer the cheapest in my pricing tables.

It really seems like they're using cloaking to be #1 on my list, but I want to give them the benefit of the doubt. I know I'm not viewing cached pages, because the values on the "fake" pages update constantly along with the real pages. Is there anything else I can check to know for sure that they are or aren't cheating? I'm a n00b at this black hat stuff, so I really don't know what to look for.



 2:53 pm on Sep 12, 2013 (gmt 0)

That sure sounds like cloaking to me.

Not only that, if they're advertising that price they're required by law in many areas to honor it so if your regular customers ever catch them using Lynx or something...

At any rate the FTC would fry them IMO.

An easy way to possibly solve the problem would be to make your crawler send a standard browser user agent string assuming they aren't checking for your IP address as well. If they think it's a browser maybe they'll give you the right price.

Try that.

[edited by: incrediBILL at 2:56 pm (utc) on Sep 12, 2013]


 2:53 pm on Sep 12, 2013 (gmt 0)

if they were cloaking it would most likely be based on IP address or User-Agent, so you should test that first.
when you say "side-by-side" do you mean the dealer's server got the requests from the same IP?
what User-Agent string is used for your web-crawling script?


 3:25 pm on Sep 12, 2013 (gmt 0)

You could try using a search engine translation, without the translation. ;)



 3:27 pm on Sep 12, 2013 (gmt 0)

>>IP address or User-Agent

i'd assume user agent given that you tested your 'bot/script' and a normal browser from the same IP address and they showed different results.


 3:37 pm on Sep 12, 2013 (gmt 0)

If they are honoring the lower price to your users, is it really a problem? There are lots of companies that have different prices for different customers/prospects.


 5:56 pm on Sep 12, 2013 (gmt 0)

Thanks for all the replies, guys. I did do a test with a fake User-Agent, and it still looked like they were cheating. I'll try testing from another server later tonight for further confirmation.

The reason I want to get to the bottom of this is that it makes me look bad when people click their links and the prices don't match. People think either my site is not reliable or maybe I'm getting paid off to send them traffic or whatever. Regardless, I like to do things right :)

Is there anything else I could be overlooking that could turn out to be an honest mistake?


 7:16 pm on Sep 12, 2013 (gmt 0)


I complimented my first test with a test from another server. So I have:
* window A where I view their page with my home IP address
* window B where the page is being fetched from an alternate server
* window C where the page is being fetched from my normal web crawler

The prices in window A and B matched, while C displayed lower prices. In other words, they're cloaking. Busted!


 8:10 am on Sep 13, 2013 (gmt 0)

I plan on exposing the perpetrators, but before I do I need to know if this is 100%. Is there any possible way this can be a mistake or an accident?


 5:05 pm on Sep 13, 2013 (gmt 0)

>Is there any possible way this can be a mistake or an accident?
That's a defence they may put up.

I would suggest you speak to a lawyer.


 7:10 pm on Sep 13, 2013 (gmt 0)

Whether you choose to stomp on them or not, you can still prevent them from continuing to do it. Change your robot's UA string to something humanoid. I suggest a current Chrome, which is extremely generic.


 3:47 am on Sep 14, 2013 (gmt 0)

I plan on exposing the perpetrators, but before I do I need to know if this is 100%. Is there any possible way this can be a mistake or an accident?

Accidentally serve lower prices to a bot UA string than what's being served to a browser UA string within seconds of access from one or the other? Uh, to me that sounds about as believable as Google "accidentally" storing all those e-mails and other info collected via street view wifi sniffing.


 3:43 pm on Dec 27, 2013 (gmt 0)

I think you should do a GET request for the site with a standard browser user-agent from the same IP address on which your crawler is operating. This will eliminate the (potentially-innocent) possibility that they are geotargeting.

brotherhood of LAN

 4:37 pm on Dec 27, 2013 (gmt 0)

It's worth mentioning that you can mitigate these issues somewhat by using an IP to spider that is separate from your website IP. This kind of thing was easy to manipulate with particular "directory scripts" where they give you a link for a reciprocal link. You'd simply serve a link based on a lookup of the sites IP, store it in a DB and serve the link to those IPs GET requests.

Using a different IP is less predictable.


 6:32 pm on Dec 29, 2013 (gmt 0)

Many web sites either 403 or serve cloaked responses to the
default UAs offered by most spidering engines.

In most of my scripts used to fetch information from the web
(using wget, lynx, curl, etc.) I have a list of a dozen or so
valid, different UAs, and use logic to randomly select one
from the list to make each request.

It's easy to find (huge) lists of UAs on the web via a search.


 11:31 pm on Jan 1, 2014 (gmt 0)

Why not just ask them? If, after asking, you see changes, then they definitely were doing something wrong. If nothing changes, you might get a reply from them.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / Cloaking
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved