Combination User Agent / IP

Forum Moderators: open

Message Too Old, No Replies

Combination User Agent / IP

How does it work?

Filipe

7:43 pm on May 1, 2002 (gmt 0)

I've heard of User Agent cloaking and IP cloaking - is there any reason why you wouldn't use them concurrently?

johnhamman

8:35 pm on May 1, 2002 (gmt 0)

It's recomended that you use them together to do a better job of cloaking.
john

sinyala1

8:21 pm on May 3, 2002 (gmt 0)

DO NOT USE AGENT CLOAKING. BAD IDEA!!! You'll be caught using agent cloaking. Agent cloaking can be caught EASILY by a search engine and google and other search engines doing beta testing on catching agent cloaking scripts will be caught. Use DNS and IP's ONLY. One other reason why you do not want to do this is I could fake my agent EASILY and catch your cloaked site giving away your cloaked pages and also leaving you very wide open to report by both editors and searchers. If you're using agents to filter, stop. You wouldn't believe the logs I have for spider traps and search engines are changing agents/not using agent names at all to check out sites. I programmed my cloaking script in php and used only ip segments and dns. Most search engines own entire ip segments, but the main thing you need is not agents it's dns. most all search engines carry the same dns and if they don't keep your spider trap being submitted monthly.

sinyala1

8:21 pm on May 3, 2002 (gmt 0)

Oh yeah...use wild cards. I programmed mine to use wild cards. Here's some listings in my spider trap.

scooter2.* 11 Altavista
scooter.* 11 Altavista
scooterr.* 11 Altavista
bigip1-snat.* 11 Altavista
vscooter.* 12 Altavista
*.sv.av.com 11 Altavista

The * is the wild card. 11 and 12 is how many times it was hit. The * can be anything.

63.173.190.16 <<--- that hit my spider trap 27 times in the past 3 months and only search engines know about it.

DNS wild cards are easy. You can take out most every single spider google has by this DNS:

*.googlebot.com

Agents can be faked, DNS cannot.

johnhamman

2:58 am on May 4, 2002 (gmt 0)

I disagree with you on the ip/UA test. For cloaking DNS is great but its slow on some programing languages and depending on the server your running the DNS on it may not return a accurate reading. Yes agents can be faked but if you use them with IPs you can be safe. Heres a link [webmasterworld.com...] Check out my psudocode. Now that code should work extremly good as long as you have an updated IP and useragent list. Ip checking runs great and its fast. Now what is even more Ideal is to have a script that logs any questionable Ip and useragents and records them for you to later DNS check them. Then if they return a spider or whatnot, you would add them to your cloak script.

Just a thought.
john

msgraph

3:42 am on May 4, 2002 (gmt 0)

I agree with John. When you have lot's of domains running on a server it can really slow things down to just do DNS/IP matchups.

Plus some of the SE's are adding new IPs every few months. I'd rather have a failsafe and take my chances than have the search engine catch me with my pants down.

One of the best methods I have experienced had both options going. When the browser visits, a UA match is performed. For example, Mozilla to the left and the rest to the right.

If Mozilla has a term that matches something from the SE then it is given the proper page.

The "rest" go off by UA/IP match.

The bots can afford a slight slowdown but we're not going to give that slowdown to every single user that enters the site.

As for the getting caught, I'd rather get caught by some user who is just trying to steal a page than a SE robot. I'm not going to take that risk at all.

sinyala1

12:05 pm on May 6, 2002 (gmt 0)

What are you guys talking about slowing the server down? What are you kidding?

You have a spider trap.......this spider trap is submitted to search engines. The IP is recorded and DNS upon going to the web page. The database isn't even on the same server as the any of my cloaked site, it just requests to use the spider database. Here's my code, and this code loads extremely fast, like not even close to .01 seconds. Like I said, it's stupid to stick with agents. So IP's change, their DNS's do not, and if they do the spider trap will pick them up but I seriously doubt altavista's going to change their entire network and server names anytime soon and if they did I'd easily pick them up on next weeks run of spider trap. It's updated weekly and I've never had a spider miss it and I've been cloaking for 2 years. Be safe with your agents as you think but if I check logs and never saw a spider miss and go to my normal site I would say I'm doing pretty good. I couldn't find your code because it was 404 but here's my spider trap IP and DNS loggin code:
$new_ip = $REMOTE_ADDR;

$new_dns = gethostbyaddr($new_ip);

$result = mysql_query("SELECT * FROM spiders ORDER BY id", $db);

while($myrow = mysql_fetch_array($result)){

$id = $myrow["id"];

$ip = $myrow["ip"];

$count = $myrow["count"];

if($new_ip == $ip){

$found = "1";

$count++;

mysql_query("UPDATE spiders SET count='$count' WHERE id='$id'", $db);

}

if(!$found){

mysql_query("INSERT into spiders (ip, dns, engine) VALUES ('$new_ip', '$new_dns', 'IP') ", $db);

sinyala1

12:07 pm on May 6, 2002 (gmt 0)

Maybe you should reprogram your scripts? Mine loads.........without slowdown. I ran tests on non-cloaked and cloaked loading time and it's no where close to a noticeable loading difference and I have 1,899 known IP's an DNS's in my database.

Air

12:54 pm on May 6, 2002 (gmt 0)

sinyala1,

So you aren't doing real time DNS resolution then? You do the lookup with the spider trap and save it, then reference it just like the list of IP addresses? (am I getting that right?)

If that is what you are doing, then the concern doesn't apply, the concern was over doing a DNS lookup for each page request in real time ...

sinyala1

12:58 pm on May 6, 2002 (gmt 0)

Yeah, you're getting it right. It also refreshes the database with new DNS. Some search engine IP's stop and go offline and change and I'll know this by the DNS no longer being listed but no it's not in live time. The refresh of the DNS happens once a day on the server and would no way affect another cloaked site. My scripts very short and very clean so the cloaked sites load very fast.

sinyala1

1:04 pm on May 6, 2002 (gmt 0)

My point was that search engines right now are running tests to find out cloaked sites and they're doing this by changing agent names and a few other things. If you're using agent names then you will be exposed to this. If anyone has more information on how google and other engines are exposing and banning cloaked sites I would appreciate any info on it.

johnhamman

2:37 pm on May 6, 2002 (gmt 0)

Real Time DNS does slow down on high trafic sites. High traphic is the key here. Every bit counts and the slightest slowdown will loose customers.

sinyala1

2:43 pm on May 6, 2002 (gmt 0)

You're not reading my listings correctly. A database holds the records of dns. The site doesn't check to see if the DNS changes. That's an entirely different thing that I programmed to refresh my database. This has nothing to do with real time cloaked site checking dns. It checks a database, it doesn't check to see if the DNS changed or make changes in real time. It runs off a database, and cannot make changes to it nor does it check to see if the IP's DNS changed. If it finds it in the list, it finds it. It won't do a reverse lookup on it to see what changed.

sinyala1

2:48 pm on May 6, 2002 (gmt 0)

>Real Time DNS does slow down on high trafic sites. High traphic is the key here.

My site gets in the ball park of a thousand hits a day. The load time on it is unnoticable from being cloaked to being cloaked. Real time DNS is recorded by the server itself, does this slow it down too when feeding the web page? No. It's sent when the user is trying to acces the web page. If you're really convinced my DNS database slows it down I'll throw it onto a site that gets 80,000 hits a day. I do work for a company that is one of the largest food chains in the world so I'll test it on their site for half an hour and run speed tests on loading time vs old loading time.

johnhamman

3:35 pm on May 6, 2002 (gmt 0)

That would be realy interesting so see. What programing lang. do you write it in and how would you do the speed test?
john

sinyala1

4:59 pm on May 6, 2002 (gmt 0)

>That would be realy interesting so see. What programing lang. do you write it in and how would you do the speed test?

PHP, and I would use sites that have html analyzers/loading time. I work for an applicaiton service provider (asp) so they have loads of stuff that checks html errors, load speed programs, bandwith speed, etc. I'll run the test 10 times on the site without it then 10 times with it. I'll tell you the results after I put it on.

Filipe

5:10 pm on May 6, 2002 (gmt 0)

I'm kinda lost on some of the distinctions you're making... what's "real-time" DNS as opposed to what Sinyala is doing... is just doing an IP check? Maybe I'm reading his code wrong...