cloaking script

Forum Moderators: open

Message Too Old, No Replies

cloaking script

hermes

6:53 pm on Mar 23, 2005 (gmt 0)

Hi guys. I know that there are commercial cloaking solutions out there. But i would like to do this for FREE! I would be extremely grateful if anyone could post any PHP scripting that they use in cloaking endeavors. Or perhaps sticky me if they know of any good free cloaking packages out there.

I am currently thinking to write my own cloaking script. I have put down some pseudocode below. Am i on the right track?
I only wish to cloak for google.

if (user agent = googlebot) AND (IP = IP from our database)
{show optimised
html}

else

{show viewer
html}

Now the above code is for cloaking on a single domain - risky. But of course - it can be adjusted for cloaking on a redirecting domain:

if (user agent = googlebot) AND (IP = IP from our database)
{show optimised
html}

else

{redirect to primary url}

Now I was thinking of using the google IP's on
[iplists.com...]
Is this a good choice?

MrSpeed

9:17 pm on Mar 23, 2005 (gmt 0)

The pseudocode looks good.

Can any of the experts comment on the
{redirect to primary url} part?

I kind of thought the right way was to fetch the page server side via LWP or Curl. This way your cloaking domain stays in the address bar. If you just redirect to the target page won't you set off a few red flags for your competition snooping around?

Now lets for argumants sake say I was cloaking for widget.com

If I fetch widget.com server side won't most of the image and href links be broken because they're relative links?

I would imagine you would need to change all links to absolute...correct?

volatilegx

1:34 am on Mar 24, 2005 (gmt 0)

To fix the link/broken image problem, insert a base href tag after the <head> tag.

Now I was thinking of using the google IP's on
iplists.com
Is this a good choice?

Excellent choice. :P

tacheman

3:50 pm on Mar 24, 2005 (gmt 0)

Just thought I'd mention that there was another iplist at fantomaster which claims to be updated every 6 hours, it's not free though. The iplists one says it was last updated around 3 months ago which makes me slightly wary of using it.

Does anyone know at what sort or regularity search engines change their ip addresses? I guess its a trade off between using the free list and risking losing some domains or paying for a bang upto date list.

MrSpeed

4:14 pm on Mar 24, 2005 (gmt 0)

The iplists one says it was last updated around 3 months ago which makes me slightly wary of using it.

You can always build a spider trap that looks for new IP's for a specific user agent. Have the script email you this warning so you can determine if it is a new IP for a spider or someone spoofing the user agent.

//waiting to here some more about the freshness of iplists in 3,2,1... :)

MrSpeed

4:25 pm on Mar 24, 2005 (gmt 0)

hermes -
Since your banging out a script there I thought of another feauture you may want to code.

If your going to fetch a remote page.....



if (cached version exists AND it's age is less than x days) 
 show cached page 
else 
 GET fresh version 
 cache it 
 display it 
end if

It may not be desireable to bang on some remote page everytime someone calls one of your pages. You can also get a little creative using iframes.

hermes

8:10 pm on Mar 24, 2005 (gmt 0)

This LWP and curl is a good idea. Never really come accross them before. Interesting.

On a more general theme I wonder about these technologies and copyright. Of course in this instance here - it is fine because we are connecting to our own material. But what happens if one connects to someone else's material in this way (for instance the BBC website). In a sense, your server is only really acting as a proxy between the viewer and the BBC website. What are the legal issues here? Can anyone post any resources on this? I have come across cases of hotlinking etc. and know that this is legally hot water. But this issue seems more complex to me.

On a second point: RE "This way your cloaking domain stays in the address bar."
When the viewer looks at the cloaking domain source code - can they see the url of the primary domain at all? Is there any way for them to find the url of the primary domain?

volatilegx

11:39 pm on Mar 24, 2005 (gmt 0)

On a second point: RE "This way your cloaking domain stays in the address bar."
When the viewer looks at the cloaking domain source code - can they see the url of the primary domain at all? Is there any way for them to find the url of the primary domain?

If there is a base href tag, then it will have the "primary domain" in it.

Regarding the freshness of iplists.com (which is my site), a closer look will reveal that it was last updated yesterday. IP addresses are added (or removed) as they are discovered. Some of the lists are quite old. They haven't had any new IP addresses added because the spiders continue to use old IP addresses. However, if it will make everybody feel better, I can re-upload the files and make them all appear new :)

Just for the record, the fantomaster IP list is also very good.

tacheman

9:53 am on Mar 25, 2005 (gmt 0)

However, if it will make everybody feel better, I can re-upload the files and make them all appear new :)

I liked that, offering the ip lists for download free is highly commendable.

About using cURL/LWP, could you not just use php function file_get_contents or file to read the html page and print it to screen?

MrSpeed

2:03 pm on Mar 25, 2005 (gmt 0)

About using cURL/LWP, could you not just use php function file_get_contents or file to read the html page and print it to screen?

Yes.

bestphilippinehotels

1:48 pm on Apr 6, 2005 (gmt 0)

where can i find good cloaking scripts?

MrSpeed

3:22 pm on Apr 6, 2005 (gmt 0)

search at DMOZ

cloaking script

hermes

MrSpeed

volatilegx

tacheman

MrSpeed

MrSpeed

hermes

volatilegx

tacheman

MrSpeed

bestphilippinehotels

MrSpeed

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week