homepage Welcome to WebmasterWorld Guest from 54.225.57.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

    
Cloaking using LWP
Cloaking using LWP
yourPAGES




msg:675492
 4:02 am on Nov 15, 2005 (gmt 0)

We're converting a site from static to dynamic using a proprietry database system (not open source), and we would like to hide the parameter-laden dynamic URLs from the outside world.

Firstly, I'm not sure if this counts as Cloaking as we're not trying to present different results to anyone based on who they are... if not, then our apologies for misusing the term.

Secondly, because we want the static URL to remain visible it seems to us that Mod_Rewrite won't do the job, so we've read up a little about Perl LWP and wonder if this could be the way to go.

Our understanding is that it goes something like this; although we're sure the syntax is almost certainly incorrect :-)...

#!/usr/local/bin/perl
use LWP::Simple;
$dynamic_url = "http://whatever";
my $page = get($dynamic_url);
print "$page";

Any thoughts on (1) the technique, and (2) the steps to make it happen?

Thanks,
your.PAGES

 

yourPAGES




msg:675493
 4:05 am on Nov 15, 2005 (gmt 0)

We maybe should have added that both the old Static and new Dynamic URLs will be on the same domain.

your.PAGES

jdMorgan




msg:675494
 4:14 am on Nov 15, 2005 (gmt 0)

If you publish static URLs on your pages that can be used to create the corresponding dynamic links (i.e. they contain all the necessary information) when requested from your site, then mod_rewrite on Apache or ISAPI rewrite on IIS would be a lot more efficient. A search of this site for "rewriterule static URL dynamic" and a look through the WebmasterWorld library will turn up many 'how-to' threads. Site search info is available in the link at the top of every page on WebmasterWorld.

Jim

yourPAGES




msg:675495
 4:28 am on Nov 15, 2005 (gmt 0)

Thanks Jim,

Our concern is that in all our searching suggested that Mod_Rewrite changes the URL displayed in the Browser, and that's not what we want.

Although the database package can publish Static URLs, they are not as user-friendly as we would like, and they also impose other restrictions that the Dynamic pages don't suffer from.

We will keep searching, but we'd appreciate it if you or anyone else can suggest any solution (we're not worried if it isn't Perl LWP) that will retain the Static URL upon page deisplay.

Thanks,
your.PAGES

yourPAGES




msg:675496
 4:45 am on Nov 15, 2005 (gmt 0)

We've found the following comment from you in another post...

"Mod_rewrite acts after a request arrives at your server, and before any content is served or any scripts are invoked. Therefore, it can change incoming static urls to the dynamic form needed to call your script, but it cannot change the urls that appear on your site's pages."

...so is there a non-Mod_Rewrite solution?

Thanks,
your.PAGES

jdMorgan




msg:675497
 4:55 am on Nov 15, 2005 (gmt 0)

You're reading things into this that are not there, and the result is heading toward an over-complicated solution.

Here's how this works:

  • Publish the static link "example.com/red/widget" on your page
  • Visitor clicks the example.com/red/widget link
  • Visitor's browser sends request for /red/widget page to your server
  • Request for example.com/red/widget arrives at your server
  • Server internally rewrites /red/widget to product-display-script.php?product=widget&color=red
  • Script product-display-script uses parameters product=widget&color=red to generate the "red widget" product page, and the server sends that to your visitor.
  • This page contains another link. This time, it's example.com/blue/widget
  • User clicks to see the blue widget.
  • Process repeats.

    Set up a small test and try it.

    Jim

  • yourPAGES




    msg:675498
     6:30 am on Nov 15, 2005 (gmt 0)

    Thanks Jim,

    Yes, you're probably right that making it more complicated than it should be :-)

    But in you example (which at least tells me that I am following what's going on), what URL will appear in the user's Browser at the following point?...

    # Script product-display-script uses parameters product=widget&color=red to generate the "red widget" product page, and the server sends that to your visitor.
    *** which URL here? ***
    # This page contains another link. This time, it's example.com/blue/widget

    Everything else we can cope with (and have done already using test URLs), but in our tests the URL shown in the Browser in the Dynamic one, and that's not what we want.

    Why?

    (1) It looks messy
    (2) Although it can be bookmarked, it still looks messy.
    (3) It'll encourage someone to experiment (hack) with different parameter combinations, maybe leading to nasty results.

    Can we achieve what we're after.

    (and by the way, is this called "cloaking"?)

    Thanks,
    your.PAGES

    volatilegx




    msg:675499
     4:48 pm on Nov 15, 2005 (gmt 0)

    I believe with Mod_Rewrite, you can set it up so the URL in the browser doesn't change. It only changes if you have Mod_Rewrite code that sends a redirect.

    > (and by the way, is this called "cloaking"?)

    No.

    You could accomplish your goal through the use of LWP, but it would be wasteful of server resources, because every time a page was requested, the perl interpreter would have to send another request through Apache for the file requested via LWP.

    By the way, your LWP code snipped looks good, but you forgot to send a header before printing the page... use this:

    #!/usr/local/bin/perl
    use LWP::Simple;
    $dynamic_url = "http://whatever";
    my $page = get($dynamic_url);
    print "Content-type: text/html\n\n";
    print "$page";

    jdMorgan




    msg:675500
     5:15 pm on Nov 15, 2005 (gmt 0)

    Mod_rewrite has many functions, the major ones being:

  • Internal rewrite
  • External redirect
  • Proxy throughput
  • Error response

    Internal rewrites and proxy throughputs *do not* change the browser's address bar, because no 'handshake' with the client takes place. Only in the second case - the external redirect, is the address bar changed, because the server sends a redirect response containing the new URL to the client, and the client must re-request the desired resource from that new URL. It is the client browser that changes its own address bar when it re-requests the resource, not the server.

    Again, set up a small test and see for yourself... For a demonstration, you need look no further than the address bar at the top of this page -- WebmasterWorld is almost entirely dynamic, yet the forum page links are static.

    Jim

  • yourPAGES




    msg:675501
     11:44 pm on Nov 15, 2005 (gmt 0)

    Thanks to you both.

    OK, maybe what we need to see is an example of an External Redirect (which is what we think we've coded in the various tests we tried out before asking for help here), and then one for an Internal Rewrite.

    We've seen these mentioned in our searches, and think we understand then difference (see below), but haven't seen an example of how the syntax differs between them.

    External Redirect...
    Browser sends: www.domain.com/static-value.html
    Mod_Rewrite redirects to: www.domain.com/cgi-bin/script.cgi?key=static-value
    Script is invoked and serves results
    Browser shows: www.domain.com/cgi-bin/script.cgi?key=static-value

    Internal Rewrite...
    Browser sends: www.domain.com/static-value.html
    Mod_Rewrite rewrites to: www.domain.com/cgi-bin/script.cgi?key=static-value
    Script is invoked and serves results
    Browser still shows: www.domain.com/static-value.html

    If that's so, then an Internal Rewrite is exactly what we're after.

    Thanks for staying with us while we get our heads around this. We're reasonably familiar with Regular Expressions, so once we've got this bit understood we should be ok.

    Thanks,
    your.PAGES

    volatilegx




    msg:675502
     4:45 pm on Nov 16, 2005 (gmt 0)

    Here's an example of an internal rewrite:

    RewriteEngine on
    RewriteBase /
    RewriteRule ^namingtoken-(.*)\.html$ pagename.php?$1

    What this does is look for requests like "namingtoken-productname.html" and rewrites the URL (internally, without showing the rewritten URL in the browser) to "pagename.php?productname". I used the "namingtoken-" part of the original URL to distinguish this type of call from other non-dynamic pages on your site.

    Of course, the parameter I used is very simple and you could do something more sophisticated. Check the Apache 1.3 URL Rewriting Guide [httpd.apache.org] for examples, etc.

    Philosopher




    msg:675503
     4:56 pm on Nov 16, 2005 (gmt 0)

    Hi yourPAGES

    Yes, the internal redirect example you displayed above is correct and from whate you've written precisely what you are looking for.

    As mentioned earlier, virtually all of WW is dynamic but the URLs are all rewritten to static looking URLs.

    All of the work is done at the server level and is NEVER seen by the visitor.

    yourPAGES




    msg:675504
     12:21 am on Nov 17, 2005 (gmt 0)

    Thanks volatilegx & philosopher.

    We've finally realised that what I was missing (or rather adding) was the [R]... we hadn't found anywhere that explicitly stated something like...

    External Redirection: use [R]
    Internal Rewrite: don't use [R]

    We know that sounds obvious, but having gone back over a lot of the references we had looked at, the words "Redirect" & "Rewrite" are intermingled (misused) quite a lot, so we had seen examples using [R] that claimed to be "Rewrites" and vice versa.

    Thankyou to everyone for your help.

    volatilegx




    msg:675505
     2:34 pm on Nov 17, 2005 (gmt 0)

    Yeah sometimes we get so used to doing things, we forget to mention the obvious (to us anyway). The R flag forces a redirect, as you've realized. There are a bunch of flags you can use with Mod_Rewrite... here's a list (scroll down a bit): Mod_Rewrite: RewriteRule [httpd.apache.org]

    One more thing... if you are using parameters in your rewriting, you might want to consider using the NE flag.

    'noescapeŚNE' (no URI escaping of output)
    This flag keeps mod_rewrite from applying the usual URI escaping rules to the result of a rewrite. Ordinarily, special characters (such as '%', '$', ';', and so on) will be escaped into their hexcode equivalents ('%25', '%24', and '%3B', respectively); this flag prevents this from being done. This allows percent symbols to appear in the output, as in
    RewriteRule /foo/(.*) /bar?arg=P1\%3d$1 [R,NE]

    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Marketing and Biz Dev / Cloaking
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved