Forum Moderators: coopster & phranque

Message Too Old, No Replies

How to redirect seamlessly and parse HTML

         

scorpion

4:51 pm on Aug 15, 2002 (gmt 0)

10+ Year Member



I have two questions. In the absence of mod_perl, how can I use php or cgi to redirect to an external location and still keep my site in the address bar? Without using frames, meta refresh, or any client code..

Secondly, how can I insert some HTML into a specific location in an HTML file and send the new file out to the browser (using php or cgi)?

Knowles

4:58 pm on Aug 15, 2002 (gmt 0)

10+ Year Member



Welcome to WebmasterWorld scorpion!

For your first one mod_rewrite would probably work if I understood what you where asking correctly.

On the second question I dont understand it. Are you wanting to say have a form and then have it printed out?

Nick_W

4:59 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi, welcome to WebmasterWorld!

1. You can't. You need frames for that.
2. You could put a variable in your php page and
then when the user does whatever triggers the inclusion of this html then assign the html to your empty variable (have to use sessions or form vars) and relocate to the current page.

Need much more detail on the second question give a good answer...

Nick

Nick_W

5:01 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Knowles,

hmmm... don't know mod rewrite stuff but I suspect you're incorrect... You certainly can't do it with PHP

Nick

Knowles

5:03 pm on Aug 15, 2002 (gmt 0)

10+ Year Member



I may be Nick (you know me always misunderstanding how mod_rewrite works.) And come to think of it the reason I was thinking you can wasnt to do with mod_rewrite, I read something about it right after reading something about mod_rewrite. I gotta get a new brain.

jatar_k

5:06 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld scorpion,

1. You can but it is very intensive. You essentially have to get your script to go get the page, might have to parse and cut it, and then display it. You would have to rewrite links etc. Really isn't worthwhile, it is most likely to be slow.

The method is, in essence, cloaking. I don't understand the reason why you need to do it this way. I also wonder about what you mean by external source. If it is somone else's site then use frames, if it is something you own then find a way to put it on the other site.

2. Nick has the gyst of it and a little more info may help us out.

scorpion

5:11 pm on Aug 15, 2002 (gmt 0)

10+ Year Member



1. Somebody told me you can use the php directive 'include'. Then the page is loaded and the address bar URL stays as the original site (even if the included file comes from another URL off-site)...
I believe I have .htaccess, so I suppose I can make php files look like html files...

2. I mean like this: you have 'http://www.abc.com/apage.html'
Then you call a script on your site: 'http://www.mysite.com/ascript.cgi'

This script ascript.cgi does this:

=> retrieve apage.html
=> find the words "insert here" on the page
=> insert html, say "<a href='abc'com<hellow></a>" after this location
=> output this new file to the browser

Are there some php or cgi examples how to do this, I don't need complicated parsing, just find a tag and insert after it, etc...

scorpion

5:14 pm on Aug 15, 2002 (gmt 0)

10+ Year Member



Ok, basically, I want some scripting to hide my affiliate links. I want people to either click on a link (if it is on the page), or if it is a root file from a top-level domain that redirects to an affiliate link, but I want it so that the user never sees the affiliate link code and in the address bar always sees my site's top level domain and/or subdirectories...

So that's the problem, if there are good solutions let me know...also, it should look good for search engines...

Nick_W

5:20 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right. Don't know why I didn't think of this: I'm not certain but it may do what you want...

PHP fopen() funtion [php.net]

If filename begins with "http://" (not case sensitive), an HTTP 1.0 connection is opened to the specified server, the page is requested using the HTTP GET method, and a file pointer is returned to the beginning of the body of the response. A 'Host:' header is sent with the request in order to handle name-based virtual hosts.

Nick

Nick_W

5:22 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but I want it so that the user never sees the affiliate link code and in the address bar

Frames!

Nick

jatar_k

5:26 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



1. As far as including goes, yes that can be done. Look at the other ones on the left hand side as well.

include function [php.net]

2. You can do find and replaces like that using regular expressions, take a look at the other functions on the php.net. They have a whole list of regex functions. I sometimes split the file and then have a function in the replacable part that does the work. You have a header file include and then do your dynamic stuff, then include a footer file.

Nick's right, frames is the way to go for the other.

littleman

5:29 pm on Aug 15, 2002 (gmt 0)



You could grab the remote page in perl via LWP, or IO::Socket, then you are going to have to change all relative links to full URL links. That's not super hard, but could be a headache because a lot of site use odd relative links -- so you are going to have to do some testing and fine tuning.

Inserting into the page could be a little tricky too if you are determined to have your added html in a specific location. Basically you are going to have to search for a string in the original HTML and then replace it with the same string + whatever you want to add.

Back in the heavy spam days I use to do a lot of this type of stuff for clients. I'd grab their page add some SE food and then feed it to the spiders on a "mirror domain". One thing I would do is cache the altered pages every few days to keep the script from bogging down the server.

lorax

5:33 pm on Aug 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month


Scorpion,
Have a look at the CURL functions (if supported by your web host). They offer a robust command set for working with off-site resources.

http://curl.haxx.se/

scorpion

12:44 am on Aug 16, 2002 (gmt 0)

10+ Year Member



About this option about fixing links. Will this work?

After retrieving the remote file using fopen, include, or just lwp::simple::get(), can't I just add the tag:

<base href="original_site.com/"> in the head, and all links will show up properly, instead of the complex image/link parsing?

littleman

1:05 am on Aug 16, 2002 (gmt 0)



:) Yeah, you could, which I should have thought to mention. But there is a good chance that off site base refs would trigger a flag. with the SEs if that is a concern.

But then again the dupe content may nab you too.

scorpion

1:57 am on Aug 16, 2002 (gmt 0)

10+ Year Member



thanks. Actually, not a concern since the affiliate redirect is deeply linked (that is, it is not on the homepage, but a 3rd page down, I think...).

Do engines like google do deep spidering? I mean if your homepage looks ok, they probably won't look too deep...or do they go really deep?

Knowles

1:59 am on Aug 16, 2002 (gmt 0)

10+ Year Member



They can go deep, if they feel the need too. Also your competitors can be mean and try to turn you in for it.

scorpion

4:10 am on Aug 16, 2002 (gmt 0)

10+ Year Member



Actually, I found a ridiculously simple solution, check this out:

STEP 1: Create file index.php

<?
require 'http://yahoo.com/';
?>

If this file is on your server as index.php and you call it, it shows Yahoo in the browser AND the url in the address bar is your site...

STEP 2: Serve index.html as index.php in your .htaccess file

RewriteEngine on
RewriteBase /scripts/
RewriteRule ^index\.html$ index.php [T=application/x-httpd-php]

Voila, this should do exactly what I desire,

Only think is, I don't know what search engines would do to the above scheme or how to perhaps improve it for search engines...