| 7:11 am on Jan 20, 2006 (gmt 0)|
Not sure I understand the question, but maybe the LWP module (perl) is what you are looking for.
| 1:21 pm on Jan 20, 2006 (gmt 0)|
Ok, I'm looking to include the source of the script part, giving me static content on my page from a dynamic script.
Right now if I do I view source of the page it just shows the <script tag and not the actual html that my browser gets and displays.
I have no knowledge with writing CGI scripts so I will need some "leading" :)
I also tried using cURL last night and didn't get any different results.
| 7:04 pm on Jan 20, 2006 (gmt 0)|
Look into the LWP or LWP::Simple modules, but they are not core modules and some hosts don't have them installed. If it's your own server you can install them.
| 7:41 am on Jan 21, 2006 (gmt 0)|
I still don't get what you're trying to do. Can you explain what you're trying to acomplish in terms of what you want the web site to do? What type of content it is you're dealing with etc. Maybe then I can understand the problem.
| 1:25 pm on Jan 21, 2006 (gmt 0)|
Thanks for replying, I'm trying to hone the question, but I'm not good at describing it :(
The info provided to me from my merchant is in the form of a <script tag and this builds the page on the browser on the fly, with the latest info from the merchant. All you see when you do a browser view source is the <script src="http://blahblah"></script> part. But the page displays multiple items, text and images, and this is the part I'm trying to retrieve.
I'd like to run a cron to get the HTML data of the contents of the script once a day and then save it to a text file. This way I can use an include to present the results in a static format rather than dynamic. But I have no clue how to do this.
| 4:01 pm on Jan 21, 2006 (gmt 0)|
I don't think fetching the script and saving it to a static file would work. The reason is that the script tag with a SRC attribute in an HTML page specifies an external file containing CLIENT-SIDE script.
What happens then is that when the page is loaded, the script is also fetched and executed IN THE BROWSER - but when you do a page view, you don't see the contents of the script file, just the script tag (as you do).
The fetch idea you have would only work if the script executed on the web server - which is not the case here.
| 3:13 pm on Jan 22, 2006 (gmt 0)|
Well, I figured that if a browser can pull the data, then a script could do the same.
I guess I'll keep looking :)
| 3:27 pm on Jan 22, 2006 (gmt 0)|
You certainly can do what you have described with perl, php or wget assuming that the merchant returns html (and not script within script within script etc)
There are a few different ways to do it, but basically :
1. Grab the html from the merchant site using your cronned, perl or php script or wget and save it to a text file daily.
2. Use SSI in your html file to include that text file.
or, do 1, but use perl or php to save the data and the remainer of your page as your html file. No text file or SSI.
My perl is rusty, so I can't give code examples. sorry.
| 3:46 pm on Jan 22, 2006 (gmt 0)|
I think it IS scripts within scripts.... I tried grabbing the url before and all I get from the server is
var lsn_hid='***'; var lsn_eid='******'; var lsn_oid='***'; var lsn_u1=''; var lsn_click='http://somewebsite.com/cgi-bin/click?id=******&var=****.'+'***'+'&type=14&catid='+'1'+'&hid='+'***';
| 8:38 am on Jan 25, 2006 (gmt 0)|
yes ^^ you need somthing wich can run these script ^^
1) Command line switches and Mozilla?
( I used theme only ready installed. they are not perfect. but in combination with lwp and perl they can do what you need ... perhaps ^^)
| 5:08 pm on Jan 25, 2006 (gmt 0)|
It appears you have JS code or SSI, (server side includes) in an HTML file.
Let's call the current static page on your site "mypage.htm"
If that page contains JS or SSI that produces content and inserts it into "mypage.htm" on-the-fly, it MAY or MAY NOT be possible to pre-fetch the content and automatically insert it as static content.
If the content can be retrieved (so you can paste it into your page and make the entire page appear as static content originating from your domain), you will most likely be able to do it with LWP (as others have suggested).
The possibilty that you MAY NOT be able to do this could arise from:
1. the server where the content originates relies on the URL of the calling page to produce the content, (ie- their server says, "produce HTML for JS or SSI requests originating from "http://domain.tld/specific-page.htm")
2. the content constantly changes and is produced specifically at time/date request is made, (you could end up publishing outdated info even if you ran your scraper script every hour)...
3. the approach you are taking is against the TOS of the data provider you are attempting to scrape from.
There is most likely "some way to do it", but you may not get what you expect from the resulting data.
| 5:26 pm on Jan 25, 2006 (gmt 0)|
var lsn_hid='***'; var lsn_eid='******'; var lsn_oid='***'; var lsn_u1=''; var lsn_click='http://somewebsite.com/cgi-bin/click?id=******&var=****.'+'***'+'&type=14&catid='+'1'+'&hid='+'***'; document.write('');
Chances are you can build the var data as a query string in the form of:
$URL = 'http://somewebsite.com/cgi-bin/click';
$varName_1 = 'lsn_hid';
$varData_1 = 'data';
$varName_2 = 'lsn_eid';
$varData_2 = 'data';
$varName_3 = 'lsn_oid';
$varData_3 = 'data';
Then feed it to LWP as a request to scrape that URL
$scrapeURL = "$URL?$varName_1=$varData_1&$varName_2=$varData_2&$varName_3=$varData_3"; #(etc..)