Forum Moderators: coopster

Message Too Old, No Replies

Data extraction with variables

         

turbohost

5:42 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



Hi,

I want to do some data extraction from a website with variable urls. There is a fixed part in the URL ( e.g. [site.com...] and a variable part which is a number (e.g. 100). I want to extract all the html pages from 100 till 110.

I used the script below for the data extraction of a single page :

<?php
$lines = file ('http://www.example.com/');
foreach ($lines as $line_num => $line) {
echo "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br>\n";
}

I adjusted this script like this :

<?php

$a=100;
$b=110;
while ($a <= $b)
{
$lines = file ('http://www.example.com/show.php?id=$a');
foreach ($lines as $line_num => $line) {
echo "" . htmlspecialchars($line) . "<br>\n";
}
$a++;
}
?>

The output of the script gives me 10 times the data from the url 'http://www.example.com/show.php?id=$a' instead of 'http://www.example.com/show.php?id=100' till 'http://www.example.com/show.php?id=110'. How can I tell PHP to use the variable $a in the url?

Thx,
Turbo-host

coopster

6:02 pm on Dec 29, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



You need to change this line to use double quotes so the PHP parser will insert the value of your variable:

$lines = file ("http://www.example.com/show.php?id=$a");

jatar_k

6:04 pm on Dec 29, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I usually build strings before I pass them to functions, just habit and helps me debug. I can then echo the var before it is passed to see what is actually in there.

$dest = 'http://www.example.com/show.php?id=' . $a;
$lines = file($dest);

should work fine but I would think that it isn't resolving the variable because it is in single quotes instead of double. So this should fix it.

$lines = file("http://www.example.com/show.php?id=$a");

<added>I am a little slow this morning it seems ;)

coopster

6:08 pm on Dec 29, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I should have added that you may want to read the PHP String pages regarding Variable parsing [php.net].

turbohost

6:22 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



Thx a lot guys, the script is working :->

turbohost

6:26 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



Maybe one more question. How can I adjust this script to write the output to a file?

mogwai

6:43 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



As you loop through append each new line to a variable

$file_content .= "Line #<b>{$line_num}</b> : " . htmlspecialchars($line) . "<br>\n";

After the loop use something like:

$file_open = fopen('file.txt', w);
fwrite($file_open, $file_content);
fclose($file_open);

turbohost

7:39 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



OK, I think I'm almost there :-> I chmodded the text file 666 but I'm still getting the error 'Parse error: parse error in /home/bla/public_html/get.php on line 9'. I'm using the script below. Do I have to add a ' or a " somewhere?

<?php
$a=100;
$b=102;
while ($a <= $b)
{
$lines = file ("http://www.example.com/show.php?id=$a");
foreach ($lines as $line_num => $line)
{
$file_content = . htmlspecialchars($line) . "<br>\n";
$file_open = fopen('/home/bla/public_html/bla.txt', w);
fwrite($file_open, $file_content);
}
$a++;
}
?>

jatar_k

7:44 pm on Dec 29, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



$file_content = . htmlspecialchars($line) . "<br>\n";

I think you meant

$file_content .= htmlspecialchars($line) . "<br>\n";

not sure, the problem though is the dot after the =

turbohost

8:06 pm on Dec 29, 2003 (gmt 0)

10+ Year Member



Great, everything is working. Thx guys!

Here is the working script if somebody else wants to use it.

<?php
$a=100;
$b=102;
while ($a <= $b)
{
$lines = file ("http://www.example.com/show.php?id=$a");
foreach ($lines as $line_num => $line)
{
$file_content .= htmlspecialchars($line) . "<br>\n";
}
$file_open = fopen('/home/bla/public_html/bla.txt', w);
fwrite($file_open, $file_content);
fclose($file_open);
$a++;
}
?>