Forum Moderators: coopster

Message Too Old, No Replies

Looting information from outside source?

trying to take a artist description from an outside source...

         

GamingLoft

5:42 pm on Feb 3, 2008 (gmt 0)

10+ Year Member



I have absolutly no idea what im doing but i figure im going to have to do something like this (even though i dont know the proper functions)

1. read the file
2. create a function to find the parts of the file i want
example of this:


GET TEXT after...
"<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>"


and stop at getting text at "<div style="margin-top:20px; border-top:1px solid #CCC; padding-top:10px;">"

pretty complicated for me, i have no idea where to start but heres what i have so far (all it does is read the file...)


<?php
$artist = urlencode($_GET['artist']);
$url = 'http://www.example.com/music/'.$artist.'/+wiki';
readfile($url);
?>

[edited by: eelixduppy at 7:10 pm (utc) on Feb. 3, 2008]
[edit reason] example.com [/edit]

eelixduppy

7:37 pm on Feb 3, 2008 (gmt 0)



How about something like this, to continue the code that you already have there:

$artist = urlencode($_GET['artist']);
$url = 'http://www.example.com/music/'.$artist.'/+wiki';
$html = [url=http://www.php.net/file-get-contents]file_get_contents[/url]($url);
$b = preg_quote('<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>','/');
$e = [url=http://www.php.net/preg-quote]preg_quote[/url]('<div style="margin-top:20px; border-top:1px solid #CCC; padding-top:10px;">','/');
[url=http://www.php.net/preg-match]preg_match[/url]("/$b(.+)$e/ix",$html,$desc);
echo "Description: $desc";

This is untested but it looks like it should work correctly. You could also use strpos and find the location of each of those in the string and then take that substring however that is just as much work. Try the above and see what that gives you.

GamingLoft

10:38 pm on Feb 3, 2008 (gmt 0)

10+ Year Member



ok i tried it... but i have a couple problems.

1) TAKES FOREVER to load, probably because the servers where im looting info from are slow but is there anyway to fix this, like using javascript or some code?

2) It takes the "mini icon" thing from the site, i swear i have no idea what its called right now but its that little icon that shows up in the shortcuts or in the toolbar w/e. its like 16x16 px.. yeah that thing its using the OTHER sites icon...

3) BIGGEST PROBLEM what shows up on my page is not the artist info, or the info im taking but it shows up


Description: Array

-------------------
THINK I KNOW THE ANSWER...

well ...


<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>

was not found within the source of the file im trying to take info from when i did cntrl+f

so i started cutting my search term down, and realized that the code i should be finding has spaces between it, (like linespaces) i was wondering if that'd make a difference? and if so, does anyone knowsa way around that?

heres the code i need to find..


<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>

----------------------------------------------

---------------------------------------------

-----------------------------------------

K I KNOW THE SIZE OF MY POST IS GETTING OUTTA HAND, but i have another update..

i now know EXACTLY what i need to do, for this to work 100% clearly.

if anyone can relate to this code.. they'll know what i mean


//IGNORE EVERYTHING BEFORE THIS.
<div class="lastPanel alt">
<div class="h"><h2>Factbox (<a href="/forum/markup#artistwikitags" target="_help" style="color: black;" title="What's This?">?</a>)</h2></div>
<div class="c">
<dl class="sidebarInfoList">
//ANY TEXT/num/characters HERE (WildCard?)
</dl>
</div>
<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>
//THE ARTIST DESCRIPTION WILL BE HERE!
<div style="margin-top:20px; border-top:1px solid #CCC; padding-top:10px;">
//IGNORE EVERYTHING BELOW!

[edited by: GamingLoft at 11:25 pm (utc) on Feb. 3, 2008]

GamingLoft

1:16 am on Feb 4, 2008 (gmt 0)

10+ Year Member



sorry for the double post, i really dont like to do this but.. :/
"Age: Allowable time to edit post has past. "

anyways, i came up with this so far... which i still need to work on..


<?php
$artist = urlencode($_GET['artist']);
$url = 'http://www.example.com/music/'.$artist.'/+wiki';
$file = file_get_contents($url);
$factbook = '<div class=\"lastPanel alt\">
<div class=\"h\"><h2>Factbox (<a href=\"/forum/markup#artistwikitags\" target=\"_help\" style=\"color: black;\" title=\"What\'s This?\">?</a>)</h2></div>
<div class=\"c\">
<dl class=\"sidebarInfoList\"> ';
$thatline = ' <div style=\"margin-top:20px; border-top:1px solid #CCC; padding-top:10px;\"> Registered users can edit this page. <br/> <a href=\"/join/\">Sign up</a> now, it\'s free and you will discover so much great music :)
</div>';
$factbook = stripslashes($factbook);
$factpos = strpos($file, $factbook);
$thatline = stripslashes($thatline);
$thatlinepos = strpos($file, $thatline);
echo 'located at...';
echo $factpos;
echo 'line at...';
echo $thatlinepos;
echo '<br><br>';
$desc = substr_replace($file, '', 0, $factpos) . '';
$desc = substr_replace($desc, '', -$factpos, -1) . '';
echo $desc;
?>

if you read the bottom of my post above you'd see that im attempting to do this


//IGNORE EVERYTHING BEFORE THIS.
<div class="lastPanel alt">
<div class="h"><h2>Factbox (<a href="/forum/markup#artistwikitags" target="_help" style="color: black;" title="What's This?">?</a>)</h2></div>
<div class="c">
<dl class="sidebarInfoList">
//ANY TEXT/num/characters HERE (WildCard?)
</dl>
</div>
<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>
//THE ARTIST DESCRIPTION WILL BE HERE!
<div style="margin-top:20px; border-top:1px solid #CCC; padding-top:10px;">
//IGNORE EVERYTHING BELOW!

anyways, my code is doing what its doing but i STILL need help with the WILDCARD part, i have ABSOLUTELY no idea how to do this and can't find anything anywhere!

well heres the part im stuck on... in my ughh simplified terms i guess you could say...


<div class="lastPanel alt">
<div class="h"><h2>Factbox (<a href="/forum/markup#artistwikitags" target="_help" style="color: black;" title="What's This?">?</a>)</h2></div>
<div class="c">
<dl class="sidebarInfoList">
//ANY TEXT/num/characters HERE (WildCard?) <<<PART I NEED HELP!
</dl>
</div>
<div class="f"><span class="iesucks">&nbsp;</span></div>
</div></div>

[edited by: eelixduppy at 6:54 am (utc) on Feb. 4, 2008]
[edit reason] example.com [/edit]

GamingLoft

3:00 am on Feb 4, 2008 (gmt 0)

10+ Year Member



k i fixed it and it works great, if anyone wants to see it in action..

<snip>

(not sure if ill get in trouble for that.)
(mod plz delete if its not aloud.)

here is my completed code..


<?php
$artist = urlencode($_GET['artist']);
$url = 'http://www.example.com/music/'.$artist.'/+wiki';
$file = file_get_contents($url);
$factbook2 ='<div class=\"f\"><span class=\"iesucks\">&nbsp;</span></div>
</div></div>';
$thatline = '<div style=\"margin-top:20px; border-top:1px solid #CCC; padding-top:10px;\">';
$factbook2 = stripslashes($factbook2);
$factpos2 = strpos($file, $factbook2);
$desc = substr_replace($file, ' ', 0, $factpos2) . '';
$desc = str_replace($factbook2, ' ', $desc);
$thatline = stripslashes($thatline);
$thatlinepos = strpos($desc, $thatline);
$desc = substr_replace($desc, '', $thatlinepos, -1) . "<br />\n";
echo $desc;
?>

kthx bye.

[edited by: eelixduppy at 6:53 am (utc) on Feb. 4, 2008]
[edit reason] no URLs, please [/edit]