Forum Moderators: coopster
now $string holds all the HTML/Javascript contents
how to i use PHP to extract the Javascript content?
for example Javascript:
var topic_MGID = new Array("1567564", "1568929", "1568951", "1561037", "1568755", "1534196", "1561747", "1567650", "1568941", "1568950", "1566588", "1549010", "1565487", "1567621", "1568940", "1566540", "1568899", "1568844", "1557462", "1568945", "1568745", "1567921", "1567565", "1568839", "1568057", "1568912", "1568382", "1568375", "1568760", "1568867", "1568626", "1511621", "1567480", "1567207", "1566951", "1568739", "1567096", "1560900", "1567283", "1568833");
how do i use PHP to extract those Javascript Array content
thanks
[edited by: jatar_k at 4:22 pm (utc) on April 15, 2005]
[edit reason] fixed sidescroll [/edit]
well, looking at that string you could split it on commas and the chop the front off. You could probably use a regex as well.
get everything between the ( and ) and then split it on comma and strip the double quotes after if you like, it really depends on what you want to do with it.
preg_match("'<script>(.*?)<\/script>'",$string,$java);
print_r($java);
That will display the match, $java[0] will be the match with the tags included, $java[1] will be the the match inside the <script> tags.
If you have more than one occurance of <script> tags, you could use preg_match_all instead of preg_match, it will make another level in the array so look at the print_r($java).
and i use split() function to split them
$array_MGID1 = split ("\".\"", $topic_MGID1);
i got this: you see on [0]... there's "1568032 and on [39]...there's 1566903" how do i take those two quotation marks off as well? split ("\".\"", $topic_MGID1); <-- that's how i split them [1][edited by: jatar_k at 6:09 pm (utc) on April 15, 2005]
Array
( [0] => "1568032 => 1569051 [2] => 1557462 [3] => 1568479 [4] => 1569061 [5] => 1567564 [6] => 1569064 [7] => 1569062 [8] => 1568617 [9] => 1568929 [10] => 1569057 [11] => 1566951 [12] => 1539172 [13] => 1568836 [14] => 1540625 [15] => 1567650 [16] => 1568987 [17] => 1561747 [18] => 1569058 [19] => 1568970 [20] => 1568988 [21] => 1567565 [22] => 1568755 [23] => 1567480 [24] => 1557378 [25] => 1568239 [26] => 1569054 [27] => 1567163 [28] => 1568976 [29] => 1477551 [30] => 1569025 [31] => 1567262 [32] => 1568968 [33] => 1568057 [34] => 1532183 [35] => 1569011 [36] => 1511621 [37] => 1561037 [38] => 1569041 [39] => 1566903" )
[edit reason] fixed sidescroll [/edit]
$array_MGID1 = split ("\".\"", $topic_MGID1);
with this:
$array_MGID1 = split ("\".\"", substr($topic_MGID1,1,-1));
What that does is ignores the first and last characters of the original string, in this case the quotes.
You could also use:
$array_MGID1 = explode("\",\"", substr($topic_MGID1,1,-1));
It produces the same result, it is quicker though.
i'm writing RSS.. so it has to be in XML form
but i'm parsing a Chinese Big5 web site
it has all those funny symbols which not compatible with XML
is there any way to take out all the non-compatible characters and replace it with something else?
so it will be 100% XML form?
thanks