Forum Moderators: coopster

Message Too Old, No Replies

Pulling data from xml file

         

msthac01

4:14 pm on Feb 13, 2008 (gmt 0)

10+ Year Member



I'm trying to pull and save data from an xml file that I'm receiving after sending a post request to a remote server. The issue is the layout of the tage, and that I don't have much php knowledge. The tags look like the following:

<RESPONDING_PARTY _Name="The Co name" _StreetAddress="The Address" _City="The City" _State="The State" _PostalCode="The ZIP">

How do I go about pulling the data out of a tag laid out like this? Any suggestions

whoisgregg

6:09 pm on Feb 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I recommend taking a look at the PHP XML functions [php.net], particularly the comments on that page.

msthac01

9:26 pm on Feb 13, 2008 (gmt 0)

10+ Year Member



We have another page in our application that takes an xml file received over an http post and is able to pull out the data and insert into the db properly the major difference are the way the xml tags are formed, on the other page that works the tags are very simple such as <city>data</city><state>data</state> so its much simpler to pull the data, would it be possible to modify that code to handle the different tag structure that the xml file I'm dealing with has?

Here's the code that pulls the data:

<?php

$filename = $CreditResponse;
$fp = fopen($filename, 'rb');
$TheSize = filesize($filename);
$xml_dataRead = fread($fp, $TheSize);
//echo htmlentities($xml_dataRead);

$usercount=0;
$userdata=array();
$state='';

function startElementHandler ($parser,$name,$attrib)
{
global $usercount;
global $userdata;
global $state;

switch ($name)
{
case $name=="THIS IS NOT USED FOR GetSmart" :
{
break;
}
default : {$state=$name;break;}
}
}

function endElementHandler ($parser,$name)
{
global $usercount;
global $userdata;
global $state;

$state='';
if($name=="LeadInformation") {$usercount++;}
}

function characterDataHandler ($parser, $data)
{
global $usercount;
global $userdata;
global $state;

if (!$state) {return;}
if ($state=="CITY") { $userdata[$usercount]["city"] = $data;}
if ($state=="STATE") { $userdata[$usercount]["state"] = $data;}
} // end function

if (!($xml_parser = xml_parser_create())) die("Couldn't create parser.");
xml_set_element_handler($xml_parser, "startElementHandler", "endElementHandler");
xml_set_character_data_handler($xml_parser, "characterDataHandler");

while($data = fread($fp, $TheSize))
{
if(!xml_parse($xml_parser, $data, feof($fp)))
{
break;
}
}

fclose($fp);
xml_parser_free($xml_parser);
for ($i=0;$i<=$usercount; $i++)
{
mysql_select_db($database_dbConnect, $dbConnect);

$city = addslashes(trim($userdata[$i]["city"]));
$state = addslashes(trim($userdata[$i]["state"]));

?>

So would it be possible to modify this code to handle the fact that there is more data within the tags and to pull that data out?

youfoundjake

2:14 am on Feb 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Since I'm in the same boat, I'll mention SimpleXML, which is now supported with PHP5.
[us2.php.net...]

msthac01

1:56 pm on Feb 19, 2008 (gmt 0)

10+ Year Member



yeah that looks like the best solution unfortunately i'm on php 4.4.4 :(

dreamcatcher

9:07 pm on Feb 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Might be worth asking your host about an upgrade. Its kind of encouraged that everyone should move to PHP5 now as PHP4 has been discontinued by the PHP dev team since December 2007.

Just a thought. Most hosts should now be going with PHP5.

dc

PHP_Chimp

10:11 pm on Feb 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You are trying to read the values from the attributes. So that should not be to much of a problem.

The code below is mainly copied from some code that I use. So there may well be bits that you dont need, but its late so this is a quick reply ;)


$_parser = xml_parser_create('ISO-8859-1');
xml_parser_set_option($_parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($_parser, XML_OPTION_SKIP_WHITE, 1);
xml_set_object($_parser, $this);
xml_set_element_handler($_parser, 'tag_open', 'tag_close');
// xml_set_character_data_handler($_parser, 'tag_contents');
function tag_open ($parser, $tag, $attr) {
switch ($tag) {
case 'RESPONDING_PARTY':
// you can access the attributes here
foreach ($attr as $key => $value) {
echo "$key : $value<br />\n";
}
break;
default:
// whatever
break;
}
}
// you need to supply a tag_close function

The details are on the xml_set_element_handler [uk2.php.net] page.

PHP4 sucks for XML, but you can still work with everything. Its just a lot more bother than it is with PHP5.

msthac01

4:25 am on Feb 22, 2008 (gmt 0)

10+ Year Member



I found some code on one of the sites I've been cruising through and managed to get to read the file, but if there are any similar tags then it overwrites them with the last one it comes across. How can I get it to append each similar tag and not overwrite?

[php

$file = $filename;
$depth = 0;
$tree = array();
$tree['name'] = "root";
$stack[] = &$tree;

function startElement($parser, $name, $attrs) {
global $depth;
global $stack;
global $tree;

$element = array();
foreach ($attrs as $key => $value) {
$element[strtolower($key)]=$value;
}

end($stack);
$stack[key($stack)][strtolower($name)] = &$element;
$stack[strtolower($name)] = &$element;

$depth++;
}

function endElement($parser, $name) {
global $depth;
global $stack;

array_pop($stack);
$depth--;
}

$xml_parser = xml_parser_create();
xml_set_object ( $xml_parser, $this );
xml_set_element_handler($xml_parser, "startElement", "endElement");
//xml_set_character_data_handler ( $parser, "tagContent" );
if (!($fp = fopen($file, "r"))) {
die("could not open XML input");
}

while ($data = fread($fp, filesize($filename))) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
xml_parser_free($xml_parser);
$tree = end(end($stack));
echo "<pre>";
print_r($tree);
echo "</pre>";

php]

PHP_Chimp

5:22 pm on Feb 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To append information to a variable you can use .=
So something like -

$test = 'This is ';
$test.= 'a';
$test.= ' test';
echo $test

Will give you 'This is a test'

So if you change -


foreach ($attrs as $key => $value) {
//$element[strtolower($key)]=$value;
$element[strtolower($key)][b].=[/b] $value;
}

that should append the attributes together.

<edit>
If there is not going to be any content within the tags that you want then you can get rid of the xml_set_character_data_handler. As this works with the contents/data within each tag.

[edited by: PHP_Chimp at 5:23 pm (utc) on Feb. 22, 2008]

msthac01

9:14 pm on Feb 22, 2008 (gmt 0)

10+ Year Member



Alright I've at the point where I can read the data into an array, but now I'm stuck on what I thought would be the hardest part of this...basically when I receive my data it can be of varying length and tags can potentially be repeated any number of times which brings up my issue... Due to the varying length of the file I don't know how to account for this when saving the data to the SQL db. I wanted to save a particular set of tags into one table and the rest of the data into another table, but how will I account for multiple tags...here's an example:

This tag can appear anywhere from 1 to XX times and I want to save it to a separate table so as to be able to account for multiple instances of the tag...I figure that the best option would be to pull these tags into a separate file and then read and save from there, but how can I accomplish pulling just these tags out......

<CREDIT_LIABILITY CreditLiabilityID="TRD0000" BorrowerID="100252" CreditFileID="B-EFX-01" CreditTradeReferenceID="CTR0000" _AccountIdentifier="N/A" _AccountOpenedDate="1997-07" _AccountOwnershipType="Individual" _AccountReportedDate="2001-11" _AccountStatusDate="2001-11" _AccountStatusType="Open" _AccountType="Revolving" _DerogatoryDataIndicator="N" _HighCreditAmount="6000" _LastActivityDate="2001-10" _MonthlyPaymentAmount="58" _MonthsReviewedCount="47" _TermsDescription="MONTHLY" _TermsSourceType="Provided" _UnpaidBalanceAmount="5157" CreditLoanType="UnknownLoanType">
<_CREDITOR _Name="CITI" _StreetAddress="P.O. BOX 6500" _City="SIOU FALLS" _State="SD" _PostalCode="57117">
</_CREDITOR>
<_CURRENT_RATING _Code="1" _Type="AsAgreed"/>
<_LATE_COUNT _30Days="0" _60Days="0" _90Days="0"/>
<CREDIT_COMMENT>
<_Text>AMT IN HIGH CREDIT IS CREDIT LIMIT</_Text>
</CREDIT_COMMENT>
<CREDIT_REPOSITORY _SourceType="Equifax" _SubscriberCode="906BB00289"/>
</CREDIT_LIABILITY>

msthac01

9:49 pm on Feb 22, 2008 (gmt 0)

10+ Year Member



I think the preg_match_all function is what I'm needing in this instance, I just can't seem to figure out the right way to code the expression to get the tags I want out, and I'll also have to loop through the results somehow to make sure every occurance of the tag is saved to the db. Any one familiar with the preg_match_all function?

msthac01

7:38 pm on Feb 25, 2008 (gmt 0)

10+ Year Member



Well after some advice from other people on some forums I'm looking into using domxml for pulling the tags that can appear multiple times in order to save them to a separate db. The issue is I can't seem to get the Dom stuff to work. I uncommented it in my php.ini file and recycled to get it loaded yet when I try the following code:

<php

$theFile = file_get_contents($CreditResponse);

$newstring=utf8_encode($theFile);

if (!$dom = domxml_open_file($newstring)) {
echo "Error while parsing the document\n";
echo print_r($error);
exit;
}

$root = $dom->document_element();

?>

My if statement always fails and reads the generic error message. I'm not really sure why its not working, yet I echoed out the $newstring and got my entire xml file so it seems to be something with the domxml_open_file command. I just don't know what to check...Any help would be greatly appreciated.

PHP_Chimp

9:11 pm on Feb 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




<?php
// $data = file_get_contents($file); // slurp the file into a string
// to test use this
$data = <<<FILE
<xml>
<tag1 attr1="value 1" attr2="value 2" />
<tag2 attr1="value 1" />
<tag1 attr1="value 1 again" attr2=" value 2 again" />
<tag2 attr1="value 1.2 again" />
<tag3 attr1_3="this is different" />
</xml>
FILE;
$xml_parser = xml_parser_create('ISO-8859-1');
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($xml_parser, XML_OPTION_SKIP_WHITE, 1);
xml_set_element_handler($xml_parser, 'startElement', 'endElement');
//xml_set_character_data_handler ( $xml_parser, "tagContent" );
$out = array();// this will store the output
$depth = 0;
function startElement($xml_parser, $tag, $attr) {
global $out, $depth;
if(empty($attr)) {
$out[$depth] = $tag;
$depth++;
}
else { // there are attributes
// non-overwriting
foreach($attr as $key => $value) {
$out[$depth][$tag][strtolower($key)] = $value;
}
$depth++;
}
}
function endElement($xml_parser, $tag) {
global $out;
// do nothing as we are only adding these values to an array.
}
if (!xml_parse($xml_parser, $data /*, feof($fp) */ )) {
die(sprintf("XML error: %s at line %d",xml_error_string(xml_get_error_code($xml_parser)),xml_get_current_line_number($xml_parser)));
}
xml_parser_free($xml_parser);
echo '<pre>';
print_r($out);
echo '</pre>';
echo $data;
?>

Will give you have all of the tags including there attributes listed, none are overwritten.
Finding what you need should then be a matter of looking for the tags that you want.
Hopefully this is what you were after :)