Forum Moderators: coopster
The text file is as follows :
DATE//1st line not needed
@!$#@#$# //2nd line not needed
#@@#$@#$ //3rd line not needed
#@$@#$@# //4th line not needed
//line break
Record "n" of "total"-->INTO COLUMN 1
//line break
DN: Database Name
VALUE-->INTO COLUMN 2
TI: Title
VALUE-->INTO COLUMN 3
AU: Author
VALUE(1);VALUE(2);...;VALUE(N)-->INTO COLUMNS 4,5...uptil N
AF: Affiliation
VALUE(1);VALUE(2);...;VALUE(N)-->INTO COLUMNS 5,6...uptil N
SO: Source
VALUE--> INTO COLUMN 6 (if AU & AF are 1 value each)
DE: Descriptors
VALUE-->INTO COLUMN 7 (")
PY: Publication Year
VALUE-->INTO COLUMN 8 (")
//line break
Record "n+1" of "total"
.
.
.
.
.
.
.
.
.
.
.
.
(Note: The above format is EXACTLY how the file exists)
Initially I used editplus to clean the text file using search and replace and successfuly converted it into a csv file (delimited by ~) and imported it onto the mysql db and it looks fine. I need to do the search & replace process in php though and I needed a jumpstart. :)
Thanks
/\\/
I have only one question: what do you mean by columns 4,5...N?
and then for the second time: columns 5,6...N?
How many rows do you want to input to db?
Could you provide me with two exemplary files with outputs, that you would want to have in db?
Reagards
Michal
GIVEN FILE FORMAT:
Thu Feb 22 15:19:55 EST 2007
CSA
Database: PsycINFO
Query: SO=(Journal of Marketing)
Record 1 of 486
DN: Database Name
PsycINFO
TI: Title
The editor's corner
AU: Author
Lincoln, Douglas J (1)
AF: Affiliation
(1)Boise State University, ID, US
SO: Source
Journal of Marketing Education. Vol 27(3), Dec 2005, pp. 211
DE: Descriptors
*Education; *Marketing; *Teaching; Career Development; Critical
Thinking; Debates; Learning; Management; Organizational Structure;
Professional Development; Satisfaction; Technology
PY: Publication Year
2005
Record 2 of 486
DN: Database Name
PsycINFO
TI: Title
Introducing Professional and Career Development Skills in the
Marketing Curriculum
AU: Author
Kelley, Craig A (1); Bridges, Claudia (1)
AF: Affiliation
(1)Department of Management, California State University, Sacramento,
Sacramento, CA, US
SO: Source
Journal of Marketing Education. Vol 27(3), Dec 2005, pp. 212-218
DE: Descriptors
*Career Development; *Curriculum; *Marketing; *Professional
Development; Curriculum Development
PY: Publication Year
2005
(CONTINUES TILL RECORD # 486)
REQUIRED FILE FORMAT :
1~PsycINFO~The editor's corner~Lincoln, Douglas J(1)~(1)Boise State University, ID, US~Journal of Marketing Education. Vol 27(3), Dec 2005, pp. 211~*Education; *Marketing; *Teaching; Career Development; Critical Thinking; Debates; Learning; Management; Organizational Structure; Professional Development; Satisfaction; Technology~2005
2~PsycINFO~Introducing Professional and Career Development Skills in the Marketing Curriculum~Kelley, Craig A(1)~Bridges, Claudia(1)~(1)Department of Management, California State University, Sacramento, Sacramento, CA, US~Journal of Marketing Education. Vol 27(3), Dec 2005, pp. 212-218~*Career Development; *Curriculum; *Marketing; *Professional Development; Curriculum Development~2005
(CONTINUES TILL RECORD# 486)
what i meant by the columns 4,5...N is that if there are more than one author for a paper then I would need to store each author in a separate column.
similar for affiliations...
-/\ \/
<?php
//read file with file [php.net] function$lines = file('file_or_url_to_read.txt');
$processing = 0; // a temp variable to show us what line are you processing (of the record, not file)
$matches = '';//initialization of matches variable
foreach ($lines as $line_num => $line) {
if(!$processing) if(preg_match('@(^Record )([0-9]+)( of [0-9]+$)@'), $line, ) $processing = 1;
if(!$processing) continue;//here comes the check part
//but please let me continue this tomorrow ok?
sorry for inconvenience
Michal
<?php
$lines = file('file_or_url_to_read.txt');//read file
//$lines = explode("\n", $str);//use this if you read file as a string$final_arr = array();
$processing = 0; // a temp variable to show us what line are you processing (of the record, not file)
$matches = '';//initialization of matches variable
$record = '';
$title_end = 0;
$author_end = 0;
$aff_end = 0;
$source_end = 0;
$desc_end = 0;foreach ($lines as $line_num => $line) {
if(!$processing) if(preg_match('@(^Record )([0-9]+)( of [0-9]+)@', $line, $matches)) $processing = 1;
if(!$processing) continue;//here comes the check part
$processing++;
echo $processing;
switch($processing) {
case 2://this is first row
$record = $matches[2];
$final_arr[$record]['record'] = $record;
break;
case 5:
$final_arr[$record]['dn'] = $line;
default:
break;
}
if($processing < 7) continue;if(!$title_end){
//you've got database name now and number of record
//now is title:
if(!preg_match('@^(AU: Author)@', $line)){ $final_arr[$record]['title'] .= $line; continue;}
else {$title_end = 1; continue;}
}
if(!$author_end){
//now is author:
if(!preg_match('@^(AF: Affiliation)@', $line)) { $final_arr[$record]['author'] .= $line; continue;}
else {$author_end = 1; continue;}
}
if(!$aff_end){
//now is author:
if(!preg_match('@^(SO: Source)@', $line)){ $final_arr[$record]['affiliation'] .= $line; continue;}
else {$aff_end = 1; continue;}
}
if(!$source_end){
//now is author:
if(!preg_match('@^(DE: Descriptors)@', $line)){ $final_arr[$record]['source'] .= $line; continue;}
else {$source_end = 1; continue;}
}
if(!$desc_end){
//now is author:
if(!preg_match('@^(PY: Publication Year)@', $line)){ $final_arr[$record]['description'] .= $line; continue;}
else {$desc_end = 1; continue;}
}
$final_arr[$record]['publication'] = $line;
$processing = 0; // reset all temp variables
$matches = '';
$record = '';
$title_end = 0;
$author_end = 0;
$aff_end = 0;
$source_end = 0;
$desc_end = 0;
}print_r($final_arr);
?>
I tested it and I think the result should satisfy you.
Regards
Michal
PS. I'm so good :D
it really gave me some perspective....and oh yeah ure pretty darn good :)
I ran the code and it gives out a very satisfactory array. Now I would like to know whether this array can be imported into a mysql database such that each variable in the array ([record],[title]..so on..) can be put into columns..(would appreciate complete sql code..for this one..major noobing here :)]
Alternatively...i did some looking around and came across some places where the array can be converted directly into xml data (which is the desired result at the end)....
would buy a beer in the event i meet a 'michal' :P...
thanks
To mysql:
foreach($final_arr as $row) {
$secure = array_map('mysql_real_escape_string', $row);
$sql = "INSERT INTO `table` ('dn', 'ti', 'au', 'af', 'so', 'de', 'py')
VALUES('{$secure['dn']}', '{$secure['title']}', '{$secure['author']}',
'{$secure['affiliation']}', '{$secure['source']}', '{$secure['description']}',
'{$secure['publication']}')";
mysql_query($sql) or die("Error: ".mysql_error());
}
This should put the data into db.
Regards
Michal
PS. Regarding the beer I'm in Poland now, but am travelling a lot, so no problem for me ;)
[edited by: coopster at 5:23 pm (utc) on Mar. 26, 2007]
[edit reason] removed link to non-authoritative site [/edit]