Forum Moderators: coopster

Message Too Old, No Replies

Import tab indented flat file as nested categories

How would you go about doing this? Thanks!

         

kenmcd

4:37 pm on Aug 29, 2009 (gmt 0)

10+ Year Member



Just looking for a point in the right direction.
A sort of pseudocode example.
My beginner level PHP skills are not enabling me to envision how to do this.
Thanks.

How would you go about importing a tab indented text file into MySQL as nested categories?

The categories are in a list and indented with tabs.
Like this:

LevelOneCat1
----LevelTwoCat1
----LevelTwoCat2
LevelOneCat2
----LevelTwoCat3
----LevelTwoCat4
LevelOneCat3
----LevelTwoCat1
--------LevelThreeCat1
--------LevelThreeCat2
LevelOneCat4
Etc...

The categories are actually indented with tabs (which I cannot use here).

What method would you use to work through this list assigning the category IDs and keeping track of the relationships?

My documentation software does this from a tab indented text list and it makes it very easy to create an outline structure very quickly.

I am trying to adapt his idea to a PHP online directory application to enable creating the directory category structure very quickly and easily.

Just looking for a point in the right direction.

Thank you!

dreamcatcher

8:57 am on Oct 5, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi kenmcd,

If you don`t have a lot of categories, probably best adding them manually in the db one at a time. Might not be worth writing an import routine.

dc

andrewsmd

10:35 pm on Oct 5, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm in the middle of something that might work. I will post back tomorrow if I have something.

juanyjuans

10:45 pm on Oct 5, 2009 (gmt 0)

10+ Year Member



First what fields are in the DB table? You will need a field that indicates the level of each category e.g.

category ¦ level
1 ¦ 1
1 ¦ 2
1 ¦ 2
2 ¦ 1
...

andrewsmd

2:00 pm on Oct 6, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try this out. This is exactly what my text file looked like.
Note 4 - is a tab

Parent 1
----child 1-1
----child 1-2
--------child 1-2.1
--------child 1-2.2
----child 1-3
parent 2
----child 2-1
--------child 2-1.1
--------child 2-1.2
----child 2-2

I just printed the data because I didn't know what you wanted to do with it. Post back if you need help outputting this in some kind of html.

//the name of your file
$fileName = "temp.txt";

//all of the lines in the file
$lines = file($fileName);

//an array of positions and values
$returnArr = array();

//now go through wherever we had a tab
//and find what position it will be and
//add them to a new array
foreach($lines as $key => $i){

$position = 0;

//keep the spot in the string
$tempSpot = 0;

while(strpos($i, "\t", $tempSpot) !== false){

$position++;
$tempSpot = strpos($i, "\t", $tempSpot) + 1;

}//while

//i have to give it a unique key
//you could use a 2d array but since php
//uses hash tables for arrays this is just easier
$returnArr[$key.";".$position] = $i;

}//foreach $tabArr

//now go through this array and it will tell you what
//positions they were
foreach($returnArr as $key => $i){

//split the key to get just the child spot
$tempArr = explode(";", $key);

//the position
$spot = $tempArr;

echo("This element's spot is $spot over and it's value is $i<br>");

}//foreach

[1][edited by: andrewsmd at 2:02 pm (utc) on Oct. 6, 2009]

andrewsmd

3:36 pm on Oct 6, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Slow day at work for me. Here is an html version. If you need some other format just let me know.
<?php

//the name of your file
$fileName = "temp.txt";

//a marker to keep our spot in the file
$marker = 0;

//all of the lines in the file
$lines = file($fileName);

//an array of positions and values
$returnArr = array();

//now go through wherever we had a tab
//and find what position it will be and
//add them to a new array
foreach($lines as $key => $i){

$position = 0;

//keep the spot in the string
$tempSpot = 0;

while(strpos($i, "\t", $tempSpot) !== false){

$position++;
$tempSpot = strpos($i, "\t", $tempSpot) + 1;

}//while

//i have to give it a unique key
//you could use a 2d array but since php
//uses hash tables for arrays this is just easier
$returnArr[$key.";".$position] = $i;

}//foreach $tabArr

//now go through this array and it will tell you what
//positions they were
foreach($returnArr as $key => $i){

//split the key to get just the child spot
$tempArr = explode(";", $key);

//the position
$spot = $tempArr[1];

if($spot > 0){

for($j = 0; $j < $spot; $j++){

echo("&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;");

}//for

}//if spot > 0

echo($i."<br>");

}//foreach

?>

kenmcd

11:35 pm on Oct 6, 2009 (gmt 0)

10+ Year Member




Thank you very much.
I will give that a try and play with it.

Thanks!

kenmcd

12:16 am on Oct 7, 2009 (gmt 0)

10+ Year Member




@dreamcatcher
The entire reason for this is to be able to add many categories.
The application already has a semi-manual method to add lists of sub-categories to one parent category.
This takes awhile for large directories.
My documentation application makes it very easy to add a very large outline structure with this tabs method.
So that gave me the idea to transfer this method to the PHP directory application.

@juanyjuans
The database keeps the info in two tables
- categories - which is a list of all the categories
- cats_relations - which has the relations like this

category ¦ parent
02 ¦ 01
03 ¦ 01
04 ¦ 01
06 ¦ 05
07 ¦ 05
etc.

@andrewsmd
I can see how you parsed the tabs and created the array. Very cool.
Now I guess I need to put the MySQL code within your output loop to create my database entries.
Going to take me awhile to figure that one out.
So I may not post again for a day or two.

Thank you very much for the help!

andrewsmd

1:15 am on Oct 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think I have a quick idea on how you can have your mysql table. I'm not quite for sure I understand your table relationship though. I have done this before and how I did it was to have a table that was something like
id ¦ parent ¦ category

for anything that is a top level category you just have 0 or null in the parent category. So in this case your data would look something like
0 ¦ 0 ¦ parent1
1 ¦ parent 1 ¦ child 1-1
2 ¦ parent 1 ¦ child 1-2
3 ¦ child 1-2 ¦ child 1-2.2
4 ¦ child 1-2 ¦ child 1-2.2
5 ¦ parent 1 ¦ child 1-3
6 ¦ 0 ¦ parent 2
7 ¦ parent 2 ¦ child 2-1
etc.

If your structure is like that (which is what I would recommend) then I can have something for you tomorrow, my code is at work since I've done this before.

kenmcd

3:38 am on Oct 7, 2009 (gmt 0)

10+ Year Member



Hi,

This is the actual table structure for the two tables:

CREATE TABLE `cats_relations` (
`catid` int(11) NOT NULL DEFAULT '0',
`parentid` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`catid`,`parentid`),
KEY `category_parent_id` (`parentid`),
KEY `category_child_id` (`catid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `categories` (
`catid` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`image` char(100) DEFAULT NULL,
`image_position` varchar(10) DEFAULT NULL,
`description` text,
`introtext` varchar(100) DEFAULT NULL,
`published` tinyint(1) DEFAULT NULL,
`checked_out` int(11) DEFAULT NULL,
`checked_out_time` datetime DEFAULT NULL,
`ordering` int(11) DEFAULT NULL,
`access` tinyint(3) DEFAULT NULL,
`count` int(11) DEFAULT NULL,
`params` text,
`icon` varchar(100) NOT NULL,
PRIMARY KEY (`catid`)
) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;

The PHP directory application does allow "unlimited nesting", with some reasonable practical limits.
I think category 0 is accounted for in the code as I have seen it in places.

But each row in the cats_relations table stores only those two fields - catid and parentid.
And then it figures-out the nesting tree from that.

Just re-read and re-read the code above and cannot get my head around it yet. Maybe tomorrow. Been up since around 3a.m. and am now fading fast. Not the best time to be trying to figure this out.

Thanks for your help.

kenmcd

3:39 am on Oct 7, 2009 (gmt 0)

10+ Year Member



Hi,

This is the actual table structure for the two tables:

CREATE TABLE `cats_relations` (
`catid` int(11) NOT NULL DEFAULT '0',
`parentid` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`catid`,`parentid`),
KEY `category_parent_id` (`parentid`),
KEY `category_child_id` (`catid`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `categories` (
`catid` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) DEFAULT NULL,
`image` char(100) DEFAULT NULL,
`image_position` varchar(10) DEFAULT NULL,
`description` text,
`introtext` varchar(100) DEFAULT NULL,
`published` tinyint(1) DEFAULT NULL,
`checked_out` int(11) DEFAULT NULL,
`checked_out_time` datetime DEFAULT NULL,
`ordering` int(11) DEFAULT NULL,
`access` tinyint(3) DEFAULT NULL,
`count` int(11) DEFAULT NULL,
`params` text,
`icon` varchar(100) NOT NULL,
PRIMARY KEY (`catid`)
) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;

The PHP directory application does allow "unlimited nesting", with some reasonable practical limits.
I think category 0 is accounted for in the code as I have seen it in places.

But each row in the cats_relations table stores only those two fields - catid and parentid.
And then it figures-out the nesting tree from that.

Just re-read and re-read the code above and cannot get my head around it yet. Maybe tomorrow. Been up since around 3a.m. and am now fading fast. Not the best time to be trying to figure this out.

Thanks for your help.

andrewsmd

1:47 pm on Oct 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your table structure is basically the same as mine, you just don't have an id field in catids. This is not a good practice. While right now it won't present any problems, it could in the future. Why can't you just add another field to your cats_relations table that is an id and is the primary key?

andrewsmd

7:02 pm on Oct 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try this out
<?php

//the name of your file
$fileName = "temp.txt";

//all of the lines in the file
$lines = file($fileName);

//an array of positions and values
$returnArr = array();

//now go through all of the data
foreach($lines as $key => $i){

//reset the number of current tabs
$currentTabs = 0;

//to keep the number of tabs in a previous element
$previousTabs = 0;

//keep the spot in the string
$tempSpot = 0;

//the previous element
$previousKey = $key - 1;

//if this is not true then we need to go back and find the parent
if(strpos($i, "\t", $tempSpot) !== false){

//reset the array
$tempArray = array();

//we do this to count the number of tabs in
//the current given element
while(strpos($i, "\t", $tempSpot) !== false){

$currentTabs++;
$tempSpot = strpos($i, "\t", $tempSpot) + 1;

}//while (strpos($i, "\t", $tempSpot) !== false)

//reset tempspot
$tempSpot = 0;

//get the number of tabs in the previous element
while(strpos($lines[$previousKey], "\t", $tempSpot) !== false){

$previousTabs++;

//increment tempSpot
$tempSpot++;

}//while (strpos($i, "\t", $tempSpot) !== false)

//we have to do this as long as previousTabs is greater than or eqaul to
//current tabs just trust me
while($previousTabs >= $currentTabs){

//reset previous tabs
$previousTabs = 0;
//reset tempspot
$tempSpot = 0;

//go back one more element
$previousKey--;

//find the tabs of the new previous element
while(strpos($lines[$previousKey], "\t", $tempSpot) !== false){

//increment tabs
$previousTabs++;

//increment tempSpot
$tempSpot++;

}//while (strpos($i, "\t", $tempSpot) !== false)
}//while previoustTabs >=

//add them to the array
array_push($tempArray, $lines[$previousKey]);
array_push($tempArray, trim($i));

}//if strpos($i, "\t", $tempSpot) !== false

//else we have no parent and it is a paretn
else{
//reset the array
$tempArray = array();
array_push($tempArray, "0");
array_push($tempArray, trim($i));
}//else

//add the temp array to the return array
array_push($returnArr, $tempArray);

}//foreach $tabArr

//now go through this array and it will tell you what
//positions they were
foreach($returnArr as $i){

echo("Parent is:{$i[0]} and it's child is {$i[1]}<br>");

}//foreach

?>

kenmcd

5:19 am on Oct 9, 2009 (gmt 0)

10+ Year Member




Hi andrewsmd.

The directory application is not mine.
So I have no control over how the database is structured.
The developer is very stubborn and egotistical, and is more concerned with showing off theoretical database structure than real world practicality.
You ought to see the directory entries tables – the "data fields" are stored in the same field sequentially. Theoretically fancy, but practically a giant PITA as things like a simple entries import are not simple.
So your structure comments are probably correct, but I am stuck with it.

The existing import tool is overly complicated and very hard to use. And expensive.
So I am making a FREE simple easy-to-use import tool.
I have the entries import code working.
Just wanted to add a better categories import tool also.

Thank you VERY much for the code.
I think I can merge it with the proper PHP and MySQL code to accomplish my categories import.

I will post back here with my progress.

Thanks again.

.