Forum Moderators: coopster
I have about 400 items in the tree, and I'm finding that it takes PHP/MySQL about 5.9 seconds to finish digging. I know this because I put a timer on some parts of the code.
Is there a better, faster way to do this kind of thing? I've optimized my code in every way I can think of, and though I'm somewhat pleased with the 6-second performance, I suspect that there may be a faster, more intelligent way to do it.
I'd love to hear from anyone who has innovative ideas or alternate methods of storing and/or retreiving large hierarchical trees.
I've considered storing the tree in XML or just caching the HTML output, but that presents difficulties since the structure of the tree is prone to changing quite often.
Cheers,
httpwebwitch
(A pared-down version of the function is below.)
function scantree($x){
$query = "SELECT ID,parent FROM tree WHERE parent=".$x." ORDER BY ID";
$result = mysql_query($query);
if (mysql_num_rows($result)>0){
while($row = mysql_fetch_array($result)){
print('<BR>'.$row['ID']);
// recurse the function using self as root:
scantree($row['ID']);
}
}
}
it depends on how your tree is ordered (if at all)
when constructing data structures like this you need to consider retrieval when deciding on how to implement addition of nodes. as usual it's all about compromise, if you 'spend' more on inserting data with an amount of order you will 'save' on retrieval.
if your tree is totally unordered you may have to search nearly all of it to find what you want.
In that case, you may want to try this <snip> that lets you store in disk files the generated HTML or even a serialized version of your query result rows.
When your tree changes, than you can call a class function to invalidated the caches that relate to the changed nodes.
[edited by: jatar_k at 5:30 pm (utc) on Dec. 9, 2003]
[edit reason] no self promo thanks [/edit]
Here's a thought:
would it be faster to read the nodes in one at a time (and in any order), and spend my processing time constructing a complex PHP array which mirrors the tree structure? Instead of hitting the database repeatedly with 400 queries, I could grab one big result set, and parse that in memory using some fancy array footwork. I have no idea whether a bunch of little database hits are more time-consuming than working with large PHP arrays.
And another idea:
What if I stored not only the "self" and the "parent", but I also store the entire "ancestry" lineage along with each node? Could there be faster methods available if each node is more aware of its place in the tree, storing a sequence like "self>parent>parent>parent>parent>root"?
Maybe those two ideas could work together somehow.
It would require some upkeep on the backside to prevent orphans and make sure the lineage is accurate when a branch or node is moved. But in this case, speed is more important than convenience.
If your tree does not change all the time...
Regretfully, it does. Constantly. Thanks for the advice on caching though, there is another situation where I want to do that. That's tomorrow's job. ;-)
So, I'm looking for faster methods of storing and/or processing a tree, rather than ways to keep the tree floating in memory or in a file.
I understand this.. but just thought I'd mention this possibility in case it was of interest to you.
As you mention, the tree changes frequently. So instead of sorting through this data when it's requested and make the user wait six seconds, you could instead spawn a background process that sorts the data as the tree is modified, and then cache those results. At that point, it wouldn't really matter how long this took.. six seconds would be fine since the process runs in the background, and is triggered when the data changes, not when the data is requested.
Short of that, I think your idea of reading the data in one full swoop and then using PHP to parse through it is much better than making hundreds of individual calls to the database. MySQL queries, even though they may be to a local machine, communicate over a socket, and this communication involves network overhead. Having this overhead occur once instead of 400 times would be to your advantage.
I'm sure I could store a "mytree.html" file somewhere on the server, and update that file whenever there is a SQL INSERT, DELETE, or UPDATE. I could even encapsulate the process in a function, which would be called after any of the above events. There might be file locking issues to prevent two processes on the file from being executed simultaneously, but I'll cross that bridge when the waters rise.
I'll try something like that on Monday.
Still, I'd like to hear from anyone that has tried alternate methods of parsing/processing a tree of data objects. I like the elegance of the recursive function, but I would enjoy hearing about other methods.
I don't know if it will offer any insight into your current situation (check out chapter 2), but it's a great book to jog the brain - lots of people think it's the greatest computer science book ever written.
Tom
Also - do you need to select "parent" in the query when it's already defined?
I used to work for a company where we constantly had this problem. In the year I worked there, at least 6 different tree-structure implementations were done, each with their particular problems, and none of them really worked 100%- Ack!
There's absolutely no good reason to recurse the database calls- the recursion should only be used to construct the structure. I would also separate construction from display to make it more maintainable.
Tree structures are usually made with a design pattern called a Composite. A quick Google for "composite design pattern" will yield much useful information. Without much strong typing, I don't think you'll need inheritance in PHP like you would in Java, but much of the information does apply, like considerations about whether children nodes should have a reference to their parent, etc...
HTH
It of course depends on the content ofyour strtucture and what sort of information you need from it.
SN
Based on some of the recent posts, I realized I could explain better what the recusion is being used for. I'm just displaying a navigation tree, a hierarchy of pages, in a style reminiscent of a Windows folder diagram. We've all seen them. I don't need to sort out a complex object for any other use, I'm just throwing it on the screen in nested tables with collapsing and explanding DHTML, those + and - icons, etc.
Over the weekend, it occured to me how displaying a tree could be done without recursion, but it does require saving the complete ancestry of a node, as an array or comma-separated string.
WHERE 1 is the root node, nodes may have ancestries that look like this:
1
1,2
1,2,3
1,2,4
1,5
1,5,6
1,5,7
1,5,8
The last item in the list is the node's own ID number. When the nodes are sorted with "ORDER BY ancestry", they'll come out pre-sorted in a tree-like order. To indent child nodes, I just indent according to the count() of their parentage array.
With this method, there may be some clumsiness when a node is repositioned, i.e. if node "5" is made a child of node "2", then all the childred of "5" need their ancestries re-calculated.
I admit it's not as elegant as a recursive function, but I'm certain it would be faster... But I'm not favouring that idea since the recursion method provides me with an end-of-loop when a group of siblings are exhausted, which is very useful for drawing that indented tree structure.
I'm not even 100% sure you can do just that with PHP; perhaps some people that work with it daily (in OO mode) can clarify.
class TreeNode
{
$id;
$parentId;
$name;
TreeNode($id, $parentId, $name)
{
$this->id = $id;
$this->parentId = $parentId;
$this->name = $name;
}
TreeNode[] children;
boolean hasChildren()
{
return $children.size();
}TreeNode getChildById($id)
{
for (int $i=0; $i<children.size(); $i++)
{
TreeNode $child = children[$];
if ($child->id = $id)
{
return $child;
}
else if ($child->hasChildren())
{
TreeNode $grandChild = $child->getChildById($id);
if ($grandChild!= null) return $grandChild;
}
}
return null;
}
void addChild(TreeNode $child)
{
children[children.size{}+1] = $child;
}String echoHierarchy()
{
$echo = "<ul>\n<li>" . $this-name . "</li>\n";
if ($children.size() > 0)
{
for (int $i=0; $i<children.size(); $i++)
{
$echo .= $children[$i]->echoHierarchy();
}
}
$echo .= "</ul>
}}
//assumes children will have id>parentId; if not will require a stored procedure to avoid nullpointers, or more code.
$sql = "Select id, parent_id, name from menu_items order by id";ResultSet $rs = $cnx.execute($sql);
$root;
while ($rs->hasNext())
{
$id = $rs.get("id");
$parentId = $rs.get("parent_id");
$name = $rs.get("name");
if ($id = 0)
{
$root = new TreeNode($id, $parentId, $name);
}
else
{
$parent = $root->getChildById($parentId);
$parent->addChild(new TreeNode($id, $parentId, $name));
}
}