Forum Moderators: phranque

Message Too Old, No Replies

Internet Directory Structure Design

Making is scalable and practical

         

ukgimp

2:11 pm on Sep 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wish to get a decent hierarchy for a web directory. I am going round in circles trying to find the best way of classifying them. At present I am at the old subjects and then have a option to add a country. But then that does not allow for a greater depth.

The flip side is the country/region/sub-region followed by the subject but that is crap as well as there could be a region specific that caters for a different subjects (Yorkshire ¦ housing) as well as one for a town in Yorkshire that has something about housing.

I find myself going round in circles. How do you cope with these sort of issues. Be aware that it is not intended to be the odp or yahoo so that level of granularity is not required.

The once the type os structure has been defined it is sourcing a suitable set of csv’s so you don’t have a life long job of creating the hierarchy let alone the content.

Any ideas.

Cheers

PatrickDeese

11:40 pm on Sep 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I believe that are what symbolic links are used for.

ie in:

country > region > city > widgeteers > widget polishing

you put a sym. link to

business > widgeteers > widget polishing

etc.

ukgimp

10:13 am on Sep 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Patrick

Even with the breadcrumbs (easy bit:)) you still need to have the heirarchy set out in the db.

That is where my difficulty lies.

Cheers

killroy

11:32 am on Sep 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The breaqdcrumbs ARE a hierachy, jsut a different display.

In my own directory, that I'm running since 1999, I have a many to many relationship, basically the same, well similar to symlinks.

Personally I find it non-sensical to try to fit everything into a single category. Is a Local Yellow Pages Directory something regional, advertising, reference,information,business?

I think it's all of the above.

If you system permits it, I urge you to lsit things in all matching categories. After all it's no good if you have something listed when th evisitor is looking in the wrong category.

SN

jmccormac

11:57 am on Sep 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is a complex task ukgimp but there are two basic ways to do it - a numerically based structure or a gradual accretion way.

The numeric structure uses a set of numbers to represent each topic and a separate table can carry the topic names to number correspondences. A second table can carry aliases that map to these numbers. This is largely how the ODP is constructed but ODP's design breaks down in that categories/topics seem to be added in an erratic manner. It is very important to get the initial design right.

The closest analogy would be a phone number type system. With the main headers being: [aaaa] [bbbb] [cccc] [dddd]

This would be the raw numeric structure. The topic stucture would be: Topic (n) = aaaabbbbccccddd

It would also be possible to create an alias as part of the topic table or as a separate table with the topic as a reference.

Alias(x), Topic(n), aaaabbbbccccdddd

The ODP has the right idea but the accretion of categories is the most danagerous thing for any directory. It is all to easy to add a category to a directory but it is difficult to keep the category up to date.

I did an implementation of the ODP for Irish sites about a year and a half ago and I still have my notes on it around here somewhere. The big problem with ODP as far as I remember, was that its implementation of a numeric structure was flawed. The most important thing is that the lowest element in all this is the website/review. Everything is built upon that. It will have a unique code (the aaaabbbbccccdddd number) that will allow it to be part of other tables.

It takes some planning to get a good structure though - the ODP sructure is not a good one to start. The best way would be to approach it clearly with perhaps an industry/commercial/domestic categorisation like the Yellow Pages in parallel with an ODP like structure. Using an alphabetical structure is not a good thing as the frequency distribution of the English language means that some letters will have few or no entries and others like C, M, S etc will be overloaded.

Regards...jmcc

killroy

12:29 pm on Sep 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Let's just say its a vast topic.

During the first three years of my directory we still had hour long discussions about the categories at least once a week. And its a relatively small national Yellow Pages Directory.

I msut have spend days analysing the category structures of various YP/webdir sites and tried to merge them, but they were all sufficiently different to make it clear there was no good single solution.

That's why in the end I opted for a system of m2m:

Item pages: i1,i2,i3,i4,i5,...
categories: c1,c2,c3,c4,c5,...

where each category is in this form:
c1: cat1
c2: cat1/subcat1
c3: cat2
c4: cat1/subcat2
c5: cat1/subcat2/subsubcat1
.
.
.

and then the links: i1-c1,i1-c3,i2-c1,i2-c2,i3-c1,i4-c3,...
You get the idea.

then of course you have the aliases:
Cars/Car Rental == Rental/Cars
Cars/Car Rental == Auto/Auto Hire

and so on...

SN

penfold25

1:21 pm on Sep 7, 2003 (gmt 0)

10+ Year Member



hey u can check out my directory structure if u want, its a niche, but might be hepful.

Is it going to be general or a niche?

ukgimp

9:33 am on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>Let's just say its a vast topic.

You are not wrong :)

>>right idea but the accretion of categories is the most dangerous

Hence my need to get it right in the first place. If I set down the structure first there should be no real need for too much need to add more and more subcategories.

>>there was no good single solution.

This is the problem I am facing.

>>Is it going to be general or a niche?

The topic is very vertical but the classification could be very very granular. For example, lets take the ODP and slice off everything that is not related to online shops. Bear with me now :). Each shop can have multiple levels of supplier audience (worldwide, UK, west midlands, Birmingham). But the shop can also have multiple types of things it sells but not as complex, (clothing children, clothing adult etc). But it is conceivable that a clothing shop might be worldwide but someone might like to browse into the clothing and narrow down to county from there. I suppose it is like having two browse structures as a particular listing could have two categories it be found under.

The more I think of this the more confused I get :).

>>and then the links: i1-c1,i1-c3,i2-c1,i2-c2,i3-c1,i4-c3,...

I have the links set out as unique numbers 1.php, 2.php where each number is the autonumber from the category parent/child table. Then each entry has that unique number associated with it. I suppose there is a possibility that a record could be present in more than one cat but then that gets more confusing still. Lol

I will report back when I have done more.

killroy

11:07 am on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, if you want to check out my directory you'Re welcome. It ahs a multi level mutliy category multi regionailty categorysation.

In fact it automatically builds the industryies under the regional levels from the oposite end of industries with city focus afterwards.

It's been running for four years and serving thousands of businesses and lsot more visitors.

SN

claus

11:32 am on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nice to see a directory thread for once ;)

>> there is a possibility that a record could be present in more than one cat

This is, i believe, your basic challenge. If you have one basic topical grid and a geographic column as well in your dataset it is fairly easy to make the regional dimension as well just like killroy stated, but..

Firms do span several areas. In my dir, i face this with the topics "webdesign", "advertising", "design" and a few other cats. Some firms just need to be two places unless you want to decide where the firm should go. If the directory is pay-for-inclusion, this is easy as then you leave it up to the company (and their budgets), but otherwise it can be hard to judge.

In stead of the "top-down", you could try a "bottom-up" approach: For each link in your database, design a set of keywords (industry1, industry2, geo1, geo2, keywordX, keywordY) and then build the cats automagically from these.

/claus


Added:

You will need to invent new cats and subcats as time goes by. I think it is totally improbable that you can devise a structure from the start that will not need to expand or change at some point. So, it's got to be flexible from the start. Cats merge and split, some becomes obsolete and new ones enter.

I just have to mention this as well: There's the ISIC-codes and similar industry groupings used mainly in statistics, these are good starting points.

ukgimp

11:45 am on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Some half Decent Country Code site swhich can adpated depending on your needs.

www.abcounties.co.uk/counties/list.htm
www.crwflags.com/fotw/flags/region.html
www.internetworldstats.com/list1.htm
www.geohive.com/global/gen_regions.php
www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html
h*ttp://unstats.un.org/unsd/methods/m49/m49regin.htm
www.un.org/depts/dhl/maplib/worldregions.htm

as the rest, still ongoing :)

killroy

11:50 am on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"You will need to invent new cats and subcats as time goes by. I think it is totally improbable that you can devise a structure from the start that will not need to expand or change at some point. So, it's got to be flexible from the start. Cats merge and split, some becomes obsolete and new ones enter."

I concur 1000%.

You will leanr, with experience, that htere is no such thing as a perfect or even adequate dir structure.

Jsut do some research on oflfine yellow pages categorisation, or even compare popular online deirectory categorysation and you will find HUGE discrepancies and disagreements.

SN

Brad

9:41 pm on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>"You will need to invent new cats and subcats as time goes by. I think it is totally improbable that you can devise a structure from the start that will not need to expand or change at some point. So, it's got to be flexible from the start. Cats merge and split, some becomes obsolete and new ones enter."

Yes. There is a lot to be said for getting your top level cats down and a few second level cats and then let it grow organically.

A directory script that lets you move subcategories and their contents around is a big plus.

rcjordan

9:50 pm on Sep 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>directory script that lets you move subcategories and their contents around is a big plus.

Directory scripts should also have multi-category and related category capability (and sub-cats, too) or you'll hit the wall pretty quickly. However, this 'feature' can also be a script's Achilles Heel --the matrix gets HUGE. 500 categories X 5 subcategories X 3 sub-subcategories and even simple directories require a map in order to manage it.

ukgimp

9:27 am on Sep 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> related category capability

Behave rcjordan, I have not even got my hierarchy sorted and you have to throw that big fat spanner into the works :)

I know that each dir is different in its requirements but I have certain requirements. For me the main two types of classification are location which could be multi level

World ¦ region ¦ county ¦ region ¦ possible town

That is how I would see the main structure. Each listing can then have another classification eg

Clothing ¦ women’s ¦ trollies

So the main browse is used to find a site and another parent child table is set up so each record has two classifications and hence two possible ways of navigation down. I see a drop down list for the second listing, ala

Clothing
--womens
-------trollies
-------shoes
-- mens

etc etc.

That could allow for two different ways of browsing. I want to keep the submissions easy as joe public will be filling them out.

Cheers