Forum Moderators: coopster & phranque

Message Too Old, No Replies

Schema for Articles DB

Best way to do it?

         

Nick_W

5:23 pm on Mar 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi all,

I'm building an articles DB that will be able to take multiple pages for each article. I did this once but it was over a year ago, I no longer have the code and certainly not the memory!

So, table would look like this right?

¦pageid¦parent¦title¦desc¦keywds¦text¦

If that's correct, my most immediate question would be: What considerations need to be taken into account with the whole parent/page thing?

Do I give the starting page a parent of NULL or 0?

Any pointers from those that have done this more often than me would be greatly appreciated :)

Cheers

Nick

Allen

12:05 am on Mar 15, 2003 (gmt 0)

10+ Year Member



Another one who hasn't stated the language they'll be using.

I don't think it really matters as long as you keep to one or the other. Purely through my own coding style, I tend to use NULL for values not on the table just in case 0 becomes a valid id for some weird reason. Now it's habit.

BTW. I code in both Perl & PHP.

Allen

jatar_k

12:18 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



language they'll be using

since the topic is db schema, one wouldn't think it would matter ;)

NULL or 0?

Probably more of a personal preference thing, I would probably go with 0.

What considerations need to be taken into account with the whole parent/page thing?

sorry Nick, don't know exactly what you mean.
Could be just because it's friday afternoon though

ukgimp

12:22 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nick

Have a look at this mutha, it might give you an idea or two

ht*tp://www.devarticles.com/art/1/310

andreasfriedrich

12:47 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What about author infos and a time stamp?

Does each page of an article really has separate title¦desc¦keywds?

If the answers are yes, yes, no think normalization.

Sorry that´s all I´m capable of right now. Could be just because it's friday night though.

>>since the topic is db schema, one wouldn't
>>think it would matter

but one would wonder why this thread is in "Perl [perl.com] and PHP [php.net] CGI Scripting".

Andreas

Nick_W

7:18 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks all,

Yes andreas, timestamp and author-id of course, just didn't want to complicate the issue ;)

>normalization - Yep, good point should have included those details.

I'd have to put the kws in a seperate table right? - What about the timestamp? - dont need one for each page...

Thanks

Nick

Woz

8:08 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>What considerations need to be taken into account with the whole parent/page thing?
It depend on how you are going to present the information.

When you view the opening page of the article do you want all the subpages in the group listed and do you also want them listed as you view all the subpages? If so then a simple parallel parent ID would suffice enabling all pages in the group to display links to all other pages in the group.

However, if you want to force the visitor to read the article sequentially then a serial parent/child structure needs to be applied. This means that each page/subpage would needs to be able to display links to both the previous and next subpage requiring two calls to the database if a simple parent id is used. Or, if you add fields for the T/D/K for both parent and child into each row you could build all the strings with one database call. However, the trade ofs are a slight increase in database size PLUS some rigid coding on the admin side to ensure whenever a T/D/K is edited it is also updated in the relevant parent or child.

>Does each page of an article really has separate title¦desc¦keywds?
Yes!

>If the answers are yes, yes, no think normalization.
No!

>I'd have to put the kws in a seperate table right?
No No No! (Hehe)

Make the Titles/Descriptsion/Keywords specific to each sub-page of the article and target them to the content of the subpage. If you use the same T/D/K for each subpage you are missing out on a lot of keyphrases and therefor traffic.

>What about the timestamp? - dont need one for each page...
I always included DateLastEdited as standard for internal admin but you could also use that to display the date the article or subpage was last edited.

DejaVu, was just thinking about this myself for a personal project. Great Minds and all of that.

Onya
Woz

Nick_W

8:35 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Woz,

I agree with almost everything there, but I either don't understand you or disagree with the parent/child thingy...

I was thinking of adding an article id also so the schema would be

pageid¦parent¦artid¦title¦desc¦kws¦text

Then when a user hits a subpage (i need to display all pages to article on every page) the query would go:

(assuming I get the artid when the page is called by user)

SELECT pageid FROM tbl WHERE artid = artid

Make sense, or is it too early to be messing with this stuff? ;)

Nick

aspdaddy

11:26 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Theres quite a few ways of doing this,personally I would separate the page and article entities and go for something like this:

article(id,title,desc,content,numPages, date)
page(id,parent_id,title,desc,content,date)

I think the acid test for any design is how easy are the queries, how easy is it to add to/extend the design, and how does it handle ref integrity on deletes/updates etc.

brotherhood of LAN

11:36 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



deja vu with the thread...where's the other one you posted Nick? Wasnt all that diff from this one, less to do with the page sequence thing ;)

how about storing it as bits in the one field. say, 4 bits for the number of pages, and another 4 to indicate which position the page is in on the full article.

the bits dont necessarily have to be numbers....

Woz

11:45 am on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmmm, let me preface here by saying I am used to Access/ASP. I have no experience with mySQL or PHP or PERL so I am not sure if what I am suggesting would be entirely applicable across the board but the concepts should be transportable.

As I see it there are two ways of presenting the pages of an article:-

1) List all pages in the article on all the pages so the reader can jump around however they want, as in the MonkeyWeb articles, (Group Navigation) or

2) List only the Previous and Next pages in the series to force the reader along a predetermined path, as in the devArticles article mentioned above (Serial Navigation).

devArticles' approach is to have two tables, one (articles) listing the overall article information, the second (pages) listing the pages within the article linked by a foreign key.

Listing articles is one call to the database cycling through the article table. Listing pages within a particular article requires two calls, the first grabbing the overall information, the second the calling up all pages linked to the article.

Displaying the individual pages may also require two calls if you needed article informaion as well as page information.

I tend to try and put things in one table, or generate a simply query within the database itself, if I can to minimize the number of calls and connections. Thus I would do either of these ways:-

1) Group Navigation

tblSource = entryID¦parentID¦title¦desc¦kws¦text

To start a new article a new entry is made with a blank parentID. This is the header for the article and contains the overall T/D/K along with a synopsis in the text area.

To add a page a new entry is made with the parentID set to the entryID of the article and contains the page specific T/D/K and the page content in the text area.

To list all possible articles

SELECT * from tblSource WHERE parentID is null; ' and order however you want to.

To list pages in the article needs a little more processing and assumes passing two variable in the query string:-

display?ArticleID=x&PageID=Y

NB PageID = the ID of the page (entryID)
-- ArticleID = the ID of the article (parentID)
(With some simple parsing you can actually combine them into one variable and then split them before processing which is less server intensive than extra database calls. You would then have something like display?A14P17 which would equal ArticleID = 14 PageID = 17.)

SELECT * from tblSource WHERE entryID = ArticleID OR parentID = ArticleID order by entryID;

while not rs.EOF

if entryID = PageID then
print text field and set metatags from T/D/K fields
else
print navigation link using the entryID and Title
end if

ors.next

wend

This displays the page text and generates links for all the other pages in the group. It also assume that the pages are entered into the database in the correct order which would not be altered in any way and thus depends on entryID ascending for the order. If the order was to be altered then perhaps an additional field would be needed which could be set as desired. IF so then additional error checking would be needed in the adimn section to assure correct ordering.

So far so good.

2) Serial Navigation

tblSource = entryID¦parentID¦title¦desc¦kws¦text

I would then make a query, stored proceedure in msSQL, linking the page with its parent entry and child entry and grab the title from both resulting in:-

qrySource = entryID¦parentID¦childID¦title¦desc¦kws¦text¦parentTitle¦childTitle

Again, to list all articles

SELECT * from tblSource WHERE entryID is null; ' and order to your choice.

To display a page

display?ID=x

SELECT * from qrySource WHERE entryID = ID

generate metatags from T/D/K

generate content from text field

previous page = <a href 'diaplay?ID=parentID'>parentTitle</a>
next page = <a href 'diaplay?ID=childID'>childTitle</a>

Done!

This way if the order of the pages were to be changed you would simply reallocate parentIDs and they would be picked up in the process of displaying. You would need to build error checking into the admin side to check against two pages having the same parent or orphan pages but it should not be too hard.

Yet another approach if you wanted to number the pages might be to use something like

tblSource = entryID¦parentID¦pagenumber¦title¦desc¦kws¦text

The article header would have pagenumber set to 0 and then subsequent pages numbered accordingly. You could still then use the above techniques and simply look for

ParentID = x and pagenumber = y to get Page y of Article x.

All food for thought. Hope this makes sense Nick, thanks for starting the thread, it made me put the thinking cap on for my own project.

Onya
Woz

Nick_W

12:25 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks alot everyone, I love these types of discussions, you really learn stuff!

Remember I said I'd done this once before but it was a long time ago? - It all comes flooding back now.

I did indeed have two tables, almost exactly as you've a couple of you have mentioned. One for the article info, one for page info.

That works for me, and seems the most effecient. I love the idea of an article 'synopsis' - would make a good 'nemu' of articles ;)

Thanks everyone!

Nick