Forum Moderators: open
Gratefully, I was able to understand here in WW that the best solution for a certain "automated" website need, as in a website that can act like today's blogs and require minimal webmastering later on, can best be fulfilled by a CMS using PHP, along with the regular XHTML. However, I want to go a step further, and make the items that are published in this website, including text articles, images, and audio files, superbly searchable, sortable, and mineable, and most importantly: Future Proof.
Why not ready today for RSS feeding, too?
I've heard that XML may be the basis, or at least part of the answer to my question and this requirement.
I also believe that the Semantic Web [w3.org] should be very related one day, but not very much today due to its apparent obscurity for developers and inapplicability or lack of practicality for publishers, for the current time. However, how can I make all the items in my website, articles, images, etc, future-proof? I mean how can I make them ready for a Semantic Web one day? And ready for a possible shift or upgrade from XHTML to an XML web one day? And that metadata stuff that comes along with it? The goal is to minimize administration later on, and save hundreds of hours rebuilding all this stuff or adding vital information to it (like metadata for instance?) so it can comply with future web standards.
And to give an example, how, for instance, can my articles and other items be fairly ready for so-called RDF browsers, like Longwell [simile.mit.edu].
I know there must be work done anyway in the future, with these revolutionary upgrades, but how can I minimize this work 5 years from now for instance?
Any advice, guidelines, directions, and recommendations of subjects and technologies to be studied, will be very appreciated. Thanks for anyone taking the time to help.
db -> transform -> (x)html/xml/csv/pdf/etc
db -> xml -> transform -> (x)html/xml/csv/pdf/etc
db -> interface -> partners/suppliers/clients/etc
The idea of being "future proof" is a kinda shaky one, perhaps "developability" is a better aim.
Can you elaborate just a little bit on the list of 3 sentences that you wrote there? I didn't quite understand what you meant by them.
You can pull the data out and use asp/php/jsp/etc + fast native db interface to write out the data into the output format. Or you can pull out data in XML format and use XSLT to transform it into the output format. Or you may want to implement a public XML interface e.g. using SOAP (like the Google API) open to affiliates or resellers. Or you can spit out a csv file of products every 24 hours for Froogle.
Using a db also means you don't need to use XHTML, and can use HTML as the public-facing web page format. If you need to deliver an XML format as output, you can just do another transform on the raw data, without having to mess about stripping website bits (e.g. navigation) out of XHTML web pages.
XML is a great data exchange format (transferring data from one closed system to another) but too inefficient as a storage format in a closed system. If you start using lots of XML files there inevitably comes a point where you need to index them (in a database) to get decent performance. And then you start thinking "why are we using XML when we're just importing it into a database anyway...?" :)