Forum Moderators: coopster

Message Too Old, No Replies

How To Convert A Wikipedia Dump File (XML file) to PHP/MySQL Database?

Have a huge xml file and want to make that human readable using PHP/MySQL

         

cosmoyoda

8:12 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



Hi guys...

As some of you know, Wikipedia makes it possible for users to download their entire database, full or articles etc of the Wikipedia Encyclopedia. The Download can be found here: [download.wikipedia.org...]

I did download one of this files, but for the Wikionary database, meaning I have a very big XML file (about 60-100 MB) with all the database (A Database Dump of the WikiMedia servers) from the Wikionary site (en.wikionary.org). It has all the words in the "Wikipedia Dictionary", but the XML file is too big and not really human readable. I was wondering of there is a way to convert this and put in my MySQL Database, so I can create new PHP files in Dreamweaver and interpret the Database data in a new way using my own web site.

NOTE: Wikipedia content is free to use, if I cite the source of course, so don't worry about copyright violation here. All I need to know is how to parse or convert this huge XML file and put in a MySQL Database to use PHP and make it human readable on my site.

I downloaded the Wikionary database first, because its not as big as the Wikipedia Article, just for simplicity's sake. When I succeed putting this file on my Database, I might do the same with the Wikipedia Data Dump.

Thank you for your help! I've tried searching the forums, but it seems like no one had ever asked that question before. Hopefully you guys can help me on this one. Thank you.

PHP_Chimp

8:43 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



php has inbuilt xml parsing functions. Are you using php 4 or 5?
As xml parsing with 4 is a bit of a pain, with 5 it is a lot nicer.
Check these out -
php4 - [uk2.php.net...]
php5 - [uk2.php.net...]

As these will help you work through the content.

cosmoyoda

8:52 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



Yes, I am using the latest PHP 5 version.

I will try these links. Thanks a lot!

However, the XML file is very big. I am using Dreamweaver to create my website, and trying to open that big file on Dreamweaver never happens, it just freezes the whole program. :(

That's why I first need to put the data of this Wikipedia Data Bump XML file in my Database, is that possible?

PHP_Chimp

11:30 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The xml functions should help you to be able to parse the file and get what you want.
So you could either store the xml file in a database or just as a plain file. Then you can import it and work through to get what you want with the php xml functions.