Forum Moderators: open

Message Too Old, No Replies

ASP question XML /Database bot list

         

johnhamman

4:20 pm on Apr 4, 2002 (gmt 0)

10+ Year Member



What do you think would be a better way of listing the spiders to access? a xml file or from a database and why?
Im trying to make a spider utility from scratch.
john

william_dw

7:28 pm on Apr 4, 2002 (gmt 0)

10+ Year Member



Hiya,
When you say a spider utility, I assume you mean a tool to generate robots.txt's?.

If so I would use XML,
the reasons are:
1 Size: You wont need to do any insert's into the database once it's created, so the overhead of a database would swell your distribution package. (for example a blank access db starts out at 130kb, an XML file starts at 21[xml header <?xml version="1.0"?>]).
2 Format: XML is easier to work with as you can easily edit the file with a text editor if needs be.
3 Interoperability: Pretty much any programming language can work with XML
4 Speed: Although once a connection pool has been setup databases are slightly faster, for a tool where it's likely you will just have a single read operation to load all the spider details, XML is a smaller file which should in theory be quicker.

Those are the reasons I can think off the top of my head, plus if you aim to sell this (for money, to your boss as a idea, etc), XML standards based sounds nice and new.

HTH,
Dw

johnhamman

7:31 pm on Apr 4, 2002 (gmt 0)

10+ Year Member



no actualy i mean like to store a list of BotIP addys,usragents,etc... for like cloaking and blocking.
But now that you mentioned to create robot.txt hmmm. good idea!

Josk

7:53 am on Apr 5, 2002 (gmt 0)

10+ Year Member



I would go for a database...I went down this route sometime ago. Making and updating robots.txt is now a matter of running a perl script every so often all the sites are updated to the latest versions.

Hint: use mysql (or better)

johnhamman

12:59 pm on Apr 5, 2002 (gmt 0)

10+ Year Member



it wont make robot.txt, I ment it for storage of spiders for cloaking!

Josk

5:23 pm on Apr 5, 2002 (gmt 0)

10+ Year Member



oops...but mine does that too! (once you have have a decent database, these things become quite easy...)