Forum Moderators: phranque

Message Too Old, No Replies

HTML parsing: ASP, ColdFusion, etc

Can I parse out HTML results via the web?

         

spunky

9:43 am on Dec 12, 2001 (gmt 0)



I am new to the database-scripting world and have question that I hope someone can help me with. I'm trying to find out if and how I can use ColdFusion, ASP, or PHP to parse out a chunk of HTML from a HTML file(s) and possibly write it to a file or database (ie:MySQL or Access). I want to pull out some HTML that has search results included within. I would like to know which would be the easiest (quickest) way to do this, if possible . I appreciate any help or suggestions.

Randy

Brett_Tabke

10:16 am on Dec 12, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Welcome to the board Spunky. PHP or Perl would be your quickest bet. I parse between 60-70 pages 5 times a day [searchengineworld.com] that end up in a database.

I'd think PHP or Perl would be the fastest since at their heart, they are both bascialy text processing and matching engines. Both are well suited for html slice and dice work.

spunky

4:53 pm on Dec 12, 2001 (gmt 0)



Brett,

I've been looking around and I can't find a basic script to start working with. Is the PHP syntax pretty "self-explanatory" that I could start from scratch without alot of reading? I just download PHP this weekend and haven't really played with it. I did read a tut on Webmonkey and it I grapsed that pretty well. Of course I'm trying to do this ASAP, so that's why I'm not attempting to start from scratch. (I don't have the time.) I appreciate the advice and thanks for referring me to searchengineworld. I love this site, and I think I'm going to like that one also.

Randy

esconsult1

4:24 am on Dec 21, 2001 (gmt 0)



If you're gonna use PHP readup on the following functions:

preg_match(...)
preg_match_all(...)

These are my bread and butter. Easy to use and does the job. However, you'll need readup on basic regular expressions first, whether you're using PHP or Perl.

Once you've covered REGEXP's, then the REGEX syntax in Perl or PHP is the same.