Welcome to WebmasterWorld Guest from 54.159.44.227

Forum Moderators: open

Message Too Old, No Replies

Selecting and extracting text between two defined markers

what text processing tool/script?

     

Casethejoint

9:45 am on Sep 1, 2006 (gmt 0)

10+ Year Member



I have 100s of full html pages but want to extract only the content, which is clearly marked with comments ( ie. <!-- Content begins/ends here --!> ). Rather than just cutting and pasting into separate files, how would you approach it? Is there some console cleverness that can be used?

coopster

5:29 pm on Sep 1, 2006 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



shell, perl or some other form of server-side scripting would be ideal here. Loop through the files in the directory, locate the string in between the comments and write them out to a new directory/files.

Casethejoint

8:17 pm on Sep 1, 2006 (gmt 0)

10+ Year Member



Hey Coopster - both sed and awk do the trick. Just not sure how to apply the correct phrasing to a whole directory. Anyhow, trial/error etc :)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month