Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Selecting and extracting text between two defined markers

what text processing tool/script?



9:45 am on Sep 1, 2006 (gmt 0)

10+ Year Member

I have 100s of full html pages but want to extract only the content, which is clearly marked with comments ( ie. <!-- Content begins/ends here --!> ). Rather than just cutting and pasting into separate files, how would you approach it? Is there some console cleverness that can be used?


5:29 pm on Sep 1, 2006 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

shell, perl or some other form of server-side scripting would be ideal here. Loop through the files in the directory, locate the string in between the comments and write them out to a new directory/files.


8:17 pm on Sep 1, 2006 (gmt 0)

10+ Year Member

Hey Coopster - both sed and awk do the trick. Just not sure how to apply the correct phrasing to a whole directory. Anyhow, trial/error etc :)

Featured Threads

Hot Threads This Week

Hot Threads This Month