homepage Welcome to WebmasterWorld Guest from 54.161.246.212
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / WYSIWYG and Text Code Editors
Forum Library, Charter, Moderator: open

WYSIWYG and Text Code Editors Forum

    
Selecting and extracting text between two defined markers
what text processing tool/script?
Casethejoint

5+ Year Member



 
Msg#: 3068586 posted 9:45 am on Sep 1, 2006 (gmt 0)

I have 100s of full html pages but want to extract only the content, which is clearly marked with comments ( ie. <!-- Content begins/ends here --!> ). Rather than just cutting and pasting into separate files, how would you approach it? Is there some console cleverness that can be used?

 

coopster

WebmasterWorld Administrator coopster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3068586 posted 5:29 pm on Sep 1, 2006 (gmt 0)

shell, perl or some other form of server-side scripting would be ideal here. Loop through the files in the directory, locate the string in between the comments and write them out to a new directory/files.

Casethejoint

5+ Year Member



 
Msg#: 3068586 posted 8:17 pm on Sep 1, 2006 (gmt 0)

Hey Coopster - both sed and awk do the trick. Just not sure how to apply the correct phrasing to a whole directory. Anyhow, trial/error etc :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / WYSIWYG and Text Code Editors
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved