homepage Welcome to WebmasterWorld Guest from 54.167.41.199
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / WYSIWYG and Text Code Editors
Forum Library, Charter, Moderator: open

WYSIWYG and Text Code Editors Forum

    
How to Export Meta Tags from Frontpage to Excel
donna130




msg:4075682
 5:13 am on Feb 7, 2010 (gmt 0)

Wondering if someone can please share a free & easy way to spider &/or export metatags for all pages from FP to excel. Thank you for your time and thoughts.

 

SteveWh




msg:4075711
 7:37 am on Feb 7, 2010 (gmt 0)

If the site is on your local PC, there are two Linux command line utilities, called grep and sed (or Super Sed, ssed, which is even better), that together should be able to do this extraction for you. There are versions of grep and ssed for Windows, also.

What they can do is extract the relevant text from the pages and store it in a separate text file which you can then import into Excel.

The downside is that learning them is not easy. However, if it's a big site, the days you spend figuring out how to use the utilities might still be less time than you'd spend doing the extraction manually, and at the end you'd know how to use two extremely useful text processing tools.

donna130




msg:4076063
 2:43 am on Feb 8, 2010 (gmt 0)

thanks. Well, before your post I tried gsitecrawler, but the meta descrip gets truncated at 124 chars. I know there are a lot of meta tag crawlers out there, but can't seem to find one with export to csv/excel feature, at least without truncation. Thanks and thanks in advance for others' experiences on this topic as well and if there's a good tool that can, ideally, allow me simply export my Frontpage metatags to excel, or secondarily, allow me to crawl my own site (or local folder) and extract the metas, then export to excel.

donna130




msg:4077818
 6:34 pm on Feb 10, 2010 (gmt 0)

I found the answer to my post, and hope it helps others:

Instead of trying to export from Frontpage, I found a way to essentially spider my own site for meta tags. It's very useful because you can designate a folder or even single text file of your URLs, and it will retrieve the metas and subsequently allow you to save/export those to csv/excel. It took me a long time searching. If you're looking for this type of thing, it's called URL Meta Tag 2.0 aka Online Data Extractor. It's not freeware, and they have a trial, but it's definitely worth it. Best nav is NEW > General tab > Choose URLs from File. Select the file (eg I had them all pasted to a text file). Choose "No of Pages to be processed "1". I'm hoping this makes someone's life a lot easier.

SteveWh




msg:4079743
 8:39 pm on Feb 13, 2010 (gmt 0)

Using GNUWin32 grep 2.5.4 and Super Sed for Windows 3.62, the following single command line searches all files in the current folder and its subdirectories. For each meta tag found, it outputs the filename in the first column, the meta tag in the second column, the two columns separated by a tab for easy import into Excel. You can then sort on either column:

grep -RiPo "\<meta.*\>" *.* | ssed -Re "{s/\:/\t/}" > C:\TEMP\MetaList.tsv

C:\TEMP\MetaList.tsv is just a placeholder example file name.

donna130




msg:4079840
 2:41 am on Feb 14, 2010 (gmt 0)

I heard that had a lot of bugs. I was able to solve the prob earlier (see previous post). Yeah, URL Mega Tag 2.0 was very easy and I'd definitely recommend it to others.

The only issue remaining from the original post is how to BULK IMPORT metas edited in excel BACK INTO Frontpage, ie back into the URLs contained within the local folder for publishing. Anyone know how?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / WYSIWYG and Text Code Editors
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved