Forum Moderators: mack
I have a nice Google XML sitemap built (Thanks to wordpress plugin).
Basically, I can fetch the links from it via url extractor and clean it with a text editor, but I can't seem to find any definitive info on creating a sitemap for yahoo and msn. Could someone point me to a Hands on Tutorial (with code examples) for the format.
Do i use bullets? Can I just put links on the page (how many etc) do I have to put Anchor text for those links, make catagories? Etc etc.
I basically just want to toss up just the links.
like this:
<a href="http://www.mydomain.com/mypage2.html"></a>
<br>
<a href="http://www.mydomain.com/mypage.html"></a>
Can I do that?
Cheers
It is trivial if you have a spreadsheet.
<url>
<loc>http://www.<example>.com/</loc>
<lastmod>2006-06-23T17:46:32+00:00</lastmod>
<changefreq>daily</changefreq>
<priority>1</priority>
</url>
Remove all lines which contain <lastmod>
Remove all lines which contain <changefreq>
Remove all lines which contain <priority>
Search for <url> replace with <li>
Search for </url> replace with </li>
Search for <loc> replace with <a href=">
Search for </loc> replace with ">sometext</a>
In a spreadsheet, you can write a formula which copies the url again into the location which says "sometext" and be done. Otherwise, you need to find a way to put a descriptive, useful title into that location.
Once you have done that, you can simply add <ul> at the top and </ul> at the bottom of the list of urls and remove everthing else, format the page the way you want it to look, split it into ten or twelve pages with a navigation system (try to keep less than 100 links per page) and you should be done.
Was just gonna ask you if you knew the grep command, been a while since I've done Grep...
<lastmod>2006-06-23T17:46:32+00:00</lastmod>
obviously the time changes on each one
<lastmod>Grep Comman Here * someting...etc </lastmod>
Than kyou for taking the time to care and share...!
Be sure to make a backup copy before you do any of this.
...
ummmmmm
....
grep -v <pattern> <filename> >> newfile.txt
?
so
grep -v lastmod originalfile.txt > newfile.txt
Once everything is gone that you don't want, maybe sed would work better for the substitution part?
run this command:
grep -v lastmod sitemap.xml > newfile.txt
run this command
grep -v changefreq newfile.txt > newfile2.txt
run this command:
grep -v priority newfile2.txt > newfile3.txt
Now you should have a file called newfile3.txt which does not contain the lines which you don't want.
Pull it into a text editor and do search and replace for the other items mentioned above.