Welcome to WebmasterWorld Guest from 54.160.198.60

Forum Moderators: mademetop

Message Too Old, No Replies

SiteMaps Protocol Endorsed by 3 SE's

     
7:08 am on Nov 16, 2006 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38063
votes: 13


Released Wednesday at PubCon:

[sitemaps.org...]

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.

10:34 am on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 2, 2005
posts:112
votes: 0



Yahoo Search blog announced;

'Together we're announcing [sitemaps.org...] which provides details of the current release of the Sitemaps protocol and will include future updates as we continue to collaborate on this common protocol. By offering an open standard for web sites, webmasters can use a single format to create a catalog of their site URLs and to notify changes to the major search engines. This should make is easier for web sites to provide search engines with content and metadata. And in turn, search engines can spend less time crawling unchanged pages and can update indexes faster as new content is discovered. This will help us reflect the changes more quickly, and improve our ability to provide more timely and relevant search results for users. Sitemaps is available to any site owner who wishes to communicate more easily with participating search engines. Simply create and upload an XML Sitemap and submit the URL of the file to search engines.'

A who is on domain says it created On:12-Aug-2001 and owner is Google

[edited by: Brett_Tabke at 2:49 pm (utc) on Nov. 16, 2006]
[edit reason] fixed link [/edit]

2:50 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 2, 2005
posts:112
votes: 0


Hi Brett

I just wonder if i have a sitemap like this, is it still a bad idea to go more then 2 levels deep as i see around 4 level pages with PR5.

You were said at [webmasterworld.com...]

All pages should be linked to more than one other page on your site,and not more than 2 levels deep from root.

Thanks

[edited by: encyclo at 3:06 pm (utc) on Nov. 16, 2006]
[edit reason] fixed link [/edit]

4:49 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 4, 2003
posts: 122
votes: 0


I submitted a sitemap to Google and Yahoo. Is there a place to submit sitemaps to MSN Search?
5:29 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 20, 2003
posts:46
votes: 0


How do Google, Yahoo and MSN benefit from sitemaps? Why is there such a big push for webmasters to sign up? There must be more to this than "Help us find all of your pages".
5:33 pm on Nov 16, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Mar 2, 2004
posts:253
votes: 0


trav - good point, the engines have been able to find pages fine before. If your sites pages are all spidered and ranking, why would a webmaster submit this data to the engines? Seems like the engines could be using this data against the webmaster.
5:39 pm on Nov 16, 2006 (gmt 0)

New User

10+ Year Member

joined:Mar 25, 2004
posts:6
votes: 0


A who is on domain says it created On:12-Aug-2001 and owner is Google

It all just looks like a repackaged and debranded version of the Sitemaps documentation on Google's site.

So is the news just that the Sitemap schema should point to the new domain, and that Yahoo (and Microsoft?) will now accept it?

I wonder if Yahoo SiteExplorer will continue to accept other formats (not that I won't be sending them to the sitemap schema'd file now, rather than an Atomized version).

Anyone have something tangible with regard to Microsoft? There isn't an existing sitemap service for MSN, is there?

6:20 pm on Nov 16, 2006 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2001
posts:1281
votes: 0


The first sample sitemap on www.sitemaps.org/protocol.html does not have a closing <url> tag!
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</urlset>
7:18 pm on Nov 16, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 10, 2003
posts: 439
votes: 0


So, can we submit the exact same sitemap file to google and yahoo now?
I already have on google, but no on yahoo.

The one problem with the current system is that if you have different subdomains, you need a different sitemap for each subdomain. They don't allow one sitemap linking pages on another subdomain.

Example:
mysite.com site maps can't link to sub.mysite.com, you need to put another site map at sub.mysite.com just for pages under that domain. Does it make sense or am I tripping here?

Thanks

7:19 pm on Nov 16, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Mar 2, 2004
posts:253
votes: 0


does not have a closing <url> tag!

I wouldn't expect a 0.9 version to have anything less ;)

7:24 pm on Nov 16, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 30, 2002
posts:741
votes: 0


If your sites pages are all spidered and ranking, why would a webmaster submit this data to the engines?

It won't help ranking, but it will help you find out which pages have problems being spidered.

7:43 pm on Nov 16, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2002
posts:872
votes: 0


> There must be more to this than "Help us find all of your pages".

someone said in summer that the spiders are constantly fighting with 40 % broken links. I assume they are fed up with dedicating their bandwith to that crap. Search engines will always perfrom some spidering the way they used to, but I guess sites with correct sitemaps of this kind will be spidered much more often/regular in the future.

> So, can we submit the exact same sitemap file to google and yahoo now?

If I remember correctly the only difference to the sitemap the webmaster-central-console requires is the urlset-tag. It would be helpful if one of the google insiders could confirm if or when we might alternatively submit the urlset-tag listed in the example kapow gave.

7:49 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:May 4, 2004
posts: 44
votes: 0


Does anyone have the sitemap ping addresses for Google and Yahoo?

It would be great to notify the search engines of a sitemap without having to create a million different Google, Yahoo and Microsoft Live ID logins.

8:06 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 4, 2003
posts: 122
votes: 0


Insight, you need to submit your sitemaps to Google and Yahoo search sites. Once your site and maps are validated, they will poll the sitemaps regularly.
8:08 pm on Nov 16, 2006 (gmt 0)

New User

10+ Year Member

joined:June 7, 2005
posts:36
votes: 0


This morning I submitted 5 sitemaps i had in Google to Yahoo and all worked fine. I didn't have to authenticate the subfolder i use on one of them as I had to do with Google.
8:21 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:May 4, 2004
posts: 44
votes: 0


Thanks for the reply mlemos.

I understand how to submit (http://www.sitemaps.org/faq.html#faq_after_submission), but I was wondering if anyone knew what the ping url's were so I can submit automatically without creating specific Webmaster Central and Site Explorer logins:

You can also submit your Sitemap using an HTTP request (replace <searchengine_URL> with the URL provided by the search engine):
Issue your request to the following URL:

<searchengine_URL>/ping?sitemap=sitemap_url

9:26 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 10, 2006
posts:117
votes: 0


I believe there are 3 reasons why they did this

1. Duplicate Content

2. Page not found errors (as Oliver Henniges menthions above)

3. Faster refreshes (due to less downloading)

There were many many discussions about webmasters who complained that SE's keeps on spydering and listing pages that weren't ment for that - mostly when it comes to dynamic content.

I think all 3 SE's have problems with listing duplicate pages. If there is a website like [sitemaps.org...] - webmaster can specify which on his page are to be spydered and listed only once - and all the SE's will follow...

This will also help to get rid off "not found pages" faster - and will make refreshes faster, since SE's will download less and faster...

In general: thumbs up to Google, MSN and Yahoo!

9:56 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 10, 2006
posts:117
votes: 0


Quick question...

You can also submit your Sitemap using an HTTP request (replace <searchengine_URL> with the URL provided by the search engine):

Could anybody please tell me that is the <searchengine_URL> for each Google, Yahoo, and MSN?

For some reason they didn't include the most important info... duh...

10:13 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:May 4, 2004
posts: 44
votes: 0


I'm looking for the same info goobarev.

Here's Google's:
www.google.com/webmasters/sitemaps/ping?sitemap=www.domain.de/sitemap.xml

I'd really like to find one for Yahoo so I don't have to set up a bunch of Yahoo logins.

10:40 pm on Nov 16, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:June 15, 2006
posts:500
votes: 0


Why not create a yahoo site explorer account

[siteexplorer.search.yahoo.com...]

11:51 pm on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 11, 2002
posts:140
votes: 0


> How do Google, Yahoo and MSN benefit from sitemaps? Why is there such a big push for webmasters to sign up? There must be more to this than "Help us find all of your pages".

I think this is a delicate business. Google most certainly doesn't use XML sitemaps very much, though the ideals are lofty.

a) It's the only way a webmaster has of providing subjective input to the crawler - through the <priority> tag you can tell the crawler which of your pages you prefer over others. If it comes to a choice between two of your equally-ranked pages, the one you give the highest priority to should be presented first in the SERPs.

b) There is potential for bandwidth savings, though I'm not sure this has been thought through properly. First there is <lastmod>. In theory - with a "trusted" site - the crawler can use <lastmod> to decide which pages it should spider. This cuts both ways. If I find a minor and trivial typo, I fix it and upload the page. Today, the spider will spot the change on its next HEAD or "Get if modified" and upload the page for reindexing. Pointless, because anyone visiting the page will see the change anyway. So I could make the change, upload the page, and NOT modify the <lastmod>. Save the crawler bandwidth. Or if the change is significant, I update <lastmod>, reload the sitemap and resubmit it.

There are other ramifications, and I'm in the process of writing a web page. The conceptof a "trusted" sitemap looms and promises to be a nastier issue than -30 penalties. A lot of the tools generate XML sitemaps with the current date in <lastmod> and a <changefreq> of daily - anyone expecting to fool a search engine thus is as dumb as Shrub.

8:20 am on Nov 17, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 30, 2004
posts:129
votes: 0


> Pointless, because anyone visiting the page will see the change anyway. So I could make the change, upload the page, and NOT modify the <lastmod>.

This is how I have been using sitemaps. I have a lot of archive news articles on one site I don't want Google to bother reindexing on a regular basis but I would like it indexed once as it is an important reference for some people, so I set change frequency to 1 month. At the moment it hasn't had much affect on the spider but we'll see in the future.

5:55 pm on Nov 17, 2006 (gmt 0)

New User

10+ Year Member

joined:Sept 19, 2005
posts:25
votes: 0


it appears MSN still hasn't implemented it yet

[blogs.msdn.com ]

6:21 pm on Nov 17, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2002
posts:872
votes: 0


> First there is <lastmod>

I have a number of pages indexed where I present a random collection of our products to my customers. The only way to account for this correctly in a stiemap with respect to this lastmod tag would be to let the xml-file run through the php-parser (or anyother language) and insert today's value. I don't do that yet, i'm happy if the pages are indexed at all, so I leave the lastmod date on the day I changed some of the rest of the pages.

I assume spiders cannot read the correct lastmod-date from the filesystem, can they? How would you handle this?

8:46 pm on Nov 17, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 10, 2001
posts:731
votes: 0


I believe this has been implemented both in Google and Yahoo! at least since late '05. I have been using the site maps since November of 2005.

I think this will not just reduce the spider technical overhead, but potentially create a legal protection for the search engines.

If the spiders can be targeted by the webmasters' direction through sitemaps (verses 'just' scanning through everything), the webmasters no longer have a case to claims of 'illegal copying' or similar.

[edited by: Tapolyai at 9:01 pm (utc) on Nov. 17, 2006]

4:49 am on Nov 18, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:May 12, 2006
posts:112
votes: 0


The search industry is a winner in this as well as publishers to a certain degree. It just depends how all the search engines use the information and interpret it - some publishers may gain from this while others lose out depending on the search engine.

1. Spam will have a much reduced affect on the search results and therefore the quality of results should increase.

2. Quicker refreshing search indexes with better quality results as well as reduced bandwidth usage for both publisher and search engine.

3. In the future, with the sitemaps becoming standardized and universal it will enable small business sites to get indexed without having to redesign their site due to poor crawlability.

4. Content that is never updated or rarely updated should not be constantly crawled by search spiders and this can save a lot of bandwidth on heavy traffic sites especially.

The very fact that it has taken so long for some form of agreement to be made shows that the search engines took their time before looking at collaboration as ultimately all of them gain from better quality information rather than just relying on their own crawlers.

9:30 am on Nov 18, 2006 (gmt 0)

New User

10+ Year Member

joined:Feb 16, 2006
posts:17
votes: 0


They said Yahoo and MSN would pick up the xml file automatically..

So it isnt nescessary to submit them through Yahoo Siteexplorer like the normal urllist.txt or feed?

Thnx in forward

10:42 pm on Nov 18, 2006 (gmt 0)

New User

10+ Year Member

joined:Feb 16, 2006
posts:17
votes: 0


Submission through Yahoositexplorer is nescessary or HTTP request
3:00 am on Nov 19, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 21, 2005
posts: 379
votes: 0


insight wrote:
Does anyone have the sitemap ping addresses for Google and Yahoo?

I found these in someone's blog:

[google.com...]
[search.live.com...]
[siteexplorer.search.yahoo.com...]

I haven't tested them though.

9:28 pm on Nov 19, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 13, 2005
posts:157
votes: 0



[search.live.com...]

This returns "Bad format while processing ping."

This 35 message thread spans 2 pages: 35
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members