homepage Welcome to WebmasterWorld Guest from 54.145.209.80
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / General Search Engine Marketing Issues
Forum Library, Charter, Moderators: mademetop

General Search Engine Marketing Issues Forum

This 35 message thread spans 2 pages: 35 ( [1] 2 > >     
SiteMaps Protocol Endorsed by 3 SE's
Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3158223 posted 7:08 am on Nov 16, 2006 (gmt 0)

Released Wednesday at PubCon:

[sitemaps.org...]

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.


 

OldWolf

5+ Year Member



 
Msg#: 3158223 posted 10:34 am on Nov 16, 2006 (gmt 0)


Yahoo Search blog announced;

'Together we're announcing [sitemaps.org...] which provides details of the current release of the Sitemaps protocol and will include future updates as we continue to collaborate on this common protocol. By offering an open standard for web sites, webmasters can use a single format to create a catalog of their site URLs and to notify changes to the major search engines. This should make is easier for web sites to provide search engines with content and metadata. And in turn, search engines can spend less time crawling unchanged pages and can update indexes faster as new content is discovered. This will help us reflect the changes more quickly, and improve our ability to provide more timely and relevant search results for users. Sitemaps is available to any site owner who wishes to communicate more easily with participating search engines. Simply create and upload an XML Sitemap and submit the URL of the file to search engines.'

A who is on domain says it created On:12-Aug-2001 and owner is Google

[edited by: Brett_Tabke at 2:49 pm (utc) on Nov. 16, 2006]
[edit reason] fixed link [/edit]

OldWolf

5+ Year Member



 
Msg#: 3158223 posted 2:50 pm on Nov 16, 2006 (gmt 0)

Hi Brett

I just wonder if i have a sitemap like this, is it still a bad idea to go more then 2 levels deep as i see around 4 level pages with PR5.

You were said at [webmasterworld.com...]

All pages should be linked to more than one other page on your site,and not more than 2 levels deep from root.

Thanks

[edited by: encyclo at 3:06 pm (utc) on Nov. 16, 2006]
[edit reason] fixed link [/edit]

mlemos

10+ Year Member



 
Msg#: 3158223 posted 4:49 pm on Nov 16, 2006 (gmt 0)

I submitted a sitemap to Google and Yahoo. Is there a place to submit sitemaps to MSN Search?

travisk

10+ Year Member



 
Msg#: 3158223 posted 5:29 pm on Nov 16, 2006 (gmt 0)

How do Google, Yahoo and MSN benefit from sitemaps? Why is there such a big push for webmasters to sign up? There must be more to this than "Help us find all of your pages".

maherphil

10+ Year Member



 
Msg#: 3158223 posted 5:33 pm on Nov 16, 2006 (gmt 0)

trav - good point, the engines have been able to find pages fine before. If your sites pages are all spidered and ranking, why would a webmaster submit this data to the engines? Seems like the engines could be using this data against the webmaster.

Shoggoth

10+ Year Member



 
Msg#: 3158223 posted 5:39 pm on Nov 16, 2006 (gmt 0)

A who is on domain says it created On:12-Aug-2001 and owner is Google

It all just looks like a repackaged and debranded version of the Sitemaps documentation on Google's site.

So is the news just that the Sitemap schema should point to the new domain, and that Yahoo (and Microsoft?) will now accept it?

I wonder if Yahoo SiteExplorer will continue to accept other formats (not that I won't be sending them to the sitemap schema'd file now, rather than an Atomized version).

Anyone have something tangible with regard to Microsoft? There isn't an existing sitemap service for MSN, is there?

kapow

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3158223 posted 6:20 pm on Nov 16, 2006 (gmt 0)

The first sample sitemap on www.sitemaps.org/protocol.html does not have a closing <url> tag!
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</urlset>

skuba

10+ Year Member



 
Msg#: 3158223 posted 7:18 pm on Nov 16, 2006 (gmt 0)

So, can we submit the exact same sitemap file to google and yahoo now?
I already have on google, but no on yahoo.

The one problem with the current system is that if you have different subdomains, you need a different sitemap for each subdomain. They don't allow one sitemap linking pages on another subdomain.

Example:
mysite.com site maps can't link to sub.mysite.com, you need to put another site map at sub.mysite.com just for pages under that domain. Does it make sense or am I tripping here?

Thanks

maherphil

10+ Year Member



 
Msg#: 3158223 posted 7:19 pm on Nov 16, 2006 (gmt 0)

does not have a closing <url> tag!

I wouldn't expect a 0.9 version to have anything less ;)

WebWalla

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3158223 posted 7:24 pm on Nov 16, 2006 (gmt 0)

If your sites pages are all spidered and ranking, why would a webmaster submit this data to the engines?

It won't help ranking, but it will help you find out which pages have problems being spidered.

Oliver Henniges

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3158223 posted 7:43 pm on Nov 16, 2006 (gmt 0)

> There must be more to this than "Help us find all of your pages".

someone said in summer that the spiders are constantly fighting with 40 % broken links. I assume they are fed up with dedicating their bandwith to that crap. Search engines will always perfrom some spidering the way they used to, but I guess sites with correct sitemaps of this kind will be spidered much more often/regular in the future.

> So, can we submit the exact same sitemap file to google and yahoo now?

If I remember correctly the only difference to the sitemap the webmaster-central-console requires is the urlset-tag. It would be helpful if one of the google insiders could confirm if or when we might alternatively submit the urlset-tag listed in the example kapow gave.

insight

10+ Year Member



 
Msg#: 3158223 posted 7:49 pm on Nov 16, 2006 (gmt 0)

Does anyone have the sitemap ping addresses for Google and Yahoo?

It would be great to notify the search engines of a sitemap without having to create a million different Google, Yahoo and Microsoft Live ID logins.

mlemos

10+ Year Member



 
Msg#: 3158223 posted 8:06 pm on Nov 16, 2006 (gmt 0)

Insight, you need to submit your sitemaps to Google and Yahoo search sites. Once your site and maps are validated, they will poll the sitemaps regularly.

Sztraik

5+ Year Member



 
Msg#: 3158223 posted 8:08 pm on Nov 16, 2006 (gmt 0)

This morning I submitted 5 sitemaps i had in Google to Yahoo and all worked fine. I didn't have to authenticate the subfolder i use on one of them as I had to do with Google.

insight

10+ Year Member



 
Msg#: 3158223 posted 8:21 pm on Nov 16, 2006 (gmt 0)

Thanks for the reply mlemos.

I understand how to submit (http://www.sitemaps.org/faq.html#faq_after_submission), but I was wondering if anyone knew what the ping url's were so I can submit automatically without creating specific Webmaster Central and Site Explorer logins:

You can also submit your Sitemap using an HTTP request (replace <searchengine_URL> with the URL provided by the search engine):
Issue your request to the following URL:

<searchengine_URL>/ping?sitemap=sitemap_url

goubarev

5+ Year Member



 
Msg#: 3158223 posted 9:26 pm on Nov 16, 2006 (gmt 0)

I believe there are 3 reasons why they did this

1. Duplicate Content

2. Page not found errors (as Oliver Henniges menthions above)

3. Faster refreshes (due to less downloading)

There were many many discussions about webmasters who complained that SE's keeps on spydering and listing pages that weren't ment for that - mostly when it comes to dynamic content.

I think all 3 SE's have problems with listing duplicate pages. If there is a website like [sitemaps.org...] - webmaster can specify which on his page are to be spydered and listed only once - and all the SE's will follow...

This will also help to get rid off "not found pages" faster - and will make refreshes faster, since SE's will download less and faster...

In general: thumbs up to Google, MSN and Yahoo!

goubarev

5+ Year Member



 
Msg#: 3158223 posted 9:56 pm on Nov 16, 2006 (gmt 0)

Quick question...

You can also submit your Sitemap using an HTTP request (replace <searchengine_URL> with the URL provided by the search engine):

Could anybody please tell me that is the <searchengine_URL> for each Google, Yahoo, and MSN?

For some reason they didn't include the most important info... duh...

insight

10+ Year Member



 
Msg#: 3158223 posted 10:13 pm on Nov 16, 2006 (gmt 0)

I'm looking for the same info goobarev.

Here's Google's:
www.google.com/webmasters/sitemaps/ping?sitemap=www.domain.de/sitemap.xml

I'd really like to find one for Yahoo so I don't have to set up a bunch of Yahoo logins.

vite_rts

5+ Year Member



 
Msg#: 3158223 posted 10:40 pm on Nov 16, 2006 (gmt 0)

Why not create a yahoo site explorer account

[siteexplorer.search.yahoo.com...]

Phil_Payne

10+ Year Member



 
Msg#: 3158223 posted 11:51 pm on Nov 16, 2006 (gmt 0)

> How do Google, Yahoo and MSN benefit from sitemaps? Why is there such a big push for webmasters to sign up? There must be more to this than "Help us find all of your pages".

I think this is a delicate business. Google most certainly doesn't use XML sitemaps very much, though the ideals are lofty.

a) It's the only way a webmaster has of providing subjective input to the crawler - through the <priority> tag you can tell the crawler which of your pages you prefer over others. If it comes to a choice between two of your equally-ranked pages, the one you give the highest priority to should be presented first in the SERPs.

b) There is potential for bandwidth savings, though I'm not sure this has been thought through properly. First there is <lastmod>. In theory - with a "trusted" site - the crawler can use <lastmod> to decide which pages it should spider. This cuts both ways. If I find a minor and trivial typo, I fix it and upload the page. Today, the spider will spot the change on its next HEAD or "Get if modified" and upload the page for reindexing. Pointless, because anyone visiting the page will see the change anyway. So I could make the change, upload the page, and NOT modify the <lastmod>. Save the crawler bandwidth. Or if the change is significant, I update <lastmod>, reload the sitemap and resubmit it.

There are other ramifications, and I'm in the process of writing a web page. The conceptof a "trusted" sitemap looms and promises to be a nastier issue than -30 penalties. A lot of the tools generate XML sitemaps with the current date in <lastmod> and a <changefreq> of daily - anyone expecting to fool a search engine thus is as dumb as Shrub.

davidof

10+ Year Member



 
Msg#: 3158223 posted 8:20 am on Nov 17, 2006 (gmt 0)

> Pointless, because anyone visiting the page will see the change anyway. So I could make the change, upload the page, and NOT modify the <lastmod>.

This is how I have been using sitemaps. I have a lot of archive news articles on one site I don't want Google to bother reindexing on a regular basis but I would like it indexed once as it is an important reference for some people, so I set change frequency to 1 month. At the moment it hasn't had much affect on the spider but we'll see in the future.

ChuckyG

5+ Year Member



 
Msg#: 3158223 posted 5:55 pm on Nov 17, 2006 (gmt 0)

it appears MSN still hasn't implemented it yet

[blogs.msdn.com ]

Oliver Henniges

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3158223 posted 6:21 pm on Nov 17, 2006 (gmt 0)

> First there is <lastmod>

I have a number of pages indexed where I present a random collection of our products to my customers. The only way to account for this correctly in a stiemap with respect to this lastmod tag would be to let the xml-file run through the php-parser (or anyother language) and insert today's value. I don't do that yet, i'm happy if the pages are indexed at all, so I leave the lastmod date on the day I changed some of the rest of the pages.

I assume spiders cannot read the correct lastmod-date from the filesystem, can they? How would you handle this?

Tapolyai

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3158223 posted 8:46 pm on Nov 17, 2006 (gmt 0)

I believe this has been implemented both in Google and Yahoo! at least since late '05. I have been using the site maps since November of 2005.

I think this will not just reduce the spider technical overhead, but potentially create a legal protection for the search engines.

If the spiders can be targeted by the webmasters' direction through sitemaps (verses 'just' scanning through everything), the webmasters no longer have a case to claims of 'illegal copying' or similar.

[edited by: Tapolyai at 9:01 pm (utc) on Nov. 17, 2006]

Ganceann

5+ Year Member



 
Msg#: 3158223 posted 4:49 am on Nov 18, 2006 (gmt 0)

The search industry is a winner in this as well as publishers to a certain degree. It just depends how all the search engines use the information and interpret it - some publishers may gain from this while others lose out depending on the search engine.

1. Spam will have a much reduced affect on the search results and therefore the quality of results should increase.

2. Quicker refreshing search indexes with better quality results as well as reduced bandwidth usage for both publisher and search engine.

3. In the future, with the sitemaps becoming standardized and universal it will enable small business sites to get indexed without having to redesign their site due to poor crawlability.

4. Content that is never updated or rarely updated should not be constantly crawled by search spiders and this can save a lot of bandwidth on heavy traffic sites especially.

The very fact that it has taken so long for some form of agreement to be made shows that the search engines took their time before looking at collaboration as ultimately all of them gain from better quality information rather than just relying on their own crawlers.

rxbbx

5+ Year Member



 
Msg#: 3158223 posted 9:30 am on Nov 18, 2006 (gmt 0)

They said Yahoo and MSN would pick up the xml file automatically..

So it isnt nescessary to submit them through Yahoo Siteexplorer like the normal urllist.txt or feed?

Thnx in forward

rxbbx

5+ Year Member



 
Msg#: 3158223 posted 10:42 pm on Nov 18, 2006 (gmt 0)

Submission through Yahoositexplorer is nescessary or HTTP request

Mokita

5+ Year Member



 
Msg#: 3158223 posted 3:00 am on Nov 19, 2006 (gmt 0)

insight wrote:
Does anyone have the sitemap ping addresses for Google and Yahoo?

I found these in someone's blog:

[google.com...]
[search.live.com...]
https://siteexplorer.search.yahoo.com/submit/ping?sitemap=sitemap_url

I haven't tested them though.

corbing

5+ Year Member



 
Msg#: 3158223 posted 9:28 pm on Nov 19, 2006 (gmt 0)


[search.live.com...]

This returns "Bad format while processing ping."

This 35 message thread spans 2 pages: 35 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Marketing and Biz Dev / General Search Engine Marketing Issues
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved