homepage Welcome to WebmasterWorld Guest from 107.20.73.188
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Video sitemaps and CDNs
Having index and sitemap files on different domains
Mishaelo




msg:4367786
 2:02 pm on Sep 27, 2011 (gmt 0)

Hi,

Trying to add a video sitemap to our site.

Videos are divided into categories so the plan is to have a sitemap index pointing to the actual sitemaps for each category.

sitemap index will reside on the root but our developers want to host the actual sitemap files on the CDN (under media.domain.com).

Is this possible and what will the robots.txt need to include so the end result will be:
1. sitemap index will be on domain.com/sitemap_index.xml
2. sitemap index will point to the individual category sitemaps on media.domain.com/sitemap_cat1.xml, media.domain.com/sitemap_cat2.xml etc.
3. media.domain.com/sitemap_cat1.xml, media.domain.com/sitemap_cat2.xml will list videos on domain.com/video1.php, domain.com/video2.php etc.

hope this makes sense...

thanks

 

phranque




msg:4368152
 9:19 am on Sep 28, 2011 (gmt 0)

i'm not sure you can do it precisely as described.
i believe you will have to specify the sitemap location for each hostname in that host's robots.txt file.

Sitemap file location:
http://www.sitemaps.org/protocol.php#location

Mishaelo




msg:4368162
 9:51 am on Sep 28, 2011 (gmt 0)

but what if the sitemap index is on the root and the GZIP files are on the CDN subdomain?

Not sure what goes on the root robots.txt and on the subdomain robots.txt

phranque




msg:4368197
 11:16 am on Sep 28, 2011 (gmt 0)

the root robots.txt refers to a sitemap (index?) on the root hostname which only refers to urls on the root hostname.

the subdomain robots.txt refers to its sitemap index on the root hostname.
the CDN subdomain's sitemap index only refers to the individual category sitemaps on the CDN subdomain's hostname.
so far, so good.
the part where i think you are going to have an issue (see #3 above) is that these individual category sitemaps may only refer to urls on the CDN subdomain's hostname.

this is all exactly what is described at the url i provided above.

Mishaelo




msg:4375379
 8:29 am on Oct 17, 2011 (gmt 0)

How will 301 redirects come in play here?

meaning everything will be "normal":
example.com/sitemap_index.xml will be an index file pointing to:
example.com/sitemap_cat1.xml, example.com/sitemap_cat2.xml etc...

but example.com/sitemap_index.xml will have a 301 to media.example.com/folder1/folder2/sitemap_index.xml
and example.com/sitemap_cat1.xml will have a 301 to media.example.com/folder1/folder2/itemap_cat1.xml etc..


And example.com/robots.txt will have:
sitemap: example.com/sitemap_index.xml


Does this comply with the protocol?

phranque




msg:4376246
 10:57 pm on Oct 18, 2011 (gmt 0)

why would you redirect requests for your sitemaps?

Mishaelo




msg:4376458
 9:50 am on Oct 19, 2011 (gmt 0)

why would you redirect requests for your sitemaps?

request from the developers.

in any case, if anyone is facing similar problem - issue was resolved by having the index file on the root (example.com/sitemap_index.xml) which pointed to the sitemap files on the root but they redirect to the CDN.
WMT seems happy with it

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved