homepage Welcome to WebmasterWorld Guest from 54.197.19.35
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
sitemap.xml still have problems
amythepoet




msg:4372511
 6:16 am on Oct 10, 2011 (gmt 0)

Hi,

Now I have a fixed all errors but on the sitemap.xml ,I see it says all over the place, priority blah blah

I took that out, and I got a message from google earlier saying to let google decide when it will crawl, that's fine by me

I have to do something with settingsand I don't know how to locate it in google webmaster tools, anyhelp

 

lucy24




msg:4372525
 7:31 am on Oct 10, 2011 (gmt 0)

site configuration
>>
settings

The crawl rate affects the speed of Googlebot's requests during the crawl process. It has no effect on how often Googlebot crawls your site. Google determines the recommended rate based on the number of pages in your site.

amythepoet




msg:4372690
 12:47 pm on Oct 10, 2011 (gmt 0)

Lucy, I found it, thank you

now on the same page it says:


Learn more
Don't set a preferred domain
Display URLs as www.mysite
Display URLs as my site

which one should I choose?

Many THANKS!

lucy24




msg:4372746
 2:43 pm on Oct 10, 2011 (gmt 0)

Either with or without www according to personal preference. And redirect the same way in your htaccess (or let the host do it for you). Otherwise you get the dreaded Duplicate Content.

tedster




msg:4372772
 3:18 pm on Oct 10, 2011 (gmt 0)

Amy, I'd pick the version of your URLs that already show the most in your site: operator results. If you're already indexed one way, for the most part, and then you ask Google to change to the other way it can cause a disruption in your traffic... at least it used to.

Also as Lucy recommended, make sure your .htaccess redirect goes in the same direction.

Sally Stitts




msg:4373048
 12:40 am on Oct 11, 2011 (gmt 0)

I grew weary messing around with xml sitemaps.

I now just support 2 sitemaps -
One sitemap.html for my visitors, and

one sitemap.txt for Google and Bing - just a list of alphabetized URLs.
Google gobbles it up from Webmaster Tools within minutes of submission, and indexes new pages within a day or two.
On one occasion, Google gobbled it up INSTANTLY!

It takes days and days for Bing just to download the list.
.

Reno




msg:4373051
 1:09 am on Oct 11, 2011 (gmt 0)

one sitemap.txt for Google and Bing - just a list of alphabetized URLs.

If we want to utilize this same method, would you put a link somewhere on one of the primary pages so Google can find sitemap.txt? Or is it better to put a link to it on sitemap.html? And does it matter that it be named "sitemap.txt"?

.........................

Sally Stitts




msg:4373059
 2:06 am on Oct 11, 2011 (gmt 0)

... would you put a link somewhere on one of the primary pages so Google can find sitemap.txt


I don't think that is even necessary. If you go Webmaster Tools, and click on "Site Configuration" at the top, and then "Sitemaps", you can input your site map directly.

I don't know how strict they are about a naming convention. I would imagine that /site-map.txt might be OK. But why try to name it something obscure, unless you are trying to hide it. /qwerty1.txt might even be OK, since you are telling them the name, and that it is a site map. But I am not sure about that. If someone really wants my sitemap as text, they can always copy my html site map, and strip it down.

I did remove a link to sitemap.txt from my sitemap.html file, because only Google and Bing have a need to see it. So, I have NO link to sitemap.txt from my site.
.

[edited by: Sally_Stitts at 2:14 am (utc) on Oct 11, 2011]

lucy24




msg:4373060
 2:10 am on Oct 11, 2011 (gmt 0)

If you're submitting it manually to gwt or listing it in your robots.txt you can probably call it just about anything. If you want search engines to find it by blind luck, better stick with sitemap.txt. Or sitemap.xml.

https://www.google.com/support/webmasters/bin/answer.py?answer=183668 * sez

For best results, follow these guidelines:

* You must fully specify URLs, as Google attempts to crawl them exactly as provided.
* The text file must use UTF-8 encoding.
* The text file should contain nothing but the list of URLs.
* You can name the text file anything you wish. Google recommends giving the file a .txt extension (for instance, sitemap.txt).

If you are based in a country that uses Roman script you can ignore the UTF-8 business, because your URLs will naturally not include any non-ASCII characters. (Titles, sure. Filenames, nuh-uh.)


* I checked in a different browser because I got the URL from "inside". You don't have to be signed in to GWT to see the page. And I have no idea why it didn't obfuscate :(

Reno




msg:4373061
 2:15 am on Oct 11, 2011 (gmt 0)

Thanks Sally ~ will look into this deeper. The reason I asked is because of the discrepancy that I sometimes see in GWT between the number of URLs in my (long ago) submitted sitemap.xml, and the number of URLs that they show in the index. For some unexplained reason, the latter is occasionally much lower than the former. I don't care if they don't match exactly, but if there are 50 submitted and only 20 in their index, then something is amiss. And I hasten to add that the missing pages are not dupes and are linked consistently into other primary pages. It's yet one more mystery when dealing with Google...

.......................

Sally Stitts




msg:4373065
 2:17 am on Oct 11, 2011 (gmt 0)

You can name the text file anything you wish.


Ahh, someone with the facts. There you have it.


Last week I broke down, and updated my "Master" index. I too, became concerned that my filecounts were flaky, depending upon where I looked. It was painful, but I found about 5 problem files (out of ~350). Now, everything agrees - masterfile, sitemap.html and sitemap.txt. I feel better, when I THINK I know what I am doing - ha-ha.

Now, if I could only find out which 2 files Google is NOT indexing - I haven't figured out how to do that yet. But it is only 2 files, so the heck with it.
.

Reno




msg:4373076
 3:27 am on Oct 11, 2011 (gmt 0)

if I could only find out which 2 files Google is NOT indexing

What I did was to use site:domainname.com as my Google query, then I manually copied each returned address into a text file, sorted them, and compared to my own list. After doing all that (it was a considerably smaller site than yours!), then I went into GWT and did a Fetch as Googlebot for each of the missing URLs. That process just took place in the past few days and so far, nothing has changed so I cannot say definitely that the problem is solved.

........................

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved