Welcome to WebmasterWorld Guest from 54.167.185.18

Message Too Old, No Replies

Google Sitemaps Problem

   
12:59 am on Feb 26, 2006 (gmt 0)

5+ Year Member



Aloha

and thank you in advance for any help, sitemaps is driving me insane

i downloaded this freeware program which creates a sitemap for you, and it has buttons on it that generate a map for you.

it's pretty popular program from what i can tell.

i will tell you exactly what i did, i know very little about this stuff so i'm going to be painstakingly detailed to make sure anyone who knows what's wrong can answer it (and it's probably very easy).

On the sitemap making program I extracted all the urls from my site (it actually helped me because many urls I found out weren't linked to anything and weren't getting spidered!).

I clicked "generate Google XML"

Clicked "file" - "save map as" - google sitemap xml

then saved the whole thing as "sitemap_index.xml" in my root directory

then i uploaded it via ftp

when i clicked submit sitemap in google it gave me the following error:

Parsing error (Line 2) We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting

The text at the beginning of my sitemap is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!--Google Site Map File Generated by [example...] of site building site.net Sat, 25 Feb 2006 16:49:04 GMT-->
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<url>
<loc>http://www.mysite.com/example/outbound/usa/washington-dc.html</loc>
</url>
<url>

and a list of urls in that format

it ends like this:

</url>
</urlset>

thanks for any help anyone can give, these things are giving me a headache, though i do owe a lot to that sitemap generator for catching all those unlinked urls

3:42 am on Feb 26, 2006 (gmt 0)

5+ Year Member



You can try this lines for urlset.

<urlset
xmlns="http://www.google.com/schemas/sitemap/0.84"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84
[google.com...]

4:38 am on Feb 26, 2006 (gmt 0)

10+ Year Member



Hi,
I think you get that error when you have an invalid character in one of your URLs. Try opening the file in IE. if it fails to load, look for the last line shown. Examine the next line and replace any strange characters in the file or directory name with it's equivalent Unicode. Also update all references and links.
Check the section titled:
"Entity escaping"
here:
[google.com...]
6:10 am on Feb 26, 2006 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'm a bit suspicious of the commented line. It is line 2 by my count, and you have edited it so your post can comply with our Terms of Service (thank you!)

Dashes inside an xml comment can sometimes confuse an xml parser. If the actual domain name inside the comment contains a dash or two, that may be the problem. Or, there may be some other confusing character in the part that you've edited.

However, it would be quite odd for the program to inject a problematic comment line, but you never know. There's nothing "off" that I can see in the rest of what you've pasted in. Might be good to run the output through your own validator. Google recommends the xml validation tools listed on the W3C site:
[w3.org...]

11:49 pm on Feb 26, 2006 (gmt 0)

5+ Year Member



I have some blurred memory about cr/lf vs cr problems in one of the files I edited by hand using notepad.

But it is 1 a.m. here and I am getting sleepy, so I can be wrong.

5:08 am on Feb 27, 2006 (gmt 0)

10+ Year Member



I would tend to just take line 2 out altogether, seeing as it's a comment...
6:51 am on Feb 27, 2006 (gmt 0)

5+ Year Member



i deleted line 2 above and now it says "ok" in google sitemaps

thank you!

one more question: do i need to run the site map maker every time i add new pages to my site or will it update itself?

7:22 am on Feb 27, 2006 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Because you needed to save the xml sitemap file manually to the root directory on the server, I'm pretty sure you will need to run it everytime you change the site structure and then save that newly created file to the same place, overwriting the old file.

For the program to automatically update the sitemap, you would need to have installed the application on the server and then also automated it somehow -- you described none of those actions.

12:58 pm on Feb 27, 2006 (gmt 0)

5+ Year Member



Have you looked at the softplus gsite crawler? it has an automated process for producing the gzip file and can ftp the file if required.

I have found it very useful after spending quite a while trying to put together the sitemap through various other programs.

4:28 pm on Feb 27, 2006 (gmt 0)

10+ Year Member



Jamie-uk, thanks for that tip.
8:38 pm on Feb 28, 2006 (gmt 0)

5+ Year Member



Thanks everyone
3:26 am on Mar 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi!

I have a similar problem. When validating find this:

Validation Status:

Your XML file at [mysite.com...]
Does Not Validate to the Google Schema Definition because of a parsing error.

--------------------------------------------------------------------------------

Reported Errors:

/sitemap/testbed/3aa8ad1cde2a1786b21e5d132dafed3a.xml:1: parser error : Document is empty

^
/sitemap/testbed/3aa8ad1cde2a1786b21e5d132dafed3a.xml:1: parser error : Start tag expected, '<' not found

^

In order to detect what's wrong I just made a sitemap03.xml containing this:


<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.google.com/schemas/sitemap/0.84"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.google.com/schemas/sitemap/0.84
http://www.google.com/schemas/sitemap/0.84/sitemap.xsd">
<url>
<loc>http://www.mysite.com/index.htm</loc>
<lastmod>2006-1-23</lastmod>
<changefreq>weekly</changefreq>
<priority>0.6</priority>
</url>
</urlset>

I have no clue of what to do really since all I do gives me same result...