Forum Moderators: goodroi

Message Too Old, No Replies

Google Sitemap Generator Not Working With Python 3

And is a sitemap really needed these days?

         

Frank_Rizzo

10:16 am on Feb 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had used google's old sitemap generator for years. This has worked well scanning my own sanitised list of URLs and created 12x 50,000 lined xml.gz files. Google read them and displayed info nicely in WMT.

Last year I upgraded the server to a later OS and that changed python version from 2 to 3. This broke the sitemap_gen.py script.

I tried modyfing the code to fix the breaking changes (print is a function, tabs and indents issues, TRUE is a keyword etc) but gave up after an hour or so.

1. Is there a newer equivalent: a simple command line tool that creates the sitemaps from reading a list of urls? I can't find one anywhere, and all references to the source of google's python script give a 404.

2. Is there any need to do this in this day and age? Sitemaps were deemed to be essential for 'new sites' like 20 years ago but has that need now gone because SEs are more advanced in crawling?

NickMNS

1:20 pm on Feb 2, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I tried modyfing the code to fix the breaking changes (print is a function, tabs and indents issues, TRUE is a keyword etc) but gave up after an hour or so.

The differences between Python2 and 3 tend generally to minor when we are talking about relatively simple scripts. "print is a function" is the most common difference, in Python2 the print statements could be unbracketed, but in Python3 they must be bracketed. eg:

#Python2
>>> print "Hello World"

#Python3
>>> print("Hello World")


As for tabs and indent issues, they haven't changed so my guess is that you are inserting tabs with your text editor when it should be spaces. Python uses space, specifically four spaces, and the correct spacing is required for the code to run. For example in an if statement the code must be indented with four spaces, the next line without any spaces will be considered as outside the statement. example:

my_var = 0
if my_var > 0:
my_var += 1
print("My variable is more than 0:", my_var) #this will run when my var > 0

print("My variable is zero:", my_var) #this will always run regardless of the value of my var


1. Is there a newer equivalent: a simple command line tool that creates the sitemaps from reading a list of urls? I can't find one anywhere, and all references to the source of google's python script give a 404.

I don't know. I have always written my own as it is pretty straight forward, but I haven't needed to in a while.

2. Is there any need to do this in this day and age? Sitemaps were deemed to be essential for 'new sites' like 20 years ago but has that need now gone because SEs are more advanced in crawling?

In my opinion, as based on my experience no.I have several sites of which two have millions of pages even tens of millions, in the past I have had site-maps but they were annoying to maintain and they never seemed to make any difference. But the sites have a simple and relatively flat structure where all the pages can easily be found through the links on the parent pages. If you have a site that has a more complex structure, even a smaller site, than yes sitemaps can be a benefit.

not2easy

1:24 pm on Feb 2, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Sitemaps are not required, Google does not require them but they say for large sites that it is more useful than small (under 500 pages) sites. Google says the reason to have a sitemap is because
A sitemap tells Google which pages and files you think are important in your site
and helps them find all the pages you would like to have crawled and indexed - but a sitemap does not guarantee they will be crawled or indexed.

It does not need to be in xml format. If you cannot create xml sitemaps, Google is fine with .php sitemaps or even .txt format files. Their most recent information on building sitemaps [developers.google.com] might help on how-to.