I read up on this and stuff. Sounds very interesting and promising. It is in beta still and they do not guarantee that if you provide a file (or files) with your URLs that it will get you indexed faster but I have a feeling this could be big (especially in the future).
One thing I wonder is how they will treat new document count discovery in terms of this and the part of their indexing algorithm regarding page count growth and link growth.
|Webmasters attempting to install and execute sitemap_gen.py should have knowledge of uploading files to their webserver, connecting to their webserver, and running scripts. In addition, Python version 2.2 must be installed... |
99% of the people who are involved with publishing info on the net will have idea how to use these tools. Do you have Python on your server? I don't even know myself.
This means that the same relatively tiny handful of experts will continue to abuse and spam Google at the expense of legitimate contributors who have already found themselves lost in a sea of web sites that have been highly optimized for AdSense or whatever, but otherwise serve no real informational purpose.
The first publishers to sign up for this will be the scraper sites.
|The first publishers to sign up for this will be the scraper sites. |
I have shure no scraper site and I will be ready in 2 hours by implementing this into my CMS.
I will not use the tool, because the tool knows less than my CMS.
The tool would set the change date for any change in a page, including changes in the navigation links.
My CMS sets the change date according change in title, description or content only.
Well so far so good, I run a perl sitemap script, just added it to the beta, google liked it, says it could take a few hours to update.
All linux servers run python and if they are up to date they will have version 2.2.
The script I use is simple to install, you can set folders and files you dont want indexing and it uses a template to display title (hyperlinked) last updated time/date file size.
Looks like G likes it, time will tell
What program/script are you guys using for this?
yeah, very nice - but somewhat complicated.
You need to have installed Python on your webserver in order to generate this sitemap code.
You don't have to use their sitemap generator. If you know a little about XML, you should be able to generate that sitemap file in any programming language you like.
Or you even can use plain text file, just list your urls one by one there...
beggers, you are totally wrong. you have missed the whole point behind this move by Google.
|Eventually we hope this will be supported natively in webservers (e.g. Apache, Lotus Notes, IIS). But to get you started, we offer Sitemap Generator, an open source client in Python to compute sitemaps for a few common use cases. |
Google's spiders crawl web pages anyway, this method is just a way to hint to those spiders and help them do thier job faster and using less resources from Google. This will make it easier on G to index pages, which will leed to it indexing more pages in less time. Such methods are only guides for the spider, but G ultimately decides wheather or not to take such webmaster hints into consideration for each web site. So its only a method to speed the spiders up and use less of their resources nothing else. But sure G won't be 'selling' you the idea in that way, and has to tell you what is in for you in that initiative, so it entices you be telling you you'll get your pages indexed in a more timely manner and more of your pages, and tells surfers they'll get more timely and searches that also have wider reach.
I've always thought about this problem. Someone publishing I page then me looking for it in a search engine minutes later, sure I do not find it (unlesss it's in a news site), but it takes time till it's indexed an appears in search engine results. I've often mused about the idea of the need for search engines to index pages almost instantly, I did not figure out it was really technically possible now, but I thought it would be a good idea and an actual need if it is satisfied one day. And that day seams to be coming not so long from now. (8 months from now, 1.5 year from now ...?)
If you have paid attention to the quote above from Google, you'll understand that this is only the start of an initiative, it is not the 'real' thing yet. Who said this python Sitemap Generator or any other script is the way this will be done, no it's only the initial phase for the more savvy webmasters, but in the future as the quote suggests, this is hoped to be a standard part of web servers and how things are done on the web, which means it will be so EASY for anyone publishing a web site to benefit from such technology. I'm invisioning the day this will be a standard part of the Apache server (which I think will not take long) and then Microsoft coming in runningn to make it part of its IIS web server :)
I have an interesting tidbit for you all!
A blog I run about how to maximize adsense earnings just came out of the sandbox today.
Coincidence that I added it to Google SiteMaps yesterday? I think not.
I use a certain free site search on my site and whenever I add content and/or new pages, the service generates a "what's new" page which gets indexed faster than the pages themselves and usually high up in the SERPs. Also, this service has always provided a site map which is accessible from any page on the site which provides a link (simple html code).
This service has Google beaten by a country mile and it won't allow spammers as Google will.
Anyone have another XML sitemap creator other then googles (that doesn't rely on python?)
Having a sitemap on every website is good thing. For SEO point of view you can say that at least there is one page on your website having link to all other pages/sections.
Google search engine indexing technique is not open and we can only guess about it. According to my experince Google bot pay a visit to your site within week you post your url to them and after that they rank your site into certain level and pay a 10% to 80% increase on page indexting when they visit next time and so on.
They never index your site 1st moment and they never update on some regular basis. If you have listed your site on some high PR website than they can index your pages more frequently.
The sory become more complex when you have a dynamic website.
[edited by: Woz at 10:40 am (utc) on June 7, 2005]
[edit reason] No URLs or Sigs please, see TOS#13 [/edit]
I am not allowed to run the script on my server because of security reasons.
Are there other programms/scripts that Google will like?
|A blog I run about how to maximize adsense earnings just came out of the sandbox today. |
Coincidence that I added it to Google SiteMaps yesterday? I think not.
I have a feeling this must be due to the recent update rather than anything else.
Saying that, i'd be interested to hear if anyone has a similar case. ;)
> The first publishers to sign up for this will be the scraper sites.
Agreed. Google has launched a scraper and spammer trap. Asking them to submit machine readable URL listings seems to be a pretty smart solution to automatically filter crap out of the index. Seriously, Google engineers are smart enough to develop such a service taking abuse into account.
>Anyone have another XML sitemap creator other then googles (that doesn't rely on python?)
>Are there other programms/scripts that google will like?
Here you go: [groups-beta.google.com...]
> The story becomes more complex when you have a dynamic website.
Exactly, large dynamic web sites will get the most value out of the Google SiteMap service. Here is a short tutorial explaining how database driven sites can fully automate the sitemap channel:
[edited by: Jenstar at 10:39 am (utc) on June 7, 2005]
[edit reason] No promotional URLs please, as per TOS [/edit]
Did anyone read the rules and regulations for this?
The Google Services are made available for your personal, non-commercial use only. You may not use the Google Services to sell a product or service, or to increase traffic to your Web site for commercial reasons, such as advertising sales.
Looks to me that if you're adsense publisher and do this - could mean ban!
I think I'll stay away from this for now.
Any thoughts on this yet? I know that it is in fine print but sometimes you gotta read these things.
Just put a regular site map and google will eventually spider your pages. They are just implementing this to lighten their system workload.