Site Status URL collectors Status Show details Webserver filter RunningHide Number of collected URLs 0 Log parser DisabledDetail Time of most recent log process 2011-02-12 15:01:15 Number of collected URLs 0 File scanner DisabledDetail
Sitemap creators Status URLs in Sitemap Settings Show details Web Running 65011 Edit Detail Time of most recent Sitemap file generation 2011-02-13 00:00:19 Number of URLs in most recent generated Sitemap file 65011 Mobile Disabled 0 Edit Detail Time of most recent Sitemap file generation N/A Number of URLs in most recent generated Sitemap file 0 Code Search Disabled 0 Edit Detail Time of most recent Sitemap file generation N/A Number of URLs in most recent generated Sitemap file 0 Blog Search Disabled -- Edit Detail
Even more, I have erased "Pathname for log file(s)" field content.
Some query fields has been added. Despite query strings are dynamic their data depends on admin updates: number of pages is fixed. But "URLs in Sitemap" is growing on each map update: generator does parse the access log and adds complex query string created in third party site linking to ours.
Maybe I am missunderstanding some concepts, also don't understand
Msg#: 4266301 posted 7:00 pm on Feb 14, 2011 (gmt 0)
Consider if your webserver may be returning response 200 - OK for invalid requests. Do other sitemappers (e.g. try A1 Sitemap Generator) also include odd looking URLs? If so, chances are it's your server that is odd (e.g. responding to errornous requests with "All ok - 200" HTTP response)
Msg#: 4266301 posted 9:16 pm on Feb 14, 2011 (gmt 0)
Yes, all requests responses are OK. But this is not the problem. The program does parse log data despite I did not selected to do that and removed log file location. At this time I am doing it by myself with php from mysql data. The only problem I am finding is for scripts time execution despite I split mysql loaded data. I need to extend max execution time for some of this scripts I would only run once a day off-peak time and I only am see global solutions. set_time_limit() is not changing anything. Maybe I would have to post this in other forum. Thank you
Msg#: 4266301 posted 12:55 am on Feb 20, 2011 (gmt 0)
The only problem with these site map generators from log files is that they are not typically complete. They only include pages that are visited on your site, and the pages they are not visited don't get put in the sitemap. And these unvisited pages are the ones you want in the sitemap!
Msg#: 4266301 posted 2:41 pm on Mar 12, 2011 (gmt 0)
I wouldn't trust sitemap generators scrapping log files either. Nowadays you can find online good free generators that crawl websites and make sitemap files directly out of it - free at least for a few thousand pages.
Msg#: 4266301 posted 12:39 pm on Mar 16, 2011 (gmt 0)
At this time I've built my own xml site generator with thousands of urls. Google webmaster tool sitemaps behavior is very strange. I cannot get 37 highest priority to get all added to web index. Only 15. One month and still waiting. Site ranked 5/10
Msg#: 4266301 posted 8:34 pm on Mar 17, 2011 (gmt 0)
A1SG can handle 100k+ URLs (but it's a tool you download)
About indexing, just give it time. The numbers reported in Google Webmaster Tools gives you numbers of URls in sitemap confirmed to be in search engine index. (Easy to prove, at least in my cases) Can you confirm you really only have 15 indexed if you try search in Google using complete page titles.