Welcome to WebmasterWorld Guest from 54.227.5.198

Forum Moderators: goodroi

Message Too Old, No Replies

About google site map generator

Don't want to parse log files

     

marciano

2:30 am on Feb 13, 2011 (gmt 0)

10+ Year Member



Hello,

I have installed this app from
[code.google.com...]
in a Linux server

Reading the messy XML content I found that access log has been parsed despite I only checked Webserver filter from

URL COLLECTORS
Webserver filter checked
File scanner unchecked
Log parser unchecked


Site Status
URL collectors Status Show details
Webserver filter RunningHide
Number of collected URLs 0
Log parser DisabledDetail
Time of most recent log process 2011-02-12 15:01:15
Number of collected URLs 0
File scanner DisabledDetail

Sitemap creators Status URLs in Sitemap Settings Show details
Web Running 65011 Edit Detail
Time of most recent Sitemap file generation 2011-02-13 00:00:19
Number of URLs in most recent generated Sitemap file 65011
Mobile Disabled 0 Edit Detail
Time of most recent Sitemap file generation N/A
Number of URLs in most recent generated Sitemap file 0
Code Search Disabled 0 Edit Detail
Time of most recent Sitemap file generation N/A
Number of URLs in most recent generated Sitemap file 0
Blog Search Disabled -- Edit Detail


Even more, I have erased "Pathname for log file(s)" field content.

Some query fields has been added. Despite query strings are dynamic their data depends on admin updates: number of pages is fixed.
But "URLs in Sitemap" is growing on each map update: generator does parse the access log and adds complex query string created in third party site linking to ours.

Maybe I am missunderstanding some concepts, also don't understand
Number of collected URLs 0

Thanks for any help

spiritparse

7:00 pm on Feb 14, 2011 (gmt 0)

5+ Year Member



Consider if your webserver may be returning response 200 - OK for invalid requests. Do other sitemappers (e.g. try A1 Sitemap Generator) also include odd looking URLs? If so, chances are it's your server that is odd (e.g. responding to errornous requests with "All ok - 200" HTTP response)

marciano

9:16 pm on Feb 14, 2011 (gmt 0)

10+ Year Member



Yes, all requests responses are OK.
But this is not the problem. The program does parse log data despite I did not selected to do that and removed log file location.
At this time I am doing it by myself with php from mysql data.
The only problem I am finding is for scripts time execution despite I split mysql loaded data.
I need to extend max execution time for some of this scripts I would only run once a day off-peak time and I only am see global solutions. set_time_limit() is not changing anything.
Maybe I would have to post this in other forum.
Thank you

maximillianos

12:55 am on Feb 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The only problem with these site map generators from log files is that they are not typically complete. They only include pages that are visited on your site, and the pages they are not visited don't get put in the sitemap. And these unvisited pages are the ones you want in the sitemap!

bulker

2:41 pm on Mar 12, 2011 (gmt 0)



I wouldn't trust sitemap generators scrapping log files either. Nowadays you can find online good free generators that crawl websites and make sitemap files directly out of it - free at least for a few thousand pages.

ohno

8:47 am on Mar 16, 2011 (gmt 0)



Can someone point me to a good online generator with a 2K limit? Most seem to be limited to 500 pages. Thanks

bulker

11:44 am on Mar 16, 2011 (gmt 0)




Can someone point me to a good online generator with a 2K limit?


<snip>

[edited by: goodroi at 11:00 am (utc) on Mar 17, 2011]
[edit reason] Please no product mentions [/edit]

marciano

12:39 pm on Mar 16, 2011 (gmt 0)

10+ Year Member



At this time I've built my own xml site generator with thousands of urls.
Google webmaster tool sitemaps behavior is very strange.
I cannot get 37 highest priority to get all added to web index. Only 15. One month and still waiting. Site ranked 5/10

spiritparse

8:34 pm on Mar 17, 2011 (gmt 0)

5+ Year Member



A1SG can handle 100k+ URLs (but it's a tool you download)

About indexing, just give it time. The numbers reported in Google Webmaster Tools gives you numbers of URls in sitemap confirmed to be in search engine index. (Easy to prove, at least in my cases) Can you confirm you really only have 15 indexed if you try search in Google using complete page titles.

ohno

7:48 am on Mar 18, 2011 (gmt 0)



Is it advisable to upload a sitemap in one format? ie, just XML & not XML & the GZIP version too? We have a site that has both versions & at times they show different number of pages indexed.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month