homepage Welcome to WebmasterWorld Guest from 23.23.9.5
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
About google site map generator
Don't want to parse log files
marciano

10+ Year Member



 
Msg#: 4266301 posted 2:30 am on Feb 13, 2011 (gmt 0)

Hello,

I have installed this app from
[code.google.com...]
in a Linux server

Reading the messy XML content I found that access log has been parsed despite I only checked Webserver filter from

URL COLLECTORS
Webserver filter checked
File scanner unchecked
Log parser unchecked


Site Status
URL collectors Status Show details
Webserver filter RunningHide
Number of collected URLs 0
Log parser DisabledDetail
Time of most recent log process 2011-02-12 15:01:15
Number of collected URLs 0
File scanner DisabledDetail

Sitemap creators Status URLs in Sitemap Settings Show details
Web Running 65011 Edit Detail
Time of most recent Sitemap file generation 2011-02-13 00:00:19
Number of URLs in most recent generated Sitemap file 65011
Mobile Disabled 0 Edit Detail
Time of most recent Sitemap file generation N/A
Number of URLs in most recent generated Sitemap file 0
Code Search Disabled 0 Edit Detail
Time of most recent Sitemap file generation N/A
Number of URLs in most recent generated Sitemap file 0
Blog Search Disabled -- Edit Detail


Even more, I have erased "Pathname for log file(s)" field content.

Some query fields has been added. Despite query strings are dynamic their data depends on admin updates: number of pages is fixed.
But "URLs in Sitemap" is growing on each map update: generator does parse the access log and adds complex query string created in third party site linking to ours.

Maybe I am missunderstanding some concepts, also don't understand
Number of collected URLs 0

Thanks for any help

 

spiritparse

5+ Year Member



 
Msg#: 4266301 posted 7:00 pm on Feb 14, 2011 (gmt 0)

Consider if your webserver may be returning response 200 - OK for invalid requests. Do other sitemappers (e.g. try A1 Sitemap Generator) also include odd looking URLs? If so, chances are it's your server that is odd (e.g. responding to errornous requests with "All ok - 200" HTTP response)

marciano

10+ Year Member



 
Msg#: 4266301 posted 9:16 pm on Feb 14, 2011 (gmt 0)

Yes, all requests responses are OK.
But this is not the problem. The program does parse log data despite I did not selected to do that and removed log file location.
At this time I am doing it by myself with php from mysql data.
The only problem I am finding is for scripts time execution despite I split mysql loaded data.
I need to extend max execution time for some of this scripts I would only run once a day off-peak time and I only am see global solutions. set_time_limit() is not changing anything.
Maybe I would have to post this in other forum.
Thank you

maximillianos

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4266301 posted 12:55 am on Feb 20, 2011 (gmt 0)

The only problem with these site map generators from log files is that they are not typically complete. They only include pages that are visited on your site, and the pages they are not visited don't get put in the sitemap. And these unvisited pages are the ones you want in the sitemap!

bulker



 
Msg#: 4266301 posted 2:41 pm on Mar 12, 2011 (gmt 0)

I wouldn't trust sitemap generators scrapping log files either. Nowadays you can find online good free generators that crawl websites and make sitemap files directly out of it - free at least for a few thousand pages.

ohno



 
Msg#: 4266301 posted 8:47 am on Mar 16, 2011 (gmt 0)

Can someone point me to a good online generator with a 2K limit? Most seem to be limited to 500 pages. Thanks

bulker



 
Msg#: 4266301 posted 11:44 am on Mar 16, 2011 (gmt 0)


Can someone point me to a good online generator with a 2K limit?


<snip>

[edited by: goodroi at 11:00 am (utc) on Mar 17, 2011]
[edit reason] Please no product mentions [/edit]

marciano

10+ Year Member



 
Msg#: 4266301 posted 12:39 pm on Mar 16, 2011 (gmt 0)

At this time I've built my own xml site generator with thousands of urls.
Google webmaster tool sitemaps behavior is very strange.
I cannot get 37 highest priority to get all added to web index. Only 15. One month and still waiting. Site ranked 5/10

spiritparse

5+ Year Member



 
Msg#: 4266301 posted 8:34 pm on Mar 17, 2011 (gmt 0)

A1SG can handle 100k+ URLs (but it's a tool you download)

About indexing, just give it time. The numbers reported in Google Webmaster Tools gives you numbers of URls in sitemap confirmed to be in search engine index. (Easy to prove, at least in my cases) Can you confirm you really only have 15 indexed if you try search in Google using complete page titles.

ohno



 
Msg#: 4266301 posted 7:48 am on Mar 18, 2011 (gmt 0)

Is it advisable to upload a sitemap in one format? ie, just XML & not XML & the GZIP version too? We have a site that has both versions & at times they show different number of pages indexed.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved