Welcome to WebmasterWorld Guest from 54.161.64.174

Forum Moderators: open

Message Too Old, No Replies

...tell which pages google spidered and when?

How to tell which pages google spidered and when?

     
2:19 am on Oct 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have c-panel as my web configurator-whateveratoritscalled, is there something in that or some other stats program that can tell me when and which pages googles spider has crawled? Whats the IP?

Is there a way of knowing who referred the bot? ie:which link it crawled in from?

I had multiple links from a pr7 site that gets updated daily and my serps went up the next day, two days later, same links but I've disappeared. Just wondering what could be the cause, no changes on the pr7 site. Can I tell somehow if google didn't crawl that link?

Also, my site is a pr4 but google indexes the home page and a few others every day. My content changes only minorly once every 7-10 days. Is that normal?

6:15 am on Oct 23, 2003 (gmt 0)

WebmasterWorld Senior Member marcia is a WebmasterWorld Top Contributor of All Time 10+ Year Member



It's all normal, and it doesn't matter if it's cpanel, it depends on whether your host makes logs available to you, or what kind of stats program you've got. That's how you'd see.
9:17 pm on Oct 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



nuevojefe if you can get your raw logs, google's crawler asks for robots.txt regularly (it you dont have one of these this log entry may be in your error log)

Plus it usually always leaves a referrer saying something like

"Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

or similar, once you have some of these its easy to isolate what it has requested.

3:16 pm on Oct 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the pointers, I think my raw log was turned off, I changed it yesterday to auto-archive so I'll see if it's not empty anymore in a minute.

Thanks!

8:13 pm on Oct 24, 2003 (gmt 0)

10+ Year Member



The most helpful I found, was using a custom 404 page that sends an e-mail.
Thus I see any stupid mistakes I make and as I don't have a robots.txt all the crawlers looking for it
See if your cpanel allows you to define a custom error page and if yes plug a nice error message in that also sends an e-mail (look on g for custom error pages).
M
4:23 pm on Oct 27, 2003 (gmt 0)

10+ Year Member



I put <!--#echo var="DATE_GMT" --> in the footer of my pages. When Google caches the page it will show the exact date and time that the page was crawled.
8:03 pm on Oct 27, 2003 (gmt 0)

10+ Year Member



Great idea about the date...

Anyone know how to code this same thing in asp?

<!--#echo var="DATE_GMT" -->

8:16 pm on Oct 27, 2003 (gmt 0)

10+ Year Member



Nevermind the post I made above I used:

<% Response.Write FormatDateTime(Time, vbLongTime) %>
<% Response.Write FormatDateTime(Date, vbLongDate) %>

4:04 pm on Oct 29, 2003 (gmt 0)

10+ Year Member



For those who are interested: In php files I use the following in the footer area to catch the spider date and time in the cached copy of the page:

<?php echo date("l dS of F Y h:i:s A T");?>

On another topic,using the line below in the second line of a php file sets the file header information to show "recently modified" and can help with fresh listing:

header('Last-Modified: '.gmdate('D, d M Y H:i:s \G\M\T', time()));

 

Featured Threads

Hot Threads This Week

Hot Threads This Month