| 8:11 am on Apr 16, 2003 (gmt 0)|
Can your rankings in the SERPs change after a deep crawl? I think not, but there was a shift after last month's deep crawl....
| 8:32 am on Apr 16, 2003 (gmt 0)|
Just woke up to find my site was down between 12.30am UTC and 8am UTC. (Still waiting for the excuse from the ISP)
No googlebot on my log files for the past two days....does this mean that I'm out of google next month?
| 9:02 am on Apr 16, 2003 (gmt 0)|
Deep Crawl starting for me in the UK, Woohoo!
crawl2.googlebot.com - - [16/Apr/2003:00:22:57 +0100] "GET /robots.txt HTTP/1.0" 200 24 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)
| 9:36 am on Apr 16, 2003 (gmt 0)|
Glad I got those changes in last night now :)
Visited all of my sites, and took the top level pages this morning, just started to see it come back for more :)
| 9:52 am on Apr 16, 2003 (gmt 0)|
just wondering what stats programs you guys are all using?
im using webalizer and i think it updates at 12am every day.......
does your stats have realtime time stats so u can see if google is at your sites?
what do u reckon is the best stats proggy?
| 10:28 am on Apr 16, 2003 (gmt 0)|
Webalizer typically runs at 2am unless your host has set it differently.
| 10:48 am on Apr 16, 2003 (gmt 0)|
For Webalizer, I usually download my log and process it locally. Good for seeing up to date info and also allows you to customize the results (e.g. see the first 500 referrers instead of 50).
| 11:14 am on Apr 16, 2003 (gmt 0)|
I didnt know u can do this?
how would u do it?
| 11:23 am on Apr 16, 2003 (gmt 0)|
My hosting provider provides a link to the log file from the control panel. The log file is not visible from FTP for me so you may be out of luck if your provider doesn't give you a link.
But if you can get the log file, download Webalizer (or the log file analyzer of your choice) from [mrunix.net...] It is command line based so it is helpful to use a batch file which will allow you to drag the log file onto it. The batch file will contain a command like the following:
"c:\program files\webalizer\webalizer.exe" -o "c:\program files\webalizer\stats" -R 1000 %1
This puts the results into a stats subdirectory and gives you the top 1000 referrers.
These instructions are for Windows of course.
| 12:13 pm on Apr 16, 2003 (gmt 0)|
OK....so you can pick on me for being totally naive...but where in the webalizer reports would I look for googlebot?
Thanks and sorry for such a stupid question
| 12:36 pm on Apr 16, 2003 (gmt 0)|
Webalyzer may not show you googlebot. Some domain GUIs (graphical user interfaces) provide a link to something like "Last 20 visitors" This is a nice way to monitor current traffic and to find googlebot. If you use CPanel, I can help you even more...just sticky me.
Even better though, What you need is access to your Raw Access Logs: Does you ISP provide these as a download? If you can get your Raw Access Logs then you can purchase a analysis program which will allow you to, for example, view all trips by Googlebot to your website.
Anyway, hopefully you've got one of two things:
1. Access to a "Last 20 Visitors" feature
2. Access to your Raw Access Logs
| 12:48 pm on Apr 16, 2003 (gmt 0)|
How does one recognise the difference between deep and fresh - is this purely by how extensively it grabs pages, or by the DNS name of the crawler?
| 12:50 pm on Apr 16, 2003 (gmt 0)|
Same here. Just checked logs and he showed up after midnight:
Company or ISP : Google Inc. US,CA
IP Range : 188.8.131.52 - 184.108.40.206
Total Visits : 22
| 1:57 pm on Apr 16, 2003 (gmt 0)|
Glad I stayed up late now, googlebot, Fast and scooter are all dancing around my sites.
| 2:01 pm on Apr 16, 2003 (gmt 0)|
:::How does one recognise the difference between deep and fresh
freshBot: 64.68.82.* bah, listed for only a few days, short dinner date then it dumps you.
deepcrawler: 216.239.46.* Good bot, it likes you, listed until death do you depart. So don't make him mad or it will divorce your site. You can divorce the Googlebot by using your robot.txt file. It's much cheaper and faster than going to the courts.
| 2:53 pm on Apr 16, 2003 (gmt 0)|
I use a program called web log expert. It has the ability to get logs via ftp,http. It can also send the report the same way plus it can email it to you. It is pretty cool. You can run it manualy on your machine or have it scheduled.
| 3:21 pm on Apr 16, 2003 (gmt 0)|
Slightly off-topic, but for a small (only 25 pages) site I've just developed, I've taken a "roll-your-own" approach to log file analysis - basically, on the site's home page, each visit to that page is written to an xml file. I've then got an admin asp page, which reads the xml file, and does an xsl transform to present it in the browser - two of the values in the xml file are User Agent and IP Address.
So, for Googlebot, I see an entry of
Googlebot/2.1 (+http://www.googlebot.com/bot.html) for User Agent and 220.127.116.11 for IP address. At the moment the xsl transform just shows ALL visits, but I hope to modify the transform to allow me to search for Googlebot/FAST/Jeeves etc ....
Just an alternative to relying on a hosting provider for access to the log files, but might be impractical as the number of visits to the page increase ...
| 5:21 pm on Apr 16, 2003 (gmt 0)|
Mercenary, If you can download the raw log files, you can open them in Wordpad etc and search for 64.68 (freshie) or 216.239 (deepbot). Analog is a good free analyzer, although you have to edit the config file first.
| 7:29 pm on Apr 16, 2003 (gmt 0)|
Googlebot is one bot I dont mind eating up all my pages. So far so good, on one of my forum sites it has taken 250 pages so far, brand new site -- last round freshbot picked up like 50 or so of them. I love this new site, so google should too!
| 8:19 pm on Apr 16, 2003 (gmt 0)|
Does this mean that the deepcrawl didn't happen last month? I noticed a lot of pages I added weren't included in the index. Will the next update be much more thorough. Just curious - would like to know more.
| 12:56 am on Apr 17, 2003 (gmt 0)|
The deepcrawl happened last month during the second week, (it crawled me on Mar 11-12). Perhaps those pages you added went up after the deepbot came through.
| 2:29 am on Apr 17, 2003 (gmt 0)|
So what is the E.T.A. on the next update? I know some of you guys have documented these lag times. I have a life so I do not have time for that kind of thing. :)
| 3:14 am on Apr 17, 2003 (gmt 0)|
WTH... I'm still seeing freshbot as soon as thirty minutes ago.
| 4:10 am on Apr 17, 2003 (gmt 0)|
yes! From the hits it looks like G has deepcrawled my main site and is currently hanging around my forums... A few thousand pages there to index... You think he will take it all up in one go? This is the first time G has seen this forum ready for indexing so will it be in this index or will it be flagged for the next one?
| 4:13 am on Apr 17, 2003 (gmt 0)|
I think you'll get in the first round. Deepbot is pretty considerate on my forum archives and hangs around for quite a few days instead of hitting the servers too hard all at once.
| 4:25 am on Apr 17, 2003 (gmt 0)|
Inktomi has been hitting my forums pretty hard recently every fortnight, I know it has been hitting several different pages in the forums, but so far only one page has actually been included in the index. So I hope Google gives me a little more results :)
| 9:53 pm on Apr 17, 2003 (gmt 0)|
I think she's bringing her little brother to the work:
is hitting separate pages starting just a couple of hours ago.
Whois query shows that it's from the same ip-range as the 216.239.46.x crawl bot.
| 10:12 pm on Apr 17, 2003 (gmt 0)|
Deepbot came whilst our site was down for maintenance - will that be it for another month or will it re-appear in the next couple of days?
| 10:32 pm on Apr 17, 2003 (gmt 0)|
I have a PR1 site (quite new site, I think PR will go up to 4 after one or two updates), and deepbot has visited my site also.
On the log-subject: I use analog for general purpose log file analysis. It crunches 100 mb log files within seconds (only a 1 Ghz machine). If you have access to a unix-machine, I would use analog.
For searching for googlebot, I wrote a shell-script that does this:
cat /var/log/apache/access.log ¦ grep googlebot > /home/myaccount/google.log
Grep filtered out every line that contains the term googlebot. I then "analyse" the file with vi. When vi opens the file, I immediately see the number of lines, which is the number of hits. Quite low-tech but amazingly effective and easy to implement. You can grep the result again to filter out freshbot or deepbot as you wish.
| 10:59 pm on Apr 17, 2003 (gmt 0)|
|So what is the E.T.A. on the next update? I know some of you guys have documented these lag times. I have a life so I do not have time for that kind of thing. :) |
It gets a bit crazy here around update time, clarksc3. We've just been through it, things seem good, most people are happy...
Let's not even talk about the next update. :-)
|Deepbot came whilst our site was down for maintenance - will that be it for another month or will it re-appear in the next couple of days? |
RankOutsider, if your site was previously in the index then usually the deepbot will look for it again during the crawl. Keep your fingers crossed... it might be back before this session is finished.
| 11:41 pm on Apr 17, 2003 (gmt 0)|
On my second deep crawl for this site and it’s at 1321 hits this time. still trucking Go deepbot go!
I have been seeing alot of Inktomi deep craw’s to that is a first for me. About 600 hit's so far on this crawl
| This 120 message thread spans 4 pages: < < 120 ( 1  3 4 ) > > |