Forum Moderators: open

Message Too Old, No Replies

         

fashezee

8:24 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How can I tell when I have been spider has visited my site?
Is it only by my server

pendanticist

9:49 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How can I tell when I have been spider has visited my site?
Is it only by my server

The information you'd need to evaluate are in your access_log files. Do you have access to them? I only ask because I understand individual host servers handle this differently.

Pendanticist.

mayor

10:29 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Check with your site hosting service.

If it's a freebie host don't expect to have access to the raw logs, where you can see a record of every visitor that comes to your site, including spiders.

If it's a paid hosting service, you probably have access to the raw logs via an FTP client like WS_FTP which is available for free on the web.

ExtremeExports

11:01 pm on Jan 11, 2003 (gmt 0)

10+ Year Member



I have raw access logs, I can open them and save them to my computer but it saves it as a .gzip Which program do I use to open and view my logs? I've been trying to find this out for some time now. Anyone know how I can do this?

mayor

11:16 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To open the .gz file ftp it to your computer, then open it with Winzip (free trial versions available on the web). It takes a little hacking to learn how to use Winzip to expand the file, though. I associate the .gz file with Winzip, then do a right mouse click on the filename in Windows Explorer, select Winzip and click on 'expand to here'. Not a very good explanation, is it? So get some instructions for Winzip and read them ... probably beats my insistance on hacking rather than reading by a long shot.

pendanticist

11:37 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ExtremeExports,

What follows is only a small part of what mayor is coaching you on.

---------------------------------------------------------------------------------

I don't know if your host server is the same as mine, but when you're looking at those files from within your FTP client - the very first one (at the top of the list) should be viewable as a .txt file even after you download it to your hard drive. Just that first one, mind you.

That file shows the most current activity and is also the data that shows in your daily stats page when you view it.

Once the time period set by your host server (usually a week) runs out, that viewable .txt file gets converted by your host server into one of those .gz files and a new .txt file is started.

As you become better at evaluating access_log files you might want to keep tabs on that one as it shows traffic more incrimentally and not enmasse like the .gz files.

Myself, I download that one several times a day.

Just a thought.

Pendanticist.

andreasfriedrich

11:40 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Powerarchiver available at powerarchiver.com is freeware and requires no hacking. However, most log analyzers will read packed logfiles.

Andreas

fashezee

11:58 pm on Jan 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Once I have access to my log files; how do I determine the IP addresses that
belong to spiders?

ExtremeExports

12:52 am on Jan 12, 2003 (gmt 0)

10+ Year Member



Thank you everyone for the help. I will try it and see if I understood the instructions on how to open it.
Extreme

engine

12:56 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



There is a terrific resource for spiders here. [searchengineworld.com] Look at the IP addresses from that page and it will indicate what you should look for in your site logs.

You will find further information on spiders here in this forum. Use the WebmasterWorld site search to identify newer or "rogue" spiders.

When you look in your logs, generally, the respectable spiders look for your robots.txt file first. That should help you see the spider visits.

ExtremeExports

1:13 am on Jan 12, 2003 (gmt 0)

10+ Year Member



I've downloaded the rawaccess log to my computer, extracted it with winzip and then tried to open it but it gives me the following error: Program to big to fit into memory.
What do I do now? :-(

fashezee

1:17 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Reboot

andreasfriedrich

1:18 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Use a better editor (have a look at the editors forum for some suggestions) or get more memory.

ExtremeExports

1:40 am on Jan 12, 2003 (gmt 0)

10+ Year Member



It's trying to be opened in MSDOS. I am so frustrated right now, that I am unable to do something that sounds so simple. :(

ExtremeExports

1:45 am on Jan 12, 2003 (gmt 0)

10+ Year Member



okay! I finally am able to view it in a text editor and it looks very confusing. But thank you everyone again. :)

andreasfriedrich

1:50 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perhaps this info on Apache´s logfile [httpd.apache.org] will help.

ExtremeExports

1:57 am on Jan 12, 2003 (gmt 0)

10+ Year Member



THANK YOU ANDREASFRIEDRICH!
I figured it out! Now I get it! :o

--Petra

andreasfriedrich

2:05 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You´re welcome Petra.

Happy log checking ;)

Andreas