Forum Moderators: open

Message Too Old, No Replies

Spider Simulator

Anybody know a Win equivalent of Sim Spider

         

piskie

1:07 am on Dec 13, 2001 (gmt 0)

10+ Year Member



I would like a utility installed on my Windows PC that would strip the text out of an HTML page and disply it like a spider retrieves it. Does anybody know of a good tool??
Thanks in advance
Piskie

Brett_Tabke

10:26 am on Dec 13, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Well, there are editors that will do that. What I do alot of times is (using Opera) - alt-f3(view source), and it loads it in my editor (EditPlus), and then it's a simple key commend to strip the html (which I have redefined as alt-f3). So, alt-f3, alt-f3 and I am looking at the stripped source in about a second.

(btw: there will be a public release of a utilty containing sim spider early next year - feb 1 I think).

piskie

1:11 pm on Dec 14, 2001 (gmt 0)

10+ Year Member



Done and dusted. A good result thanks Brett.
I also look forward to having Sim Spider on my desktop when it comes out.

worklive

2:34 pm on Dec 14, 2001 (gmt 0)



Imagination can be a good technology. I know spiders read pages in columns, down one then across to the top, then down etc.

So it is a good idea to load key-terms, spider-food, in columns.

WebGuerrilla

1:07 am on Dec 16, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




piskie,

Another quick thing you can do that we give you close to the same thing is to simply save that page your viewing as a text file. In IE, you would just click file>save as and then select text file.

It doesn't give you the additional hyperlinks like sim spider does, but it will list the text in the same order that a spider would see it.

Robert Charlton

5:27 am on Dec 16, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>>I know spiders read pages in columns<<

Not necessarily... depends on how your tables are built. What they do read is page source code from top to bottom.

piskie

11:06 pm on Dec 16, 2001 (gmt 0)

10+ Year Member



Thanks WebGuerrilla I just tried that and it works fine. It leaves less white space than stripping HTML in an editor. But using an editor is a bit quicker. I will be using both methods depending on target and purpose.