Forum Moderators: coopster & phranque

Message Too Old, No Replies

Verity K2 Search Engine

Ignore a directory within an indexed directory

         

justa

12:51 pm on Apr 18, 2002 (gmt 0)

10+ Year Member



I've got a search running for my site using the Verity K2 Search Engine bundled with ColdFusion 5.

I have created a collection and I am able to spider all of the pages in that directory and all sub directories. The spider starts in E:\pawaintranet\ and runs through all of the sub directories, including E:\pawaintranet\staging_new\ which is where all of the staging pages still under development sit.

I need to find a way to run the spider for E:\pawaintranet but ignore the E:\pawaintranet\staging_new\ directory and all sub directories.

At the moment I'm running the spider through a cfm page, but I know you can also run it through the command prompt. I've tried

vspider -start E:\pawaintranet -nofollow E:\pawaintranet\staging_new -collection D:\verity\collec~1\test_s~1\

and

vspider -start E:\pawaintranet -nofollow E:\pawaintranet\staging_new -collection D:\verity\collec~1\test_s~1\spider\

Does anyone have any suggestions?

william_dw

1:46 am on Apr 19, 2002 (gmt 0)

10+ Year Member



Hiya,
it's been a while,
but i think you may have success on the command line if you try using double backslashes on the directory names (ie: E:\\pawaintranet\\staging_new).

vspider has a tendency to use regular expressions on the command line, so escaping the backslashes may solve the problem (you may not have recieved an error if it looked like a valid REGEX to vspider)

Also, try encapsulating the path in quotes (")

And another thing,
the version of verity may have progressed, but as of v3.7 the correct -collection syntax was -collection path_and_filename
Just dont use a .clm, it dosent like the extension.

Oh one more thing,
if the syntax is still valid, why not use -exclude instead? it would exclude going into that directory i think.

Here's some examples, one of which i think should work...

vspider -start E:\pawaintranet -nofollow E:\\pawaintranet\\staging_new -collection D:\verity\collec~1\test_s~1\

vspider -start E:\pawaintranet -nofollow "E:\\pawaintranet\\staging_new" -collection D:\verity\collec~1\test_s~1\

vspider -start E:\pawaintranet -nofollow E:\\pawaintranet\\staging_new -collection D:\verity\collec~1\test_s~1\coll1.col

vspider -start E:\pawaintranet -nofollow "E:\\pawaintranet\\staging_new" -collection D:\verity\collec~1\test_s~1\coll1.col

vspider -start E:\pawaintranet -exclude E:\pawaintranet\staging_new -collection D:\verity\collec~1\test_s~1\

vspider -start E:\pawaintranet -exclude "E:\pawaintranet\staging_new" -collection D:\verity\collec~1\test_s~1\

vspider -start E:\pawaintranet -nofollow "E:\\pawaintranet\\staging_new" -collection D:\verity\collec~1\test_s~1\

And if none of those work then perhaps someone else knows the correct syntax.

If it works do let me know,
HTH,
Dw

ps: as far as i know, you shouldnt escape the collection url

pwringger

3:53 am on Aug 27, 2002 (gmt 0)

10+ Year Member



check this page:

[daemon.com.au...]

The Verity style has to be remapped to return CF variables.