Forum Moderators: open
HTtrack did NOT call up robots.txt . My understanding is that their free version respects robots,
but their paid version can override that.
Questions:
1) What do you suppose the purpose of this is?
2) Might somebody be trying to scrape my whole site?
3) Lets say I disallow HTtrack in robots. Per their website, " disallow / " is too vague!
I have to specify some directory. My whole site is in the root directory.
Can I disallow the root dir, even though that is where robots.txt lives?
Do you see the logical problem with that? I would be disallowing robots.txt too,
thus providing them a legal 'out'.
- Confused in California
Dishonorabale bots and/or visitors could care less if robots.txt exists much less the contents.
For answers to your first to questions?
Only the person or machine grabbing the pages has the answer. Assumptions may be made when a webmaster has learned to recognize traffic patterns in their visitor logs.
Your best advantage is to examine the use of htaccess.
Here some old threads:
A Simple Beginning
[webmasterworld.com...]
Close to Perfect
[webmasterworld.com...]
[httrack.com...]
About as often as visitors read TOS, UAG's or FAQ's on the websites they visit.