People saving your site to harddisk

Forum Moderators: DixonJones

Message Too Old, No Replies

People saving your site to harddisk

Aaron

8:02 pm on Apr 18, 2001 (gmt 0)

recently, I found a couple of people [4 so far] were saving my site to hard-disk.

I discovered it because my webtracker, logged a address that was somethign like c:/download/files etc..and it wasn't on my computer. Of course, it also meant he was online at the time.. Incidentally this could be a advanatage of having a webtracker instead of analysing logs?

It's cool to see people find your site worthy of saving, but it robs me of the satifaction of seeing hits, or drawing more traffic, since I update once a week..

It might also indicate a problem, because, users might find my site too slow to load, hence they prefered to read it off their hard-disks , even when online!

Perhaps I should do something to prevent this?

Mike_Mackin

8:06 pm on Apr 18, 2001 (gmt 0)

It is common for folks in Eastern Europe as an example to download sites. They are paying BIG MONEY by the minute to be online.

I know a guy who is interested in SEs and downloaded Danny's complete site.

Xoc

8:08 pm on Apr 18, 2001 (gmt 0)

There is nothing that you can do to prevent it. You can break their links when they come off the disk, but there is nothing you can do to stop them from saving a page.

Aaron

8:14 pm on Apr 18, 2001 (gmt 0)

I saw, some javascript tricks somewhere to prevent visitors from downloading your site. But I don't think it's fool proof..Hee Hee..Switch off java??

Woz

1:00 am on Apr 19, 2001 (gmt 0)

>It is common for folks in Eastern Europe

And Asia, I used to do it all the time when I was in China. Internet by the minute can be expensive.

There is not a great deal you can do to prevent it, but to curtail downloading programs you could include their signatures in your robots.txt file. Even then though, some of these programs may be setup ignore the robots.txt file. Have a look at Brett's Robots.txt tutorial here [searchengineworld.com].

I agree with Xoc though that one way to foil "ripperofferers" would be hard links rather than relative links.

Onya
Woz

JonB

5:23 am on Apr 19, 2001 (gmt 0)

i do this all the time.

Here is Slovenia internet costs me 1$ per hour+ 30$ per month.In Europe internet is stll paid by hour in many if not most countries.And i have 33.600 kb modem/line.

I use teleport pro which is excellent. i think this program can solve the "hard links " so you can browse offline no matter what type of links are there as long the links are on the same doman(otherwise teleport pro wont donwload them-but i think you can set up this too).

There are several possibilities how this program save the site or page on the hardrive, one is like it is on the web (excellent for mirroring) and another is
prepared for offline reading.

Jon

chiyo

6:03 am on Apr 19, 2001 (gmt 0)

We are guilty too! We download whole sites for research purposes, and then can browse with the telecom/ISP ticker off. Most good offline downloaders can parse relative or absolute links.

We use WinHTTrack.. a simple freeware program which can follow links, and download one site or any levels of many sites. It also downlaods much faster than downloading and saving one page at a time as it runs several threads at a time such as LeechFTP. It follows robots.txt.

We do not do it to steal code. I would take a guess that most offline browsing is not for any nefarious or cheating purpose. But I guess if you have mainly a marketing or advertising site, you may have reason to speculate other reasons for people downlaoding material en masse.

People do it to us regularly and we are pleased they find our content useful enough to download it for later reading.. even the whole site. When we do find that people have published under their own name our material we do pursue it, and use several methods to find such illegal copying. Its harder to find breaches of copyright when people make multiple copies to distribute off line to others. It is still illegal to do do, but just the act of downloading even a whole site for personal use, I dont think is a problem at all.

Your logs still reflect the page views, at least the first time they look at a page. And server based and browser based cacheing already causes random error in your page counts. You also may see the hits.. (eg my documents/yoursite/blah.htm whenever an absolute location is called and the reader is on line, and they didnt download all elements.)

As publishers ourselves, we have to accept that publishing information on the Web, means that you allow people to view it, whether on or off line, though not to breach copyright by copying code, or reproducing content on other domains without clearance. Same deal whever you publically publish anything such as a book.

CyberSpaced

2:26 pm on Apr 21, 2001 (gmt 0)

I noticed the trend in downloading sites as well... But I wonder how many times people download my sites to reproduce them somewhere else....

Has anyone ever run across their site posted somewhere else and not known about it? I'd like to know how you found it if you did...

theperlyking

3:23 pm on Apr 21, 2001 (gmt 0)

Don't forget browsers "offline" function. Both IE and Netscape (at least) allow people to view your site when not connected. IE can also pull pages down for you in a scheduled manner.
I don't think its something you can stop really as theres so many routes for people to do it.