Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Singingfish extractor not obeying robots.txt

sound files hosted at different domain than website



5:42 pm on Jun 9, 2003 (gmt 0)

10+ Year Member

We host a small number of sound files for a friend in a robots.txt excluded directory. The website that links to them, however, is at a different, unprotected domain, e.g.:


Singingfish.com's spider sees the original website and declares it open season on the sound files. The extractor then blindly hits those sound files without checking the robots.txt file of the hosting domain.

I sent them an email. In the meantime, I am banning their extractor:


ADDED: Got a reply to my email. They do not seem concerned that their extractor ignores robots.txt. Instead, they said they would add my domain to their exclusion list and run a script to remove our files from their db. Not what I was hoping for...


3:28 pm on Jun 11, 2003 (gmt 0)

10+ Year Member

Adding your domain to an excluded list seems a pretty inefficient method of sorting things out - surely getting the spider to read robots.txt is the way to do things but hey, who are we to ask? ;)

As an aside, did you consider blocking the spider via .htaccess?



12:51 am on Jun 14, 2003 (gmt 0)

10+ Year Member

The extractor uses a generic Real Media UA, so I blocked them by IP instead.

I replied to their email, explaining the situation and even pointing them to this thread. Unfortunately I got no response.


Featured Threads

Hot Threads This Week

Hot Threads This Month