Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Singingfish extractor not obeying robots.txt

sound files hosted at different domain than website

5:42 pm on Jun 9, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 18, 2002
votes: 0

We host a small number of sound files for a friend in a robots.txt excluded directory. The website that links to them, however, is at a different, unprotected domain, e.g.:


Singingfish.com's spider sees the original website and declares it open season on the sound files. The extractor then blindly hits those sound files without checking the robots.txt file of the hosting domain.

I sent them an email. In the meantime, I am banning their extractor:


ADDED: Got a reply to my email. They do not seem concerned that their extractor ignores robots.txt. Instead, they said they would add my domain to their exclusion list and run a script to remove our files from their db. Not what I was hoping for...

3:28 pm on June 11, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 22, 2002
votes: 0

Adding your domain to an excluded list seems a pretty inefficient method of sorting things out - surely getting the spider to read robots.txt is the way to do things but hey, who are we to ask? ;)

As an aside, did you consider blocking the spider via .htaccess?


12:51 am on June 14, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 18, 2002
votes: 0

The extractor uses a generic Real Media UA, so I blocked them by IP instead.

I replied to their email, explaining the situation and even pointing them to this thread. Unfortunately I got no response.


Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members