Page is a not externally linkable
incrediBILL - 10:42 pm on Mar 31, 2009 (gmt 0)
My own custom script. I use the same "curl" program in the script above to download each main web page to a file, with verbose headers and all redirections. Then I parse it via many hundreds of "fingerprints" I've collected over time to identify sites with a virus, domain park, hosting setup, soft 404s, adult sites, hacked sites and a whole lot more. Additionally, there are calls made to WHOIS to identify various name parks and to check to see if the domain is still registered if it fails to load. Even that wasn't terribly complicated but the collection of data to make that process work well can take years. We get lots of member reports, doesn't work as well as it should, but it's better than no member reports.
How do you all expedite the task of keeping out the bad stuff - subsequent transformations of listed websites, websites that have died, sites going from a website to a parked domain page or worse - transformed on expiration to malware download, MFA, pron, etc.? Member "reporting" links? Do they work?