There are statistics out there from other companies, but none -- it seems -- as much of an authoriy as Netcraft.
>> how hard it is to replicate what Netcraft does?
I'd say it would be pretty difficult. It's not just the collection of data that's difficult, but WHERE the data is collected that seems to be a larger issue. I know Netcraft has a large client base that includes many large companies, including hosting companies.
>> How to figure out how many servers are behind a website?
I'm sure there are various methods of doing so -- none that I am familiar with. However, examining hosting companies is a good start at estimating this IMO. I'm sure they get stats from ISPs, as well. It's just a mass collection of data from various sources compiled together, I guess.
>> is they're methodology way off?
It depends on how they make their hypothesis and extrapolate their data. All statistics have certain percent error and this can vary depending on many, many factors, including data collection methods, as mentioned by rubing. If the statisticians did their job well then the stats should be relatively accurate.
>> pay to be listed by netcraft
Don't think that's true?