homepage Welcome to WebmasterWorld Guest from 50.16.165.62
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 98 message thread spans 4 pages: 98 ( [1] 2 3 4 > >     
25 Signals of Crap
So many threads on Signals of Quality - let's take the reverse approach...
Fribble




msg:3233695
 5:34 am on Jan 27, 2007 (gmt 0)

A recent thread I read in the supporters forum that mentioned the ever-popular phrase "signals of quality" got me thinking. Why don't we try and compile a list of possible "signals of crap"?

I realize that many of the list items would logically just be a perceived "signal of quality" reversed but if we all dig down into our experience then this exercise just might bring up a few unique and useful insights. Here's a start:

25 Signals of Crap

  1. Reciprocal link request pages.
  2. No Privacy policy.
  3. Outdated copyright date or last modified date visible on the pages.
  4. error pages that don't send 404 headers or send content regardless of the page requested/querystring entered.
  5. Massive numbers of incoming links from link farms.
  6. dead/404ing links.
  7. High link churn.
  8. No published contact address, email address or phone number.
  9. A high bounce rate (surfers clicking back on their browser and selecting another search result).
  10. Too much duplicate content.
  11. Whois info for the domain which is the same as other domains previously penalized or banned. (Could also be true of adsense publisher/affiliate ID's and other identifiable footprints)
  12. Use of/links to affiliate programs that are known scams
  13. Domains previously used for spam or that are blacklisted.
  14. Stagnation (Site never changes)
  15. excessively long URI's/URL's (query strings or folder and file names)
  16. A high percentage of affiliate links vs regular outbound links.
  17. No / very few outbound links.
  18. No / very few inbound links.
  19. All inbound links are to homepage only
  20. Outbound links to questionable/spammy/crap sites.
  21. Profanity or explicitly adult language on a non-adult site.
  22. Too many spelling errors.
  23. Contains unrelated subjects (ex: a site that reviews toys and tries to sell insurance or viagra).
  24. Lack of interest from social bookmarking sites.
  25. MySQL or PHP errors in the pages

I'm not claiming any of these are definitely a signal of crap, but they make my list of possibilities based on my own conversations and observation.

Please add/subtract/modify and let's see if we can find a new perspective and learn something.

 

tedster




msg:3233794
 8:38 am on Jan 27, 2007 (gmt 0)

No real menu or information architecture -- just a laundry list of links going on down the page.

appi2




msg:3233811
 9:07 am on Jan 27, 2007 (gmt 0)

Going to get flamed for this...

Adsense that follows G's heat map.

Is this going to be for a MSSA penalty check?

abbeyvet




msg:3233812
 9:07 am on Jan 27, 2007 (gmt 0)

Articles that are bang on-topic but have no value.

They are generally 300-500 words long with multiple repetitions of keywords/phrases but a low unique word count.

eg

APPLES
Apples are a type of fruit. They grow on trees and can be red apples or green apples. Apples can be eaten raw or cooked. etc etc

These articles almost always contain within them 2 or more large adsense or other advertising units.

BeeDeeDubbleU




msg:3233831
 9:50 am on Jan 27, 2007 (gmt 0)

I note that Advertising/Adsense is beginning to feature highly even at this early stage in the thread. ;)

steveb




msg:3233833
 9:56 am on Jan 27, 2007 (gmt 0)

Adsense
More than 25% links from blogs
No links to the site from any domain in the top 100 for a query where the page ranks in the top 20 for that query

thecoalman




msg:3233844
 10:13 am on Jan 27, 2007 (gmt 0)

My perspective from a guy with a few smaller sites on a few in your list.

Reciprocal link request pages.

Depends on what you are linking too, just adding any link because the the other site has done the same then I can agree. I use such pages to gather links for my sites but will only link to sites that are of the same nature. My sites are also aimed primarily at locals so I may include some links that may be of interest to the local visitors.

No Privacy policy.

Don't have one... :p Then again the only e-mail addresses I gather are through my forums and the user only ever gets email they request through the forum. I haven't ever sent a single e-mail to anyone from any of these e-mail addresses and never will. Just because the site has no privacy policy doesn't necessarily make it crap, in my case your information is more secure than most places because I'm never going to use nor will I ever share it. What percent of users do you think ever read that anyway? That's besides the fact that most of the "biggies" will use some big long privacy policy that appears to protect your information right up until the last paragraph or so.... Who would you rather trust your info too, me or the guy with the big long privacy policy your not going to read anyway?

Stagnation (Site never changes)

I have one site that hasn't changed in 3 or 4 years. It's for a single niche product and has ranked at the top for the last 2 years. There is simply nothing to add that will add any value to the end user. I will be updating it shortly but they will mostly be cosmetic changes.

No / very few inbound links.

The above mentioned site is so niche that there is very few places that would have any interest in linking to it, certainly doesn't make it crap though.

For the most part the rest I can agree with.

abbeyvet




msg:3233848
 10:30 am on Jan 27, 2007 (gmt 0)

Just because the site has no privacy policy doesn't necessarily make it crap

No, but a search bot is not going to sit down and have a nice friendly chat with you about your attitude to privacy, it will just note that it didn't find a policy and may (who knows?) chalk that up against you.

Aside from which it is a legal obligation in many juristictions to publically disclose on your site how you use personal information provided by the user.

BeeDeeDubbleU




msg:3233860
 10:55 am on Jan 27, 2007 (gmt 0)

I think this is extremely unikely. How would a search engine determine what is a privacy policy?

On websites that don't collect information privacy policies are irrelevant.

steveb




msg:3233870
 11:23 am on Jan 27, 2007 (gmt 0)

A lack of a privacy policy is complete nonsense. Having a privacy policy obviously is not a signal of quality, and not having one certainly does nothing to imply lack of quality.

Similarly how does "No published contact address, email address or phone number" have the slightest relevance to quality? But more to the point, how is a bot going to read a graphic showing an email address?

Ya gotta think like a search bot looking for quality, not some paranoid widget buyer. The two have little in common.

The Contractor




msg:3233874
 11:28 am on Jan 27, 2007 (gmt 0)

On websites that don't collect information privacy policies are irrelevant.

Very few sites should be without an "honest" privacy policy.

Do you allow the user to contact you by email/form?

Do you set a cookie?

Do you track your sites visitors, what info is collected in your stats program?

Do you require registration for posting comments etc?

Privacy policies are not only for sites which collect personally identifying information. I think they are a good idea for any site.

The Contractor




msg:3233876
 11:31 am on Jan 27, 2007 (gmt 0)

Ya gotta think like a search bot looking for quality, not some paranoid widget buyer.

I didn't realize this thread was only about "bots" and not real visitors?

Similarly how does "No published contact address, email address or phone number" have the slightest relevance to quality?

Depends what type of site you are discussing. If it's a "real" business/organization it better have a complete address.

Also, many "human" edited directories will not list a site in any regional categories unless there is a identifiable address listed on the site - why would they?

[edited by: The_Contractor at 11:35 am (utc) on Jan. 27, 2007]

Matt Probert




msg:3233908
 1:09 pm on Jan 27, 2007 (gmt 0)

Profanity or explicitly adult language on a non-adult site.

May I take exception, please?

I assume you really mean the "inappropriate" use of profanity or explicitly adult language. My own slang web site naturally contains profanity, is not an adult site, but nor is the profanity gratuitous - it is merely defined.

Within literature and art, profanity can add impact. I remember reading a poem when I was young, suddenly out of nowhere the narrative refered to his emotions being told to "F**** off!", it added creative and artistic impact.

The inappropriate or gratuitous use of profanity, I agree is totally unneccessary and lowers the value of a piece of work.

Matt

BeeDeeDubbleU




msg:3233910
 1:12 pm on Jan 27, 2007 (gmt 0)

Privacy policies are not only for sites which collect personally identifying information. I think they are a good idea for any site.

Yes of course they are but as Steveb says they are hardly a quality indicator. While we are on the subject, I agree with some of your assertions but IMHO some of the others are not really quality related. I suppose it depends on whether you mean signals of crap to the casual visitor or to another designer doing an analysis of the site.

Reciprocal link request pages.

Why is this a problem? Many quality sites offer this and done properly it is not a problem.

Outdated copyright date or last modified date visible on the pages.

This could be a minor oversight. It is quite common and it would not necessarily indicate to me that the site is of poor quality.

Massive numbers of incoming links from link farms.

How would a site visitor recognise this?

No / very few outbound links.

No / very few inbound links.

Related to quality? Not in my view.

All inbound links are to homepage only.

Why is this a problem?

europeforvisitors




msg:3234021
 4:07 pm on Jan 27, 2007 (gmt 0)

Adsense

Well, there go THE WASHINGTON POST and THE NEW YORK TIMES. :-)

BeeDeeDubbleU




msg:3234030
 4:15 pm on Jan 27, 2007 (gmt 0)

:)

Well yes, but to be fair, most people already have opinions on the value (or quality) of the NYT and WP. The fact that they carry adverts is probably neither here nor there, particularly where advertising revenue provides the main part of their hard copy income.

jdMorgan




msg:3234047
 4:28 pm on Jan 27, 2007 (gmt 0)

> I think this is extremely unlikely. How would a search engine determine what is a privacy policy?

Quite simply: The standard URL for your compact privacy policy is:

/w3c/p3p.xml

and a link to your human-readable privacy policy is included in that file.

Jim

europeforvisitors




msg:3234076
 5:19 pm on Jan 27, 2007 (gmt 0)

One obvious "signal of crap" that I see quite a bit (especially in the travel and technology sectors) is:

- A large number of template-based, keyword-driven, computer-generated pages that contain little or no content, and which are waiting to be filled with user content. (Such pages may not be populated with content, but you can be sure that they're filled with AdSense ads, affiliate links, and/or price-comparison logos and links.)

Google could go a long way toward cleaning up its index by penalizing sites that have a significant percentage of "placeholder pages."

mattg3




msg:3234095
 5:30 pm on Jan 27, 2007 (gmt 0)

Small font size text framed by ads. Or big adsense blocks. I avoid them cause I am sue my users think too they are cr@p. Just disregard the heatmap.

buckworks




msg:3234105
 5:42 pm on Jan 27, 2007 (gmt 0)

Here's a small signal of carelessness that I've come across a couple of times while shopping this week:

-- The drop list for the user to select their credit card expiry date still includes last year.

arnarn




msg:3234107
 5:45 pm on Jan 27, 2007 (gmt 0)


Fribble...
Stagnation (Site never changes)

I think this one needs a bit more qualification. Making these sites disappear would remove a LOT of reference sites and LOTS of valuable information.

How does one define stagnation anyway? (sounds like something putrid or decaying).

Now, get rid of those pond scum sites and I'd be happy

Fribble




msg:3234124
 5:55 pm on Jan 27, 2007 (gmt 0)

Wow...

I think there's a major flaw in threads like this. Vague questions that attract vague answers.. (whoops!) Though unspecified I was referring to things that could potentially be determined during a search engine bot visit and the subsequent processing and use of that information (Hence my posting in this forum).

Remember that I'm not claiming to 'know' anything here - I certainly know nothing about what goes on in the search engines - though I do have my opinions..


A lack of a privacy policy is complete nonsense. Having a privacy policy obviously is not a signal of quality, and not having one certainly does nothing to imply lack of quality.

Similarly how does "No published contact address, email address or phone number" have the slightest relevance to quality? But more to the point, how is a bot going to read a graphic showing an email address?

You may be right, but isn't it possible that a bot can determine whether your site is collecting information through cookies and forms and such? How many privacy policies have you ever seen without the phrase "Privacy Policy" in them? Aside from that, how hard would it be for an algorythm to identify a TOS or PP simply based on the content of the page?

As for my guess that a lack of either an address, email, or phone number could possibly hurt... I agree that's on the edge of reasonability, but maybe it would depend on the site in question, and what 'kind' of site the engine determines it to be?

A good question to ask at this point may be just how complex of a system are search engines using to rank sites? Do they know if you're a store? Do they link sites together by webmaster (where they're able?) Do they look for 'sets' of things? (What if they find a cart but no privacy policy or SSL?) Certainly some of these signs of crap may look like crap when taken by themselves - but what about when they're combined with other 'signs' that the bot can recognize and catalog?

Maybe every site is not measured by the same things.


Ya gotta think like a search bot looking for quality, not some paranoid widget buyer. The two have little in common.

But isn't this the point? Are search engines not trying to increase their ability to analyze pages and sites to the point where their bots and algo's can see them more or less as users do? What other direction could you choose when tweaking a SE algo that exists largely to deliver the best site to every "paranoid widget buyer"?


Reciprocal link request pages.

Why is this a problem? Many quality sites offer this and done properly it is not a problem.

Apologies again, I wasn't specific enough. I should have coupled that along with existing recip links to non-topical or relevant sites.. If I were the big G I would take that as a sign of trying to game me and may or may not decide to throw a dunce cap on the site.

Fribble




msg:3234126
 5:56 pm on Jan 27, 2007 (gmt 0)

How bout this:

  • Reciprocal link request pages coupled with existing reciprocal links to non-relevant sites.Thanks thecoalman
  • No Privacy policy. (Debatable) (site-dependent?) (jdMorgan:The standard URL for your compact privacy policy is: /w3c/p3p.xml )
  • Outdated copyright date or last modified date visible on the pages. (Debatable)
  • error pages that don't send 404 headers or send content regardless of the page requested/querystring entered.
  • Massive numbers of incoming links from link farms.
  • dead/404ing links.
  • High link churn.
  • No published contact address, email address or phone number. (Debatable) (site-dependent?)
  • A high bounce rate (surfers clicking back on their browser and selecting another search result).
  • Too much duplicate content.
  • Whois info for the domain which is the same as other domains previously penalized or banned. (Could also be true of adsense publisher/affiliate ID's and other identifiable footprints)
  • Use of/links to affiliate programs that are known scams
  • Domains previously used for spam or that are blacklisted.
  • excessively long URI's/URL's (query strings or folder and file names)
  • A high percentage of affiliate links vs regular outbound links.
  • No / very few outbound links (depending on the site's type/niche).
  • No / very few inbound links (depending on the site's type/niche).
  • All inbound links are to homepage only (Debatable)(Site Dependant?)
  • Outbound links to questionable/spammy/crap sites.
  • [Matt Probert]The inappropriate or gratuitous use of profanity
  • Too many spelling errors.
  • Contains unrelated subjects (ex: a site that reviews toys and tries to sell insurance or viagra).
  • Lack of interest from social bookmarking sites.
  • MySQL or PHP errors in the pages
  • [tedster]No real menu or information architecture -- just a laundry list of links going on down the page.
  • [abbeyvet]relatively short pages/articles containing unnaturally high keyword density for their topic - almost always contain within them 2 or more large adsense or other advertising units.
  • [steveb]Adsense
  • [steveb]More than 25% links from blogs
  • [steveb]No links to the site from any domain in the top 100 for a query where the page ranks in the top 20 for that query
  • [mattg3]Small font size text framed by ads. Or big adsense blocks. I avoid them cause I am sue my users think too they are cr@p. Just disregard the heatmap.
  • [buckworks]The drop list for the user to select their credit card expiry date still includes last year.

What else? C'mon is this all you got?
:)

steveb




msg:3234271
 8:47 pm on Jan 27, 2007 (gmt 0)

"I didn't realize this thread was only about "bots" and not real visitors?"

Check out the forum, dude. This is the Google forum.

We aren't talking about real visitors here. Real visitors are completely irrelevant to the concept of "signals of quality" and of crap too. Real visitors for example can read "Geore washington was the 37th President of the US" and know this is a signal of crap. A search engine bot on the other hand can't judge a fact like that. Bots look for signals of quality (like links from respected sites) or signals of spam (like having only links from banned sites).

Addresses or privacy policy are useless to bots a signals of quality. Likewise, any content that has to be coherent and accurate is no signal of quality or crap to a bot. Bots only can look for signals that make it likely something is good quality or not.

steveb




msg:3234275
 8:51 pm on Jan 27, 2007 (gmt 0)

"Well, there go THE WASHINGTON POST and THE NEW YORK TIMES."

And that is why they are called "signals", not "answers".

Adsense on a website rather than a site's own advertising is a signal of lower quality in general, but that doesn't mean a high quality site can't use adsense. An element that is low quality is just a signal, not a definitive answer. In this case, those two sites have a near infinite number of signals of quality that overwhelm the few signals of crap.

appi2




msg:3234286
 9:04 pm on Jan 27, 2007 (gmt 0)

Not to fire up the argument agian but

grep "w3c" years.logs
1 single spammy? bot.

The w3c policy isn't linked from the sites pages,
but w3c policy pages does link to the main/user privacy page which is obviously seen by the bot as it follows the on page links.

Does any SE actually even check for a "w3c" folder privacy policy?

trinorthlighting




msg:3234300
 9:12 pm on Jan 27, 2007 (gmt 0)

How about crappy html coding!

PCInk




msg:3234310
 9:33 pm on Jan 27, 2007 (gmt 0)

For every rule, there will almost always be a reason to break the rule.

Example: "Many people get the i and e the wrong way around and write 'freind' when the rule 'i before e except after' would indicate that it would more likely to be written as 'friend'"

Breaking one rule possibly doesn't indicate crap quality, but breaking a number possibly does!

glengara




msg:3234314
 9:37 pm on Jan 27, 2007 (gmt 0)

Seems to me we're judged by our linkage, so...

Links unrelated to page topic.

Oliver Henniges




msg:3234324
 9:49 pm on Jan 27, 2007 (gmt 0)

> excessively long URI's/URL's (query strings or folder and file names)

veto.
cf. Dmoz.

> How about crappy html coding!

<fnord>
Yes, trinorthlighting, we both are the last dinosaurs checking w3c-conformity. As paranoids by passion, we know that google's own policy in this respect is just a test;)
</fnord>

This 98 message thread spans 4 pages: 98 ( [1] 2 3 4 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved