Forum Moderators: open

Message Too Old, No Replies

New Kind of Penalty?

         

SlyOldDog

1:20 pm on Feb 23, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We've been hit by some weird new penalty. Or to put it more accurately - a leveling of the playing field.

We always maintained at least 2 domains for each subject matter we were interested in to guard against hard times when Google might remove a site from good SERPs, accidentally or on purpose. We never considered this spamming - just insurance.

Last night I noticed that Google seems to have identified our whole network of sites. We don't have a ban, but on most searches now only one site will show up in the top 50.

The sites aren't cross linked in any identifiable way. So I think they have some way of aggregating all links between the sites and determining which sites are most likely to be connected.

Anyone else seen this?

nuevojefe

8:45 am on Feb 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Medcenter:
how to get around this?

1. different IP address
2. different LSI terms / keywords [throw in a good amount of different unique words to each site/page, both in Title, Meta, and Page Content
3. differ your site structure, especially directories and filenames. maybe even alter the sitemap.
4. have at least a 40% difference in your outbound links for each site.

do the above, and your site will be back. [provided all other things remain equal]

1. Do you think changing class IPs for sites that have already been indexed will have any effect, or is it too late?

3.What if you have deeplinks (coming into your site), if you redirect them to the newly structured pages is that going to nullify the positive effect of the redesigned heirarchy and structure?

On another note, what is everyone's opinions on crosslinking as far as avoiding penalization? Is it better to link a few internal pages that are on topic to from each site, or have hub sites all link only back to the authority site and have the authority site link outwards to all the hubs?

caveman

1:16 pm on Feb 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



SlyOldDog, I wouldn't rule anything out without knowing more, but I doubt the LSI explanation, if your assessment of site characteristics is accurate. Just having common terms, even unique ones, won't do get anyone killed by LSI. And it sounds like the sites are different enough to avoid any issues (true?).

If the site structures are much closer than I assumed from your descriptions, then maybe LSI...or if the links out are nearly identical. But even the links out, if MedCenter's comments applied, would mean that I'd be seeing several of my competitors' sites blown away, and such is not the case...sort of in line with kaled's comments. Understanding LSI in concept, after all, has little to do with understanding if, or how much of it is being applied right now. (I'm of the belief that they pulled way back on it with Brandy, having determined that it wasn't quite ready for prime time yet.)

It's interesting that your sites just got hit, at a time when lots of sites have come back. Have you figured it out yet? More questions...

Have you made changes recently on the site that got hit? How old are the sites? How large are the sites? Was it the smaller one that got hit? Percentage of backlinks from common sites, and similarities in backlink text?

mil2k

4:51 pm on Feb 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A good thread which talks about search engines detecting duplicates :-

Duplicates and the challenges search engines face [webmasterworld.com]

SlyOldDog

11:00 pm on Feb 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey Caveman

I am not sure if you are 100% right about LSI not explaining our departure. LSI only classifies the results. It's up to Google what they do with them. If Google wants to display as varied a set of results as possible for a search, they would tend to ignore 2 sites on the same sub topic (with similar vectors). My product names could easily cause LSI to classify my 2 sites as being on the same sub topic. In Google's eyes our 2 sites would be out there all on their own specialising in our focused products.

To answer your other questions - it was the bigger site that was hit (both are a few hundred pages). Bigger site was static. Both sites about 9 months old. Back links - no idea really, but since our link development guys work with all our sites I'd say at least 20% of backlinks would be in common between the 2 sites. No recent changes made to either site.

This big site also got hit in pairs with some of our other sites on different topics (since the big one covers other topics which we also maintain backup sites for).

Bobby

11:43 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I'm of the belief that they pulled way back on it with Brandy, having determined that it wasn't quite ready for prime time yet

I agree with you Caveman, if LSI explains the Florida update I think Brandy is a return to more standard analysis of theme for a page or site.

SlyOldDog,

Having shared links among many sites may be partially responsible for them getting the axe in spite of the different servers and IPs.

Call it a filter, call it an OOP or whatever you like, but it appears to me that the value previously gained from external links has been re-evaluated and may be the cause of a drop in SERPs.

sit2510

6:19 am on Feb 26, 2004 (gmt 0)

10+ Year Member



An interesting thread! At first I thought I was alone.

SlyOldDog & a-chameleon, I experience the same problem like yours. I got a big site with thousands of pages which act like a buffer or insurance as you call it and several smaller niche sites selling the same products. Both of them never show up on the same keywords, not only on the primary and competitive ones but on secondary and less competitive ones too.

From what I see, we may rule out the following list of possibility:

1) WHOIS

The big and smaller sites have no connection in term of contact name, address or even e-mail, so I don't think this is an issue.

2) IP ADDRESS

Out of question. They host on different IP, and even on different hosting companies...Also no connection.

3) DENOMINATOR and file name

No, I don't use any denominator. In many cases, the folder and even file names are different.

4) SIMILAR PAGES OR NEARLY DUPLICATE CONTENT

Also No. The small sites have 1 page for each product whereas the big site has at least 5 pages for that product. In other words, we break the info of the product into different pages. What can be similar is at the paragraph level, not at the page level.

5) SIMILAR LINKING STRUCTURES

Also No. Totally different styles and structure. Less than 10% of link partners are similar, because I often refuse to link out from the big site while I am more lenient on the smaller ones.

6) SITES ARE INTERCONNECTED

* I THINK THIS MIGHT BE THE CASE * Although I try not to mix the big and small sites together, I have been tempted by human nature to have link exchange between the two and it is only 1 link in and out in the link directory, absolutely similar to any other link partners.

If G were able to identify this, then it is a "REAL MAGIC" in its new algo.

webnewton

8:15 am on Feb 26, 2004 (gmt 0)

10+ Year Member



Hello Syl,

I've a network of 100+ sites. I've as many as 5-6 sites on same topic optimized for differect keywourds. All different content. The same thing happened with me too. However i made sure to follow the following rules:

1)All sites on simmilar topics/products are on different I.P's.(Does google has a way to find out that the sites are on same server and ability to penalise it on that basis?)

2)I made sure that no site has duplicate content.(thought i did provide the links from one site to the other in accordance with the hiltop algo. And yes 50-60% of the site shown by google in reponse to the query "simmilar pages" are my other sites)

3)Google has becomen really smart after Florida update. Nobody has a sure answer to new algo changes as of yet. But ofcouse in your case and as happened in my case Google has surely identified the network.
(there are a few keywourds for which more than one of my site used to appear. Now it's only one; which one it's upto google)

What i feel could be the reason:

1)Same pattern of link exchange pages on the whole group of site.
2)Google toolbar. The best spy when page rank feature in on.

You guys have something to add?

Pricey

1:27 pm on Feb 26, 2004 (gmt 0)

10+ Year Member



maybe someone noticed your 2 domains and gave you a couple of sad face clicks ;)

petertdavis

1:43 pm on Feb 26, 2004 (gmt 0)

10+ Year Member



Why would Google care who owns the site? If the site has quality content, why would they give it a penalty based on who owns the site? They could get themselves into some serious trouble with this, as I can imagine some 'interesting' ways to "knock off" competing websites if I understood how this penalty works.

caveman

2:27 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Overnight this became personal. We operate nearly 200 sites so it can take a while to catch up to issues. Yesterday afternoon, we discovered that two of our related sites are suffering from a similar problem, though I'm not sure it's the same thing. I'll break my comments into two posts...

SlyOldDog:

I am not sure if you are 100% right about LSI not explaining our departure. LSI only classifies the results. It's up to Google what they do with them. If Google wants to display as varied a set of results as possible for a search, they would tend to ignore 2 sites on the same sub topic (with similar vectors). My product names could easily cause LSI to classify my 2 sites as being on the same sub topic. In Google's eyes our 2 sites would be out there all on their own specialising in our focused products.

The only time I was ever 100% right was when my wife and I decided to have a child...and I had help with that. :)

Tech is definitely not my main area of expertise, but here was my thinking: Regarding LSI, yes, having similar unique kw's common to both sites could cause them to be identified as affilitated. But that must be, and is, true of millions of sites that are unrelated. In one of our categories, we have numerous competitors who are more or less selling the same thing (it's the lone affiliate category that we operate in). Many sites share similar link structures, similar content, and nearly identical sets of affiliate links. Without doubt, G sees the common elements. But many still survive post Brandy. And notably, we have one competitor with three sites on the same topic exactly, just different kw sets (um, not even *that* different). Your point on G doing what it wants with the info is precisely in line with my own conclusions. If my competitors aren't being nailed with some of their tactics, I can't see how you would be, based solely on shared kw's; even unique ones.

But, there is another factor. You noted that there is some cross linking. OK, so now, there are affiliated sites that are also cross linking. Could that be it? Maybe.

To answer your other questions - it was the bigger site that was hit (both are a few hundred pages). Bigger site was static. Both sites about 9 months old. Back links - no idea really, but since our link development guys work with all our sites I'd say at least 20% of backlinks would be in common between the 2 sites. No recent changes made to either site.

This big site also got hit in pairs with some of our other sites on different topics (since the big one covers other topics which we also maintain backup sites for).


Well there goes my theory about big versus small sites. I think it might have been steveb who noted elsewhere that smaller sites are back again more with Brandy and your situation seems consistent with that.

The more advanced method is defining a host by the uniqueness of its outbound links, and its hierarchal directory structure.

basically, if two hosts on the web have the exact same outbound links, or at least 80% in common, and their insite directory structure is 80% similar, then it is most probably a mirrored set.


SlyOldDog, just for clarity, do your sites share either of these two traits noted by MedCenter? I got the impression that they do not. I'm asking again in particular because of the site structure issue, which I think may be at play with my own situation...

caveman

2:28 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, now, here's my situation:

Two sites related by category. I'm not in real estate but if I were, these might be structured like this: One site about homes in a State, and one about homes in a large City in that state, with much more detail and information provided in the City site, but similar structures for both. They (intentionally) look alike, and link to each other, but only two pages from each site link back to the other site.

Both sites did well pre Florida. Both did well during Florida and Austin. But the State site got slammed by Brandy. The thing that is odd is that almost no pages now come up for the State site. It's as if G just decided that the entire site didn't pass muster. It's not a page by page thing.

They link to each other. Similar templates. Similar site structure and common path/filenames. Again, we were not trying to hide anything. The City site was always intended as a subset of the state site for those wanting more information. Sort of like a legal site about state's rights, with links out to individual sites offering more details about issues in a given state. Both make sense and have a place, depending upon what you're looking for.

But apparently Brandy doesn't agree, the little vixen.

I could just add the details of the city site into the state site, but we only provide the much higher level of detail on this one city, so that seems a bit odd, and for those only wanting information about the one city, they would have no interest in the state site. Most irritating.

Chicken Juggler

2:38 pm on Feb 26, 2004 (gmt 0)



I got all kinds of sites and they are all linked together. I have 2 sites that are 4 and 10 right now for a term that is my most important. Been doing it for a while. The 2 pages returned are very different. There is some other reason keep looking. If you have enough backlinks that are very differnt you can do just about anyting. Both of those sites have lots of natural links. In Google lots of natural looking links really makes a difference.

caveman

2:52 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Chicken Juggler you make a good point. I believe that the strong presence of some positive factors (tons of backlinks for example) can overcome other problems.

So the long term answer is, get more links.

However, if you don't have enough of those positives for a given site, it's still possible that any of the theories above might be in play...or so I believe.

It's good to know where the traps are so that you can go around them, rather than needing a tank to go over them. ;-)

Chicken Juggler

8:37 pm on Feb 26, 2004 (gmt 0)



Dune - The best way to avoid a trap is knowing of it's existance.

SlyOldDog

8:42 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Caveman

Could be something to do with our links directory. The top level looks the same on all sites, although the links inside are specific to each site.

I have a gut feeling it's not though. We are renaming our products on one site now to see if LSI might be the culprit.

Other than that I can only blame the incoming links. Our sites often get listed alongside each other in pages linking to us. We don't have much control over that.

I'll get a programmer to produce some stats on the % similarity on outgoing links pages to investigate that avenue.

Chicken Juggler

8:51 pm on Feb 26, 2004 (gmt 0)



GG has said over and over again that off site things can not hurt you. Worst case scenario they do nothing.

cabbie

9:05 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>>>GG has said over and over again that off site things can not hurt you. Worst case scenario they do nothing.

If so he is wrong.

Chicken Juggler

9:07 pm on Feb 26, 2004 (gmt 0)



If that were true then I could go around killing sites. I could start a business doing that. Pay me $1000 and I will boot your competiter out of Google.

caveman

11:12 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



cabbie is quite right.

Indeed, they even changed the Webmaster section of their site (as was widely reported in here), to include a caveat about this. It now reads:

"There is almost nothing a competitor can do to harm your ranking or have your site removed from our index."

And of course, what they admit is frequently only the tip of the iceburg. For example, I'm not sure they ever admitted that there was a problem with Florida/Austin. I still chuckle at that one.
;-)

hutcheson

11:26 pm on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Why would Google care who owns the site?
I thought we'd pretty well established that Google doesn't care. In fact, they're detecting bad neighborhoods that (according to WHOIS) have many different slum landlords -- even though all of them share the same toothbrush.

>If the site has quality content, why would they give it a penalty based on who owns the site?
There's no penalty based on who owns the site. The issue is twofold: (1) possible decreased value for links within ghettos (a ghetto is a slum with a wall around it, of course); (2) allowing each ghetto only one main listing in a given set of search results.

>They could get themselves into some serious trouble with this, as I can imagine some 'interesting' ways to "knock off" competing websites if I understood how this penalty works.
You don't, and you can't.

It is, as googleguy suggested, not links FROM the slums but linking TO the slums that gets your site blacklisted from the tourist guide lists.

chubba

12:07 am on Feb 27, 2004 (gmt 0)

10+ Year Member



Hey caveman and all.

I think this discussion is going two ways... The geographic results taking a tumble (state as opposed to city) is something to do with the new algo.

I recently left a company where, like others in this post, we produced sites for 'widget makers' - over 600 pages of identical content in each site and maybe five individual pages unique to each firm. In other words a hell of a lot of not just similar, but identical, content.

For two years some of our clients held front page results for 'widgets state' (or for the UK county) and 'widgets city/town'. All was good until Austin when the companies in London all took a dive and cannot now be found for those searches. They are still in the index and searching for the company name still returns them.

What is strange is if I search for 'widgets essex' I get the clients in essex still there. If I do a search for 'widgets billericay' (a town in Essex) the sites in Billericay are there.

All the sites are virtual hosted and have the same IP address, showing that this is not a content or IP exclusion but a difference in the way G handles geographic searches. Maybe it only flicked the switch for more popular searches which would explain why London sites took a beating where others survived unharmed.

I can verify again that there is no IP or content ban as one of my old companies sites that was not in the top 100 for 'widgets london' is now front paging!

Odd too is that this site has exactly the same optimisation as the sites that took a dive - title tag 'company name : widgets london' meta keywords 'widgets london, company name, a few other relevant terms' both sites (those that were front paging and the one that is now)have intro pages of flat graphics, no alt tags, no hidden text or no script tags.

I am at a total loss with this one.

Agree a lot with the LSI thoughts and have had a few experimental successes with my new clients sites but cannot understand why some geographic searches have been affected and others not.

I think paranoia is a trait of SEO'rs and that you need to look a little more closely at re-optimising title tags, link structures and the LSI related terms before suggesting Google is as intelligent as you make out. With the amount of pages G now indexes do you really think they are going to be looking at IP addresses, WHOIS, and content and comparing sites against each other? Get real.

Chubba

Net_Wizard

1:00 am on Feb 27, 2004 (gmt 0)



This might be of interest...

Looking at a site cache through the Google Toolbar sends googlebot to the site.

I'm not sure if this is good or bad but everytime I look at the cache through the toolbar, I would notice later on that Googlebot is on the site.

However, I didn't really take note of details such as..
1. was it immediate access
2. access to the cached page or just the home page

But on several occasion for every cache query there's a corresponding 1 googlebot access.

Now, I'm a little hesitant to do that after reading this thread and just noticed that my traffic was down,not bad but down. Just log-in here and noticed this new penalty thread :(

GoogleGuy

1:42 am on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Looking at a site cache through the Google Toolbar sends googlebot to the site."

Arouw? Thought I'd debunked that as a myth..

cabbie

1:49 am on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>>>Pay me $1000 and I will boot your competiter out of Google
:) You don't know how. :)
I am not intersted in doing it at all but I do know how as I have sites banned from off site activities.
Sabotaging is not my style and nor do I wish to complain.I pick up my content and move on.

Chicken Juggler

3:56 am on Feb 27, 2004 (gmt 0)



It was a joke.

rfgdxm1

4:07 am on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Arouw? Thought I'd debunked that as a myth..

Hmm...why on Earth not consider spidering a URL that Google knows about only from the toolbar? I'd think that Google should NOT give any preference to spidering URLs that it knows about from only the toolbar. However, why totally discard useful data?

Net_Wizard

4:55 am on Feb 27, 2004 (gmt 0)



Arouw? Thought I'd debunked that as a myth

I don't know about myth but it's just too much of a coincidence that everytime I do cache lookup with the toolbar, 1 googlebot access appears in my log.

Just pure luck? What are the chances of those happening on multiple try?

1x = pure luck or coincidence
2x = happenstance
Nx =?

However, why totally discard useful data?

Exactly my sentiment. Anything that passes through the toolbar is data 'that can be categorize and analyze'.

Come to think of it...

Who uses the toolbar? I bet, most are web site owners and very few are regular users.

If most of toolbars users are web site owners, 'how many uses the cache feature of it'?

Data sent to Googleplex would be formatted like...
Client agent:toolbar Request:Cache Site:example.com

This kind of request alone is unique in a way because it narrows down to 'prolific users' of the cache feature. Knowing who is the prolific user and his/her pattern could tell a lot about that user.

As to Googlebot, who knows it might just a friendly page update check or could be something else...as I have said it's a single access only.

arun_g

11:01 am on Feb 27, 2004 (gmt 0)

10+ Year Member



"Of course a human check would confirm it."

Slyolddog,
You may like analyzing that statement in terms of metrics like "patterns" or "logic". You just might get clues into Googles AI driven mind.

Crush

11:23 am on Feb 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What I am now seeing is sites that were there yesterday are replaced by the second sites today. It is like Google is alternating them to see if one is more relevant than the other. This is happening in more than one location.

It is like widgets red = site 1 showing
red widgets = site 2 showing

Interesting to see what is going on. Google knows these sites are related but does not want to give a ban to one of them. To be on the safe side and not make too many webmasters unecessarily angry they are alternating them but they never appear on the same page.

Be nice if GG could give us a clue but I think this is top secret stuff and they would not like to let it out.

george123

12:02 pm on Feb 27, 2004 (gmt 0)

10+ Year Member



Are they completely mad? who was the big brain!who was the new Einstein in Google that introduced the great new algorithm!.They don't give anymore the results that people want ,there algo looks like a mole .A search engine suppose to give the results for a phrase (key word) that makes the user happy. What's happening big brains? Last month due to snow storms Widget airport was closed .My page gives the exact information about Widget airport ,supplies the info about telephones and any kind of service that deal with widget airport, last month on those particular days when widget airport was closed due to snowstorms my site had thousand of clicks cause was in #4 under widget airport. After that storm my page went down to the google SERPS chaos ,and what you find now under widget airport are the one million pages that advertise the only hotel near by the widget airport. My page is not commercial ,it's an informative page. I receive hundreds of thanks emails from people I helped with my information during those days of snow storm. Now the page is down to the drain of google's SERPS .For what reason Mister big brain?.I know your reason mister ,you want to make internet not a place of accurate information ,you want to make internet dollars in your pocket ,shame on you mister big brain. Society pays back one day don't forget that.
This 70 message thread spans 3 pages: 70