Forum Moderators: Robert Charlton & goodroi
This is my first thread here... Must say have had to spent months reading catching up :)
Although the issue is about my site, I think it applies in general to all news/media websites.
We started a small blog about a year ago which is about say politics... After getting several posts out, we realised that in some cases our news were out before anyone else on Google News and other major news aggregators... So we transformed the site with a nice template, many news sections and a passion for writing articles...
In the first three months, we were doing great on google and all other major SEs. But after out rss feed was starting to get aggregated on major news aggregators (not google news) like Topix, our rankings in the SERPs plummeted.
It took me a while to realise what is happening. Most of these aggregators are linking to the articles with 302 redirects. Also I found that many sites had a page with our rss feed, which is nearly identical to the frontpage of the site.
After getting several hundred new links to my site, still was not getting ANY google search traffic... Meanwhile, our news headlines started becoming very competitive with the big players (bbc, reuters, etc.) and we started to make an impact on the web...
Stories published, would get linked from hundreds of sites related to say the political parties involved.. Imagine this happening continuously since last December and daily. Sometimes we get over 500 new links to the article pages in a single day.
Even people from the NYT and the BBC linked to us, while on a BBC radio page our headlines are regularly featured..
Then back in March I submitted the site for inclusion in Google News and we got accepted, after our request was turned down last December.
Now it is Augoust and still no gsearch traffic. Only hits from "referral spam" websites. Almost daily we'll have a story which for a few minutes will be featued in the Google news homepage. Other times several of our stories get featured on the Google news homepage (for all english googlenews sites). At some point in june we got to be the top story for quite a while...
Despite the successes and despite all our efforts, nothing is coming from google... Neither something changed in the last updates, apart from the fact that now we do not rank even in the top 50 for "companyname" and "company name"...
site: operator does not get the homepage first and generally google seem to have it all wrong. After a very significant investment, we're still trying to survive, competing with monsters compared to our infrastructure, despite all the success stories.
Any ideas on how to overcome this issue?
Also another puzzling fact is that homepage has pr3 and all major sections pr4...
Thanks in advance to those who put the effort to read this :)
Ranking a site like this is no different to ranking any other kind of website.
Let's forget about Google News (although great that you're in it) - it has no bearing on the search SERPS.
Let's also forget about what the toolbar is telling you (other than the fact that it might indicate a poor internal link structure, the toolbar data is essentially useless).
Let's rewind and take a look at some of the basics:-
1. Are your pages spiderable? Ever run a simulator over them?
2. Internal navigation structure - how easy is it for a bot to crawl every page of solid content that you have on your site?
3. Titles - how well crafted are they?
4. Webserver configuration - check your header data is correct.
5. URL's - are they full of a mess of variables or nice and clean (it makes less difference in G now than it did historically, but you still want to make it as easy for them as possible).
6. Internal Links - relative or absolute?
7. Is there any "history" on this domain? It could be suffering from a previous penalty - [archive.org...] can be useful to check with.
site: operator does not get the homepage first and generally google seem to have it all wrong.
How many pages are returned, and how many are in your site?
not ranking for companyname or company name
If that is the same as your URL or a fairly "unique" term, then you have a serious problem which goes beyond making basic improvements to ranking.
The good news is that'll make it easier for us to work out what it is.
8. Are you getting traffic from other search engines?
TJ
Sometimes we get over 500 new links to the article pages in a single day.
Wondering if this could trigger some sort of "penalty" with Google.?..as your site/sector probably has a very specific history for link acquisition rates...and suddenly...you have some serious daily spikes....
AND...will these links stick around for the long haul...or simply vanish after weeks or months..? (if the long haul..then eventually you will see some benefits in the SERPs...)
BUT... Google News and the SERPs are two entirely different entities....(and congrats on what must be some new traffic from these links)...DEFINITELY...start with the traffic from the new link relationships...for however you are currently monetizing...or working towards creating revenue...(PERHAPS eventually you will see some new traffic from the SERPs through all of this...)
1. Are your pages spiderable? Ever run a simulator over them?
They are easily spiderable. Proof of this is that in November we had high rankings in Google. Yep I am visiting my site with the first ever browser i was using which is Lynx.
>>2. Internal navigation structure - how easy is it for a bot to crawl every page of solid content that you have on your site?
It is very easy to crawl the main sections, with article titles and descriptions, etc... I pay attention to these issues, I only use text links.
>>3. Titles - how well crafted are they?
This depends on who is writing the article.. Some of them can not help it and use very long titles (i.e: over 120 characters long). They have the right keywords in them and stuff, but there is an issue about the titles and the stories not being exactly for the same topic as the main site or the sections.
E.g.: Although the site is called politics, the articles might be about local community issues related to a specific MP of a specific political party and might not be exactly politics.
4. Webserver configuration - check your header data is correct.
Checked that. Generally use the Drupal.org system, which is very search engine friendly (as my other sites have proved). The only thing was that the error page was getting 200 found code, but this has been rectified 2 months ago.
5. URL's - are they full of a mess of variables or nice and clean (it makes less difference in G now than it did historically, but you still want to make it as easy for them as possible).
They are all nicely done. Only the news sections urls get a single variable appended on the previous day's news page.
6. Internal Links - relative or absolute?
Relative
7. Is there any "history" on this domain? It could be suffering from a previous penalty - [archive.org...] can be useful to check with.
Well this is a project started a year ago (and a few months more) and owned it for years (with no hosting), so nothing significant has happened. Only that I was getting referral spam and did not realise it might be hurting.
How many pages are returned, and how many are in your site?
site:domain.com returns over 29,000 results, many of which are story comments/forum comments/etc. which google should not be indexing and it has. Content on the site I do not think it is over 3000 stories and about 20 major sections.
It just came into my mind that some articles get posted in more than one section. Not sure if this causes some sort of flag.
If that is the same as your URL or a fairly "unique" term, then you have a serious problem which goes beyond making basic improvements to ranking.The good news is that'll make it easier for us to work out what it is.
The url is not a unique term. It is a phrase which is used widely in the world of "politics", but not that competitive. We were no1 for it, before being replaced by a bunch of others. We were above sites like bbc and reuters till last february or something (the google rankings drop was last november).
8. Are you getting traffic from other search engines?
Yes we do. MSN, Yahoo and ASK have our site in the top5 for our very competitive keywords and within the top10 for less ones. The problem is that google sends a hit or two per week for some very weird keywords. Mostly if instead of "politics show" you search for "politicsshow", which sounds like sandbox effect to me.
Generally the problem is that we can rank for very weird keywords related to the "MPs" or their "community" and not for their political party or for "politics" (the politics is an example which came into my mind).
decaff wrote:
Wondering if this could trigger some sort of "penalty" with Google.?..as your site/sector probably has a very specific history for link acquisition rates...and suddenly...you have some serious daily spikes....
Hello Decaff... Well this is what worries me as well. The problem is that in some days this might be a couple of hundred only (on forums like this one or on sites dedicated for the "political parties" we write about.
Another worrying thing is that the news aggregators link to the articles (that they get from my xml feed) using 302 redirects (with a sever header checking tool I verified this).
Thanks for looking into this guys...
are story comments/forum comments
Do you allow URL linking inside story comments? If so, who are you linking to?
TJ
It definitely is a penalty, but every time I contact google I get the same old answer... "there is ALMOST nothing a competitor can do..."
Is it gonna be a good idea to start reducing the links I have control on (about 200)?
[webmasterworld.com...]
Quote:
"Also I found that many sites had a page with our rss feed, which is nearly identical to the frontpage of the site." This I believe is a significant factor.
Here's the theory: There are thousands of spam AdSense web pages out there, that parse the Google News RSS feed for a particular word or phrase. You can always spot them because they carry the name of the website and the location of the publisher under the title of each entry, e.g.
Ice Continues to Melt
Penguin News, Antartica.
Ice continues to melt at an alarming rate in the Southern Ice Cap. Penguin leaders met in emergency session....
You get the idea. The spam pages have constantly updating content due to the new articles appearing in Google News that contain the phrase in question. Now, it is known that Google has no effective method of distinguishing between the original publisher of a particular item on one hand, and the duplicates on the other. So, I would suggest that the many spam pages carrying Google News RSS feeds have the effect of making your articles look like duplicates, and your SERP's are affected accordingly.
So in the (theoretical!) example above, the SERP for the article 'Ice Continues to Melt' in 'Penguin News' will be worsened because the same headline appears in many other AdSense/Google News spam pages.
mcskoufis, getting back to the original post, you say that "Sometimes we get over 500 new links to the article pages in a single day." I wonder if many of these were actually the spam pages I speak about?
I must emphasise that this theory is personal opinion on my part. However, it should be remembered that the Google News algo usually picks up new content almost instantly. The main Googlebot may spider the article a few days later, and it may have visited the spam pages already by the time it gets to the original article. If there is no crossover between the News bot and the regular bot, then the regular Googlebot may have no way of knowing that the original article is the original.
As they say on exam papers, discuss....
I have read possibly every thread with over 10 comments on this site since 2003. The reason I started this thread is because I can't figure out at all. Some of my sites are ranking in google for extremely generic keywords.
It is just my "politics" site which is severely damaged. So severe that I am thinking of moving it to a new domain (even if it is a year and so old). Thanks for your comments though.. They have brought ideas which I haven't thought about.
steve40 now the comment pages have both the tag and a definition in robots.txt. I noticed all the comment pages don't have any title or description in Google's SERP, so I suspect they will be removed soon. The site does not have over 3000 pages in total.
TearingHairOut, I think that this is causing a duplicate flag... Some of the pages which have the headlines have far more pagerank than my site (on the firefox extension ok...)
Regarding your theory... You have a point here and this is what has been worrying me. There are some news aggregators which grab the feed and then link to my articles with a 302 redirected script. Now some of the sites feature this link, others feature the clean link.
The traffic from those aggregators is as significant as being on the google news homepage, even better when there is a story about a "political party" whose followers have strong online presence.
I have now only 3 items in my rss feed and no description now. I got really paranoid by this. Topix does it as well.
Google is saying that there is no googlenewsbot... From my experience (trying to beat the big boys and being the top story for that topic on the gnews homepage) I can tell you that they do index the entire article instantly.
Ah yes.. and they never show the description text from the atom feed. They pull a description out of the article according to what story it is grouped under (I try to be as plain as poss) keyword is searched. Different one for different keywords.
Google is saying that there is no googlenewsbot... From my experience (trying to beat the big boys and being the top story for that topic on the gnews homepage) I can tell you that they do index the entire article instantly.
There is a Google News bot. In my stats program, it showed up in the list of spiders visiting my site.
Ah yes.. and they never show the description text from the atom feed. They pull a description out of the article according to what story it is grouped under
You are correct here also. I was formerly a contributor to Google News and they always got the description directly from the web page rather than the feed.
(I was dropped from G News a few months ago. I switched to publishing on blog software and this seems to have been interpreted as being a move from news to blogging. But that's another story.)