|Google WMT Link Counts: Total vs Incoming vs Discovered Date|
Looking into my links as per Google WMT.
This post is just about the possible effects on my site of incoming links.
There may be 5 or 6 links I have "arranged to get" in 13 years, all back in 2003/4
I just took a few minutes to down load the, Incoming Links and Links Discovered by Date tables for my site from Google WMT.
The "More Sample Links" I didn't download (probably should have looked at those too.)
I haven't sorted out yet how many of these links/domains are actually relevant to my site and how many are "spam" so to speak. But it's pretty obvious a good % are spammy.
The total links count was something like 53,000
The incoming links table showed 1,000 domains and how many links come from each, (I guess that's the limit for that table). I guess this is by the domain name where the link(s) reside.
The big dog in this group sends 35,901 links. It is a relevant site and I guess most of those links are from a site wide links list on their pages. A few links from that site appear to be from some of their members comments in their forum.
A couple other sites send 5/6,000 links each, <removed example> is one of those. One sends all links to my home page, the other sends links to 644 of my pages.
The "Discovered by Date" table listed something like 13,000 links going all the way back to 2004. Lots of multiples domain listings here. Some domains seem to have posted a ton of links all at once, or at least they were all discovered in one shot.
Other domains are listed for several different dates, some covering long spans of time.
So, what to do, if anything?
Traffic wise the site is hurting big time across all search engines.
But this post is just about the possible effects of the incoming links.
Any comments would be appreciated.
[edited by: Robert_Charlton at 2:56 am (utc) on Aug 16, 2013]
[edit reason] removed exemplified sitename [/edit]
"Traffic wise the site is hurting big time across all search engines."
Firstly, I would check and find out what the traffic drop patterns look like. do they coincide with Panda? With Penguin? Both? Some other algo change?
Secondly, you are probably going to have to do a better analysis of the links.
- nofollow links can be ignored.
- crappy scraper sites (like the kind that scrape google search results or wikipedia pages) can probably be ignored (or make low priority).
- Figure out which links, if they were pointing to your competitors instead of you, would you think were "unfair" and that your competitors were gaming the system?
Looking back, except for Panda 1 it's been a long slow decline since early 2010. Panda 1 essentially instantly cut my traffic by 35% which turned into 50% by the end of 2011.
I'll have to look for a list of Penguin dates to correlate that, but I don't recall any immediately noticeable impacts from it.
And part of what confuses me is the slide in traffic from Bing and Yahoo that more or less has happened at the same time as G, although at a somewhat slower rate of decline.
Are your sure your site:
- Doesn't have spam on it?
- Doesn't have canonical errors?
- Hasn't been hacked / has malware on it?
Also, were you able to rectify whatever it was that Penguin had punished you for?
It seems unusual that your bing and yahoo traffic would decline as well, unless there is something that is being overlooked...
|- Doesn't have spam on it? |
The closest I think I've come to "spam" is UGC event listings that were probably intentionally submitted/posted word for word by the original authors on a 3 or 4 sites that offered event listing sections. ALL of these were carefully read/edited-as-needed by me before going live on my site.
I discontinued this part of the site late last year because it was just too hard to keep up with.
I also had 29 - 30 links pages that had a combined total of maybe 1,400 outbound links, NONE of which were arranged reciprocal links. Some of those sites probably linked to my site, but NONE of those inbound links were pre-arranged/asked-for/etc.
ALL of my outbound links were the result of me personally reviewing the target sites before I posted the link and I reviewed every outbound link 3 - 4 times a year at a minimum.
All but 2 of those pages were deleted earlier this year. The remaining 2 pages have about 150 links eack. That's 150 links arranged alphabetically on one page and by geo-location on the other. These two pages have been at the top of my most popular pages for years, so I kept them
NONE of my outbound links on these pages used KW anchor text. They ALL were anchored by business/org name.
|- Doesn't have canonical errors? |
Not sure about this, not even sure how to check. There is a working 301 from non-www to www. though.
|- Hasn't been hacked / has malware on it? |
Google WMT says none detected. I did do a fetch-as-google on the homepage the other day. Saw nothing out of place.
|Also, were you able to rectify whatever it was that Penguin had punished you for? |
If Penquin is about links, I dunno. That's partly why the links pages went away, maintaining them was the other part, Penguin was just the last straw.
Panda was more understandable, from a "thin-content" point of view. The site is largely a bunch (1,800 +/-) of hand built static pages that I've pretty consistently described as a photo gallery with expanded (fact based) captions. That meant that other than they photos there wasn't a lot of unique content on each photo page.
|It seems unusual that your bing and yahoo traffic would decline as well, unless there is something that is being overlooked... |
Yeah, that part that really puzzles me.
As for spam, I guess I was thinking more along the lines of making sure that no one had snuck content / pages on to your site that had "pharmacological" material on it.
As for canonical, what I meant was whether your pages can only be indexed one way by google. I know some sites that use tags for navigation or have pages that show up in multiple categories, so a single page could show up as:
so I would do a site:mydomain.com in google and see whether the individual pages are indexed multiple times with different URLs.
(I know, this is kind of basic, but maybe it will help?)
Were there specific dates when traffic declined sharply? Or was it just a gradual decline over time?
If you saw sharp declines in traffic or in your number of impressions, then note down the days of the sharp declines.
I am sorry to hear that the UGC was too hard to maintain. But you can probably guess by the number of moderators that WebmasterWorld uses here that it takes a lot of time to really keep a UGC site up to snuff.
Each of my pages can only be accessed at a single static url. On the category pages the anchor text for each individual page was the "topic of that page" each of those pages was linked from 2 other pages in that category, on those pages the anchor text was either "next" or "prev".
|Were there specific dates when traffic declined sharply? Or was it just a gradual decline over time? |
Panda 1 was a biggie, but there were a few other noticeable drops for which I'd have to dig up more accurate dates.
Some of the decline was related to other G updates, some of which I was more or less OK with.
ie: For years I got a lot of page views from folks looking to make big ticket widget purchases. That led to a LOT of widget sellers asking for a link, if they and their sites looked legit, I posted a link on one of those now deleted links pages, NEVER asking for a link back. But since mine was an info site where I sell nothing, it seems reasonable to me that G should rank widget sales sites above mine for those queries. I lost a lot of that traffic, but it was probably a better user experience for the visitor.
Well after download a list of links from yet another source I see that no two sources list all the same links. I guess that's to be expected.
Merging the lists is interesting.