homepage Welcome to WebmasterWorld Guest from 54.82.122.194
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 66 message thread spans 3 pages: 66 ( [1] 2 3 > >     
Bugs in WMT.... Ooops
g1smd




msg:4428303
 10:05 pm on Mar 12, 2012 (gmt 0)

Suddenly I see:

Crawl errors
Ni bilo mogoče najti 23
Ni uspelo 4
Dostop zavrnjen 3
Napaka v strežniku 22
Programska napaka 404 1
Drugo 0


Eh?

 

tedster




msg:4428408
 3:29 am on Mar 13, 2012 (gmt 0)

That's some serious data corruption!

mslina2002




msg:4428409
 3:32 am on Mar 13, 2012 (gmt 0)

I see it too.

Mine is in Spanish though.

When I click the actual links, it reverts back to English.

lucy24




msg:4428439
 5:06 am on Mar 13, 2012 (gmt 0)

Your system language isn't set to Slovenian is it? Wouldn't work for me, because g### doesn't speak my system's first-choice language. Went over and checked in Safari before I remembered that they haven't even filled in Search yet.

g1smd




msg:4428458
 7:46 am on Mar 13, 2012 (gmt 0)

Seconds after posting the original message (at 2205 UTC yesterday) that started this thread I went to the Crawl Errors section of WMT and it has been completely redesigned from what I had been looking at only a few minutes before! One minute I was on the old system, the next minute the new.

The summary screen rotates the language with each refresh. This is either a silly bug or a <cynic> deliberate error to get people's attention and get people talking about the new features. </cynic>

Crawl errors
Introuvable 23
Non suivies 4
Accès refusé 3
Erreur du serveur 22
Soft 404 1
Autre 0


Crawl errors
Nicht gefunden 23
Nicht aufgerufen 4
Zugriff verweigert. 3
Serverfehler 22
Soft 404 1
Sonstiges 0


and finally to English, which is the system setting for WMT.


I was unable to add to this thread until now, as it was locked.

The new design looks great, and features a new button where you can "clear" errors from the list.

The main feature is that it now shows the number of errors graphed over time. However the data doesn't seem to be correct, especially for the "URL Errors > Web > Server Error" graph.

On one site there were a large number of "Error 500" errors for the last year or more. After fixing those issues in January I have watched the numbers in the old WMT Crawl Errors report slowly decline to 4 as Google has recrawled the URLs. It seemed to me that the error would be removed from the report 6 weeks after the error was last found on the site.

The issue causing that problem was fixed in January and the site hasn't served a single 500 error since then. Just yesterday, WMT listed the final 4 URLs that it had last seen with errors back in January. However, today there are now a large number of those errors relisted, the error count is back up to 45. Yesterday Google were happy the errors had long gone. Today, they are relisted. This is garbage.

The graph is especially misleading. For this one site, it shows 45 errors for today (and for each day going back in time, and a larger number at the beginning of the graph). I would take the data point for today to mean they actually FOUND 45 such errors on the site TODAY. It doesn't mean that at all. It means that as of today they have 45 URLs in their database that when LAST CRAWLED at some point in the past, days or weeks ago, returned that error at that time.

Do I need to go through and "clear" each error, or will Google do that as they recrawl each one? It appears to me that the WMT data being used is at least several weeks old.

The "Not found" error report is correct for the couple of sites I have checked, showing the same data today as it did yesterday.

Make sure you click both the "Server Error" and "Not Found" boxes as there are separate graphs for each. Likewise for the three entries at the top of the page, as each of those leads to a separate graph.

Google still don't report 410 responses as 410. Everything is listed as 404.

< moderator note: see g1smd's post below - this original report
was incorrect and 410 statuses are now reported separately >


The other issue I raised almost three years ago is still there. When you save a report, the filename format varies depending on the report. There's a mix of
sitename-datetime-reporttype.csv, sitename-reporttype-datetime.csv reporttype-sitename-datetime.csv and reporttype-datetime-sitename.csv which doesn't allow for an easy to understand sort order when files are listed. Can we just have sitename-datetime-reporttype.csv for all of the reports?

[edited by: tedster at 12:47 am (utc) on Mar 16, 2012]
[edit reason] insert correction notice [/edit]

rowtc2




msg:4428462
 8:05 am on Mar 13, 2012 (gmt 0)

I see strange languages too, they have changed something recently [googlewebmastercentral.blogspot.com ]

lucy24




msg:4428464
 8:20 am on Mar 13, 2012 (gmt 0)

Ahem.

[webmasterworld.com...]

They don't like me, though. I just get English, no matter what I do or where I go.

g1smd




msg:4428465
 8:32 am on Mar 13, 2012 (gmt 0)

Yesh, as I originally posted this thread the design updated before my very eyes... but this thread was locked for the next 5 hours.

Sgt_Kickaxe




msg:4428466
 8:32 am on Mar 13, 2012 (gmt 0)

I didn't see this particular corruption but I did find something else around the same time you reported it, my site in a lot of different languages on Google.

It seems Google is running all websites through their translator and archiving them, perhaps in an effort to spot the "grab foreign content and translate it dirty" websites?

I don't know, but the timing suggests they may be related?

ohno




msg:4428467
 8:37 am on Mar 13, 2012 (gmt 0)

Same here, Search queries is in English yet crawl errors is foreign! I think this sums up Google at the moment, Product Search is also full of bugs! Funnily enough we had our first Google Checkout review in over a YEAR this week despite having many sales via GC.

Also, different sites have different foreign language! One is deffo German.

realmaverick




msg:4428487
 9:51 am on Mar 13, 2012 (gmt 0)

Haha thought it was just me, because I was using my iPhone to tether. Loving the new WMT though :)

zeus




msg:4428523
 11:18 am on Mar 13, 2012 (gmt 0)

sometimes I see other domains, with weird domain names up in the pull down top right, but when i click i dont get to there reports

robzilla




msg:4428545
 12:44 pm on Mar 13, 2012 (gmt 0)

Mon GWT est en français, que je ne comprends pas. Ce n'est pas très utile.

robzilla




msg:4428546
 12:45 pm on Mar 13, 2012 (gmt 0)

I swear I wrote that in English. What's going on here?

realmaverick




msg:4428554
 12:57 pm on Mar 13, 2012 (gmt 0)

Haha

aakk9999




msg:4428564
 1:03 pm on Mar 13, 2012 (gmt 0)

Google still don't report 410 responses as 410. Everything is listed as 404.

I am seeing 404 and 410 responses.
The errors seem to be cummulative - completely misleading. E.g. found 20 errors today, tomorrow recrawled 15 from yesterday + found 5 more, would expect to see 25, but what we seem to have is 40

ButI cannot find "Blocked by Robots" any more - anybody knows where it has gone?

lucy24




msg:4428710
 6:11 pm on Mar 13, 2012 (gmt 0)

Over in the other thread I was wondering if "access denied" is their new name for "blocked by robots.txt". Option B is that "access denied" means 403 and-- another option that just occurred to me-- "blocked by robots.txt" is the new "not followed".

Except that, wait, I have tons of roboted-out pages and they're simply not listed anywhere, although the "can't find" group rolled over from old format to new.

g1smd




msg:4428786
 9:23 pm on Mar 13, 2012 (gmt 0)

In Google's webmaster blog they mention that the "blocked by robots" list has been removed from the "crawl errors" section and will shortly re-appear in the "site configuration" section - because many of the URLs in the "blocked by robots" list aren't actually errors, the webmaster purposely blocked that access.

On one site I am now seeing 410 responses actually reported as 410 in WMT reports. Good stuff. That's been a long time coming.

g1smd




msg:4428871
 1:13 am on Mar 14, 2012 (gmt 0)

Holy krap!

Crawl errors
&#1604;&#1605; &#1610;&#1578;&#1605; &#1575;&#1604;&#1593;&#1579;&#1608;&#1585; &#1593;&#1604;&#1610;&#1607; 23
&#1578;&#1593;&#1584;&#1585; &#1578;&#1578;&#1576;&#1593;&#1607; 4
&#1578;&#1605; &#1585;&#1601;&#1590; &#1575;&#1604;&#1608;&#1589;&#1608;&#1604; 3
&#1582;&#1591;&#1571; &#1601;&#1610; &#1575;&#1604;&#1582;&#1575;&#1583;&#1605; 2218
Soft 404 1
&#1571;&#1582;&#1585;&#1609; 0


Ahh, WebmasterWorld doesn't do UTF-8.

Suffice to say the list is now in Arabic.

lucy24




msg:4428907
 5:18 am on Mar 14, 2012 (gmt 0)

... and Google Translate was stumped on "Soft 404" ?

I feel so totally cheated :sob: Mine's resolutely in English. Maybe it's because my system language is set to not-English?

r4bet




msg:4428960
 9:32 am on Mar 14, 2012 (gmt 0)

mine is latin !

not2easy




msg:4429024
 1:32 pm on Mar 14, 2012 (gmt 0)

Mine was in Spanish, but only parts of the page. It is getting to be more useless. Pages indexed for years are showing now as "Pages Indexed: 0" yet they are indexed and traffic continues, useless non-information.

n00b1




msg:4429025
 1:38 pm on Mar 14, 2012 (gmt 0)

I have had spanish, German French and some Asian language.

n00b1




msg:4429026
 1:39 pm on Mar 14, 2012 (gmt 0)

Sorry for the poor typing. Ipad problems.

g1smd




msg:4429027
 1:39 pm on Mar 14, 2012 (gmt 0)

The old style reports visible until a few days ago showed several errors for a site; errors that had been fixed many weeks ago but which apparently had not been recrawled to see the new status.

The new style reports show zero errors for this site (great!) but the historical graph drops to zero on Feb 22nd, several weeks earlier.

Why the discrepency?

not2easy




msg:4429178
 7:26 pm on Mar 14, 2012 (gmt 0)

I have noticed for quite awhile that they are showing me errors for pages that have not existed for well over a year,(and had been properly removed) the label on a sitemap may read "Downloaded June 10th, 2010". About two weeks ago it had big red warnings that an important page was blocked by robots.txt and I clicked to see the "page" that was blocked was "someimage.gif". Quite often recently I leave there just shaking my head and wondering why I even bother to look, but it helps to know how far off base they have been moving.

netmeg




msg:4429248
 10:48 pm on Mar 14, 2012 (gmt 0)

heh, just noticed part of mine is in Spanish.

tedster




msg:4429361
 5:02 am on Mar 15, 2012 (gmt 0)

<hare-brained idea>
Maybe this language bug is really showing us some crossed wires in Google's infrastructure - crossed wires that have something to do with that mysterious "zombie traffic" phenomenon.

Bill Slawski (the patent guy) has been covering a series of Google patents over the past year that have to do with which regional data center Google might route any given query to. reference [seobythesea.com]

Each regional data center would have certain standard records, but coupled with other records that emphasized regional/local interests.

...this system will attempt to predict how likely it is that relevant information may be found at a particular producer node, and may or may not take into account particular topics or subject matter that may be relevant to the query. Remember, this is a "prediction" before the query is processed, so as much that can be done without actually finding all results to predict whether or not the query may have to be sent to more than one producer node, the better.

But what happens if this prediction goes haywire? I'm not saying I've got it nailed down, here - but I am catching a whiff of something or other.

</hare-brained idea>

[edited by: tedster at 1:15 am (utc) on Mar 16, 2012]

Andem




msg:4429367
 5:21 am on Mar 15, 2012 (gmt 0)

I'm getting Turkish today.. Yesterday was German and the day before was French.

g1smd




msg:4429394
 7:52 am on Mar 15, 2012 (gmt 0)

Keep hitting Reload or Refresh. It changes every time.

@Tedster I have no idea, but whatever it is I'd guess that the problem is deep in the infrastructure with multiple causes otherwise it would be fixed already.

This 66 message thread spans 3 pages: 66 ( [1] 2 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved