Welcome to WebmasterWorld Guest from 54.226.32.234

Forum Moderators: open

Message Too Old, No Replies

google.co.il is totally broken

Data on .il sites is missing

     

hanan_cohen

1:18 pm on May 27, 2003 (gmt 0)

10+ Year Member



The last "dance" has left the Israeli/Hebrew sites totally broken in google SERPS.

The serach is for the name of a major news site in Israel. Five out of the nine datacenters return the URL of the site without the title and this is just an example of searching a Latin string.

Searching for a Hebrew string in google.co.il returns a list or URL's without any other information, such as the title of the pages or (God forbid) the search word in the context of the page.

This isn't the first time it happens.

Back in November it also happened.

If I remember correctly, the last dance begun on May 5th. I think I read somewhere that it ended on the 20th. Today is the 27th.

If Google is not goint to fix the problem sometimes soon, Israeli users are going to lose confidence in Google.

Tommorow I am going to teach a search techniques course and I really don't know what to tell my students.

People have claimed that Google is the "Knowledge Operating System" of the Internet.

I don't know about that...

[edited by: heini at 1:45 pm (utc) on May 27, 2003]
[edit reason] removed urls per TOS / thanks! [/edit]

GoogleGuy

5:42 am on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Hi hanan_cohen! I found your blog posting via scripting.com, so I poked around google.co.il and couldn't find anything wrong--the searches that I tried worked. It could be that the site you are talking about was down when we crawled; then we could only show a link and not a snippet. Could you fill out a spam report (just because that's easy for me to access) and mention your name/nickname? I'm curious to see what search isn't working for you.

thanks!
GoogleGuy

[google.com...]
is the right url to use.. thanks again.

GoogleGuy

5:12 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Aha. I see what you're talking about. I tried a query or two like "test" and saw results with descriptions, but I think you're talking about missing snippets?

I'll check on what we're doing to resolve this. Thanks for mentioning it!

hanan_cohen

5:34 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



Thanks for looking into it.

Try the queries here:
http://info.org.il/english/google-co-il-is-broken.html

Click the links near Search text =

(Dear moderators, please don't edit out this URL. I have no other ways of communicating with GoogleGuy).

[edited by: Marcia at 10:27 pm (utc) on June 2, 2003]

hanan_cohen

6:21 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



And another srange thing. The problem only accures for web pages! Searching for an Hebrew and Arabic word on all file types got me correct search results.

It seems that some index algorithms choke on Hebrew/Arabic web pages.

There might be an explanation for this. Both Arabic and Hebrew are written and displayed from right-to-left and have two standards for publishing on the web.

One is the "Visual" standard where the text is stored from left-to-right and dislplayed, using a special font, from right-to-left.

The other is the "Logical" standard where the text is stored from right-to-left and displays from right-to-left.

You will see that when you search for a Hebrew or an Arabic word from the main search page, and then click the advanced search link, the string will appear TWICE in the field "with at least one of the words". One in the order you have typed it and another, reveresed. Google does that in order to find pages in both standards, Visual and Logical.

Text in files other than web pages is stored only from right-to-left (Logical) and this is the reason why I think the problem is with the reversing mechanism.

Hope that helps.

hanan_cohen

6:44 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



I GOT IT. I know where exacly the problem is. The problem is not with the index, it's with the display mechanism. (maybe it has to do with the index too.)

The display mechanism of web pages in Hebrew or Arabic is not sure what standard the page is in so it doesn't display any information on the page. No page title, no snippet, no cache link.

If the disply mechanism is sure what standard the page is in, like in non-web-pages files (DOC,XLS,PDF etc.) it knows how to display the information for the page. If it doesn't know, it looks to see if the site is listed in DMOZ and if it does, it displays the information from DMOZ.

The problem can be in two places.

The first place is the index itslef where the display standard of the page is stored. If the information on the display standard is missing, the display mechanism cannot know how to diaply the information about a page and only the URL of the page is displayed.

The second place for the problem can be somwhere in the display mechanism itself that doesn't know what to do with the display standard it gets from the index.

If the problem is with the index, the problem will be hard to solve, because the display standard information will have to be collected again for all the pages in the index.

If it's with the display mechanism, it's "only" a software bug that can be easily solved, since it already worked.

And maybe I don't know what I am talking about...

hanan_cohen

7:43 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



The same peoblem accures also with the other RTL languages, Persian and Urdu.

Jacqwo

8:28 pm on Jun 2, 2003 (gmt 0)

10+ Year Member



Me think Hanan_Cohen should work for Google!

What say you GoogleGuy?

GoogleGuy

11:12 pm on Jun 2, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



If only all our reports were so thorough! :) I agree with what hanan_cohen was saying--it's not a problem in the index, but rather with the generation of snippets (the display mechanism). That's a good sign; it might take a few days to nail it down and get a fix pushed out, but it shouldn't require any crawling/indexing changes, because we have the correct documents. Thanks again, hanan_cohen. I'll drop by again if I find out more info from talking to people around here.

g1smd

12:08 am on Jun 3, 2003 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Heh, if only you could get your PR notched up by one or two points, or moved a page or two up the SERPs, for sending in a useful report...

hanan_cohen

8:30 am on Jun 4, 2003 (gmt 0)

10+ Year Member



Hebrew SERPS are getting better but strangly inconsistant.

I search for the name of the organization I work for, שתיל, get nicely formatted search results. Since there are many pages, I click the "more pages from this site" and get a SERP with some good results and some links with only the URL.

(this posting is also a testing of the multi linguality of this forum)

NeverHome

9:31 am on Jun 4, 2003 (gmt 0)

10+ Year Member



I don't know about anybody else, but I get complete vertigo when I see anything on a page going from right to left. Half the time I think the page is backto-front.upside-down. :) Interesting to note that google also suffers from language disorientation.

paladin

12:40 pm on Jun 4, 2003 (gmt 0)

10+ Year Member



I thought that you might want to know that the descriptions started to reappear today in one of our hebrew sites in google.co.il

GoogleGuy

2:48 pm on Jun 4, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'm glad that things are coming back. I heard that snippets on .co.il would come back in the time frame of a few days. Hopefully snippets there should be completely back to normal soon. hanan_cohen, sometimes just urls show up if we saw that url during a crawl but didn't have the time or resources to fetch that url. So sometimes it can be normal behavior to see only the url. Keep us posted of how things look, and I'll let you know if I hear any more news around here.

Best,
GoogleGuy

rbester

3:14 pm on Jun 4, 2003 (gmt 0)

10+ Year Member



Hi hanan_cohen and thanks for bringing this up. This issue has been going on for a while. I brought it up in this thread:
[webmasterworld.com...]
but it was more than a week after this started to happen. I thought it has something to do with the latest update, because everything else seems to be a little crazy with the Google results these days, so I decided to just wait and see.

... one thing I agree though, the more Google is taking their time, the more Hebrew surfers will lose their confidence in it.

rbester

3:25 pm on Jun 4, 2003 (gmt 0)

10+ Year Member



Hi GoogleGuy,

Actually the results have just significantly improved right after my previous post (probably while I was still writing it.)

I am glad to see that you care about your Israeli audience :)

GoogleGuy

6:52 pm on Jun 4, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



We do. :) Keep me posted if you see problems in the future. :)

hanan_cohen

10:47 pm on Jun 4, 2003 (gmt 0)

10+ Year Member



I am glad to see that you care about your Israeli audience :)


Wrong. The problem was/is with all Right-to-Left languages, Arabic included. According to the Cambridge Encyclopedia of Language, 181,000,000 people speak Arabic. I think that not more than 6,000,000 people speak Hebrew.

Do the math.

Hanan Cohen
***Love and Peace***

rbester

10:04 am on Jun 5, 2003 (gmt 0)

10+ Year Member



I disagree.

Which audience is more important - a million Web surfers who use the Web 3-4 hours a day, or one hundred million who use the Web for 15 minutes a week?

Google is the only major search engine to provide a Hebrew interface and a co.il domain. AV and ATW do provide Hebrew results but they haven't bothered to translate their sites into Hebrew. Other majors don't have Hebrew support whatsever. That only shows the differnce between GG and the rest. The GG folks are much better to understand the meaning of a globalized Web and they are far ahead of everyone else.

I believe GG is the number 2 search site in Israel right now (after Walla!) and it is on its way to become number 1. Way to go GG!

GoogleGuy

12:54 am on Jun 6, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Hey, let me know if snippets still aren't shown after this, but things should have been completely back to normal earlier today. There's some good reasons why it was a little hard to push this fix, but nonetheless almost all servers should have a new binary that handles this correctly now.

If anyone does still see any problems, please let me know.

hanan_cohen

7:06 am on Jun 6, 2003 (gmt 0)

10+ Year Member



Thanks and sorry but its not working yet. Here is a screenshot.
The file name contains the date and time for California.

Search string : שתיל
[info.org.il...]

The search string is the name of a website I run and I know the URL's intimately. Sometimes I get the title and snippet and sometimes I don't.

GoogleGuy

9:15 am on Jun 6, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Very strange. I see snippets for some results on this search, so it may be more than just the display mechanism. Let me look into this search with some engineers here.

amazed

11:14 am on Jun 6, 2003 (gmt 0)

10+ Year Member



this might not just be an Hebraic Arabic Urdu problem.
I have got it with German pages too from March 20. (of March 20 just index page is indexed and has snippets, all 30 subpages do not have the snippets and are not indexed.

hope it helps

rbester

1:07 pm on Jun 8, 2003 (gmt 0)

10+ Year Member



While we're at it... perhaps this is not the best place to ask, so GG guy, if you're still here, please direct me as to where I can send the following problem with Google AdWords in Hebrew:

When writing ads in Hebrew where it says: "Headline (maximum 25 characters)" it only realy allows 13 Hebrew charcters. The descriptions only allow 18 Hebrew charcters long lines (instead of 35 characters in English.)

The rates for ads in Hebrew are the same rates as the English rates. Why do we get only half the text space?

rbester

3:40 pm on Jun 8, 2003 (gmt 0)

10+ Year Member



... and another thing, check out the following page:

[google.co.il...]

it starts nicely with Dr. Eric E. Schmidt, then Sergey Brin, Larry Page... then the bio of Omid Kordestani reads something like:

"Omid Kordestani has more than 12 years of experience in...
fgdfgfd
Kordestani got an MBA from Stanford"

Then comes Wayne Rosing which reads something like:

"Wayne's got
fdgd
gdfgg
dgd
h"

The translator (who I bet was drunk at the time) has left when he got to McCaffrey's bio, so it remains in English, and afterwards the translator returns to unfinish the job.

GG - please add to your "to do" list - "translate management.html into Hebrew"

GoogleGuy

7:15 pm on Jun 8, 2003 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I'll pass that on, rbester. :)

amazed, you can also see a url with no snippets if we saw the url but didn't get a chance to crawl it.

rbester

10:11 pm on Jun 8, 2003 (gmt 0)

10+ Year Member



:)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month