Forum Moderators: open
The serach is for the name of a major news site in Israel. Five out of the nine datacenters return the URL of the site without the title and this is just an example of searching a Latin string.
Searching for a Hebrew string in google.co.il returns a list or URL's without any other information, such as the title of the pages or (God forbid) the search word in the context of the page.
This isn't the first time it happens.
Back in November it also happened.
If I remember correctly, the last dance begun on May 5th. I think I read somewhere that it ended on the 20th. Today is the 27th.
If Google is not goint to fix the problem sometimes soon, Israeli users are going to lose confidence in Google.
Tommorow I am going to teach a search techniques course and I really don't know what to tell my students.
People have claimed that Google is the "Knowledge Operating System" of the Internet.
I don't know about that...
[edited by: heini at 1:45 pm (utc) on May 27, 2003]
[edit reason] removed urls per TOS / thanks! [/edit]
thanks!
GoogleGuy
[google.com...]
is the right url to use.. thanks again.
It seems that some index algorithms choke on Hebrew/Arabic web pages.
There might be an explanation for this. Both Arabic and Hebrew are written and displayed from right-to-left and have two standards for publishing on the web.
One is the "Visual" standard where the text is stored from left-to-right and dislplayed, using a special font, from right-to-left.
The other is the "Logical" standard where the text is stored from right-to-left and displays from right-to-left.
You will see that when you search for a Hebrew or an Arabic word from the main search page, and then click the advanced search link, the string will appear TWICE in the field "with at least one of the words". One in the order you have typed it and another, reveresed. Google does that in order to find pages in both standards, Visual and Logical.
Text in files other than web pages is stored only from right-to-left (Logical) and this is the reason why I think the problem is with the reversing mechanism.
Hope that helps.
The display mechanism of web pages in Hebrew or Arabic is not sure what standard the page is in so it doesn't display any information on the page. No page title, no snippet, no cache link.
If the disply mechanism is sure what standard the page is in, like in non-web-pages files (DOC,XLS,PDF etc.) it knows how to display the information for the page. If it doesn't know, it looks to see if the site is listed in DMOZ and if it does, it displays the information from DMOZ.
The problem can be in two places.
The first place is the index itslef where the display standard of the page is stored. If the information on the display standard is missing, the display mechanism cannot know how to diaply the information about a page and only the URL of the page is displayed.
The second place for the problem can be somwhere in the display mechanism itself that doesn't know what to do with the display standard it gets from the index.
If the problem is with the index, the problem will be hard to solve, because the display standard information will have to be collected again for all the pages in the index.
If it's with the display mechanism, it's "only" a software bug that can be easily solved, since it already worked.
And maybe I don't know what I am talking about...
I search for the name of the organization I work for, שתיל, get nicely formatted search results. Since there are many pages, I click the "more pages from this site" and get a SERP with some good results and some links with only the URL.
(this posting is also a testing of the multi linguality of this forum)
Best,
GoogleGuy
... one thing I agree though, the more Google is taking their time, the more Hebrew surfers will lose their confidence in it.
I am glad to see that you care about your Israeli audience :)
Do the math.
Hanan Cohen
***Love and Peace***
Which audience is more important - a million Web surfers who use the Web 3-4 hours a day, or one hundred million who use the Web for 15 minutes a week?
Google is the only major search engine to provide a Hebrew interface and a co.il domain. AV and ATW do provide Hebrew results but they haven't bothered to translate their sites into Hebrew. Other majors don't have Hebrew support whatsever. That only shows the differnce between GG and the rest. The GG folks are much better to understand the meaning of a globalized Web and they are far ahead of everyone else.
I believe GG is the number 2 search site in Israel right now (after Walla!) and it is on its way to become number 1. Way to go GG!
If anyone does still see any problems, please let me know.
Search string : שתיל
[info.org.il...]
The search string is the name of a website I run and I know the URL's intimately. Sometimes I get the title and snippet and sometimes I don't.
When writing ads in Hebrew where it says: "Headline (maximum 25 characters)" it only realy allows 13 Hebrew charcters. The descriptions only allow 18 Hebrew charcters long lines (instead of 35 characters in English.)
The rates for ads in Hebrew are the same rates as the English rates. Why do we get only half the text space?
[google.co.il...]
it starts nicely with Dr. Eric E. Schmidt, then Sergey Brin, Larry Page... then the bio of Omid Kordestani reads something like:
"Omid Kordestani has more than 12 years of experience in...
fgdfgfd
Kordestani got an MBA from Stanford"
Then comes Wayne Rosing which reads something like:
"Wayne's got
fdgd
gdfgg
dgd
h"
The translator (who I bet was drunk at the time) has left when he got to McCaffrey's bio, so it remains in English, and afterwards the translator returns to unfinish the job.
GG - please add to your "to do" list - "translate management.html into Hebrew"