|I can read most of those screen shots just fine on a 22" monitor at 1600x1200. |
Yah, Bill, but it's not the whole page. What you leave unmentioned is they only show five paragraphs out of 23 total paragraphs in the article about Obama/McCain. It's not the whole page, the user has to click through to the site to enjoy the entire article.
Is it possible to submit sitemaps to Searchme?
|What you leave unmentioned is they only show five paragraphs out of 23 total paragraphs in the article about Obama/McCain. |
You're sidestepping that some pages are smaller than others so it could easily show the entire content or a significant portion of it so stop focusing on one particular page layout.
The point is that the snippet served up is now significantly larger than any other SE which is what I find unacceptable.
Wasn't long ago re cuil, with big boasts, and much that was wrong from the name onwards.
Big thread about it here; but I don't recall anyone from cuil posting - and certainly no ceo asking for ideas, saying they'd be discussed, and giving his email address.
"Search Me" - good name; so easy for URL.
Way cool results presentation - like new Mac OS, or using cooliris.
So if can iron out inevitable wrinkles, avoid lawsuits from cranky webmasters, and enhance results - looks to me that should deserve a good share of the market.
[and google likely looking hard at this; not laughing their socks off as likely did w cuil]
If not smaller "snippets", some blurring of the images a good idea; maybe could use something like graded filter - so sharper at the top (header readily read, along with title; albeit not all sites share such design), and then quickly becoming fuzzier.
So, can get a fair idea of what a page is like, but to read text/appreciate photos, have to visit the page.
I've 1600px wide monitor; are snippets where main text can be read with ease.
I have to ask how much of a copyright "infringement" is possible on a web page with one paragraph? Where copyright falls through the cracks is FAIR USE where it is impossible to quote a small article without quoting the entire article.
Write longer articles if you want to file a DCMA
OK, forget text, think images - if the entire image is in the screen shot then what?
Hows it different. I am addicted to Google and perorm almost 99% of my searches there. Cant say how this will help
Heres a hypothetical situation
What if an established SE rolled out a search interface like searchme.
What if website owners could 'opt in' to receive full indexing & display as we see now at searchme.
Anyone concerned with copyright or other issues could stay opted out of their text & pix being scraped.
So now the 'opt in' sites would be nicely displayed (and likely clicked on more) & the other sites would be a blurry mess of text & x's where the images should be....
Would the owners of those sites have any legal recourse for being displayed in an unfriendly light?
|Would the owners of those sites have any legal recourse for being displayed in an unfriendly light? |
At some point it becomes copyright blackmail to be strong-armed to play how you're being dictated to play, or don't play at all.
OK, so the page completely doesn't display at all for non-Adobe'd devices. Anyone posted screenshots and a review for the rest of us yet? When's the "lite" or "mobile" or "accessible" version coming?
I think that previewing sites in this way could be an excellent way to help well designed sites rise up the rankings. Optimising sites for this SE would seem depend chiefly on creating a dramatic look for the site. Could be a boon for graphic designers.....
I imagine that most people would go through to pages that appeal. The images on their image search do look rather too big but perhaps an automated watermark across the images might persuade users to visit the site. Definitely very powerful and intuitive. Excellent topic for WW.
The screenshot size is unacceptable for me. 560 by 570 pixels is way too much, anything over 300x300 I'm going to ban. Like Leosghost I've had my images blocked using hotlink protection since forever, and I also have noarchive on some websites in this index. At this size it goes way beyond fair use.
So where is the robots.txt to remove my sites? If we all have to contact them it's going to get out of hand very quickly.
slef: mobile version is [m.searchme.com...] lite version is still in alpha
Pretty cool. I like how it behaves like my iPhone. As a searcher I like the interface. I do have my concerns as a webmaster about how my content is displayed without any kind of cache date associated with it. You should at least let your searchers know how recently you updated a page.
i think there must be valid reasons why msn, yahoo and google use a simple text based interface, and devote their massive reasources to the actuall search out put and monetisation techniques
Then again, perhaps searchme will show them a thing or two, however, this fascination with a high production SERPs interphase really reminds me of the apparent operational path ask.com followed
|i think there must be valid reasons why msn, yahoo and google use a simple text based interface |
Bandwidth? Storage requirements? Browser ubiquity?
|I have to ask how much of a copyright "infringement" is possible on a web page with one paragraph? Where copyright falls through the cracks is FAIR USE where it is impossible to quote a small article without quoting the entire article. |
And I have to answer ..that if a substantial amount of the paragraph is used than it's copyright infringement ..no matter how long the original paragraph may or may not be ..the law doesnt say your paragraphs must be longer than X to be your copyright ..
Think for example about a 5 line poem with less than 200 chars ( many haiku would be covered by this example ) ..One line if reproduced by someone who did not create it ..can be considered to be a quote ..that would be about 20% ..obviously if you wrote 500 pages than 20% ( or 100 pages is no longer OK if used as a "quote" ) ..but then "fair use" isn't what search engines do ..what they do is aggregate lots other people's original material ..brand it with their logo ..on their pages ..and call it their search ..and then pass it off as their product
"fair use" is for reveiwers ( find me the reveiw on any page of googles serps or MS or searchme etc etc ..or educational establishments ..Not a single current existing search engine is an educational establishment ..
Practically search engines are usefull ..but they are not obligatory ( Brett took this entire forum out of search for a while ..not just "noarchive" ..but not spiderable "no bots" ..new people still found it :)..the old way via links from other sites and by word of mouth )..no google yahoo ms or anyone else needed ..just other webmasters saying "hey ..look there is this place I know" ..and making a link on their site ..
Search engines must obey the law ..and the protocols " even googles "noarchive" tag is an abomination which we should not have to use to prevent them scraping our entire pages ..into their publically accessable cache ..
Since they had some very bad press over the level of branding that they placed on their "cache" of entire pages ..even though they have had judgements given ( for now in their favour ) by judges who dont know a pixel from a png ..and so shouldnt even be adjudicating on matters in which they are ignorant ..
Nevertheless google has reduced the "branding" on the "cache" pages ..to quieten down some webmasters as their actions definately were evil.
When G began they were almost universally loved by webmasters ..they were new ..radically different ..and we all sung their praises to each other and to average joe and jane surfer ( most of whom were using alltheweb or altavista or Yahoo etc and who didnt know how easy it was for webmasters to manipulate what they the average surfer saw on page 1 ,2,or 3 for almost any search term ..But G were not gratefull ..and they treat webmasters and the law ( copyright or federal or countries laws ) with ever greater distain ..because they now have the money and the eyeballs to stiffle access to dissent ..
So now they are talked of in terms of hate , fear , and dislike almost in the same way that one once heard used exclusively for microsoft ..who also dont really listen ..even those who make good money with G would like an alternative so all their eggs are not forced into the same basket as G has a defacto monopoly on search in the west ..and is in bed with repression in the east ..one tweak of the algo can ruin you ..
Randy and his peoples new search engine could be the beginnings of the re-balancing of search ..the breaking of G's monopoly ..and the use of the flash interface and the concept is as radical as was google's "clean" page was then ..looks like Linux and 3D desktops and Spielburg all rolled into one ..and i repeat I love the look ..And Randy is also here ( for now )and discussing ..and apparently listening ..( which is more than G's, Y's and MS's PR reps have done in a long while ..and yes I include Mr Cutts in the ranks of SE PR spinnners ) ..So rather than fight amongst ourselves ( your sites may not be affected by screenshots and entire 1st and only paragraphs on page being shown ..but some of my sites are ..and one in particular and it's not even a site that I bother with since years ..but it's the principal of it ..but then it might have been your site that was affected by some other aspect of the searchme model ..and I'd fight for you not against you if that were the case ..somesites are just 5 pages ..images and a little text ..I'm thinking of a site that sells flash nav etc that belongs to a member here ..Others may be selling ebooks on stopping smoking or gardening or whatever ..usually one or two para pages in that model ..and most adsense sites I see are sparse on text ..so should all the sites with less than 2000 words per page just go to the wall to let Randy and his friends make money ..and os that some here ..might ..yeah might ..get a little more traffic ..?
While Randy is here and listening we may be able to work something out with him and help us all along with him ..
Saying what amounts to "screw you jack my site has big pages so I'm ok !" isnt what WebmasterWorld has ever been about ..
I think Incredibill ( may have been someone else tho ) posted a peice recently quoting the lines that end with
"and then they came for me and there was no one to speak up for me " ..or some such ..
BTW ..DMCA doesnt require one to have large amounts of text ..nor do any of the international conventions ..
Ps ..Randy ..still think you have more in common with a directory than a search engine ..even though one does not have to postulate for inclusion ..
[edited by: Leosghost at 3:24 pm (utc) on Sep. 29, 2008]
Further to the Mac/Safari issue, just wanted to let you know that I've checked, and my version of Flash player is MAC 9,0,124,0.
SearchMe.com doesn't work for me. (Mac OSX 10.4.11 and Safari 3.1.2 - Mac is a PowerPC rather than Intel.) All other Flash sites work for me: but I definitely can't see the SearchMe site unless I change browsers.
Appreciate that this thread is mainly concerned with other matters - but I'd guess it's important to know that some people may not be able to use your search engine.
Interesting that Google suddenly has an option to show long descriptions (~650 characters) in their SERPs.
Discussion of this test is at: [webmasterworld.com...]
Search engine? All it did was display screen shots which didn't link to anything although Firefox kept flashing up warning messages about pop-ups.
Firefox comes "out of the box" with pop up blocked switched on. If it doesn't work with the vanilla settings of the world's second most popular browser then it is seriously broken.
|All it did was display screen shots which didn't link to anything although Firefox kept flashing up warning messages about pop-ups. |
they link ..but the link opens straight into a new window ..not a tab ..I think that firefox views that behavior as a pop up ..
least mine does ..firefox 18.104.22.168 ..
I tried Firefox 3 ..it ran like molasses and took over the CPU and showed over 200,000 processes ..so I went back to an older one ..I'll wait to upgrade again to series 3 until someone convinces me it runs light and fast ..xp pro.
Links for me in Firefox 3, w new tab opening. (Haven't noticed difference from FF2 to 3)
Yes, URLs could be prominent (not even noticing them just now)
Results presentation useful if you'll check thro only a few results (as typical)
Less so if want to skim down list of lots of results; or use browser search to check thro search engine result. Both readily achieved w google. Option to remove images pane (as well as option to make it smaller) might help here?
Seems even when have list of results in lower window, Firefox search only checks the window w images. Only images, so it can't find words.
Interesting google n these long results (as yet, no legal action notices in that thread!). I'd suggested google would be watching but not laughing at Search Me.
OK, we spent a long time talking about this thread in exec staff today. One of the first decisions was to modify our bot to honor the NOARCHIVE directive in robots.txt. When we have made that change if you have NOARCHIVE in your robots.txt file we will still include you in our index but the image that shows up for your pages will indicate that the website owner has requested that we do not show an image of the page.
Next we discussed snippet length and the other requests but since there does not seem to be consensus on this among webmasters, we want to open this up the webmaster community to recommend a series of new robots.txt directives that will cause our bot to do things like limit or eliminate snippets, blur images, etc. We want to give the webmaster control of it so we don't end up dictating it for everyone based on the demands of a few. So I want to throw it back to you all, what directives would you like to see added to robots.txt for visual search engines like searchme?
Thanks for getting back to us Randy ..where I am it's 03.47 am ..so will consider your points after some sleep ..
Guys (and gals) lets not try to have our cake and eat it, too. Text is the USUAL basis of search engines. Images displayed in context with that text is usually a plus. If worried about images then watermark or digitize them.
Want traffic? Take it from where it comes. Want traffic and somebody is willing to work with how that is developed, TALK TO THEM.
Me? Above the fold...EVERYTHING I GOT. Okay. More than that...have to think about it. What an incoming visitor to my site sees on first click is what searchme seems to be showing. Okay by me as far as the interesting searchme display goes.
Dang it, if anyone doesn't want to share/display this stuff then you're in the wrong business.
I took a look at searchme. Interesting. Didn't know "Charlotte" was searchme...blocked it some weeks back. Still listed...but not as strong as I might have been because of that (Randy... I'll let your bot back in...didn't honor robots.txt the first two weeks, hope that has changed).
Copyright infringment is one thing. Search engine is another...if the search engine does not serve the website.
You can't get listed if you don't let 'em...and you can't complain about the listing if it is done responsibly.
Not a rant, just a reminder we don't bite the hand that feeds us (too often).
|One of the first decisions was to modify our bot to honor the NOARCHIVE directive in robots.txt . |
emphasis by me
I am not trying to be presumptuous and teaching search engine CEO what is robots.txt and what is META data, but just in case..., and it might be good reference point for new webasters
uhh,ohhh NOARCHIVE is not part of the robots.txt, but rather is meta data element (tag) that is part of the page meta data, while robots.txt is site wide
robots.txt protocol has established 'directives' although some search engines expanded it a bit - such as use of pattern matching, etc.
META data tags also have a standardized set, however there is tag data that is supported by some and not the other SEs (their initiatives)
Good primer of what "robot controls" (robots.txt and MEAT) are supported by (current :) ) major search engines
Managing Robot's Access To Your Website [janeandrobot.com]
all about robots.txt - [robotstxt.org...]
W3C notes on robots.txt - [w3.org...]
(scroll down and there are notes on META data as well)
IETF rfc - [robotstxt.org...]
Google bot and NOARCHIVE
Tastatura: I consider myself now "schooled" on meta vs. robots.txt! I told my VP Ops about my mistake and he laughed for about 15 minutes then said he assumed I knew that NOARCHIVE was a meta-tag. Thinking about it I would have realized that NOARCHIVE would probably be on a page by page basis, therefore meta. You are not being presumptuous it was my mistake, thanks for correcting it.
The major difference between robots.txt and meta robots data, is that the disallow in robots.txt says to not even access the folder/page/pages listed there (so you will have no record of what those pages contain), whilst you will still be able to access all other non-listed (non-listed in robots.txt that is) pages and see what they contain.
However, those individual pages may then have a directive to not index it, not archive it, etc, buried in the meta data. With nofollow and others there are quite a few possible combinations.
For Google, they show URLs blocked by robots.txt as URL-only entries in their SERP, which is often less than useful - as they are pages that the webmaster often/usually didn't want showing up at all. Google shows nothing at all for the URL when a page is blocked by robots meta noindex data.
Many sites have a bot-blocker that activates for anything that accesses stuff explicitly listed as "keep out" in the robots.txt file so beware of that.
Does your bot check robots.txt every time it goes to spider a site, just before starting, or does it cache the robots data for a while like some other bots - who then stumble into blocked areas, blocked only recently but after the last visit the bot made to check the robots file?
For robots.txt, what does your bot do when there is a section for "all" (User-agent: *) and a section especially for your bot? In that particular case, Google reads ONLY the section for their bot, so you have to list everything for Google in that one section, necessarily repeating everything that is already listed in the "all" section again. If your bot reads both sections, how does it prioritise conflicting rules?
Does your bot understand wildcard notation, like Disallow: /*/thatfolder for example?
Rhetorical questions, in the main, I'm sure your guys have thought carefully about these things. However, as suggested above, the place for those answers, in detail, is on a "for webmasters" section of your site.
As someone already wrote, they seem to be incapable of dealing with anything beyond the English alphabet. If they have international ambitions, they will need to learn.
|If they have international ambitions, they will need to learn. |
Perhaps they don't have such ambitions. I'm all for multiple language support, but it is very very hard (therefore expensive) - you wouldn't launch a service that attempted to have that, you'd wait until you've grown and proven your business case in one language, then expand.
There are 1.5 billion English language speakers in the World according to Wikipedia. Seems to me like a decent starting place.
I support the idea, but I've never had any of my sites translated into other languages. They are all in English, and I get decent traffic despite that.
|When I got to one particular result, and as I clicked the next result to show it, the previous result asserted itself to scroll back into view and be the main displayed result. |
I've found this too (Chrome) the view bar / arrows respond to clicks, but also to mouseover - so if you want a stable view, you have to click then move the cursor well away from the navigation.
More annoying is that when you open the page, I need to click on the search box to start typing, even though it indicates that I should be able to write straight away.
Plus with some searches (?all) there's a transient 'pop-up box' - but it disappears before I can get to pull it into plain view - Chrome shows pop-ups at the foot of the screen, all that shows is a bar saying 'blocked popup', which you can then drag up if want to see it.
I see nothing like that on FF, however.
| This 102 message thread spans 4 pages: < < 102 ( 1 2  4 ) > > |