homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Alternative Search Engines
Forum Library, Charter, Moderators: bakedjake

Alternative Search Engines Forum

NEEDED: Search Engine that DOES NOT ignore SCRIPT TAGS
very usefull thing, i wonder why i can't find it?

5+ Year Member

Msg#: 3274052 posted 3:57 pm on Mar 7, 2007 (gmt 0)

The need , is of course, to see which websites use a specific web-service (i.e. a commercial javascript/flash component).

Specifically i want the search engine to tell me which website have a reference like this:

< script langauge=javascript src="http:\\www.mycompetitor.com\go.js" >

Its a way to know how many websites are using your COMPETITORS webservice :-)



WebmasterWorld Senior Member 10+ Year Member

Msg#: 3274052 posted 4:20 pm on Mar 7, 2007 (gmt 0)

I agree. This has been om my wish list for ages, that is source code search from common web pages.

Anyway, search google for [code search] and you'll find a few code search engines. However, these are usually not searching through source code on common web pages, specifically Google's own "codesearch" is not.


5+ Year Member

Msg#: 3274052 posted 3:07 pm on Mar 8, 2007 (gmt 0)

yeh, indeed code search is irrelevnat as it searches only open source repositories etc..

What about these these Options:

1. link searchers. maybe there are search engines that do "page-link searches" and these may process <script src=>?

2. companies who do custom crawling of the internet?


5+ Year Member

Msg#: 3274052 posted 2:58 pm on Mar 11, 2007 (gmt 0)

Any ideas? I am willing to pay for a working solution


10+ Year Member

Msg#: 3274052 posted 12:48 pm on Mar 12, 2007 (gmt 0)

Have you looked on the Amazon Web Services ( [amazon.com...] )?

It gives you programmatic access to loots of the data Alexa have crawled.

You can use that and write a program that goes through each page and look for a pattern.

Donít know if they give you access to the raw html, that you will need, or only allow you to access preparsed text.


5+ Year Member

Msg#: 3274052 posted 11:24 pm on Mar 13, 2007 (gmt 0)

thanks runarb,
amazon has these searchable fields :


they have a "links" , but by the example they gave it "smells" like the normal anchor link. so no concrete reason to put the effort in testing it. (also , in alexa , link-search did not give what i wanted)

Thanks, Any more ideas, anyone?


10+ Year Member

Msg#: 3274052 posted 11:00 am on Mar 16, 2007 (gmt 0)

With Amazon Web Services you can write a program that access all the data Amazon have crawled directly. One writes a program that will run on Amazonís servers and go through every page they have.

Your program could then search each page for your pattern, and save the url for each page that has it.

For example this sample
[alexa.com...] uses AWS and Ruby to access image headers to make them searchable.

More samples her: [alexa.com...]

What you can do is write your own such program to create such a service.


5+ Year Member

Msg#: 3274052 posted 10:25 pm on Mar 18, 2007 (gmt 0)

i wasnt able to go through all the code. i trust u it can work.
but its too heavy programming for me. its a little reinventing of the wheel.. i am sure somewhere in the net there is a ready made solution :-) just cant find it :-(


Msg#: 3274052 posted 6:17 pm on Mar 31, 2007 (gmt 0)

By now there ought to be a search engine that understands

The basic approach would be to run the page in a dummy browser, up to the point that OnLoad has been executed, then parse the document object model (DOM). This would index the page as it displays in a Javascript-enabled browser. Useful for finding Javascript ad links, even if obfuscated.

This has been suggested by others (see "http://tadhg.com/wp/2007/01/30/walking-the-html-dom-without-a-browser/")
but nobody seems to have done this yet?


WebmasterWorld Senior Member 10+ Year Member

Msg#: 3274052 posted 10:25 am on Apr 20, 2007 (gmt 0)

For me the purpose isn't really to "view the page as in a browser" - it is to search through the document source text including tags, errors, comments, javascript; the lot!

Say you are looking for a page that uses a certain javascript free script, in order to get a clue about how to implement it. With such a search engine you could input some part of the fress script and immediately find a page where that bit was in the source.

Other examples: get an estimate of the number of pages that runs a specific script or a specific ad-programme, or use "nofollow" on links, or have the word "sex" in meta descriptions, or had an inline style class called "joe".

Really, there is an enourmous amount of useful stuff you could do with a source search engine.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Alternative Search Engines
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved