Forum Moderators: open
I will transfer country info from cia factbook to my site.
I have see that some other sites make use of the subdivision of the site cia factbook html pages, look below.
<!-- FileName="Connection_cf_dsn.htm" "" -->
<!-- Type="CFDSN" -->
<!-- Catalog="" -->
<!-- Schema="" -->
<!-- HTTP="true" -->
Has someone experience with this how to extract info from a other page.
Thanks
FDR
you should use a server-side language... doing this with Javascript is well-near impossible due to cross-browser security safeguards. Any language that can make HTTP requests and do string manipulation will do. Take your pick: ASP, PHP, Perl, Python, and about a dozen others
1) create the HTTP request (usually a GET request)
2) send the request, get the results back as a string
3) parse the string and grab the parts you want out of it (I recommend using Regular Expressions)
The technique itself is not unethical; it's done all the time when building applications that use public-facing APIs, RSS feeds, and the like. Then it's not "scraping", it's... consuming a web service.
Of course I would never personally scrape content from another site. that's just wrong.
Can I use some or all of The World Factbook for my Web site (book, research project, homework, etc.)?
The World Factbook is in the public domain and may be used freely by anyone at anytime without seeking permission. However, US Code prohibits use of the CIA seal in a manner which implies that the CIA approved, endorsed, or authorized such use. If you have any questions about your intended use, you should consult with legal counsel. Further information on The World Factbook's use is described on the Contributors and Copyright Information page. As a courtesy, please cite The World Factbook when used.
[cia.gov...]
Much of the data provided on US government websites are public domain. Although you should always check to make sure since the last entity you ever want legal trouble with is the U.S. government! ;)