You need to use the data from internet, but don’t type, you can just extract or scrape them if you know the web URL.
Thanks to XML package from R. It provides amazing readHTMLtable() function.
For a study case,
I want to scrape data:
US Airline Customer Score.
World Top Chess Players (Men).
A. Scraping US Airline Customer Score table from
http://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Airlines
Code:
airline = ‘http://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Airlines’
airline.table = readHTMLTable(airline, header=T, which=1,stringsAsFactors=F)
Result:
B. Scraping World Top Chess players (Men) table from http://ratings.fide.com/top.phtml?list=men
Code:
chess = ‘http://ratings.fide.com/top.phtml?list=men’
chess.table = readHTMLTable(chess, header=T, which=5,stringsAsFactors=F)
Result:
Done. You had successfully scraping data from any web page with CloudStat.
You can get the full version of this study case (code and result) at Scraping table from html web.
Then, you can analyze as usual! Great! No more retype the data. Enjoy!
Source:http://www.r-bloggers.com/scraping-table-from-html-web-with-cloudstat/
Thanks to XML package from R. It provides amazing readHTMLtable() function.
For a study case,
I want to scrape data:
US Airline Customer Score.
World Top Chess Players (Men).
A. Scraping US Airline Customer Score table from
http://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Airlines
Code:
airline = ‘http://www.theacsi.org/index.php?option=com_content&view=article&id=147&catid=&Itemid=212&i=Airlines’
airline.table = readHTMLTable(airline, header=T, which=1,stringsAsFactors=F)
Result:
B. Scraping World Top Chess players (Men) table from http://ratings.fide.com/top.phtml?list=men
Code:
chess = ‘http://ratings.fide.com/top.phtml?list=men’
chess.table = readHTMLTable(chess, header=T, which=5,stringsAsFactors=F)
Result:
Done. You had successfully scraping data from any web page with CloudStat.
You can get the full version of this study case (code and result) at Scraping table from html web.
Then, you can analyze as usual! Great! No more retype the data. Enjoy!
Source:http://www.r-bloggers.com/scraping-table-from-html-web-with-cloudstat/
No comments:
Post a Comment