A growing amount of data is available over the web. However, this data is usually presented in an unstructured HTML format which poses a challenge to researchers who want to automatically capture the data and convert it into a form appropriate for analysis. Web scraping is a computational method that offers means to meet such challenges. In this workshop you will learn how to scrape unstructured web pages using rvest R package and prepare the captured data for analysis. You will gain some hands-on experience working on a few small projects that underlie common scraping strategies/issues. The last project will include scraping of multiple web pages.

Requirements 

  • Functional knowledge of commonly used base R commands (for an overview see https://www.rstudio.com/wp-content/uploads/2016/05/base-r.pdf)
  • Participants will need to have R and RStudio installed on their device prior to attending the workshop

Upcoming workshops

No upcoming workshops available.