Data collection

The code and data for various project I have worked in can be found through this github link.

Geodata from digital sources can be accessed through Application Programming Interface(APIs) or webpage scraping. The code for obtaining data through APIs or scraping can be found in this github repository Python code.

APIs

This is the best avenue for accessing data as its the official channel that websites use to share their data in a manner that ensures they are not overrun with requests. I have gathered data using APIs from the following websites: Twitter, Wikpedia and Flickr and the data was downloaded to a PostgreSQL database.

Webpage scraping

This might result to your IP address being blocked, therefore use with caution. Check the the site's robots.txt file to learn about accessible URLs. A python web scraper can provide free alternative comprehensive access to the website's data using libraries such as Selenium WebDriver for web browser automation functionality which mimic user browsing interactions and BeautifulSoup for parsing the HTML document. BeautifulSoup parses texts from the HTML document to extract review content. Yelp, GoogleMaps and TripAdvisor apages were scraped and the data dowloaded as CSV files.