R Markdown I love this intro photo from the tidyTuesday page. This week’s tidyTuesday data cover violations of the GDPR (General Data Protection Regulations) regulatory regime for data privacy in the European Union. The Wikipedia entry on GDPR is fairly extensive. The dataset is large and suggests some interesting regulatory arbitrage. Let’s have a look at the data. First, let’s load them. gdpr_violations <- readr::read_tsv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-21/gdpr_violations.tsv') ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## id = col_double(), ## picture = col_character(), ## name = col_character(), ## price = col_double(), ## authority = col_character(), ## date = col_character(), ## controller = col_character(), ## article_violated = col_character(), ## type = col_character(), ## source = col_character(), ## summary = col_character() ## ) gdpr_text <- readr::read_tsv('https://raw.
In previous work with Skip Krueger, we conceptualized bond ratings as a multiple rater problem and extracted measure of state level creditworthiness. I had always had it on my list to do something like this and recently ran across a package called geofacet that makes it simply to easy to do. The end result should parse out state level credit risk and showcase the time series of credit risk for each of the states.
Beer Distribution The #tidyTuesday for March 31, 2020 is on beer. The essential elements and a method for pulling the data are shown: Imgur A Comment on Scraping .pdf The Tweet The details on how the data were obtained are a nice overview of scraping .pdf files. The code for doing it is at the bottom of the page. @thomasmock has done a great job commenting his way through it.
The New York Times has a wonderful compilation of United States on the novel coronavirus. The data are organized as a panel for US counties and have been continuously collected and updated since March of 2020. For US data, it is as authoritative a source as I am aware of and it provides a nice basis for visualizing various aspects of the pandemic. This commentary was originally provided in late March of 2020.
The Johns Hopkins dashboard This is what Johns Hopkins has provided as a dashboard using ARCGIS. They have essentially layered out the data into national and subnational data and then used the arcgis dashboard to cycle through them. The data There are a few different types of data available. I am relying on the same sources that Johns Hopkins is using for the county level incident data.
Oregon COVID data I now have a few days of data. These data are current as of March 24, 2020. I will present the first version of these visualizations here and then move the auto-update to a different location. A messy first version of the scraping exercise is at the bottom of this post. paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData") ##  "https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID2020-03-24.RData" load(url(paste0("https://github.com/robertwwalker/rww-science/raw/master/content/R/COVID/data/OregonCOVID",Sys.Date(),".RData"))) A base map Load the tigris library then grab the map as an sf object; there is a geom_sf that makes them easy to work with.