November 9, 2022
Last update: November 15. 2022
In preparation for the dumpster fire that is Oregon election reporting, I previously posed on importing a directory of .csv files. At present, that is what I can find to build this. What does the interface look like?
library(magick) Img <- image_read("./img/SShot.png") image_ggplot(Img) This is terrible, there is a javascript button to download each separately. Nevertheless, here we go.
Archigos
Is an amazing collaboration that produced a comprehensive dataset of world leaders going pretty far back; see Archigos on the web. For thinking about leadership, it is quite natural. In this post, I want to do some reshaping into country year and leader year datasets and explore the basic confines of Archigos. I also want to use gganimate for a few things. So what do we know?
tidyquant
Automates a lot of equity research and calculation using tidy concepts. Here, I will first use it to get the components of the S and P 500 and pick out those with weights over 1.25 percent. In the next step, I download the data and finally calculate daily returns and a cumulative wealth index.
library(tidyquant)
library(tidyverse)
tq_index("SP500") %>% filter(weight > 0.0125) %>% select(symbol,company) -> Tickers
Tickers <- Tickers %>% filter(symbol!
Trump’s Tone
A cool post on sentiment analysis can be found here. I will now get at the time series characteristics of his tweets and the sentiment stuff. This plays off of a previous post.
I start by loading the tmls object that I created in the previous post. To capture constant content over time, there I describe how to download and break up the timeline.
Trump’s Overall Tweeting
What does it look like?
Mining Twitter Data
Is rather easy. You have to arrange a developer account with Twitter and set up an app. After that, Twitter gives you access to a consumer key and secret and an access token and access secret. My tool of choice for this is rtweet because it automagically processes tweet elements and makes them easy to slice and dice. I also played with twitteR but it was harder to work with for what I wanted.
This week’s tidyTuesday focuses on degrees and majors and their deployment in the labor market. The original data came from 538. A description of sources and measures. The tidyTesday writeup is here.
library(tidyverse)
options(scipen=6)
library(extrafont)
font_import()
## Importing fonts may take a few minutes, depending on the number of fonts and the speed of the system.
## Continue? [y/n]
Major.Employment <- read.csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2018/2018-10-16/recent-grads.csv")
library(skimr)
skim(Major.Employment)
Table 1: Data summary
Name
Major.
The tidyTuesday for this week is coffee chain locations
For this week:
1. The basic link to the #tidyTuesday shows an original article for Week 6.
First, let’s import the data; it is a single Excel spreadsheet. The page notes that starbucks, Tim Horton, and Dunkin Donuts have raw data available.
library(readxl)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.