# Charts and Graphs

### Vision Zero

 NYC Traffic Casualties year casualties fatalities injuries 2009 47,438 225 47,213 2010 49,167 222 48,945 2011 46,237 211 46,026 2012 45,802 237 45,565 2013 46,585 260 46,325 2014 42,695 213 42,482 2015 43,766 207 43,559 2016 41,652 186 41,466 2017 37,007 170 36,837 source:   NYC Dept. of Transportation

In 2014, New York City set a goal of eliminating traffic fatalities and injuries. The Vision Zero initiative reduced the speed limit throughout the city from 35 to 25 miles per hour and changed traffic rules on many roads and many intersections.

We want to test the null hypothesis that Vision Zero had no effect on traffic fatalities and injuries. Since the initiative began at a fixed point in time, one way to explore the question is to compare the two time periods.

As shown in the barplot and the table, the number of "fatalities" and "injuries" has dropped dramatically since the Vision Zero initiative began in 2014.

Between 2013 and 2017, the number of injuries decreased 20 percent and the number of fatalities decreased 35 percent. (I have defined "casualties" as the sum of "fatalities" and "injuries," so it follows the downward trend in "injuries").

To create the table that allows us to make such comparisons, we can use the tapply and cbind functions.

Noting that we want to sum across years, we set the FUN argument to sum, we set the INDEX argument to nycdot\$year and we sum over nycdot\$Casualties, nycdot\$Fatalities and nycdot\$Injuries:

## make Casualties, Fatalities and Injuries by Year
cas_table <- tapply( X = nycdot\$Casualties , INDEX = nycdot\$year , FUN = sum )
fat_table <- tapply( X = nycdot\$Fatalities , INDEX = nycdot\$year , FUN = sum )
inj_table <- tapply( X = nycdot\$Injuries  , INDEX = nycdot\$year , FUN = sum )

Then we cbind the individual tables together, fix the column names and print the result:

## bind them into one table
cfi_table <- cbind( cas_table , fat_table , inj_table )
colnames( cfi_table ) <- c("Casualties","Fatalities", "Injuries")
print( cfi_table )

And we can use the barplot function to plot the trend over the years:

## set colors and text for barplot
colorseq <- c( rep("blue",5) , rep("orange",4) )
legendVZ <- c("before Vision Zero","after Vision Zero")

## create the barplot
barplot( cas_table , main = "NYC Traffic Casualities", col = colorseq )
legend( "topright" , legend = legendVZ , fill = c("blue","orange") )

More detailed usage of these functions can be found in the R script and R library that I wrote for this analysis.

Another way of exploring the question is to compare distributions. In our discussion of cross-tabulations, we will show that the distribution of average monthly casualties has shifted towards zero. In other words, the percentage of intersections with fewer than 0.05 average monthly casualties has grown since the Vision Zero initiative began.

### Minimum Wage

In recent years, some US states and cities have moved toward raising their minimum wage to \$15 per hour. Others have kept their minimum wage at the federal level of \$7.25 per hour.

We want to know how the minimum wage affects the opportunities available to workers. In our discussion of regression, we will formally test the null hypothesis of zero correlation between state minimum wage rates and employment rates after controlling other factors.

The time-series graphs below suggest that there is a small, positive correlation. Employment rates in states with "high" minimum wages are about the same as those in "medium" minimum wage states, but employment rates are much lower in "low" minimum wage states.

To create time-series comparisons like these, we first use the weighted.mean function to create a series of tables for each group of states. But to prevent the largest four states -- California, Texas, Florida and New York -- from dominating the weighted average, I have analyzed them separately.

## create table of employment rates over time in "high" minimum wage states
HiMinWageTbl <- data.frame( year = c(2001:2016) , emp_rate = rep(NA,16) )
row.names( HiMinWageTbl ) <- paste(c(2001:2016))

for (yr in c(2001:2016) ) {
GrpAndYear <- which( dta\$state %in% ListHiMinWage & dta\$year == yr )
HiMinWageTbl[paste(yr), "emp_rate"] <- weighted.mean(
x = dta[GrpAndYear,"emp_rate"] , w = dta[GrpAndYear,"civpopMil"])
}

To make it easy to create a large number of time-series graphs, it's helpful to write a few of functions:

## functions to make plotting easy
easyplot <- function( xx , yy , pch , main , ylim , ylab , yseq ) {
plot( xx , yy , type = "b" , axes = FALSE , pch = pch , main = main ,
xlim = c(2000,2016) , xlab = "" , ylim = ylim , ylab = ylab )
axis( 1 , at = seq(2000,2016,2) )
axis( 2 , at = yseq , pos = 2000 )
}
easypoints <- function( xx , yy , pch ) {
points( xx , yy , type = "b" , pch = pch )
}

Then we plot each group of states over time:

## plot the "high" minimum wage states
easyplot( xx = HiMinWageTbl\$year , yy = HiMinWageTbl\$emp_rate,
pch="H" , main = "Employment Rate by Minimum Wage",
ylab = "Employment Rate" , ylim = c(56,66) , yseq = seq(56,66,2) )

## add the "medium" and "low" minimum wage states
easypoints( xx = MdMinWageTbl\$year , yy = MdMinWageTbl\$emp_rate , pch="M")
easypoints( xx = LoMinWageTbl\$year , yy = LoMinWageTbl\$emp_rate , pch="L")

## add legend to the plot
legend("bottomleft", lwd=c(1,1,1), inset=0.075, pch=c("H","M","L"),
legend = c("\"high\" min. wage states", "\"medium\" min. wage states",
"\"low\" min. wage states"))

More detailed examples can be found in the R script that I wrote for this analysis.

In our discussion of regression, we will show that the positive correlation between state minimum wage rates and employment rates remains even after controlling for the effect of other variables, such as inflation, average annual pay and state and year fixed effects.