“In February 2021, the state of Texas suffered a major power crisis, which came about as a result of three severe winter storms sweeping across the United States on February 10–11, 13–17, and 15–20.”1 For more background, check out these engineering and political perspectives.
For this assignment, you are tasked with:
- estimating the number of homes in Houston that lost power as a result
of the first two storms
- investigating if socioeconomic factors are predictors of communities
recovery from a power outage
Your analysis will be based on remotely-sensed night lights data, acquired from the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi satellite. In particular, you will use the VNP46A1 to detect differences in night lights before and after the storm to identify areas that lost electric power.
To determine the number of homes that lost power, you link (spatially join) these areas with OpenStreetMap data on buildings and roads.
To investigate potential socioeconomic factors that influenced recovery, you will link your analysis with data from the US Census Bureau.
Use NASA’s Worldview to explore the data around the day of the storm. There are several days with too much cloud cover to be useful, but 2021-02-07 and 2021-02-16 provide two clear, contrasting images to visualize the extent of the power outage in Texas.
VIIRS data is distributed through NASA’s Level-1 and Atmospheric Archive & Distribution System Distributed Active Archive Center (LAADS DAAC). Many NASA Earth data products are distributed in 10x10 degree tiles in sinusoidal equal-area projection. Tiles are identified by their horizontal and vertical position in the grid. Houston lies on the border of tiles h08v05 and h08v06. We therefore need to download two tiles per date.
As you’re learning in EDS 220, accessing, downloading, and preparing
remote sensing data is a skill in it’s own right! To prevent this
assignment from being a large data wrangling challenge, we have
downloaded and prepped the following files for you to work with, stored
in the VNP46A1
folder.
VNP46A1.A2021038.h08v05.001.2021039064328.h5.tif
: tile
h08v05, collected on 2021-02-07VNP46A1.A2021038.h08v06.001.2021039064329.h5.tif
: tile
h08v06, collected on 2021-02-07VNP46A1.A2021047.h08v05.001.2021048091106.h5.tif
: tile
h08v05, collected on 2021-02-16VNP46A1.A2021047.h08v06.001.2021048091105.h5.tif
: tile
h08v06, collected on 2021-02-16Typically highways account for a large portion of the night lights observable from space (see Google’s Earth at Night). To minimize falsely identifying areas with reduced traffic as areas without power, we will ignore areas near highways.
OpenStreetMap (OSM)
is a collaborative project which creates publicly available geographic
data of the world. Ingesting this data into a database where it can be
subsetted and processed is a large undertaking. Fortunately, third party
companies redistribute OSM data. We used Geofabrik’s download sites to
retrieve a shapefile of all highways in Texas and prepared a Geopackage
(.gpkg
file) containing just the subset of roads that
intersect the Houston metropolitan area.
gis_osm_roads_free_1.gpkg
We can also obtain building data from OpenStreetMap. We again
downloaded from Geofabrick and prepared a GeoPackage containing only
houses in the Houston metropolitan area.
gis_osm_buildings_a_free_1.gpkg
We cannot readily get socioeconomic information for every home, so
instead we obtained data from the U.S. Census Bureau’s
American Community Survey for census tracts in 2019. The
folder ACS_2019_5YR_TRACT_48.gdb
is an ArcGIS “file
geodatabase”, a multi-file proprietary format that’s roughly
analogous to a GeoPackage file.
You can use st_layers()
to explore the contents of the
geodatabase. Each layer contains a subset of the fields documents in the
ACS
metadata.
The geodatabase contains a layer holding the geometry information,
separate from the layers holding the ACS attributes. You have to combine
the geometry with the attributes to get a feature layer that
sf
can use.
Below is an outline of the steps you should consider taking to achieve the assignment tasks.
For improved computational efficiency and easier interoperability
with sf
, I recommend using the stars
package
for raster handling.
stars
object for each date
(2021-02-07 and 2021-02-16)st_mosaic
library(sf)
library(tidyverse)
library(terra)
library(here)
library(tmap)
library(stars)
rm(list = ls())
setwd(here())
day_night_band_day38_tile5 <- read_stars("data/VNP46A1/VNP46A1.A2021038.h08v05.001.2021039064328.tif")
day_night_band_day38_tile6 <- read_stars("data/VNP46A1/VNP46A1.A2021038.h08v06.001.2021039064329.tif")
day_night_band_day47_tile5 <- read_stars("data/VNP46A1/VNP46A1.A2021047.h08v05.001.2021048091106.tif")
day_night_band_day47_tile6 <- read_stars("data/VNP46A1/VNP46A1.A2021047.h08v06.001.2021048091105.tif")
feb_7_comb <- st_mosaic(day_night_band_day38_tile5, day_night_band_day38_tile6)
feb_16_comb <- st_mosaic(day_night_band_day47_tile5, day_night_band_day47_tile6)
NA
to all locations that experienced a drop of
less than 200 nW cm-2sr-1difference <- (feb_7_comb - feb_16_comb) > 200
difference[difference == FALSE] <- NA
st_as_sf()
to vectorize the blackout maskst_make_valid
vec_diff <- st_as_sf(difference)
vec_diff <- st_make_valid(vec_diff)
st_polygon
st_sfc()
and assign a CRSpt1 <- st_point(c(-96.5, 29))
pt2 <- st_point(c(-96.5, 30.5))
pt3 <- st_point(c(-94.5, 30.5))
pt4 <- st_point(c(-94.5, 29))
coords <- list(rbind(pt1, pt2, pt3, pt4, pt1))
polygon <- st_polygon(x = coords)
houston <- st_sfc(polygon, crs = "EPSG:4326")
cropped_diff <- vec_diff[houston, op = st_intersects]
cropped_diff <- st_transform(cropped_diff, crs = "EPSG:3083")
The roads geopackage includes data on roads other than highways.
However, we can avoid reading in data we don’t need by taking advantage
of st_read
’s ability to subset using a SQL query.
st_read
st_buffer
st_buffer
produces undissolved buffers, use
st_union
to dissolve themquery <- "SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"
highways <- st_read("data/gis_osm_roads_free_1.gpkg", query = query)
query <- "SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"
highways <- st_read("data/gis_osm_roads_free_1.gpkg",
query = query,
quiet = TRUE) %>%
st_transform(crs = "EPSG:3083")
highway_buffer <- st_buffer(highways, dist = 200)
highway_union <- st_union(highway_buffer)
blackout_nohighway <- st_difference(cropped_diff, highway_union)
st_read
and the following
SQL query to select only residential buildingsSELECT *
FROM gis_osm_buildings_a_free_1
WHERE (type IS NULL AND name IS NULL)
OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')
query <- "SELECT * FROM gis_osm_buildings_a_free_1 WHERE (type IS NULL AND name IS NULL) OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')"
buildings <- st_read("data/gis_osm_buildings_a_free_1.gpkg",
query = query,
quiet = TRUE) %>%
st_transform(crs = "EPSG:3083")
blackout_nohighway <- st_difference(cropped_diff, highway_union)
affected_buildings <- buildings[blackout_nohighway, op = st_intersects]
nrow(affected_buildings)
st_read()
to load the geodatabase layersACS_2019_5YR_TRACT_48_TEXAS
layerX19_INCOME
layerB19013e1
acs_geoms <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb",
layer = "ACS_2019_5YR_TRACT_48_TEXAS",
quiet = TRUE)
acs_income <- st_read("data/ACS_2019_5YR_TRACT_48_TEXAS.gdb",
layer = "X19_INCOME",
quiet = TRUE)
median_income <- acs_income[c("GEOID", "B19013e1")]
acs_data <- dplyr::left_join(acs_geoms,
median_income,
by = c("GEOID_Data" = "GEOID")) %>%
st_transform(crs = "EPSG:3083")
tracts_with_blackout_info <- st_join(acs_data,
affected_buildings,
left = TRUE) %>%
group_by(NAME) %>%
summarise(blackout = any(!is.na(osm_id)), median_income = first(B19013e1))
Wikipedia. 2021. “2021 Texas power crisis.” Last modified October 2, 2021. https://en.wikipedia.org/wiki/2021_Texas_power_crisis.↩︎