Acquiring data
*Optional: Getting data from Social Explorer
Editing data
Joining data
Geocoding
Exporting maps
Geoprocessing: Conducting spatial analysis*
Buffers
Clipping
Spatial Joins
*Time permitting
Part I: Introduction to GIS and ESRI
The ESRI way of GIS
The first step in this tutorial is to understand that we are covering the basics of desktop GIS analysis using ESRI’s ArcGIS software suite. This is by no means an all encompassing “entirety of GIS” tutorial, but rather a view on how GIS is used to build maps from ESRI’s perspective, limited by the functionalities of the software covered.
The core function of the ESRI ArcGIS suite lies within two programs:
ArcCatalog – for managing GIS datasets
ArcMap – for mapping GIS datasets
[TBS_ALERT color=”info” heading=”What about OpenSource Alternatives?”]
QGIS is as an alternative to ArcGIS that is free and openly available to the public on all computing platforms. Despite the accessibility of QGIS, there is a steeper learning curve for those learning GIS for the first time. However, those seeking a free low-cost alternative to ArcGIS can apply the concepts learned in this workshop with that program.
A little background: Geographical information in the U.S.A.
Demographic information in the USA is typically arranged in a hierarchical geography, starting from large to small. Starting from States, information gets broken down into Counties or Metropolitan Statistical Areas (MSAs). Each of those are comprised of Census Places which are similar to cities in their size and composition. The neighborhoods of each city are broken down into a Census Tract. Census Tracts are then subdivided further into Census Block Groups. Finally, Census Block Groups compose of Census Blocks, but data is not usually published at this level for privacy concerns.
With geographical ideas in mind, now it is finally time to map something! For this exercise, you are provided with a Workshop [simple_tooltip content=’A geodatabase is shown as a folder. ‘]geodatabase[/simple_tooltip], which is a collection of GIS datasets. A GIS dataset can be any of the following:
a vector layer – points, lines or polygons
a raster layer – an image, Satellite imagery, elevation data
tabular data – excel spreadsheet, csv, etc.
[TBS_ALERT color=”info” heading=”Vector vs. Rasters”]Geographic data is stored either as vector data (as points, lines, or polygons) or raster data (as pixel grids).
Because of these differences in data storage, vector data is best suited for a human geography context (ex. urban planning, transportation forecasting, asset mapping), while raster data are best used for storing data on physical geography (ex. satellite imagery, elevation, watersheds, vegetation).
In ArcGIS, vector data is stored as individual .shp files (or feature classes within a geodatabase), while raster data is stored as .tiffs, .jpgs, or other image formats. [/TBS_ALERT]
In other words, our geodatabase contains one or multiple GIS datasets.
Download and [simple_tooltip content=’Extracting means using a program, such as 7zip to unzip files from a single file.’]extract[/simple_tooltip] Workshop.zip.
There are other files in the zip folder, such as an [simple_tooltip content=’MXD files are files that contain links to the map data and the overall map’]mxd[/simple_tooltip] file and some csv files.
Then locate Workshop.gdb, and put it in a project folder for this workshop. For this workshop, you will learn how to inspect the geodatabase data in ArcCatalog, then use ArcMap to create some maps.
Open up ArcCatalog and click the second button to left, which is the “Connect Folder” button.
Navigate to the Folder where you extracted the “Workshop.zip” file and then select “OK”.
[TBS_ALERT color=”danger” heading=”Do not try to connect a file!”]If you try to connect files, you will notice that the “OK” button is grayed out, connecting folders allows you only to choose folders.[/TBS_ALERT]
View and Preview the data
After you’ve connected the folder, now you can check Folder Connections and open the Folder which you’ve connected.
Locate “Workshop.gdb” and double click it to view its contents.
Browse for CA_Boundary and click the “Preview” tab to view the shape of California.
Adding Layers
Now the time has come to fire up ArcMap and get to map making!
The first step for any GIS project is to have data (more on this later!). In order to add data to your project click on the “Add data” button:
Notice how the connected folder can be selected and datasets be added now? Also, if your map is feeling a bit empty, you can add base maps by clicking the upside down triangle next to the Add Data button. Adding a basemap only provides reference information and nothing else.
[TBS_ALERT color=”info” heading=”ArcCatalog in ArcMap?”] You can also connect folders in ArcMap by clicking a button, but we didn’t do so because we wanted to demo ArcCatalog. You can even access ArcCatalog in ArcMap, but the view is rather constrained, so we opted to demo the standalone program.[/TBS_ALERT]
Vector layers are also referred to as “feature classes” in ESRILand. All GIS datasets can be added in this same way. Now drag each layer and re-order them. If you are familiar with Adobe Photoshop or Illustrator, you will recognize conceptual similarities with layering. What happens when layers are re-ordered? How does this dictate your strategy on building a single flattened map with multiple layers?
Attributes
Every layer (feature class) comes with attributes. This is the all-important “information” part of geographic “information” systems mapping. Data in the attribute tables dictates what can get mapped. Open the attribute table of each layer, and study how each row and column is tied to the mapped element. Questions we will answer include:
What is the unique identifier for each row?
What other attributes exist?
What happens when you select a row on the attribute table?
How do you sort elements?
Can you build custom queries?
Can you build graphs?
Symbolization
Outlines, fills, colors, weight, action! Here is the design phase of creating a map. Consider color choices: grayscale? color schemes? color hierarchy? Inevitably, you will find yourselves in the throes of ESRI’s symbolization quagmire… That said, experiment with two types of symbolization with the workshop data:
Categories -> Unique values
Quantities -> Graduated colors
Labeling
Map elements need labels at times. Consider what needs to be labeled, and what does not. Label sizes, fonts, weights, placement, colors are all things to consider for your map. Understand the relationship between labels, attributes, and layers.
Choropleth Maps
For this section, we will focus on creating a choropleth (which just means a colored map based on numerical data)!
When creating a choropleth the following needs to be considered:
Is the data choropleth-able?
Choropleths work best when representing data where boundaries are important
Conversely, choropleths do not work well when attempting to show data where boundaries are NOT important/irrelevant
Do you have the data in the geographic scale you wish to map it at?
Can you connect the data to an existing layer?
Which coloring style best represents your data?
If your information is continuous then use a single color gradient
If your information has a positive or negative range, use an opposite color scheme
Data, data, data!
Let’s talk about data manipulation in ArcMap, which is one of the core functions of any GIS program. Within ArcMap “joining” or “connecting” data is a fundamental task for working between data from different sources. There are two basic “joining” method available:
Joining – connecting an external data source to a GIS dataset
Spatial Join – connecting data based on geography
This workshop will focus on the first “joining” method, which is more applicable to non-geographic datasets, such as excel spreadsheets, CSVs, and other data tables.
Acquiring data
A) Non-geographic spatial data for GIS analysis can be sourced from different formats, such as:
Regardless of where the data is coming from, the key is that there has to be a column which is able to link the non-geographic data to some spatial data set, such as States, Countries, FIPS Codes, Zipcodes.
Editing data
When the “key” field is which formatted in differently, the table will only join if the connecting table has exactly the same formatting and values.
To ensure this formatting, and to introduce a new concept, there is the ability to “Edit” data in ArcMap.
If you already have data loaded into ArcMap you can edit using either the “Editor” or using the “Field Calculator.” Whenever you decide to edit data, you typically want to add a new field so that you do not accidentally modify other ones. To add a new field you have to open up a table, and then click on “Add Field…”
[TBS_ALERT color=”danger” heading=”Can I edit my excel tables?”]No, unfortunately, you cannot edit Excel spreadsheets, CSVs, and other data tables imported into ArcMap, only GIS datasets! Make edits to your external data outside of ArcMap beforehand![/TBS_ALERT]
Afterwards you can specify the type of field, some of which are defined in the info box below:
[TBS_ALERT color=”info” heading=”Data Types”]
Short or Long Integers – Numbers with no decimals [ex. 12] Float or Double – Numbers with decimals [ex. 12.01] String – Text (any combination of letters and numbers) [ex. Twelve and one hundredth]
[/TBS_ALERT]
A) The Editor allows you to type directly onto the fields to change any values, and is useful when you are creating your data from scratch.
For example: If you have data based on Zipcodes, you add a new field for number of enrolled students, and simply type the number in the field when you select the Zipcode.
B) The “Field Calculator” is used for running calculations and/or operations on the current data.
Right click on a field name in order to access the Field Calculator
Use a formula in order to calculate what the field should be
Joining Data
When you have data with geographic IDs, such as a Zipcode or a FIPS code, you are able to add the table to ArcGIS and then join that to the corresponding geography/GIS file.
[TBS_ALERT color=”info” heading=”What the FIPS?”]A Federal Information Processing Standards (FIPS) code is what you will encounter when working with data from the US Census, it basically has the following format: [STATE] + [COUNTY] + [CENSUS TRACT] + [CENSUS BLOCK GROUP]
For example: 06 + 037 + 2653 + 01 or 06037265301 , which is UCLA’s census tract. [/TBS_ALERT]
There are 3 steps to joining data:
1. Clean up the data in the spreadsheet and make sure that the data fields are the same type in both the origin table and the destination GIS file. An example of what means is that an Integer field will notjoin to a String field!
2. Right click on the layer that you wish to join the data to, and then click on “Join and Relates”
3. Select the field that you are join to in the destination GIS file, and then locate the spreadsheet that you have prepared for the join, and choose the correct field that you have prepared. You can then click “Ok” to complete the join!
Congratulations! You have completed your first join!
4. Now when you navigate to the layer table, you will see that the spreadsheet data was appended to the corresponding layer!
Saving the Join
In order to make the “join” permanent, you can save a new dataset by exporting the data.
Right click on the geographic dataset that the non-geographic spatial data was joined to, in this example the LA_CensusTracts
Go to “Data” then “Export Data…”
Select a location and name for the new shapefile (or Feature Class if you are saving the file back into the Geodatabase)
Utilizing the Join
Now that we have the join up and running, we will utilize what we previously learned to attempt the following challenges:
Challenge 1: Create a map that highlights percent minority populations
Challenge 2: Which census tract has the lowest percent of highschool graduates?
Challenge 3: Can you join the store and obesity tables to census tracts, and find out which store resides in the census tract with the lowest median income (out of the 7 stores)?
Geocoding
Geocoding is the process of assigning a latitude and longitude to addresses, which are then able to be utilized within a spatial context. Unfortunately, ESRI now charges for Geocodes, which makes it quite costly to access this service.
There are less accurate but free alternatives online, including one which the Sandbox has developed itself:
For our (inaccurate) geo-coder, all you have to do is place in a few addresses, and then you can copy and paste that into an Excel file and then save it.
The results from an external geocoder, such as the one above, needs to be pasted into the Excel column before adding it to ArcGIS.
Once any geocoding result is saved as a CSV or Excel file, then it can be displayedinto ArcGIS by going to File -> Add data –> Add XY data.
Finally, your data points will then load on to your map!
Exporting maps
To export a map, you go to File -> Export.
Extra Material – Geoprocessing: Conducting spatial analysis
In addition to editing and visualizing data, GIS can be used to create new data as well. There are three geoprocessing functions that will be covered, which only is a tip of the iceberg when it comes to the various tools that ArcMap provides. Most geoprocessing tools can be found under “Geoprocessing,” aside from geocoding.
Geoprocessing Menu
The drop down for geoprocessing houses all the tools for accessing spatial analysis.
Buffers
A buffer is just a circle around a specific point, line, or polygon which is helpful to see what phenomenon are around which areas. Typically, buffers are identified in linear units (kilometers, miles, etc.).
Select the buffer tool from the geoprocessing drop down, and then select the input as the layer which you want to draw the buffer around. Then specify an output directory/name and the linear distance (kilometers, miles, etc.).
An example of how the buffer options should be when filled out.
Clip
A clip will cut out data from one layer from another, which is useful when you only want to know which features are located within a certain spot. Combining buffers and the clip, results in the map below, which shows the census tracts that 1 mile around geocoded addresses!
For the clip options, the input features is the layer that remains (the cookie dough), while the clip features are the layers which you will use to base the clip from (the cookie cutter). Finally, specify an output feature class for your new file, and execute the clip.
An example of the clip options being filled out
Congratulations! You have completed this introductory GIS workshop, if you would like to check out other self-learning materials, please feel free to look at ESRI’s tutorials:
Welcome to the Introduction to GIS Workshop 2016 Edition!
First, please navigate to this page using the URL below:
http://sandbox.idre.ucla.edu/sandbox/introduction-to-gis-workshop-for-2016
And then download the Workshop tutorial files:
http://sandbox.idre.ucla.edu/Workshops/Workshop2016.zip
Supplemental Links
ESRI’s Self-learning Tutorials: http://www.esri.com/training/main/training-catalog/course-recommendations Social Explorer: http://www.socialexplorer.com/Outline:
*Optional: Getting data from Social Explorer
*Time permitting
Part I: Introduction to GIS and ESRI
The ESRI way of GIS
The first step in this tutorial is to understand that we are covering the basics of desktop GIS analysis using ESRI’s ArcGIS software suite. This is by no means an all encompassing “entirety of GIS” tutorial, but rather a view on how GIS is used to build maps from ESRI’s perspective, limited by the functionalities of the software covered.
The core function of the ESRI ArcGIS suite lies within two programs:
[TBS_ALERT color=”info” heading=”What about OpenSource Alternatives?”]
QGIS is as an alternative to ArcGIS that is free and openly available to the public on all computing platforms. Despite the accessibility of QGIS, there is a steeper learning curve for those learning GIS for the first time. However, those seeking a free low-cost alternative to ArcGIS can apply the concepts learned in this workshop with that program.
For those interested in seeing the comparison between QGIS and ArcGIS you can check out this external article here: http://www.xyht.com/spatial-itgis/qgis-v-arcgis/
[/TBS_ALERT]
A little background: Geographical information in the U.S.A.
Demographic information in the USA is typically arranged in a hierarchical geography, starting from large to small. Starting from States, information gets broken down into Counties or Metropolitan Statistical Areas (MSAs). Each of those are comprised of Census Places which are similar to cities in their size and composition. The neighborhoods of each city are broken down into a Census Tract. Census Tracts are then subdivided further into Census Block Groups. Finally, Census Block Groups compose of Census Blocks, but data is not usually published at this level for privacy concerns.
In short US geography is organized like this:
States → Counties / Metropolitan Statistical Areas → Census Places → Census Tract → Census Block Group → Census Block
Basics of Thematic Mapping
With geographical ideas in mind, now it is finally time to map something! For this exercise, you are provided with a Workshop [simple_tooltip content=’A geodatabase is shown as a folder. ‘]geodatabase[/simple_tooltip], which is a collection of GIS datasets. A GIS dataset can be any of the following:
[TBS_ALERT color=”info” heading=”Vector vs. Rasters”]Geographic data is stored either as vector data (as points, lines, or polygons) or raster data (as pixel grids).
Because of these differences in data storage, vector data is best suited for a human geography context (ex. urban planning, transportation forecasting, asset mapping), while raster data are best used for storing data on physical geography (ex. satellite imagery, elevation, watersheds, vegetation).
In ArcGIS, vector data is stored as individual .shp files (or feature classes within a geodatabase), while raster data is stored as .tiffs, .jpgs, or other image formats. [/TBS_ALERT]
In other words, our geodatabase contains one or multiple GIS datasets.
Download and [simple_tooltip content=’Extracting means using a program, such as 7zip to unzip files from a single file.’]extract[/simple_tooltip] Workshop.zip.
There are other files in the zip folder, such as an [simple_tooltip content=’MXD files are files that contain links to the map data and the overall map’]mxd[/simple_tooltip] file and some csv files.
Then locate Workshop.gdb, and put it in a project folder for this workshop. For this workshop, you will learn how to inspect the geodatabase data in ArcCatalog, then use ArcMap to create some maps.
Here is a look at our Workshop 2016 geodatabase:
Connecting a folder in ArcCatalog
Open up ArcCatalog and click the second button to left, which is the “Connect Folder” button.
Navigate to the Folder where you extracted the “Workshop.zip” file and then select “OK”.
[TBS_ALERT color=”danger” heading=”Do not try to connect a file!”]If you try to connect files, you will notice that the “OK” button is grayed out, connecting folders allows you only to choose folders.[/TBS_ALERT]
View and Preview the data
After you’ve connected the folder, now you can check Folder Connections and open the Folder which you’ve connected.
Locate “Workshop.gdb” and double click it to view its contents.
Browse for CA_Boundary and click the “Preview” tab to view the shape of California.
Adding Layers
Now the time has come to fire up ArcMap and get to map making!
The first step for any GIS project is to have data (more on this later!). In order to add data to your project click on the “Add data” button:
Notice how the connected folder can be selected and datasets be added now? Also, if your map is feeling a bit empty, you can add base maps by clicking the upside down triangle next to the Add Data button. Adding a basemap only provides reference information and nothing else.
[TBS_ALERT color=”info” heading=”ArcCatalog in ArcMap?”] You can also connect folders in ArcMap by clicking a button, but we didn’t do so because we wanted to demo ArcCatalog. You can even access ArcCatalog in ArcMap, but the view is rather constrained, so we opted to demo the standalone program.[/TBS_ALERT]
Vector layers are also referred to as “feature classes” in ESRILand. All GIS datasets can be added in this same way. Now drag each layer and re-order them. If you are familiar with Adobe Photoshop or Illustrator, you will recognize conceptual similarities with layering. What happens when layers are re-ordered? How does this dictate your strategy on building a single flattened map with multiple layers?
Attributes
Every layer (feature class) comes with attributes. This is the all-important “information” part of geographic “information” systems mapping. Data in the attribute tables dictates what can get mapped. Open the attribute table of each layer, and study how each row and column is tied to the mapped element. Questions we will answer include:
Symbolization
Outlines, fills, colors, weight, action! Here is the design phase of creating a map. Consider color choices: grayscale? color schemes? color hierarchy? Inevitably, you will find yourselves in the throes of ESRI’s symbolization quagmire… That said, experiment with two types of symbolization with the workshop data:
Labeling
Map elements need labels at times. Consider what needs to be labeled, and what does not. Label sizes, fonts, weights, placement, colors are all things to consider for your map. Understand the relationship between labels, attributes, and layers.
Choropleth Maps
For this section, we will focus on creating a choropleth (which just means a colored map based on numerical data)!
When creating a choropleth the following needs to be considered:
Data, data, data!
Let’s talk about data manipulation in ArcMap, which is one of the core functions of any GIS program. Within ArcMap “joining” or “connecting” data is a fundamental task for working between data from different sources. There are two basic “joining” method available:
Joining – connecting an external data source to a GIS dataset
Spatial Join – connecting data based on geography
This workshop will focus on the first “joining” method, which is more applicable to non-geographic datasets, such as excel spreadsheets, CSVs, and other data tables.
Acquiring data
A) Non-geographic spatial data for GIS analysis can be sourced from different formats, such as:
B) Social Explorer (http://www.socialexplorer.com/) is a website that enables access to Census Data.
Optional: Social Explorer Tutorial for getting data
Regardless of where the data is coming from, the key is that there has to be a column which is able to link the non-geographic data to some spatial data set, such as States, Countries, FIPS Codes, Zipcodes.
Editing data
When the “key” field is which formatted in differently, the table will only join if the connecting table has exactly the same formatting and values.
To ensure this formatting, and to introduce a new concept, there is the ability to “Edit” data in ArcMap.
If you already have data loaded into ArcMap you can edit using either the “Editor” or using the “Field Calculator.” Whenever you decide to edit data, you typically want to add a new field so that you do not accidentally modify other ones. To add a new field you have to open up a table, and then click on “Add Field…”
[TBS_ALERT color=”danger” heading=”Can I edit my excel tables?”]No, unfortunately, you cannot edit Excel spreadsheets, CSVs, and other data tables imported into ArcMap, only GIS datasets! Make edits to your external data outside of ArcMap beforehand![/TBS_ALERT]
Afterwards you can specify the type of field, some of which are defined in the info box below:
[TBS_ALERT color=”info” heading=”Data Types”]
Short or Long Integers – Numbers with no decimals [ex. 12]
Float or Double – Numbers with decimals [ex. 12.01]
String – Text (any combination of letters and numbers) [ex. Twelve and one hundredth]
[/TBS_ALERT]
A) The Editor allows you to type directly onto the fields to change any values, and is useful when you are creating your data from scratch.
For example: If you have data based on Zipcodes, you add a new field for number of enrolled students, and simply type the number in the field when you select the Zipcode.
B) The “Field Calculator” is used for running calculations and/or operations on the current data.
Right click on a field name in order to access the Field Calculator
Use a formula in order to calculate what the field should be
Joining Data
When you have data with geographic IDs, such as a Zipcode or a FIPS code, you are able to add the table to ArcGIS and then join that to the corresponding geography/GIS file.
[TBS_ALERT color=”info” heading=”What the FIPS?”]A Federal Information Processing Standards (FIPS) code is what you will encounter when working with data from the US Census, it basically has the following format:
[STATE] + [COUNTY] + [CENSUS TRACT] + [CENSUS BLOCK GROUP]
For example:
06 + 037 + 2653 + 01 or 06037265301 , which is UCLA’s census tract. [/TBS_ALERT]
There are 3 steps to joining data:
1. Clean up the data in the spreadsheet and make sure that the data fields are the same type in both the origin table and the destination GIS file. An example of what means is that an Integer field will not join to a String field!
2. Right click on the layer that you wish to join the data to, and then click on “Join and Relates”
3. Select the field that you are join to in the destination GIS file, and then locate the spreadsheet that you have prepared for the join, and choose the correct field that you have prepared. You can then click “Ok” to complete the join!
Congratulations! You have completed your first join!
4. Now when you navigate to the layer table, you will see that the spreadsheet data was appended to the corresponding layer!
Saving the Join
In order to make the “join” permanent, you can save a new dataset by exporting the data.
Utilizing the Join
Now that we have the join up and running, we will utilize what we previously learned to attempt the following challenges:
Challenge 1: Create a map that highlights percent minority populations
Challenge 2: Which census tract has the lowest percent of highschool graduates?
Challenge 3: Can you join the store and obesity tables to census tracts, and find out which store resides in the census tract with the lowest median income (out of the 7 stores)?
Geocoding
Geocoding is the process of assigning a latitude and longitude to addresses, which are then able to be utilized within a spatial context. Unfortunately, ESRI now charges for Geocodes, which makes it quite costly to access this service.
There are less accurate but free alternatives online, including one which the Sandbox has developed itself:
http://sandbox.idre.ucla.edu/tools/geocoder/
For our (inaccurate) geo-coder, all you have to do is place in a few addresses, and then you can copy and paste that into an Excel file and then save it.
Another useful geocoder is this:
http://www.findlatitudeandlongitude.com/batch-geocode/
The results from an external geocoder, such as the one above, needs to be pasted into the Excel column before adding it to ArcGIS.
Once any geocoding result is saved as a CSV or Excel file, then it can be displayedinto ArcGIS by going to File -> Add data –> Add XY data.
Finally, your data points will then load on to your map!
Exporting maps
To export a map, you go to File -> Export.
Extra Material – Geoprocessing: Conducting spatial analysis
In addition to editing and visualizing data, GIS can be used to create new data as well. There are three geoprocessing functions that will be covered, which only is a tip of the iceberg when it comes to the various tools that ArcMap provides. Most geoprocessing tools can be found under “Geoprocessing,” aside from geocoding.
Geoprocessing Menu
The drop down for geoprocessing houses all the tools for accessing spatial analysis.
Buffers
A buffer is just a circle around a specific point, line, or polygon which is helpful to see what phenomenon are around which areas. Typically, buffers are identified in linear units (kilometers, miles, etc.).
Select the buffer tool from the geoprocessing drop down, and then select the input as the layer which you want to draw the buffer around. Then specify an output directory/name and the linear distance (kilometers, miles, etc.).
An example of how the buffer options should be when filled out.
Clip
A clip will cut out data from one layer from another, which is useful when you only want to know which features are located within a certain spot. Combining buffers and the clip, results in the map below, which shows the census tracts that 1 mile around geocoded addresses!
For the clip options, the input features is the layer that remains (the cookie dough), while the clip features are the layers which you will use to base the clip from (the cookie cutter). Finally, specify an output feature class for your new file, and execute the clip.
An example of the clip options being filled out
Congratulations! You have completed this introductory GIS workshop, if you would like to check out other self-learning materials, please feel free to look at ESRI’s tutorials:
http://www.esri.com/training/main/training-catalog/course-recommendations
E-mail for questions: albertk[at]gmx.com