As introduced in the previous unit on Thematic Maps, the three ordinal-level thematic maps explained below are shown in the organizational table in that unit with the word "Classified" as part of their names. The classified graduated symbols map, classified flow map and classified choropleth map are so named because they require that, if the map theme is based on quantitative data, then the classification procedure (also previously demonstrated) must be considered part of the map's construction. There are data situations where that is not the case, so we will start with those.
The most common scenario in which you will encounter data that come to you already in ordinal form, is if the source of the data or a previous holder of the data performed the classification procedure. In that scenario, you do not see the original data values; all you see are the class names. Even if the class names or some other documentation for the data shows you the upper and lower limits of each class, you do not know each observation's original quantitative data value. You are stuck with those classification results, and you do not even know whether they performed the procedure well. The best you can do is to make the clearest map possible using the data you have.
The second scenario is that the data came to you in ordinal categories because that was how it was collected in the first place. Consider a survey in which you are not asked to enter a number value in response to a question, but to identify which ordinal category you identify with. For non-spatial example, “Do you make a) $0 to $19,999 per year, b) $20,000 to $49,999 per year, c) $50,000 to $149,999, or d) $150,000 or more?” In that scenario no one ever had quantitative data. Another example that might be part of a spatial data set of the storm tracks of hurricanes (or equivalent cyclones or typhoons) would be an attribute table field recording the maximum storm categories. The possible values are "Tropical Depression," "Tropical Storm," "1," "2," "3," "4" and "5." The values are primarily based on wind speed and duration.
Most classified thematic maps are maps based on a quantitative data variable that has been run through the classification procedure. Classification, in this context, means that the original quantitative level of measurement variable has been replaced by a new ordinal level variable for mapping purposes.
I should note that ArcGIS does have a built-in ability to graph the data values and does give you a default "natural breaks" classification, but neither meets the guidelines I have set. The Jenks natural breaks classification algorithm used in ArcGIS struggles with irregularly spaced outliers in the data variable, frequently creating classes toward the middle of the distribution that have too many observations. This would be easier to see and work around if it displayed the observations as I have required, with a number line graph where the data values are plotted as dots. Theirs is sort of a histogram with very small evenly spaced classes. The narrow vertical bars of varying heights is a poor way to represent the data values compared to Excel's XY scatter graph.
In ArcGIS we have two ways to create classified thematic maps. The first is to interact with the data variable outside of ArcGIS, say in Excel, where we run it through the graphing and natural breaks classification process. The final step within Excel would be to add a new data variable (column) that contains just the class values such as "Class 1," "Class 2," etc. You would have to keep a record of your classification decisions so that you can customize the legend with data value ranges instead of the class names.
The second, easier procedure is to work out the classification in Excel, as previously taught, to decide on the class limits. Then, use ArcGIS's built-in abilities to take the quantitative variable and create the appropriate thematic map: choropleth, which ArcGIS calls "graduated colors," or graduated symbols. In both cases ArcMap's dialog includes a listing of the default number of classes and the class limits of each class. Those parameters are not likely to match your own research, so be sure to replace the class limits determined by ArcGIS with yours. It may be tempting to trust ArcGIS to get it right, but we have to keep in mind that ArcGIS's algorithms are based on very general assumptions covering a wide variety of scenarios, while yours are based on close familiarity with your dataset.
A future session will take you through the process of taking your ArcMap "Data View" map into the "Layout View" dialog and adding the map legend and other elements to the map.
It is worth noting here, however, that classified maps have a unique distinguishing feature. The legends of classified thematic maps will show all the different symbols used on each map, whether they are colors, shapes or lines. Next to each symbol in the legend for such a map will be the data range represented by that symbol color or size. The key thing to notice is that they are indeed ranges of number values, such as "1 - 500" and not single values such as "500."
In a choropleth map areas are colored according to their data value. Usually the data are data are divided into classes with one color representing all the geographic features in each data class. The color scheme is selected so that the lightest color (or shade of gray, or pattern of lines or dots) represents the lowest data values and the darkest or brightest color represents the highest data value. Between those extremes the cartographer makes the colors, shades or patterns increase in intensity gradually. The trickiest requirement of a choropleth map, one that is often violated in published maps, is that the data should represent a ratio or density, not a simple count. For example, to show US population distribution, do not map the variable that contains each state's population, but map the variable in which you have calculated each state's population density.
The two maps below are, respectively, bad and good examples of choropleth maps of the same data. The first map is bad because of a poor choice in its color scheme, and also because of strange alignments of Alaska and Hawaii (US Dept. of Transportation). I strongly suspect that its cartographer has chosen its data ranges for each color poorly, as well. When data classes on a map have strictly equal data ranges, I am immediately suspicious that the cartographer who created the map did not start by graphing the data and looking for natural breaks in the data to serve as color change values. At least this one does not make that error.
My re-creation of the map corrects the color flaws and the depiction of Alaska and Hawaii. However, I did not have their original data in order to look for the best data ranges. This is an example of the scenario laid out above of receiving ordinal data already classified.
The first step in creating any choropleth map is to examine and analyze the data variable to be mapped. The purpose is to decide on the best number of classes and the best class breaks. As explained previously, the best graphical way to do this is to graph the data using a number line graph, either manually (e.g., on graph paper) or in Excel.
Graphing in Excel is practical because one of the shapefile's component files can be opened in Excel. However, in order to maintain the larger shapefile's integrity, this is only practical if you work with a copy of the file in question. In your file browser program, locate the group of files that make up the shapefile you want to work with. Make a copy of the file whose filename extension is DBF (that is, *.dbf, where "*" is the name of the larger shapefile collection). Save it with a new name but do not change the ".dbf" filename extension.
The only way now to open the DBF file in Excel is to open Excel first, and then drag the new DBF file into Excel. Next, use Excel's Save As command to make a new XLSX copy of the file.
Finally, once you have chosen a variable, follow the instructions in the previous topic to create the number line graph. Then, use Excel's tools for drawing lines and other shapes to add vertical lines to the graph to show the positions of the class breaks. Write down those class break positions by finding a rounded number that falls within each gap where you placed a break line.
The choropleth map is the thematic map type that is most commonly encountered, and with good reason. Color is a very effective way to display differences, as we also saw for topographic symbolization. Where appropriate, as also shown in the topographic symbology discussion, it is effective to use color schemes that incorporate color associations. For example, a choropleth map displaying money-related data can be shown in shades of green because that is a standard color in American currency. Green also works for agricultural productivity maps for crops, while shades of red might be more appropriate for agricultural meat production data.
Choropleth map color schemes are based on the concept of the "color ramp." The color ramp takes a range of colors from a light shade at one end to a dark or bright shade at the other end. The lightest shade should be associated with the lowest data values while the darkest or brightest shade should represent the highest data values. The specific colors chosen will depend on the number of classes in your classification. This is where the rule of 4-6 classes for choropleth maps becomes important: with too many classes, the colors in adjacent classes become too difficult to distinguish. In the point of the map design process in ArcMap where you choose your color ramp (step 7 below) you will have many options to choose from.
There are exceptions and special considerations for these "color rules." You will notice, when you activate the Color Ramp drop-down list in ArcMap, that many of the choices are not 'light to dark for a single color.' Keep in mind that ArcMap uses the same dialog/list to provide color schemes for topographic mapping, so many of the options are not color "ramps" at all, but randomly sequenced colors; avoid those if you are constructing a choropleth map. Other patterns frequently found on the Color Ramp display are ramps that blend from one color into another, such as shades of yellow to shades of red. These really are inappropriate for basic choropleth maps because they imply that the data are not only changing value but they are also changing in meaning. A simple example would be a color ramp that blends from reds for lower values to greens for higher values; the context for this might be financial data that ranges from negative values representing money owed to positive values representing money earned or profits. Again, be very selective and cautious when using such color ramps.
Use those class breaks again, as follows, to create the choropleth map. Note again that ArcGIS calls this a Graduated Colors map.
Graduated symbols are usually circles, but can be many other shapes. The basic concept is that the symbols increase in size as the data increase in value. Ideally the symbols are representative of the features or the data variable being represented on the map, but that is not always easy to achieve. The circle is considered the most generic symbol and has the added benefit of being the most compact symbol. If two circles are located close to each other or overlap each other on the map, it is easier to tell them apart, based on their shapes, than if they were squares or other shapes.
Graduated circles are unique in that they can represent either point features such as cities or area features such as the states. ArcMap will make the graduated symbols dialog avaiable for either of those types of layer. The advantage of circles for point locations is that they imitate the smaller dots that would be a common topographic symbol for them, with the center of the circles representing the actual place location. With area features it is more desirable that each circle be clearly identifiable with its area feature, even staying inside the area it represents. However, the latter is often not possible. The way the ArcMap software positions the graduated symbol for area features is to calculate the point, known as the centroid, at the center of the shape of the area.
One challenge of graduated symbols is that differences in size are more difficult to perceive than differences in color. Because we are asking map viewers to notice differences in size and compare the size differences between two symbols near or far from each other on the map (including the legend), we should not create as many different classes as we would for a choropleth map (see below).
Other shapes are possible and easily found in the Symbology dialog in ArcMap; hexagons, squares and triangles are other easily resizable symbols. ArcMap has also been increasing the availability of pictorial symbols, all of which are also able to be resized. The challenge with these other shapes is that size differences in them can be more difficult to distinguish, so they have to be exaggerated a bit more than simpler symbols.
Probably the most subtle, but important, aspect of deciding when to use the graduated symbols representation is making sure you have the correct data, especially when you are working with the quantitative data field in ArcMap and not with previously classified data. The data for determining symbol sizes on graduated symbols maps should be a count data variable, and not a density or ratio or similar variable.
Occasionally you will find a graduated symbols map in which the symbol sizes are directly proportional to the data values: for a map of the 50 states there can be up to 50 minutely different circle sizes, each one proportional to the number it represents. These maps are called Proportional Symbols maps and will be treated in a coming session where we deal with thematic maps for quantitative-level data.
The discussion about classification in this context is largely the same as it was for choropleth maps, and even more similar to the discussion below about flow maps. As with the choropleth map, the data are most often divided into classes, with one symbol size representing all of the features in that class. The concept of creating classes that reflect natural clusters and gaps in the data field is just as relevant in the classification of symbol sizes.
In fact, deciding on the classes for a graduated symbols map can be more challenging, given that map readers have a harder time distinguishing size differences than they do distinguishing color differences. That is the reason why fewer classes, three to five, are recommended for classified graduated symbols maps, where four to six classes were considered optimal for a choropleth map. With fewer classes, the question of how to treat outliers becomes more critical. The objective should still be to not let the number of features in one class dominate the other classes.
The theme for this example will be airports. These are point locations at the scale of a US map. The plan is to represent them as graduated circles with the circle sizes representing the number of passengers. The official term for passengers in the airline inducstry is "emplanements."
Flow maps represent quantities of things that flow, or travel, either from an origin to a destination or along particular paths. An example of the former is a map of petroleum trade between countries, with each pair of trading partners represented by a line whose thickness corresponds to the amount of petroleum; the line just links the two trading countries, and does not follow the actual shipping route. An example of the latter type of flow map is a traffic flow map, with each stretch of road varying in thickness according to the average number of vehicles per day that travel on it. On this type of flow map the road is in its proper place on the map.
Strictly speaking, then, the data variable that a flow map is based on, like that of the graduated symbols map above, must be a "count" type of variable. Furthermore, because of the nature of the representation, the data variable should represent something that can be counted (not measured or calculated), at least conceptually. So, we can map the number of trucks per day along each stretch of highway in Pennsylvania but not the percentage of traffic that are trucks. It is a subtle distinction, but important for proper use of flow maps.
As with its cousin, the graduated symbols map, we are focused here on maps displaying ordinal-level data, meaning the data must be classified before mapping, with the map showing only a few different line thicknesses. This is in contrast to the proportional version of the flow map, in which each line's thickness is based directly on its data value.
The comments above about classification for classified graduated symbols maps are virtually identical when talking about classified flow maps. The ideal range to aim for is three to five classes. More than that and line thicknesses are just too subtle to differentiate easily, especially if there are other lines present in the map's background layers, such as boundaries, transportation routes or even the graticule.
Flow map symbology choices are much more limited than for other thematic maps. They are basically just lines of different thicknesses. There are some style variations that can be added, such as edge lines and colors. If you choose to enhance the line thicknesses by adding colors, then make sure they are in an appropriate corresponding “graduated colors” sequence.
Here are the steps to create an ordinal-level flow map in ArcMap: