Topic: Transportation in Pittsburgh.

Carnegie Mellon University
Fall 2019
Communication Design Studio 1, Stacie Rohrbach & Brett Yasko

🔶 First Exploration

Image 1: in-class exercise about possible data types to look for

The image above shows the first exploration we did in class when we looked into different areas of interest regarding our chosen topic and possible data types to start looking for. The topics I want to explore further from the list on the whiteboard are:

  • Different modes of transportation (e.g. car, bike, uber, public transit)
  • Infrastructure (e.g. road network, bike rack locations) → this could say something about the accessibility of an area
  • Commute time / distance
  • Reasons for commute & intensity of use

In class, we also talked about making comparisons with different topics. Possible comparisons that interest me are:

  • Transportation — pollution (air quality)
  • Public transportation — accessibility — gentrification

🔶 Why I chose transportation

Image 2: found online about the importance of public transportation.

The picture above is evidence that growing cities need transit, and they need it to work well. I am interested in figuring out where Pittsburgh stands in this discussion, what is their ratio of car, bus, bicycle users and does the infrastructure influence the choice people make in their transportation?

🔶 Examples

🔶 Data sources

We need to get the data to a point that we are comparing apples by apples — can’t compare things with different metrics — need to clean the data until it matches.

https://developers.google.com/transit/gtfs — Link to Pittsburgh's’ Port Authority is broken. So I sent them an email, hopefully they will reply.

🔶 First attempt at possible research questions

  • Neighboorhood accessibility and income. Variables to look at could include: neighborhoods, commute time, #bridges, transit routes, major roads, noise pollution of that area, rent/housing prices

What is the best place to live in Pittsburgh?

  • Evolution of transport over time. Variables to look at could include: Census data 1990 - 2010, modes of transport.

🔶 Examples in class:

  • Big Names →William Playfair — the first visualizer of data & Charles Minard — the march of Moscou.
  • Layering of information — can you combine everything in one visualization?
  • Baby-names example - user has to do something to get the data
  • Changes should be clear to see — bad example is the import and exports of weapons
  • How much can you communicate without text?
  • Combining real images with data — cool way to make data less clinical
  • Shipman — using video to guide through data is really effective!
  • So Fake — using the ‘depth’ of the screen.

🔶Research Question

  • Neighboorhood accessibility and income. Variables to look at could include: neighborhoods, commute time, #bridges, transit routes, major roads, noise pollution of that area, rent/housing prices

🔶 Data variables and where I found them

Rent (2019)

https://www.zillow.com/wind-gap-pittsburgh-pa/home-values/

# of Bridges (2017)

Unfortunately, this data was not listed as neighborhoods but as latitude longitude. This makes the data unusable because I am not comparing apples to apples. Therefore I converted the latitude and longitude of 144 bridges manually!

Image 3: Converting Longitude and Latitude to Adress

I used the website latlong.net to make the conversion:

# of Bus stops (2019)

Therefore, I needed to find a way to automate this. A quick google search led me to R.

Image 4: google search explaining how to convert latitude and longitude using R

Writing the script — Disclaimer: I had help doing this.

Image 5: the script I used to convert latitude & longitude!

and it worked!! It took me a while to get this script, and it even took the computer 30 minutes to think this one through. But it was definitely faster than converting everything by hand! So happy now :) I also made Isha very happy by sharing this dataset with her.

# Population, land area, street density

Image 6: the first draft of my data — the black outlines around certain rent prices indicate that they were supplemented by using Zillow in addition to rent cafe

🔶 Yau Reading

  • Logarithmic (1,10,100)
  • Linear (1,2,3,4)
  • Categorical (cloudy, sunny)
  • Time (day/month/year; linear/cyclical; seasonal)
  • Percentage (parts of a whole)
  • Ordinal (bad, neutral, good) — Hierarchy
Image 7: In-class exercise of categorizing buckets into scales

🔶 The different coordinate systems

🔶 Examples in class 2.0

  • I like the geographical representation of the coin example — high amounts in the center — lower amounts to the outside
  • Could I work with physical object? — bridges maybe?
  • Maybe use hand-drawn images as an in-between step to work on visualizations.
  • Interesting: working spatially — the closer you get the more is revealed.
  • Popup books!!
  • Prototyping tools: laser cut, 3D printing

🔶 Applying scales to my data types

Image 8: scales used street density & rent price are linear, rest are ordinal (hierarchy)

All of my data is numerical except the neighborhoods (according to Yau this would be the categorical scale). I approached making different buckets of data by first finding the maximum and minimum entry and then by calculating the average of each column.

  • For rent price I used a linear scale to make logical steps (add 250 every time) of the data: 600–850; 850–1100; 1100–1350; 1350–1600; 1600–1850
  • For street density I also used a linear scale
  • For population, # of bridges, # of bus stops and land area I tried to use an ordinal scale. I calculated the average and took a range that I considered average and then defined what would be above average and what would be below.

I color-coded everything according to the ordinal scale:

dark green= average

light green = below average

blue = above average

Image 9: My scaled data set color-coded

Confused: Not sure if I can find a correlation in my data by doing this? How do I find an actual correlation?

To do:

  • Land area is currently in Acres — convert this to square miles so it is an apple to apple comparison with street density.

🔶 In class progress

Image 10: thinking through my project in class

I also talked it through with Hannah and she gave me an interesting lead. She told me that in DC certain wealthy neighborhoods try to keep out public transit as long as possible because they fear that accessibility to the neighborhood through public transportation will ‘degrade’ the status of the neighborhood and its inhabitants. So she suggested that instead of looking at the rent I should look at income.

🔶 New Research question

Is there a relationship between the accessibility of a neighborhood in Pittsburgh and average income?

🔶Coordinate systems

🔶Exploring visual cues

Image 11: Different types of visual cues discussed in class
Image 12: applying different visual cues to my defined buckets
Image 13: an exploration of polar coordinate system as a possible use for data visualization

Disadvantage of the polar coordinate system for my data: I have to visualize 90 neighborhoods, the polar coordinate system gets very crowded. In the drawing above I am only visualizing 45 neighborhoods with the number of bus stops and I am already running out of space.

🔶 Data variables and where I found them (part 2)

🔶 Redefining buckets & ranges

Image 14: Data & Scales
Image 15: defining ranges

🔶 Finding Correlations

Image 16 & 17: On the left, you can see that if the number of bus stops increases, the average income goes down. In the image on the right, you see that if the number of bridges goes up so does the average income.
Image 18 & 19: In the image on the right, you can see that if a neighborhood has a high number of streets the average income is higher. The same goes for bike racks — the higher the number the higher the income.

🔶 First attempt at visualization

Image 20: first attempt at layering transportation data of different neighborhoods

Feedback on this visual was alone it works well however plotting this in a cartesian coordinate system would be too much; bar charts on top of graphs.

🔶 Proposed user flow

Image 21: proposed user flow to navigate the information

🔶Feedback & takeaways from Interim review

  • Maybe tell a story about outliers?
  • Overlay/layer feedback from users input onto census data
  • Maybe create an overall accessibility score (combine bridges, streets, busses, etc)
  • Make visuals more interesting — could I possibly use a metaphor?
  • Be careful of shifts between cartesian and geographical.. how to make it fluid? is it necessary? Don’t do it for visual interests.
  • Make buckets and scales visual (Anna did this really well)- where are the options for the user to defer from the narrative?

Challenges:

my challenge is to overlay the different information of the neighborhoods so that there is richness to the information presented but not information overload.

🔶Second attempt at visualization

Image 22: different layers, from top to bottom: bridges, bridges+streets, bridges+streets+busstops, bridges+streets+bussstops+healthyridestations

Feedback from Stacie: almost there! The difference in color value for income does not have to be so drastic — the background value should be no lower than 50%. For the bus stops and bike racks try to explore bigger circles.

🔶 Interactive Data Visualizations

The form I decided on affords stamping extremely well, because that way I could layer the information and the residents would also be able to create their own neighborhood by stamping it.

🔶 Production 1.0

Image 23–25: Stamp result when laser cut into wood

Also based on Hannah’s recommendation for the ink, I decided to use ink that they traditionally use for block printing. Initially, I bought normal ink however that is quite runny and did not give me as crisp results.

🔶 Production 2.0

Image 25: stamp result when laser cut into linoleum

🔶 Let’s go!

Image 26: file for the laser cutter

I wasn’t sure which pattern would work best for the bus stops and bike racks to I decided to produce 2 versions so I could see what it would look like when stamped on top of each other.

Time to laser cut!

Image 27: laser cutting process, the dark parts are burnt residue that needs to be cleaned

Mixing greens for the optimal color value to represent income.

Image 28–29: mixing green and white to create different color values

🔶The different layers

Image 30: representing income
Image 31: representing land area
Image 32: representing the number of bridges 0–9
Image 33: representing miles of street
Image 34: representing bus stop density
Image 35: representing Healthy ride station density

🔶Layering layers

Image 35: trying out different patterns

🔶The finished stamps

Image 36–37: finished stamp collection

🔶Final result

Image 38: prototype with Income, bridges, and miles of streets layers turned on
Image 39: prototype with all layers turned on

To check out my presentation and cleaned data set download:

https://drive.google.com/drive/u/1/folders/1NEGAKKXMG_f5A4xLtig4ZBHcPVBnIHGY

🔶Next steps

My medium posts are part of my graduate study at Carnegie Mellon, School of Design.