Nicely researched piece from Brent Rose. I’ve never really bought the hype around bulletproof coffee. Then again, if butter in your coffee is your thing then don’t let this stop you.
Nicely researched piece from Brent Rose. I’ve never really bought the hype around bulletproof coffee. Then again, if butter in your coffee is your thing then don’t let this stop you.
In March 2013, Cameron Blevins came to me with a question about how to visualize his research into the U.S. post office. This presented a fantastic opportunity for both of us: as a research collaboration and a chance to learn d3.js.
DH likes to ask what to do with a million books. I wanted to ask what to do with 14,000 post offices. We wanted to know – what insight could we gain from visualizing the Post?
Visualization, as historian Richard White argues, is a means of doing research: of posing questions we otherwise could not ask without the aid of computers; of identifying patterns that might otherwise go undiscovered.
The design of the project went through several iterations as we tried to solve a key question: how to present the data in a meaningful way. Plotting points on a map is a simple enough exercise, and I find even that process can be arresting—to see the massive network of the Post and where communities tended to cluster in the West. But we wanted more than just the presentation of points on a map—we wanted to represent knowledge.
We had a lot of data – complex, messy, incomplete data. The entire dataset is 160,000 post offices for two centuries for the entire US – we only focused on the latter half of the western US.
In some ways, the data we worked with didn’t present a lot of challenges: it’s complex, yes, but the data is all the same. We were not confronting, say, a digital archive of texts, all of which can be radically dissimilar in their form and content: case files, advertisements, newspapers, diaries. We didn’t have to confront what William G. Thomas has called the document-type problem. But there were other design challenges: for example, meshing together datasets. Cameron began the project with an already-massive dataset of post offices, but roughly eight months ago he purchased another dataset from a stamp collector that expanded the amount of evidence we worked with massively. He now have data for the entire United States. Thus, we are confronted with the architecture of our data. Then we had to ask the really dangerous question: Where does the West begin?
Part of our argument is that we can use the post office as a proxy for understanding settlement patterns in the West. Many of these nineteenth century towns died years ago; they don’t exist on present-day maps. The West is known for its ghost towns – we can see a lot of them here.
I want to make a case to you today about why we should think about the Post as a proxy for communities, and the significance of visualizing that process.
There are, of course, many ways we could represent population figures visually. One of the more popular techniques is choropleths. But there’s a problem with the chropleth when it comes to the West: we have what Cameron has called the West’s “county problem.”
The problem with western counties is that they’re huge. Look at San Bernardino in Southern California, which includes the metropolis of Los Angeles. Lots of people, right? but the actual population is huddled against the western edge of the county rather than evenly distributed through the county. You get a visual representation, then, that can be misleading. What you want is more granularity.
Let’s focus on Colorado and New Mexico in 1870 – keep these places in mind, we’ll return to them a few more times. We get a sense that there are lots of people, a rough idea of where they’re at in the states with the choropleth. But it’s hard to really know unless you compare this with our map.
Two things stand out to me here. One, these two things map onto each other well. If we overlaid the population data with the post data, I think we’d see that the shading of the counties would fit well with the location of post offices. But, we also get a better sense of where people are at. They’re not distributed throughout these huge counties in New Mexico; they follow a corridor north to south – probably a railroad line in New Mexico, and nestled against the Front Range in Colorado.
To me, this is significant. If my historical question is about the settlement patterns of the western US, it matters a great deal to me to know where exactly those communities are at. The Post gives me a window into that process.
So, the Post becomes a proxy for a town. You wouldn’t have a post office where there’s no town. This is the way people communicated in the nineteenth century; oftentimes, these towns were not located next to railroads. They needed the Post and its network of postal roads, stagecoaches, and rail lines to distribute news and information. This was your connection to the broader world. It was also a key part of the national government’s process of folding the West into the national. Rather than an isolated region, it became integrated into a national system of information. And this network connected the West to larger social, economic, and political networks.
But maybe you don’t believe me quite yet. Maybe you look at these offices and say: this doesn’t work. There’s no story here. Let me give you one more example; I have to give a shoutout to Ben Schmidt, a historian at Northeastern, for alerting me to these maps.
In 1915, the U.S. Census Bureau published the Statistical Atlas of the United States. It’s a beautiful book filled with some stellar visualizations. One set of visualizations sought to illustrate the population density of the United States for 1870, 1880, and 1890.
I looked at these maps and thought: does the postal data map onto these. In other words, can I really treat the post as a proxy for settlement? If post offices happen to line up well with the Census bureau’s own statistics, to me that’s further evidence that I can treat these as indications of settlement. So, let’s look.
Here’s our Colorado – New Mexico corridor in 1870. I apologize that my projection is different from that used by the Census Bureau, so you may have to squint a bit to help. Looking at these side-by-side, I think they pretty accurately map onto each other. Notice the small pocket of post offices to the west of Denver; that same blob appears on the Census map to the west of Denver. Notice the collection of offices in northern New Mexico; the Census map distribution shows that same presence.
So, we have the West of 1890 – heavy populations in California and the Pacific Northwest; lots of people along the Rocky Mountains; empty areas in Nevada and Utah and Montana. My map seems to map onto this pretty well. Pay particular attention to WA, OR, CA – my projection isn’t the same as the Census, so they don’t map quite right. But they’re pretty close. And if I fixed the projection, I think they’d map onto each other very closely.
So, comparing post offices to other maps and visualizations – I’d say that we can safely use the Post as a proxy for understanding communities. But the question is, why is that important? Why did I spend all this time trying to convince you that I can safely treat the Post as a proxy for communities?
Because the story isn’t just about the rise of communities; it’s also about their decline. You don’t see this in the Census maps.
If we look at the progression of the maps from 1870 to 1890, the maps tell a particular story: one of growth, one of progress. Let’s return to our Colorado - New Mexico corridor. What you don’t see in these is the communities that don’t thrive.
Here’s our corridor in the Southwest again; these are post offices that close between 1870 and 1890.
If the story I’m interested in is the process by which communities grow and decline in the West, it’s these communities I want to examine. Again, you don’t see these places in the Census Maps. I would bet you also wouldn’t see these changes using population data for counties. But the postal data we have can give us that.
And therein lies one of the great benefits that I think visualization lends to the humanities. The Census maps are static, giving me snapshots of particular moments in time. Our map lets you examine any moment in time between 1848 and 1900. Maybe your curious about what’s happening in western settlement during the American Civil War? You can select those years. Maybe you have a particular interest in western settlement as conflicts with Native Americans are happening throughout the west in the 1870s and 1880s. You can select those years. You can look at places where communities go away and where they spring up – and that’s key.
What the map does is lead me to questions. It leads me to places on a map that may be overlooked by historians. Some of these places no longer exist; they, quite literally, are removed from the historical record. Some of these places are mining camps – they exist only for a year until the mines run dry, then they’re off the map. What’s happening is a distant and close reading of a spatial experience. By stepping back and looking at overall patterns, we can then zoom in closer – track down more sources, track down more information, give richness to the fabric of historical experience.
When did you start your blog (career wise: as a grad student, undergrad, etc)? Why did you decide to start blogging?
I started blogging early in graduate school. The blog started as a co-authored blog between my friend Brent Rogers and myself as a way for us to share our thoughts and ideas about digital history. We started the blog in the context of a course we were in together believing that it made more sense to write for both a public audience along with our professors. I don’t want to just write for other academics – I want what I write to be accessible and available to whoever is interested in what I have to say.
How do you host your blog? How did you learn to set it up? Can you expand upon the point you made about “owning your domain” in the Podcast?
The blog has gone through a couple of platform iterations. When Brent and I started we were hosting on WordPress’s free platform before I moved over to WordPress.com. A few years ago I got caught up in the wave of sites moving over to static blogging and, since I was also a Ruby coder, I switched over to Jekyll (which the site continues to run on today).
Many of the things I’ve learned along the way have been self-taught. I have no formal experience in computer programming or design, but I did spent time on high school and college as a freelance web developer/designer which set me up with learning the language of the web. I took a deeper dive into computers early in graduate school when I started using Ubuntu Linux as my main operating system, which introduced me to Unix, setting up my own LAMP stack, and so on. So, installing and running WordPress (at the time, hosted with Bluehost) had become familiar to me. Since I had become so comfortable with the command line, Jekyll was a natural fit for me. Plus, for my needs I didn’t want the overhead of WordPress–maintaining the database, the constant security updates, controlling spam. Running a static blog simplified the entire process for me, and let me focus on my content more than the vagaries of maintaining a website.
For owning your domain: I believe you shouldn’t let third parties be
your online identity. Facebook, Twitter, department websites,
Academia.edu, and LinkedIn should be treated as gateways to your own
domain. You don’t have as much control over how you are presented on the
web through these services, and they can disappear or change drastically overnight. What I mean by owning a domain is you should have
a URL of your own (your name if possible, and a
.com if possible) and a corner of the web that you call your own. That
corner could be a full-fledge blog, a “brochure” site that describes
who you are and what you do, or a combination of the two.
People will go looking for you online; give them a place to find you.
What were the challenges associated with the blog (i.e. spam, finding topics, finding time, getting it counted as “work”, etc)?
One of the challenges is topics, and related to that is time. One way to get around the topic hold-up is to not pigeonhole your writing into one topic or theme, which I think I’ve managed to do. I write a lot about DH and about history, but I also veer into coffee, podcasts, music, and so on. Whenever the writing muse strikes on whatever topic, I want to be ready to write no matter the theme.
Having time available to write has gotten a little trickier. When I was still taking classes and writing for those venues, I often made the things I wrote for class available on my blog – things I wrote for class would be adapted to blog posts. Nowadays it’s a little tricky to find time to write for the blog between juggling a full-time job, finishing a dissertation, and the demands beyond professional life.
What topics did you normally write about? Did you try and keep it strictly about your work, or do you ever mix in other topics?
I’ve mentioned elsewhere that I believe you should write about whatever motivates you to write. That’s what matters. Most of my writing revolves around digital history and my research, which is a broad enough organizing theme that it creates enough topics. Although I’ve tended to write about my work, I do sometimes venture into other topics and hobbies. I also started doing a John Gruber-style linkblog, which is still built into my blog but I don’t use as often as I did a few years ago.
What kinds of interactions (scholarly or otherwise) emerged out of your blogging practice?
I’d amend this to not just blogging, but Twitter also. I think the combination of the two is important: long form content appeared on the blog, but advertising the blog post and the discussions about the post took place on Twitter. Or sharing thoughts that don’t quite make the cut for long-form but work well in the short space Twitter gives.
I don’t know that I have any specific collaborations that emerged from the blog, but having an online presence that included Twitter and blogging has led to many different interactions. One has simply been networking, a function that conferences still fulfill, but I also feel Twitter has played a huge role in introducing me to a lot of people. I suppose that digital presence has helped me connect to a few projects that I’m affiliated with (The American Yawp, The Middle West Review) as well as a few forthcoming collaborations that I can’t share yet (watch for these soon!).
There are other things, too. I’ve been interviewed by a few different venues about digital humanities. I’ve been a writer for ProfHacker and GradHacker. I’ve had lots of conversations with people about scholarly Markdown. As I’ve shared my scholarly work online, I’ve connected with others doing similar things or reached a public audience interested in the work I’m doing. Some of the things I’ve written turn into conference presentations. All of these things are professionally and personally rewarding.
Do you find these interactions informative, useful, enlightening, tedious, frustrating, obligatory, etc?
By and large I find it useful and informative. The interactions have led me to new ideas, introduced me to new people, and built up a professional reputation that I think has been important not only for carving out my niche in the field of history but also allowed me to, in a way, advertise myself to potential employers. Networking is an important skill in the academy, and I think having an online presence can go a long way in helping you cultivate a network of people across institutions.
Such interactions are also exciting because they’re giving me a chance to engage in new collaborations (like the one’s I can’t mention yet…) that I likely wouldn’t have a chance to engage with otherwise. Blogging gave me an avenue not only to work through ideas, but also allowed the discovery by others to the sort of things I do and the interests that I have.
How do you think digital humanities blogging is different from more traditional forms of academic writing and reading?
In my own experience I feel that much of digital humanities blogging tends to focus more heavily on methodology rather than the narrative and analytical pieces you’d find in most academic journals and books. For example, Cameron Blevins’ post on topic modeling Martha Ballard is quite popular because of its methodological underpinnings. The entire post is mostly a methodological piece about topic modeling, and as Cameron noted at his AHA conference presentation in January 2015, the piece didn’t uncover anything necessarily new or surprising: it conformed conclusions already made by Laural Thatcher Ulrich. But we have a lot of methodological pieces – Blevins, Underwood, Schmidt, Mullen, McDaniel, myself – that probably wouldn’t find a home in most of our traditional writing venues.
I find that form of writing incredibly useful. We are, by and large, a pretty open community willing to share methods and ideas that we’ve experimented with. Given the rapid pace of change and new methodological approaches, such writing also helps me keep up with what’s going on generally in digital humanities.
How would you characterize the relationship between blogging and the digital humanities (however broadly conceived)?
Similar to the above, I feel that much of the writing in digital humanities blogging tends to focus on methodology. Such writing tends to prompt me to think about methodologies I can start to apply to my own work, or methods that I could share with others in regard to their work. DH blogging also tends to riff off one another more than traditional writing, which shows the different ways that similar methodologies are applied to different research questions (see, for example, Ben Schmidt’s latest posts on story arcs and their similar application by David Mimno, Matt Jockers, and Ted Underwood).
What DH blogs/bloggers do you read and why do you read them? What do you like about them?
There are many! That’s the thing with this community–many maintain a blog. I’d have to say the bloggers I’ve been reading the longest are Dan Cohen and Caleb McDaniel. I learned about Dan years ago when I started graduate school as I was introduced to the great work going on at RRCHNM. Reading Dan’s posts gave me a great window into what DH could be. I started reading Caleb’s blog for a different reason. He was still writing on his previous blog Mode for Caleb and, if I recall correctly, wasn’t doing much with DH at the time. The blog was his space away from his dissertation to write about whatever struck him. The pieces I recall the most are those about jazz music, an interest him and I share. In addition to Dan and Caleb, Lincoln Mullen, Chuck Rybak, Bethany Nowviskie, several of my grad school friends (Robert Jordan, Andy Wilson, Brian Sarnacki, Michelle Tiedje), Ben Schmidt, Ted Underwood, and Matt Jockers, are all regulars in my RSS reader. The list goes on and on. I value their ideas and the energy they bring to their work. I learn something new every time one of them posts something.
And thanks to Twitter I’m exposed to many, many more posts written by people doing DH.
I also enjoy the posts at BlogWest, where a group of my friends and colleagues write about my field of western American history.
What was your most popular blog post? Why do you think it was so popular?
I’d say the most popular had to be the Rubyist Historian series that I wrote for introducing humanities scholars to the Ruby programming language. It’s the only post, for example, that found it’s way onto DHNow.
I haven’t thought much about what made the series so popular. I was inspired by both the class it was based on (Prof. Steve Ramsay’s Electronic Texts at UNL) and the Programming Historian. There’s an interest among some humanities scholars to learn how to program, and any time a new language is taught and examples are provided for how that language can be used for humanities research I think there’s a hunger for that information. The Programming Historian for Python, the Rubyist Historian for Ruby, Lincoln Mullen’s in-progress book on R methods in digital history, Matt Jockers’ work on R, and Elijah Meeks’ book on D3.js all speak, I think, to the desire for people to make programming part of their normal work.
Anything else you think is important you’d want to mention about your blog?
Is this where I apologize for not writing so much recently?
I am excited to finally release the digital component of my dissertation, Machines in the Valley.
My dissertation, Machines in the Valley, examines the environmental, economic, and cultural conflicts over suburbanization and industrialization in California’s Santa Clara Valley–today known as Silicon Valley–between 1945 and 1990. The high technology sector emerged as a key component of economic and urban development in the postwar era, particularly in western states seeking to diversify their economic activities. Industrialization produced thousands of new jobs, but development proved problematic when faced with competing views about land use. The natural allure that accompanied the thousands coming West gave rise to a modern environmental movement calling for strict limitations on urban growth, the preservation of open spaces, and the reduction of pollution. Silicon Valley stood at the center of these conflicts as residents and activists criticized the environmental impact of suburbs and industry in the valley. Debates over the Santa Clara Valley’s landscape tells the story not only of Silicon Valley’s development, but Americans’ changing understanding of nature and the environmental costs of urban and industrial development.
The digital edition of my dissertation is yet a work-in-progress–there are probably things that don’t quite work right and plenty of more exposition and narrative I’ll be adding over the next few months. The project will go through iterations as I finish my written dissertation. The project will house several features, including interactive visualizations, dynamic narratives and analysis that extend upon themes covered in my chapters, and access to certain primary sources. I do this in the spirit of making my research open and extending upon themes in my research. Not every piece of digital scholarship can make the transition to print form–the act of trying to fully describe a dynamic visualization can be come lost. Better that readers have a chance to interact directly with the same tools, views, and material that I used to draw my conclusions.
I also aver that putting your work online gives you access to your publics–researchers, educators, interested readers, students, and so on. To me, such access has been invaluable. I have correspondence with people on a monthly basis who have discovered some facet of my digital scholarship who are interested in my work, have questions they want to ask, ideas they want to challenge, and collaborations they want to engage with. That has become one of the most valuable contributions digital history made to my professional life. The writing of history for other academics serves an important function, but that cannot be our only function. Out engagement with new narrative ideas in electronic form gives us a chance to reach audiences that can be difficult to find with books and articles. Digital scholarship not only makes my work better, but hopefully contributes to accessing knowledge.
The digital dissertation joins other digital scholarship that I’ve made available over the last few years, including Framing Red Power and “Self-sustaining and a good citizen”: William F. Cody and the Progressive Wild West. If anyone is interested in the code used for the site, you can find the details on Github.
So, check out the project and let me know what you think!
This weekend I’ll be in New York for the American Historical Association’s annual meeting, where I’ll be on two panels regarding digital history. The first on January 3 is an experimental panel with several scholars on using digital history in teaching and learning:
Digital Pedagogy for History: Lightning Round
Using the “lightning round” method of spreading ideas in the digital humanities, this experimental panel features one-minute expositions on innovative projects and cool ideas in digital history for teaching and learning. Five or more panelists will be invited to register via Twitter at the meeting. Audience members will also be invited to join the lightning round.
The other on January 4 is a career-oriented roundtable with Jana Remy, Mills Kelly, Andrew Torget, and Katina Rogers about tenure, alternative academic careers, and graduate training:
Digital Scholarship, Academic Careers, and Tenure
The digital revolution is disrupting long-established systems within the academy for tenure, promotion, and careers, offering both new opportunities and remarkable challenges for the next generation of historians. The AHA, in response, recently charged a committee to draft guidelines for evaluating digital scholarship in T&P. This roundtable will provide a ground-level discussion of the role of digital scholarship in early-career scholars, as session panelists share how digital scholarship fit into their work on the tenure clock, offered them alternative academic careers based on their digital projects, and the nature of peer review after the digital turn. They will also discuss how the MLA’s publication of guidelines for evaluating digital scholarship could be applied to historians.
There are several digital panels at the AHA this year, many of which I’ll try to get to, as well as a THATCamp being held on January 6th. Should be a great weekend!
In the fall quarter I taught, for the first time, a digital history course, a colloquium with undergraduate and graduate students from history, computer science, journalism, and earth sciences. The course was a blast to teach and I am extremely pleased with how the final projects ended up. I had an amazing group of students who not only seized upon the methods introduced in the course but also helped me to clarify some of my own thinking and challenged some of my ideas.
The end of the course culminated in a beta version of a digital history project that the students were free to choose on their own, so long as the topic revolved around the history of Silicon Valley.1 The projects:
They compressed a lot of work into a short amount of time (quarters are so short), and I’m quite happy with the analysis the students were able to complete in our short amount of time together. Much of their work became public-facing, either presented on the web (above), during our public electronic poster session at the end of the quarter, and through our course blog. We had a few fits and starts, mainly with getting up to speed on GitHub early on in the quarter, but by and large things felt like they went pretty smoothly.
I’m already spending time thinking about things I would change about the course, and hope that I’ll get a chance to teach the class again next fall.
The exception to this was the graduate students, who pursued research agendas in line with their own scholarly interests. ↩
[Read this along with Cameron Blevins’s companion post.]
After more than a year of work, Geography of the Post is live. I wanted to take a moment at the project’s launch to reflect back on the design decisions we made with the project and to document these changes.1
The design of the project went through several iterations as we sought to solve two problems: The first, the most efficient way of presenting the material. Since we are dealing with such a large amount of information (our total dataset approaches 100,000 post offices), we ran into problems very early on with the performance of the map. Dragging, panning, and zooming the map became frustratingly slow – a user experience you always hope to avoid. We built in manual zooming features to work around that problem.
Second, our bigger question revolved around how to present the information. We wanted to determine what sort of views we could present to users in order to ask interesting research questions. Our early design iterations focused on Oregon. We started by loading our data onto a Google map:
We experimented with alternative views, such as hex binning visually understand geographic concentrations of post offices through histograms:
These were useful views, but we had considerations that we wanted to take into account with the offices that simply plotting points doesn’t let us get at. It’s interesting, in one sense, to see the concentrations of post offices. But these points don’t represent much else. If we are using the post to understand something about the movement of people into the American West, we needed more interaction with the points in order for us to query the information with more granularity.
With the assistance of some amazing undergraduate research assistants – Jocelyn Hickock and Tara Balakrishnan – we created methods for determining the status of a post office at any point in time. Users are presented with two views. The first is what we called “Duration View,” which uses transparency of the points in order to convey the “age” of a post office. These “ages” update according to the span of time that you draw on the timeline, or you can view the map as a whole and see areas of the West that have had the oldest (or youngest) post offices.
A second view of the post offices we built into the project is what we’ve called “Status View.” This view shows us one of four statuses that a post office can be in during a given span of time: closed, opened, open throughout, or open and closed. The view gives us a chance to look for large areas of closings or openings in the context of surrounding post offices and raise questions about why those changes are occurring.
Why document our design decisions? Part of my own goal in digital humanities generally is the reusability of approaches, methods, tools, code, and design in projects that may be far afield from my own work. But I also believe that we can make our work more methodologically transparent by presenting the artifacts and iterations of our design process. Not only because designs have implied and explicit arguments, but because sharing the process helps others in their design process. Furthermore, exposing our design and thought process has helped us to think more deeply about our own design decisions.2
In other words, I am trying to answer Trevor Owens’ call that we take “a few moments at the end of a project to reflect on what you wanted to accomplish, what actually happened, and what you learned from the process.”3 Our goal at the outset was to determine what we could about the relationship between the U.S. Post and population growth in the American West. By and large, I think the project goes a long way in giving us an overall picture of population growth at specific areas in the West, a more granular view of populations than we can see in choropleth maps because of the West’s county problem. Since counties are so large in the West, a choropleth fails to really give us a sense of where people are at in space.
Population in the West, 1870. Map by Cameron Blevins.
But the choice of using post office points to surmise about the growth of population centers gives us a greater sense of where people are going in the West. To make that process more clear than a static map could convey, we designed a timeline feature that allows users to drag a span of time – from a single year to the entire span of time contained in the dataset – and visualize how these changes occur over the course of the century. You have a specific interest in the West during the Civil War? You can draw the timespan and see those offices between 1860 and 1865. More interested in the late nineteenth century? Select those years. Want to watch year-by-year how post offices grow in the West? Select a year, and drag across the timeline to watch places in the West expand.
There are elements of the map that I wish we had designed in from the beginning. I’d like to see this same information on a terrain map rather than a flat map – to see how the landscape might have determined where post offices located. I would love to add layers to the map – railroads, major roads, postal routes. Other quanifiable information might also be overlaid on the map – population figures, salaries of postmasters, perhaps even voting patterns. We may have even built in more conceptual and experimental visualizations that could have allowed us to distort time and space (think cartograms) to speculate on the ways that the post shaped how people thought about space. In these ways, we could add more layers of information that may cause us to ask new kinds of questions.
Visualizations are provocations for interpretation. For those who approach the project – researchers, teachers, students, the public – my hope is that the visualization provokes questions and ideas. The sheer scale of the office network is arresting, but interacting with that network provides a chance to view it from different perspectives. The interactions with the research, I hope, give users a chance to ask different kinds of questions that a static map simply couldn’t prompt because it lacks the ability to reshape the information easily. As an interactive scholarly work, Geography of the Post lets users explore the space of the Post and the growth of the American West.
The code for this project is available on Github. I also have a desire to make this code cleaner and open to wider use by others. Parts of the code are fairly specific to Cameron’s dataset, but I’d like the map to be able to handle any data dropped into it. ↩
Trevor Owens, “Please Write it Down: Design and Research in Digital Humanities,” Journal of Digital Humanities 1 (Winter 2011). ↩