History Technology

A Data Minin’ Man

Fernand Braudel

One of the first people who got me really excited about the prospect of being a professional historian was Fernand Braudel. This is a bit of a stretch. Like trying karaoke because you heard Aretha Franklin sing.

But inspiration is inspiration, right?

Besides composing his first book from a Nazi POW camp (!) Braudel leaned heavily on statistical analysis in his histories of late medieval/early modern Europe. He looked past individual persons and events, instead helping to introduce quantitative practices of social science into the realm of history. (So long, Great Men Theory!) This fusion helped launch the mid-20th century appetites for Social, Environmental, and Economic histories.

Braudel’s appetite for sources was astonishing. From his sprawling series: Civilization and Capitalism, 15th – 18th Century, here’s brief selection of some charts:

“Budget of a mason’s family in Berlin about 1800”

“Bread weights and grain prices in Venice at the end of the sixteenth century”

“French Merchants registered as living in Antwerp, 1450-1585”

When you make it through all the other levels of history, this is the final boss.

It’s worth noting that Braudel was working without computer power, drawing instead on an immense bibliography of individual studies. His contribution was to take all these individual analyses of bank ledgers, customs books, or shipping insurance, and synthesize them together to offer a sense of meaning. Braudel took a staggering assemblage of information and he tried to make some sense of it.

What’s incredible is that the herculean efforts of Braudel are more feasible than ever before. As Google delights in reminding us, they’ve scanned 5 million books! Perhaps more useful are targeted project’s like Zoe Alker’s to digitize records of British convict tattoos, 1788-1925.

This might be my favorite digital humanities project, and not just for the generous and excellent framing section, “Historical Background.”

As Alker capably explains, the class and position of the studied population (convicted men and women) means that the personal, individual intention to their body art is lost to history. But analysis of this community’s whole tattoo corpus teases out values, relationships, even illustrating symbols like Buffalo Bill Cody – the memento of a fleeting pop culture sensation in England.

Pop Icon of the Late 19th Century: Buffalo Bill Cody.

This seems the crux of the issue when raw information meets historical narrative: ipso facto.

Though they both lack this crucial step, projects like Black Women’s Experience or Six Degrees of Francis Bacon have an excellent potential for subsequent work. The Francis Bacon Project in particular is notable in that it offers that rarest sight of the digital humanities: an indicator of the unproven “Statistical inference.”

Each black line indicates a demonstrated connection. Think of the gray lines as the white spaces in old maps. Something’s likely there, but the author can’t prove it.

This subtle gray line is another indicator of the cutting edge. As one commenter on Alker’s lecture put it:

“‘No data is clean’ + ‘Know Your Data’ are mottos that get many of us a long way with our historical data analysis 🙂 “

James Baker, Digital Humanist @ University of Sussex

After all, the tremendous, lurking danger of big data – and all subsequently derived reasoning – is overstating its value, understating its omissions, and generally leaving data unframed by historical scholarship. As previously described, a graph won’t tell you what data is omits.

For example, the convict tattoo database doesn’t include people not subjugated to the British justice system. (Like a King.) Black Women’s Experience only surveys texts in JSTOR and Hathitrust for its word maps, because there’s a tremendous dearth of primary documents relating to this population’s experience.

These foundational details define the data collected, which means a tremendous affect on any conclusions derived from the same.

Treemap of word frequency from Black Women’s Experiences, indicating an economic bent among papers that included Black Women in their purview.
How different would this look if we could mine primary texts only?

This is the flip side of the data coin. Nicole Brown, a scholar using the Black Women’s Experiences Database was quoted saying:

“The beauty of computation and Big Data lies in how it complements the traditional close reading […] the two methods complement each other to give you a full picture of what’s going on.”

Nicole Brown, postdoctoral fellow at University of Illinois at Urbana-Champaign

Forging that complementary relationship is incredibly hard! Braudel is one of the smartest writers I’ve ever read, but it’s very worth noting that he was wrong on many major conclusions: he pushed too hard for a cyclical view of history, and wasted a lot of time trying to define the difference between civilization and culture. A contemporary critic of Braudel’s work could be justly applied to an outfit like Google or Microsoft today today:

“it contains a wealth of factual information, mostly correct, but the brilliance of its author’s rather idiosyncratic interpretation has been exaggerated…”

Rondo Cameron, Economic Historian @ Emory University

Because at the end of the day, what is an n-gram really worth? Data is not the end, it is the means, only, one of many necessary tools.

Carly Minsky’s article on AI and Historians, like some of Braudel’s flawed projects, includes similarly ill-conceived attempts to replace historical scholarship with historical algorithms. The US Government’s rush to destroy records is self-evidently sinister, but the State Department’s utilization of an algorithm to direct which documents to destroy is an Orwellian nightmare of pre-destination.

The scholarly paper, “A Bird’s-Eye View of the Past: Digital History, Distant Reading and Sport History” does a fine job of explaining the proper use of big data tools, here identified as distant reading:

“…distant reading helps shape the historical task by enabling big picture analysis, encouraging different questions, and forming new hypotheses, it does not complete the totality of the historical process.”

Murray G. Phillips, Gary Osmond, and Stephen Townsend [italics mine]

Braudel’s work has been surpassed. Not all of it, but you can read his work in the present day and spot flaws: some obvious overreaches, some subtle misinterpretations. The power to be permanently and forever correct is not afforded to historians. Hopefully the undoing happens after you’re dead, but sooner or later the flaws in your work will be found, or countered, or deemed moot.

That is, if you’re lucky enough to have people reading your work at all.

This is the fate of historians, and it should never impinge on one’s ambitions or inspirations. You can be certain that whoever surpasses or disassembles your work will do so using the same methods you used when first assembling it. It’s this pairing of hard evidence, closely-read primary sources, and razor-sharp logic that yields a persuasive historical argument. Digitized data mining makes this process easier than ever, but does not replace the fundamental methodology of history.

If you can’t tell, I think everyone should read these books.

Leave a Reply

Your email address will not be published. Required fields are marked *