Session 1: Future of Crowdsourcing (10:30-11:20)

Future of Crowdsourcing

Room 108

10:30-11:20 AM April 18, 2015


Lots of crowdsourcing project, but what’s next?


How do you define success- provide an outcome that outweighs the cost, is sustainable – how to measure value of crowdsourcing, where it works and where it doesn’t


in order to get funded, of course you have to define what the outcome will be

There’s value in getting work done – what are some other values?

-crowdsourcing of metadata, items for a collection? For metadata, if there is a goal of public engagement, that works, but if you are “flavor-saving” for the institution, you question whether you are spending too time editing outside information

-BUT of course crowdsourcing can bring in information you don’t know you don’t have


Crowdsourcing has tangible values of labor saving, higher detail records; but there are intangible values of ownership, public scholarship, involving people traditionally outside domain – changing what scholarship is and means


dealing with the question of why we fund the humanities in general


Might be good to track who participates, have those statistics

-there is a spectrum of participation/donation – from volunteer to financial donation; these both can be measures of sucess


Citizen science – there was a (name forgotten) project of public identifying celestial bodies, same platform went on to be used to transcribe coptic documents. These platforms provide opportunities for cross-discipline, bringing science and humanities together. Tools are NOT necessarily one-domain

also Zooniverse started off as a citizen science project, turned into a project of war diaries


Tools aren’t really creating epiphanies of interdisciplinary study, but creating new ways to think of the multiple uses of platforms.


Instead of calling contributors volunteers, it has now been proposed to call them “volunpeers”

-Language fosters community (even if the community is pretty self-selecting)

-Smithsonian does this, creates competition, fun and games – giving people an incentive to participate in transcription

-How to get more people involved who don’t commonly self-select? “Recaptchas” (used for security on websites and to create text and image recognition – people don’t go around saying “hey, I want to help Google,” but they could say “hey, I want to play this game”) could in a way be part of a project, and provide a link for “if you want to do more transcription, go to this site”

-projects directed toward K-12 education (and not just editing Wikipedia for a class day). Transcribing would be great for teaching handwriting and looking at a primary source (at least documents of the “Founding Fathers” might be popular)

-not all students (K-12 or undergraduate) will be immediately engaged, but some might build lasting interests


Have to tap into networking – hope that your project spreads or “goes viral.” You can target groups and hope that they “infect” others with interest for the project. In a way you can only control the first steps – the “going viral” has to come later

-Outreach and marketing: if you build it, they’ll only come if you market it

-Also need evaluation to find what worked and could be used again, or what needs to be changed

We might not be too good at outreach and evaluation yet – some projects have the mentality that any result is a good result, and prematurely congratulate themselves.

-Part of the problem in some cases is of course that libraries are underfunded – they need a different yardstick of success than a large museum would need.

-Humanities don’t traditionally use numbers, so metrics might not need to be numerical. You could use word clouds or other representations. The story isn’t always in the data. Maybe bring people from related fields in to see how we can visualize data

-sentiment analysis, words used in conjunction with others?

-meritology, narrative analysis

-Seems like there is more collaboration between departments than there is between qualitative and quantitative analysis

-Some attempts at programming, like Qualrus? But it’s not quite digital yet, still requires a lot of human input, since people can notice patterns computers can’t.


Have to check, as the Smithsonian does, how much people contribute when they do visit the site. Do they just visit for a few seconds or minutes, or do they interact? Need to build the metrics into the site from moment one.

What are some projects that could be done, or partnerships that could be made, in the foreseeable future?

“Remembering Lincoln” at Ford’s Theater – trying to find where people have their own collections. Still has issues with outreach, realizing that it takes more effort than thought. Could use K-12 outreach

Categories: Uncategorized | Comments Off on Session 1: Future of Crowdsourcing (10:30-11:20)

Session 1: Institutional MediaWiki (10:30-11:20)

Session 1: 10:30-11:20

Institutional MediaWiki

Meg Brown, Folger Shakespeare Library


  • Institutional wiki: Folgerpedia (
    • 99% of contributors are staff members
    • Scholar pages, playbill collections, seminar information
    • Not a repository per se—Folger also has an image repository called Luna
      • Wiki doesn’t allow for certain file type hosting; future plans to cross-link to other resources
      • Does host MP3 files so podcasts of Shakespeare’s birthday lectures are accessible online
  • Issues of audience
    • Folgerpedia is a place where other departments of the Folger come together (exhibitions, public outreach, etc)
  • Institutional history
    • Outsiders: keeping track of seminars, fellows, etc.
    • Inside history: collections, possibly tracking all the times the Folger has appeared in film, etc.
  • Other wiki spaces
    • Link to time management & scheduling systems
    • Wiki development space—Insights—publish pages in private before putting on Folgerpedia
      • Used for longform pieces
  • Other uses for wikis (non-Folger folks)
    • Central location for best practices documents—internal wiki
    • Not always a culture of collective editing/updating
      • ways to foster collaboration?
    • Silos can prevent communication & collaboration
  • Many DH groups default to GoogleDocs instead of wiki
    • Is the wiki concept “too much?”
    • Having to use markup language deters people from adding and participating
  • Measure of success
    • Reader views, not patron contributions: if people are accessing it, it is useful
      • Readers=people who have applied for reading privileges at the Folger
    • Main page of Folgerpedia has been accessed over 36,000 times.
  • Other possibilities of access and formatting?
    • Wiki is great for text and articles
      • Allows for transcriptions
    • Omeka would be better for objects or photos, but with a text-based collection, wiki works
    • Can be an intermediary for learning and creating with regard to public resources
    • Wikipedia doesn’t like institutions posting, so this is a way to take ownership of material and research


  • Does the wiki format add to or get in the way of information sharing? (Think linked open data, etc.)
    • “friendlier way to get at linked data”
    • connecting ideas and resources in a non-hierarchical way
      • sometimes tagging can be hierarchical
      • interdepartmental issues can prevent people from feeling comfortable making changes
      • some departments are more proprietary than others
    • institutional wikis require large amounts of staff time
      • can lead to outsourcing
      • not always group consensus and staff input when outsourcing
  • Wiki models
    • Business school wiki at UM does both outreach and internal service but most wikis are either/or
      • Intranet
    • Folger is still looking for other models
    • UMD: wiki resources are underused
    • Harry Potter wikis—best models may not be in scholarly communities
    • Monticello—community can comment but not contribute but you still have to apply to be in the community
  • Cataloging tag on Folgerpedia
    • Folger’s earliest pages were cataloging pages in efforts of trimming website
      • Getting rid of the extra “u” in cataloging was a problem
    • Seeing what made it from the card catalog to the online and such has helped researchers
    • Scholar pages (920 of them!)
      • Patrons are engaging with the resource and suggesting edits and corrections
    • Potential for growth in the area of provenance and former ownership
      • People get excited about book plates and signatures and identification
      • Random fact discoveries
  • Rosetti archive
    • Catherine: “it’s too much”
    • PDFs of every page of every Dante Rosetti poem in literary journals
    • Look at this as an example of a linked data extreme—learn from this for wiki
  • Deciding how to do Folgerpedia pages
    • Hamlet the play vs Hamlet the character
    • Who are the focused users? Librarians, catalogers, high-level scholars who may be interested in other things
      • These people fill in for users who don’t have contribution privileges
  • Development and crowdsourcing
    • Folgerpedia as a pedagogical tool?
      • Have college students create pages?
  • Information architecture
    • What sections do you funnel people into? What are the goals and who ends up keeping track of how the wiki grows?
    • Need for targeted expansion
    • Who moderates? How can institutions prevent vandalism?
      • Folgerpedia is gated because there are academic perspectives the institution does not support (i.e. Oxfordians)
  • Folgerpedia is different from Wikipedia because Folger accepts and encourages original research
    • Use of primary sources encourages people to go to the Folger library. Wikipedia only wants citation of secondary sources.
    • Best practices and contributor guidelines
      • Do not have to have a higher degree to contribute to Folgerpedia—all ranges of expertise welcome
      • Not wholly prescriptive
      • Templates created, but require some knowledge of Media Wiki to use
  • Stylistic issues
    • Disambiguation issues with scholar pages
      • LOC plugin would’ve broken MediaWiki
      • Middle initials are problematic
    • Everyone should have ORCID records!
      • Unique research identifier number that can be attached to article systems
      • WorldCat supported!
      • AND, facilitates attribution for people who might publish under multiple names
      • Not common in the humanities yet
    • It is possible to make a redirect or disambiguation page if a scholar asks for it, but Folger uses the “most authoritative name” they have on file (i.e. fellowship name)
      • “what do they call you at tea?” (Folger has tea every day at 3)
    • Redirects for people who change the gender they identify with
    • Academics move too much to have biographies
      • When scholars become participants they have the option to add biographies

Categories: Uncategorized | Comments Off on Session 1: Institutional MediaWiki (10:30-11:20)

Session 1: Social Network Analysis with Dan (10:30-11:20)

Social Network Analysis with Dan (10:30-11:20)

  • Introductions
  • Dan’s Salem witch trial project (
    • Briefing on his project
      • Biggest problem: How are people connected?
        • Over 900 documents to review
          • SNA is a useful way to keep track of the individuals and track their interactions
        • So far he has over 2000 pairs
      • Question: how is this being created?
        • NodeXL demonstration
          • System of line and node connections
          • Ranks the most central figures
          • First two months of documents have been documented
            • March documents
              • Information recorded in an excel spreadsheet
                • 50 people
                • Over 1000 relationships
              • Input data into NodeXL where the data is prepared through the program
            • Program can track the different clusters
            • Data can be grouped into boxes
              • Tracks the various trial
            • Behind each infividual, the program positions them within the web
              • Degree of centrality-how many people do they know?
              • Betweenness centrality-how many connections are there between individuals
              • Closeness centrality
              • Eigenvector centrality
                • If it is a high number-they can be identified as the most powerful/important people
              • Question: why did they (accusers) accuse so many people?
                • Unknown
                  • Various theories
                  • Ann Putnam is a 12 year old girl in a puritan society
                    • Believes that her father (Thomas Putnam) is pushing her to do this
                  • Question: Have you found anything that you didn’t expect to find?
                    • Past scholarship did not connect different groups but Dan has all of the people connected (everythingis examined through a larger lens)
                    • He has found connections that weren’t previously identified
                  • There are many ways that data can be displayed
                    • Harel-Koren
                      • You can see the individuals and their relations “blossom”
                    • Circle
                      • You can see the density of your network
                        • Helps you see who you should focus on (fewer connections vs. many connections)
                      • Spiral (not the best for this project)
                      • Sine Wave
                        • Demonstrates the denseness as well
                      • Question: are you keeping an archive of the documents you record?
                        • So far the information exists in a book
                        • Future plans to scan documents in Salem, MA
                      • Positive feedback by senior Salem scholars
                    • Question: how is the network constructed?
                      • It is important to look at the network as a whole before there is an investigation on a deeper level
                        • These connections would not have been found if there was not a quantitative applications to this history
                      • Question: is there a way to identify this textually? What happens when people are color blind?
                        • Extremely difficult
                      • Question: How do you emphasize people are in contact every day? Can you weigh individual relations in the visualization?
                        • You would make an edge weight (make line between the nodes thicker)
                        • Directional vs. non-directional (identified by arrow)
                          • Demonstrates who is reaching out to who
                        • Question: Is there a good tutorial out there?
                          • Book by NodeXL
                          • Diane Cline’s upcoming book Digital Humanities and NodeXL (coming in 3 months)
                            • Example in book Plutarch’s Life of Pericles
                              • Through the example, there is a step-by-step instructions so that data can be visualized
                                • Extremely detailed instructions
                              • Gender column to track the interactions between male and females
                            • Other SNA projects
                              • Diane Cline’s Socrates project
                                • Students, philosophers, intellectuals, Sophists, etc
                                  • Look at the different clusters interaction
                                  • Connections within Shakespeare’s play
                                  • Stanford Lit Lab
                                    • Tracks Hamlet and the relationship between the characters
                                    • Has not been generated in an SNA program
                                  • Tina’s oral history project
                                    • Group house of an artistic community in Alexandria (no longer in existence)
                                      • Looks at how artistic communities are anchored by different people
                                        • Mapping of relations between the “creative class”
                                          • What can you see about the creative class of Arlington
Categories: Uncategorized | Comments Off on Session 1: Social Network Analysis with Dan (10:30-11:20)

THATCamp dc2015 i s almost here

… and it is time to start proposing your discussion topics, demos, hacks, and mini-workshops. You may also suggest them in the morning session at 9:30. Doors open at 9:00 with breakfast, registration and conversation. Please see the website tabs for details on transportation. There is a festival on the capitol mall so add a little extra time to the commute in.

For those on the fence, this is an opportunity to get and share ideas about the Digital Humanities with diverse colleagues from regional institutions and universities; learn a bit about DH tools, issues, best practices, struggles, and culture; and get an inside look at where we are today.  We welcome walk-ins but prefer pre-registration to help us plan ahead. This is for students, faculty, practitioners, librarians, archivists, project managers, and the interested public.

Register in B156 Phillips Hall, one flight down from the entrance, which is located on the South side of I street between 22nd and 21st NW. See you there!



Categories: Uncategorized | Comments Off on THATCamp dc2015 i s almost here

THATCamp DC2015 is coming April 18, 2015 to Foggy Bottom. Really.

Please join us for DC2015 THATCamp. The un-conference will be held from 9:30 to 3:45 with breakfast and lunch included, and is organized by the GWU students in HIST3001 Digital Humanities and the Historian. We anticipate a structure somewhat like this:

9:30-10:00 Registration and breakfast in B156 Phillips Hall (two short blocks from Foggy Bottom Metro, public parking across the street).

Breakfast (bagels, creamcheese, coffee)

Breakfast (bagels, creamcheese, coffee)


Registration at DC2014

10:00-10:50 Introductions and setting the agenda (maximum 16 sessions, 4 rooms and 4 timeslots.) Come with an idea for a conversation you would be willing to host or post in advance as a comment at the bottom of the page “Proposals” on this website.


Introductions around the room, session proposals

Schedule Produced in the first session from participant proposals

11:00-11:50 Session 1 x 4 rooms


Checking the sessions and rooms

12:00-12:50 Session 2 x 4 rooms

Self-organized sessions in GWU classrooms

Self-organized sessions in GWU classrooms


Breaks between sessions for networking; continuing the conversation

12:50-1:40 Lunch

1:45-2:30 Session 3 x 4 rooms


Checking the agenda online

2:45-3:30 Session 4 x 4 rooms


The DC2014 THATCamp organizers

3:30-3:45 Gathering in original large room Phillips B156  for  closure and to say farewell

End of THATCamp2014

End of THATCamp2014


There’s a new THATCamp being planned!  Set aside April 18th on your calendar, register now, and we will give you more info as we know it. Currently we expect to hold it in B156 and 109, 110, 111 Phillips Hall on the GWU campus, at 22nd between H and I streets NW, one block from Foggy Bottom metro station and across H Street from Gelman Library. Meanwhile, read more about the THATCamp movement and browse other THATCamps at Visit to see what we did last year.

Categories: Administrative, Uncategorized | Tags: , , , , , , , , | Comments Off on THATCamp DC2015 is coming April 18, 2015 to Foggy Bottom. Really.