CCP Corpus
What is the CCP Corpus?
The CCP Corpus provides easy access to our growing collection of transcribed minutes and proceedings of the Colored Conventions.
The Colored Conventions Project seeks to bring the buried history of nineteenth-century Black organizing to digital life. Part of this work is enabling researchers to look for patterns across this broad, rich history in the language of the minutes themselves. These transcriptions are made possible by the dedicated volunteers of the CCP Transcribe Minutes initiative.
The CCP Corpus is designed to be easy to use in most large text-analysis applications, from Voyant Tools to topic modeling, the Natural Language Toolkit, or natural language processing.
By clicking the button below, I commit to the principles:
- I honor CCP’s commitment to a use of data that humanizes and acknowledges the Black people whose collective organizational histories are assembled here. Although the subjects of datasets are often reduced to abstract data points, I will contextualize and narrate the conditions of the people who appear as “data” and to name them when possible.
- I will include the above language in my first citation of any data I pull/use from the CCP Corpus.
- I will be sensitive to a standard use of language that again reduces 19th-century Black people to being objects. Words like "item" and "object,” standard in digital humanities and data collection, fall into this category.
- I will acknowledge that Colored Conventions were produced through collectives rather than by the work of singular figures or events.
- I will fully attribute the Colored Conventions Project for corpora content.
How do I use it?
As the CCP remains committed to the large-scale recovery of the convention minutes, please be aware that this collection is as yet incomplete. All uses of these materials published online or in print should indicate the provisional nature of the CCP Corpus. The CCP Corpus will grow significantly with the progress of Transcribe Minutes. Each updated version will be titled with the year and month last updated (i.e. 2015-11-CCP-Corpus.zip).
The downloaded zip file includes:
- A folder with all of the minutes in plain-text format.
- A table of contents in CSV form with relevant event and bibliographic data for each of the minutes.
- A text file named "Read Me" with notes about updates, permissions and citation guidelines.
The table of contents describes each of the texts in the collection, with a unique file id, public url, and event start date. Additional metadata is available upon request.
The Read Me file provides details about the collection's updates, reproduction permissions and guidelines on how to cite the CCP Corpus.
Feedback
We are eager to promote innovative uses of the convention minutes. If you are using these materials, or have any feedback to improve the CCP Corpus, please contact us at info {at} coloredconventions.org.
Copyright Statement
The Colored Conventions Project Corpus is being released under a Creative Commons Attribution-NonCommercial 4.0 International License. Please credit the Colored Conventions Project for providing access to these materials.