Click here to download a “clean” .txt document of Christopher Marlowe’s, Thomas Nashe’s, and William Shakespeare’s I Henry VI (Regular Spelling).

The file has undergone the following data cleaning protocols in order to make it suitable for text analysis:

  • Line numbers, IMG and SIG information using RegEx: [a-z]+\s\d\d\d\d
  • Page breaks and indents removed manually
  • Speaker tags removed manually
  • Spaces entered between speakers
  • Beginning publishing information and ending footnotes removed
  • Spaces added between words as needed

Data Cleaning Credit: Meggan Law (Framingham State University ’24)

Return Home

Henry VI, Part I.txt