Getting Started: 2/1 to 2/8
The first weeks of the internship was mostly setting up and getting settled. Downloading the PDFs of the plays in order to convert them to Word Files and .txt files was the first step, along with downloading programs like Notepad++ and RegEx to start data cleaning. Once the files were ready, I started going through Tamburlaine Part 1, and learning RegEx in order to make the data cleaning process more effiecient. I recorded the code I created to use on future documents. I did this by going through notes of previous students working on this project, and utilizing the RegEx ‘how-to’ guide. I made a list of things to fix and look for, and from there, got to work getting Tamburlaine ready to go!
Total Hours: 6
Plugging Away: 2/9-2/20
This week I have ben working on data cleaning all the plays by March 1st. After a bit of practice, one plays take between 2-3 hours to complete. As I am working, I am discovering new ways to make the process easier and more efficient. For example, in some of the files, there are large spaces before every line that need to be taken out. So to do this, I discovered that holding the Shift key and ALT key will allow me to highlight multiple sections at a time. This may seem like a minimal discovery, but it does save quite a bit of time. As of today, 2/20, I have completed seven out of thirteen total plays. My goal is to get the last six done by Monday, 2/27, and have all the txt files of the plays published by 3/8!
Total hours: 14
Finishing up: 2/21-2/28
This week, I finished the rest of the plays that needed to be edited. After a while of working with the texts and anticipating what each play would need, I was able to cut down the time of each play to 1.5-2 hours each. This was a long process, but was a very rewarding learning experience! I learned how to use Notepad ++ and RegEx, both of which are very useful and I plan on using for future projects! The next steps of this current project is to uploaded and publish the work I have been doing by March 8th. I plan on explaining my process in detail, and describing how I able to cut down time spent on each play.
Total hours: 12
Posting: 3/1-3/7
With all the txt files cleaned and ready to go, all that was left for this project was to post them on the website! After revieing my knowledge of WordPress and the process of posting, I was ready to go. I was able to get all the files uploaded correctly in about 3 hours, spread out over the weekend. I used a posting script, so all the post looked uniform and professional! Going forward, I look forward to starting new projects, either using my design skills for promotional material, or diving back into Tableau and examining the relationoships within each play.
Total hours: 4
First Degree of Marlowe: 3/20- 3/25
After finishing the .txt files, I started working on the networking data from the Six Degrees of Francis Bacon. I took the first degree interactions to Christopher Marlowe and started collecting data on each figure. The original plan was to make a networking map in Tableau to show the connections. Unfortunately, Tableau is not capable of graphing this kind of data, so we shifted gears and now working in Graph Commons. This program allows us to make connections between several historical figures, and add additional information, including links to other websites, within the graph. I am now working on collecting information about each figure that has a first degree connection to Kit Marlowe and formatting the Google Sheets correctly so it will work in Graph Commons. This step of the process should be ready by 3/25.
Total hours: 6