Friday, November 2, 2012

Digitization and Holistic Approaches to Data Sets


Over the past two years I have spent a lot of time working with Residential School quarterly return reports.  These reports were completed four times a year by School principals and contain the names, admission date, ages and discharge information of the students who were in attendance at the school.  The set of returns we have is far from complete but they do act as one of the best documents for providing proof that an individual attended IRS.

The majority of the work I do with quarterly returns is dictated by reference requests from former students, staff, families, and communities looking to find information about a particular individual.  While processing one of these typical typical reference requests, I typically flip through a binder that contains all the quarterly returns for a school and pick out any references to a specific individual.  I would then scan or photocopy the relevant pages and send them to the interested individual.  Though this process involves the quarterly returns it never led to me considering the returns as a whole. 

A recent project I've worked on helped me take a more holistic approach to looking at some quarterly returns.  The returns are one of the most frequently accessed documents in the archive I work at and staff spend a considerable amount of time manually searching returns for relevant information.  The majority of the quarterly returns are handwritten and many of them are poor quality copies.

The handwritten nature of the returns means that using Optical Character Recognition (OCR) software to make the documents full text searchable isn't possible.  But given the importance of these records and the frequency of access, creating a searchable transcribed version of the quarterly returns was seen as valuable. Currently, we have only undertaken making searchable the records for the schools that are accessed most frequently.

 Though time consuming, this task has not only increased access to the quarterly returns but provided some insight about the schools as a whole.  For example, the returns often indicate if a student is in the hospital or infirmary.  The process of transcribing and making these records has made it easier to track outbreaks of illness within the schools.  For example, at the schools in Spanish Ontario there were 28 boys sick in December 1943, which is almost 20% of the entire male student population.

Similarly, looking at the returns more holistically has also highlighted education trends within the school.  Often a trade or industry of study is listed for the boys school.  The most common trades include : farming, diary, carpentry, poultry, cooking, tailor and shoemaker.  It is now possible to group data by trade and determine which trades were more popular at particular periods. 

The transcription process has also illuminated the fallibility of these records.  One of the most common mistakes in the returns is misspelling of family names.  The transcription process highlighted how depending on who filed the return the spelling of a name could change (eg. Corbiere or Corbier or Coribiere).

Overall, the process of being able to access information and process requests more efficiently is always a great thing in my mind.  More importantly, this experience has highlighted the potential of records to provide contextual information when looked at holistically and contextually.  Considering the difficulties (eg. missing records) that many people working with residential school records come across, it is important to use the information that does exist to its fullest potential. 

Library and Archives Canada has compiled a guide to conducting Residential School research that might be useful to anyone beginning to work with IRS documents.

No comments: