Archives for category: Census

baskerville.jpg I atten­ded a very inform­at­ive and thought pro­vok­ing present­a­tion by Peter Bask­erville at the Uni­ver­sity of Guelph today. He pos­tu­lates that the
shift in wealth from men to women dur­ing the period 1860 — 1930 was of sim­ilar mag­nitude to that of the land grab by European set­tlers from nat­ive Cana­dians. His present­a­tion was based on mater­ial from his forth­com­ing book “Silent Revolu­tion: Wealth and Gender in Canada, 1860–1930.” Baskerville’s work in this book, as in past, rests on his impress­ive use of cut­ting edge quant­it­at­ive ana­lysis and syn­thesis of census data with other offi­cial records. His impress­ive record of art­icles, books and edited volumes has shed new light on the life of ordin­ary Cana­dians dur­ing the nine­teenth and early twen­ti­eth cen­tur­ies. Read the rest of this entry »

I have been of late explores vari­ous means for the auto­mated lon­git­ud­inal match­ing of census manu­script records. Its a huge chal­lenge and I seem to have spent as much time identi­fy­ing poten­tial prob­lems as opposed to identi­fy­ing poten­tial solu­tions. This is not say I haven’t pondered a couple solu­tions, but the list of chal­lenges remains much longer and seems to be grow­ing much faster — but, all this means is a more chal­len­ging research prob­lem, demand­ing some innov­a­tion in meth­od­o­logy. Fun!

googleimage.gifBut there is a paradigm shift hap­pen­ing. One that I have been par­ti­cip­at­ing in, and cer­tainly embrace, but am sel­dom always cog­niz­ant of. The idea of online col­lab­or­a­tion con­tin­ues to per­meate more and more of our every­day tasks. Emer­ging from spe­cial­ized research object­ives such as the SETI@Home ini­ti­at­ive, which sought to use excess per­sonal com­put­ing capa­city dis­trib­uted around the world, to other efforts today that take advnt­age not only of excess pro­cessor cycles to the idea of car­ry­ing out manual tasks through engage­ment of the masses in spe­cific tasks.

I star­ted play­ing with the Google Image iden­ti­fic­a­tion pro­gramme a few months back. If you haven’t tried it, it basic­ally involves match­ing you with a ran­dom online user and you spend 90 seconds typ­ing in words to describe a pic­ture dis­played to both users. You quickly type words that come to mind until both users type in the same word, at which point the engine accepts that that word is likely to be a rel­ev­ant descriptor. The key to par­ti­cip­a­tion is that the exer­cise if fun, fast and you can hop on at any­time and given the global scope, you will quickly be paired with an online user. Moreover, you have the small sat­is­fac­tion of being part of a big­ger exer­cise of improv­ing the descriptors attached to Google’s image search repos­it­ory. This little ‘game’ also clearly illus­trates one of the down­sides of Google’s repos­it­ory, as these descriptors are determ­ined through a pro­cess which renders them simple rather than more spe­cial­ized. as I ‘play’ I real­ize that I may recog­nize the image as a par­tic­u­lar movie poster, but also think that my online part­ner may not catch the sub­tleties, so I may resort to simply choos­ing a pre­dom­in­ant col­our as a sug­ges­ted word, rather than the name of the movie or say an actor in the movie. As a res­ult I choose the more obvi­ous descriptor word to encour­age faster match. The object­ive in the Google match is to match words for the highest num­ber of images dur­ing the 90 second period, which may not achieve the best descrip­tions. How­ever, the pro­cess does deliver some basic descrip­tions terms that an auto­mated pro­cess would miss. The key is mak­ing it fun for the participants.

Down this same vein, Kris Inwood poin­ted me at a census ini­ti­at­ive, Auto­mated Gene­a­logy. Work­ing down this same premise of try­ing to funify a pro­cess requir­ing mass user inter­ven­tion, at Auto­mated Gene­a­logy, the site is a meet­ing point for gene­a­lo­gists to signup for and manu­ally enter into a data­base manu­script census records. The hope here is to engage that vast army of gene­a­lo­gists out there to con­trib­ute time to help their fel­low gene­a­lo­gists and have access to records which bene­fit their own research efforts. Col­lab­or­a­tion at its best. Addi­tion­ally they have begun a sim­ilar pro­cess to match Cana­dian manu­script census records between the 1901 and 1911 censuses. This is the same task that I have been rumin­at­ing over devel­op­ing an auto­mated pro­cess for. At AG they are using auto­mated means to do simple match­ing and then allow­ing users to refine the match where human dis­cre­tion is required. This is a clever approach to a real world research prob­lem. As to pro­gress, the pub­lished res­ults indic­ate that they have tran­scribed 93.15% of the entire Cana­dian census for 1911 and 99.99% of the 1901 census with 55.15% of the proof­ing car­ried out on this one.

This is a great example of this emer­ging trend to mobil­ize indi­vidual efforts en masse to assist with pro­cesses that in the past would have been car­ried out by a small group of spe­cial­ized research­ers. Both pro­cesses recog­nize that tasks can be divided and appro­pri­ate and dif­fer­ent resources applied to vary­ing stages. Mass col­lab­or­a­tion on simple tasks made fun!

census.gifOn Tues­day, I had the pleas­ure of meet­ing with Kris Inwood, Dir­ector of the 1891 Census Pro­ject at the Uni­ver­sity of Guelph along with his staff at a review of this excit­ing project.

Census pro­ject staff have been enter­ing data since 2002 and as of last Fri­day have com­pleted the data entry phase. They have com­piled a data­base com­pris­ing 328,000 records which rep­res­ents a 5% sample of the entire pop­u­la­tion of Canada in 1891. They have over­sampled in cer­tain urban areas as well as in the west of Canada to 10%. There is also a 100% cap­ture of group quar­ters (house­holds with more than 30 res­id­ents indic­ated in the manu­script census records). The next step in the pro­ject is to begin cod­ing columns such as reli­gion and occu­pa­tion to allow for sys­tem­atic use by researchers.

Over the life of the pro­ject par­ti­cipants have also been con­duct­ing research on their own interests using census data. A num­ber have com­pleted very inter­est­ing papers examin­ing top­ics such as the char­ac­ter and nature of the enu­mer­at­ors, the foibles of the enu­mer­a­tion pro­cess, meth­od­o­logy involved in loc­at­ing abori­ginal per­sons in the census and a sur­vey of con­tem­por­ary news­pa­per cov­er­age of the census itself.

Addi­tion­ally impress­ive, many of the par­ti­cipants have con­trib­uted to a series mini-biographies of indi­vidu­als and fam­il­ies in the census which will hope­fully be shared via the census web­site. These papers illu­min­ate the human side of manu­script census records and they also provide very use­ful case stud­ies demon­strat­ing how census manu­script data can be used in a vari­ety of research contexts.

Kris sug­gests that they are very close to being able to provide research­ers with the oppor­tun­ity to begin to use this data out­side the pro­ject and aven­ues are now being explored to provide sys­tem­atic dis­sem­in­a­tion of the dataset.