I wrote this paper in 2014 as a Masters level project in the first year of my PhD. At the time, my supervisor urged me to either publish it or at least ‘put it on the web’, but I didn’t because I thought it was incomplete (it is, and it was my first attempt at LaTeX), and because of imposter syndrome etc etc etc.
Last week (March 2024) I was at a workshop where one of the presenters claimed there was no research on bias in Google search…
It’s not just search engines, the recent interest in LLMs seems to have spawned a new wave of interest in language in a digital age. But this is not a new field.
Obviously my dissertation wasn’t published, but there are MOUNTAINS and DECADES of papers and books that are/were published (and I mean pre-stochastic parrots), and I feel they are being overlooked in the bandwagon of GenAI and language/linguistics content being churned out at the moment. Some of those works are cited in this dissertation, and indeed in my PhD thesis, but many more have been written since (including a few of my own contributions).
So here it is now, on its way to join the soup of linguistic data on which ChatGPT etc feed…