JASP: a pleasant discovery

When it comes to linguistic analysis, I am a self-confessed statistics-phobe, and, for me, any instance in my research where complex statistical analysis and visualisation is required has always been the most challenging part of the process. I have solid foundational knowledge, thanks to an A-level in maths, but my confidence in whipping up pretty graphs and analysing complex datasets is low.

I have long been aware of the best solutions to my woes: one is to find the time to knuckle down and learn how to understand and use stats software like R. I know other researchers who have done just that and seem to be much more confident when complex statistical analysis is required. But I have always worried that the learning curve would be so steep that I may ‘waste’ too much time trying and failing to get things to work. Another solution is to work with someone who CAN do what you can’t do (and, ideally, offer them something in return!). But sometimes you just want to be able to sit and get on with something yourself.

This week, I was asked to cover a lecture for a colleague who had to go off work at short notice. It was an introductory lecture about quantitative research methods for language analysis. The lecture was ‘introductory’ enough that I felt perfectly comfortable with teaching the content. The only new aspect was a practical activity using a piece of software called JASP, which I had not heard of.

JASP is a free, open-source statistics tool supported by the University of Amsterdam. Their website boasts of “an intuitive interface that was designed with the user in mind”. Tasked with demonstrating how to use JASP with little time to prepare, I put this claim to the test.

I’m very happy to say that I found it very easy to use. What’s more, the support materials on their website are excellent, and offered in a range of formats including text, GIFs and YouTube videos. In addition, I found this guide for students which is super helpful.

Within not very much time, I found myself able to run analyses and produce plots that had previously given me a headache.

Plots like this may be fairly basic for some, but for me this is a massive first step in feeling more confident with calculating and presenting statistical analyses in my work. What’s more, with the recent release of corpus-specific resources like Lancaster stats tools online (and associated book by Dr Vaclav Brezina), I’m feeling more and more confident that moments like this will become more common!

The Spoken BNC2014 is now available!

On behalf of Lancaster University and Cambridge University Press, it gives us great pleasure to announce the public release of the Spoken British National Corpus 2014 (Spoken BNC2014).

The Spoken BNC2014 contains 11.5 million words of transcribed informal British English conversation, recorded by (mainly English) speakers between the years 2012 and 2016. The situational context of the recordings – casual conversation among friends and family members – is designed to make the corpus broadly comparable to the demographically-sampled component of the original spoken British National Corpus.

The Spoken BNC2014 is now accessible online in full, free of charge, for research and teaching purposes. To access the corpus, you should first create a free account on Lancaster University’s CQPweb server (https://cqpweb.lancs.ac.uk/) if you do not already have one. Once registered, please visit the BNC2014 website (http://corpora.lancs.ac.uk/bnc2014) to (a) sign the corpus’ end-user licence and (b) register your CQPweb account – following the instructions on the site. When you return to CQPweb, you will have access to the Spoken BNC2014 via the link that appears in the list of ‘Present-day English’ corpora. While access is initially only via the CQPweb platform, the underlying corpus XML files and associated metadata will be available for download in Autumn 2018.

The BNC2014 website also contains lots of useful information about the corpus, and in particular a downloadable manual and reference guide, which will be available soon. Further information, as well as the first research articles to use Spoken BNC2014 data, will be available in two in-press publications associated with the project: a special issue of the International Journal of Corpus Linguistics (due next month) and an edited collection in the Routledge ‘Advances in Corpus Linguistics’ series (due early 2018).

The BNC2014 does not end here – we are currently working on transcribing materials provided to us by the British Library to provide a substantial supplement to the corpus – find out more about that here: http://cass.lancs.ac.uk/?p=2241. For now, we will be waiting and watching with interest to see what work the corpus releases today stimulates. As ever with corpus data, it does not enable all questions to be answered, but it does allow a very wide range of questions to be investigated.

The Spoken BNC2014 research team would like to express our gratitude to all who have had a hand in the creation of the corpus, and hope that you enjoy exploring the data. We are, of course, keen to hear your feedback about the corpus; this, as well as any questions, can be directed to Robbie Love (r.m.love@lancaster.ac.uk) or Andrew Hardie (a.hardie@lancaster.ac.uk).

Source: http://cass.lancs.ac.uk/?p=2378

Not just a linguistic resource but a unique record of humanity

ESRC blog

robbie-love 150Robbie Love is a PhD student at the ESRC Centre for Corpus Approaches to Social Science (CASS) at Lancaster University, where he spent four years working on the Spoken British National Corpus 2014 project.

harry-strawson 150Harry Strawson is a writer living in London and contributed recordings to the Spoken British National Corpus 2014.

Here Robbie and Harry share two different perspectives on the Spoken British National Corpus project ahead of its release next week.

Every day billions of words are uttered in hundreds of languages all over the world. For corpus linguists, that is, people who study the form, use and function of language using specialised computer software, speech is like the golden snitch in a game of Quidditch. It appears to be everywhere around you and yet it is incredibly difficult to capture.

View original post 651 more words