We will be holding 2 x two-day workshops, 24-25 April and 27-28 April, for learning text analysis using R. The focus is on a mix of foundations of working with texts in R, and specific use of the quanteda (http://quanteda.io) package developed by Kenneth Benoit and a team of collaborators at the LSE.
Who May Participate
Anyone may apply to participate, but priority will be given to, in this order:
- PhD students
- early career academics
- other academics
- everyone else
Applicants should have some prior experience of programming in R and in text analysis, although the first day is pitched at an introductory level.
Apply to participate here.
Once your application has been approved, we will send you a link to register. We will only book travel and accommodation for applicants once they have registered for this workshop. The closing date for applications is Wednesday 22nd March and registrations will close on Friday 31st March.
The workshop is not only free to attend, but also we will cover the cost of travel and accommodation up to £300. If you provide us with the details of your requirements, we will book flights and accommodations directly. Lunch and refreshments will be provided on both days and there will be a reception on the evening of the April 24th. Breakfast will be provided on the morning of 25th April for those people who stayed overnight on the 24th. We will only cover accommodation for the night of 24th April. If you require additional nights, we can book this for you but you will be responsible for covering those costs incurred.
Day 1: Introduction to Text Analysis Using R
1:30pm – 6pm, 24th April, commencing with lunch from 12:30pm. The workshop on 27th April will be from 10am – 4pm with coffee available from 9:30am.
We will cover how to format and input source texts, how to structure their metadata, and how to prepare them for analysis. This includes common tasks such as tokenisation, including constructing ngrams and “skip-grams”, removing stopwords, stemming words, and other forms of feature selection. We show how to: get summary statistics from text, search for and analyse keywords and phrases, analyse text for lexical diversity and readability, detect collocations, apply dictionaries, and measure term and document associations using distance measures. Our analysis covers basic text-related data processing in the R base language, but most relies on the quanteda package (https://github.com/kbenoit/quanteda) for the quantitative analysis of textual data.
Day 2: Advanced Text Analysis Using R
9am – 5pm, 25th April with coffee and refreshments at the start. The workshop on 28th April will be from 10am – 4pm with coffee available from 9:30am
This day will cover more advanced text analysis using R, including more advanced methods, including how to pass the structured objects from quanteda into other text analytic packages for doing topic modelling, latent semantic analysis, regression models, and other forms of machine learning.
An illustrative workshop previously given can be viewed here https://github.com/kbenoit/ITAUR.