Web-scraping with Python and Introduction to text data with Python
in partnership with NCRM
This course provides the foundations for you to understand, execute and communicate text data analysis in a widely recognised software platform that was built for data analysis. Specifically, it will introduce additional skills using the Python programming language and requires prior introductory experience with Python. 
The University of Exeter Q-Step Centre workshop | |
---|---|
Date | 24 - 25 April 2024 |
Place | Clayden Computational Lab |
Provider | The University of Exeter Q-Step Centre |
Event details
This practical-based face to face session will be delivered over two days and will provide you with both the technical programming skills and understanding of data science techniques that you will need to research pre-existing and novel social-political and economic issues and the kind of transferable skills that are currently in demand in the job market.
Text data surrounds us in our lives and comes in different shapes and sizes, e.g. newspaper articles, tweets, product reviews, song lyrics, etc. While it might seem at first glance that this information can hardly be summarized and compared, certain computational techniques allow extracting meaningful information from text data. This course provides the foundations for you to understand, execute and communicate text data analysis in a widely recognised software platform that was built for data analysis
Specifically, it will introduce additional skills using the Python programming language and requires prior introductory experience with Python. 
Requirements:
This training can be standalone with prior Python experience or as a follow on from Introduction to Python and Python for Data Analysis on 22nd and 23rd April 2024
Find out more here
Day 3: Web scraping with Python
- Introduction to Google Colab (students need a functioning gmail/google account they can log into)
- Pandas dataframes and uploading external data to Colab
- How to scrape a web page and extract text with Beautiful Soup
- How to analyse and visualise text content using the Seaborn library
Day 4: Introduction to Text Data with Python
- Text preprocessing
- Bag of words modelling and count vectorizer
- Lexicon based sentiment analysis using spacy
- Comparative visualisation
The workshop will include refreshments at 09.30 for 10am start, with lunch provided, concluding at 16.00. The duration of sessions may vary day to day with the overall format being a mix of lectures, demonstrations and practical exercises.