POP77001 Computer Programming for Social Scientists
2021
About This Module
This module provides foundational knowledge of computer programming concepts and software engineering practices. It introduces students to major data science programming languages and workflows, with a focus on social science data and research questions. Students will be introduced to Python and R, two principal data science programming languages. This course covers basic and intermediate programming concepts, such as object types, functions, control flow, testing and debugging. Particular emphasis will be made on data handling and analytical tasks with a focus on problems in social sciences. Homeworks will include hands-on coding exercises. In addition, students will apply their programming knowledge on a research project at the end of the module.
Instructors
- Tom Paskhalis, Office Hours: Thursday 11:00-13:00 online
- Martyn Egan
Module Meetings
- 11 two-hour lectures
- Monday 11:00 in East End Development 4/5 LTEE2
- 11 one-hour tutorials
- Group 1: Wednesday 14:00 in 1 College Green 2.04
- Group 2: Thursday 10:00 in East End Development 4/5 LTEE2
- No lecture/tutorial in Week 7
Week | Language | Topic |
---|---|---|
1 | - | What is computation? |
2 | Python | Python Basics |
3 | Python | Control Flow in Python |
4 | Python | Functions in Python |
5 | Python | Debugging and Testing in Python |
6 | Python | Data Wrangling in Python |
7 | - | - |
8 | R | Fundamentals of R Programming I |
9 | R | Fundamentals of R Programming II |
10 | R | Data Wrangling in R |
11 | R, Python | Complexity and Performance |
12 | R, Python | Web scraping |
Prerequisites
This is an introductory class and no prior experience with programming is required.
Hardware and Software
- Laptop with Windows/Mac/Linux OS (no Chrome books)
- Software:
Materials
The following texts provide a good introduction to Python and R programming with a focus on data analysis applications:
Guttag, John. 2021 Introduction to Computation and Programming Using Python: With Application to Computational Modeling and Understanding Data. 3rd ed. Cambridge, MA: The MIT Press
Matloff, Norman. 2011. The Art of R Programming: A Tour of Statistical Software Design. San Francisco, CA: No Starch Press.
McKinney, Wes. 2017. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. 2nd ed. Sebastopol, CA: O’Reilly Media
Wickham, Hadley. 2019. Advanced R. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.
Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O’Reilly Media.
Additional online resources:
Python 3 Documentation (intermediate and advanced)
Assessment
- 5 problem sets (50%)
- Bi-weekly programming assigments
- Due at 11:00 on Monday of weeks 3,5,7,10 and 12 on Blackboard
- Research project (50%)
- Final Python/R project demonstrating familiarity with programming concepts and ability to communicate results
- Due at 11:00 on Monday, 20 December 2021
Assessment criteria
- ✔️ Code exists
- ⌚ Code runs and does what it has to do
- 📜 Code is legible (meaningful naming, comments)
- ⚙️ Code is modular (no redundacies, use of abstractions)
- 🏎️ Code is optimized (no needless loops, runs fast)
Plagiarism
- Plagiarising computer code is as serious as plagiarising text (see Google LLC v. Oracle America, Inc.)
- All submitted programming assignments and final project should be done individually
- You may discuss general approaches to solutions with your peers
- But do not share or view each others code
- You can use online resources but give credit in the comments
Check the Trinity’s guide on the levels and consequences of plagiarism