Data Visualisation Assignment
PNB 3EE3; Fink
While learning how to code experiments and collect data is a huge part of studying human perception, another massively important aspect is learning how to think about data, visualise it, summarize it, and talk about it.
In this assignment, you should use the programming language of your choice (e.g., R, Python, etc.) to do the following:
Create a new notebook, in which you will conduct all of your analyses. Name your file
dataVisualisation_analyses.[file ext here]
Load in the
assignments_dataVis_dataset.csv
file.Print summary statistics for each condition in the data set (e.g., mean, standard deviation)
Visualize the results of each condition. At very least, you should plot y as a function of x for each condition. You are welcome to create as many plots as you like. Try to apply the visualization best practices presented during lecture. Please create your plots in line in your notebook.
In a text cell in your code notebook, answer the following questions:
What can be concluded from the dataset?
What did you find most challenging about this assignment?
What did you learn in completing this assignment? What are you still curious about?
Why is it important to visualize data?
- Make sure your code can run from start to finish, without error. Include your notebook file within the assignments folder of your personal repository. Please also include a pdf or html version of your notebook in your repository, in addition to the python or R version.
NOTE you do not need to re-include the dataset in your submission.
- Be sure to push your changes to GitHub.
Assessment
This assignment is worth 10% of your grade. The breakdown of grading for this assignment is as follows:
Content (the statistics and visualizations make sense for the dataset): 30%
Documentation (code has been appropriately commented so that others can understand): 30%
Reflection (clear evidence of critical thinking and meaningful reflection): 40%
IMPORTANT: Generative AI (e.g., chatGPT) should not be used to complete this assignment.
Tips & Tricks
Use this as an opportunity to get more familiar with your programming language of choice and to explore the fundamentals of data visualization! It is my understanding that your PNB coursework already involves R. I also know there is a Python for PNB course some of you may have taken. If you find yourself needing a refresher on either of these languages, there are so many great online resources! Here is a resource on R essentials and here is one on python essentials. There are even more helpful links listed on the course Resources page.
I highly recommend doing all of your coding in R markdown, Quarto, or a jupyter notebook. Such notebooks allow you to combine natural language with code. Why would you want to do that? Well, imagine you need to communicate your findings to someone else, or you want someone else to be able to reproduce the exact analyses you did. By writing code, creating detailed descriptions, and plotting figures in line, everything you need to do can all be in one place! In research jargon, we sometimes refer to this as a “reproducible workflow” – if you’re interested in learning about such workflows, here is a primer. Note that there are many types of notebooks to choose from, for example, Jupyter Notebooks, Google Collab Notebooks, and R Notebooks. Choose whichever one you are familiar with or which seems most intuitive to you!
Under no circumstances should you alter the raw data file. If you need the data table in a different format, read in the provided file and manipulate the dataframe within your programming environment. DO NOT create new raw data files that need to be read in.
Try to be efficient with your code. If you find yourself copying and pasting the same chunk of code and changing one variable name each time, there is likely a much better way to do things. Here are some issues to look out for and some ways to mitigate them.