Lab #1a - Speech Perception, Classic Paradigms
Updated September 24th, 2020
Lab #1A: complete data collection by Monday September 14th; submit report by Monday September 21st (by email attachment).
Lab #1B: complete data collection by Thursday September 24th; submit report by Friday October 2nd.
In this two-part lab project you will be a participant in a series of short experiments on speech perception, and you will analyze the data that you generate in the experiments. The goal of the lab is to provide experience with:
- Effects of native language on auditory perception
- Acoustic vs. phonological encoding of sounds
- Basic analysis and presentation of judgment and reaction time data
- Description of hypotheses and outcomes
The experiments in the lab are based on classic identification and discrimination paradigms that are widely used in the speech perception literature (Lab #1A), and on more recently developed paradigms that have been used to probe higher-level encoding of speech sounds (Lab #1B).
The lab consists of a number of steps.
- Part A. Running the experiments
- Part A. Analyzing the data using pivot tables
- Part A. Calculating sensitivity scores
- Part A. Graphing the data
- Part B. Analyzing group data
- Parts A & B. Comparing predicted to observed outcomes
The English sounds used in the lab were generated using a speech synthesizer by Colin Phillips; the Russian sounds are computer-edited natural speech samples, created by Nina Kazanina (now at the U. of Bristol, UK). The experiments in Part A of the lab are run using PsychoPy scripts developed by Julia Buffinton, based on Psyscope scripts originally created by Colin Phillips and improved by Henny Yeung (now at Simon Fraser U) and Brian Dillon (now at U of Massachusetts). The experiments in Part B of the lab are based on scripts created by Sunyoung Lee (now at US Dept of State) together with Brian Dillon.
[Note: If you are a course instructor who is interested in using this lab exercise in a course, feel free to copy the files and modify them as you see fit. Please do so under the two following conditions: (i) you acknowledge the source of the lab if you make it available electronically, and (ii) you share any improvements that you make to the lab. In this way, the resources will continue to be improved for the benefit of everybody. Contact Colin Phillips if you have any questions or comments.]
Running the Lab #1A Experiments
This section guides you through the steps needed to run the Speech Perception Lab (Part A) Experiments. After you have completed this, you should proceed to the Analysis Instructions section.
Description of Experiments
There are four short experiments on speech perception in Part A of this lab. You will run all of them on yourself, then analyze your own data.
The experiments examine your perception of sounds from two languages: English and Russian. Two experiments focus on the contrast between voiced /d/ and voiceless aspirated /t/ found in English and many other languages. The other two experiments focus on the contrast between the voiced /d/ and voiceless unaspirated /t/ sounds found in Russian and many other languages.
The experiments examine performance on two tasks: identification and discrimination. In the identification task, you will hear sounds individually and must decide which of two categories the sound belongs to.
In the discrimination task, the experiments use an interstimulus interval (ISI) of 1200ms – in other words, the pairs of sounds are presented with 1200ms of silence between them. Since the sounds themselves are 300ms long, this amounts to a stimulus onset asynchrony (SOA) of 1500ms.
The four experiments are:
- English DT identification
- Russian identification
- English DT discrimination
- Russian discrimination
Setting up the Experiments
PsychoPy is an open-source, platform independent software package that runs on Windows, OS X, and Linux. It includes both a graphical interface (builder) and programming interface (coder) based on Python script. We will use PsychoPy a number of times in the course. We selected it based on the price (free!), cross-platform flexibility, strong user community, and straightforward graphical user interface. You could use it in the future in your own research. This lab will be run solely through the graphic builder interface.
You can download PsychoPy free of charge from the open source software site Sourceforge. PsychoPy has extensive documentation.
The Speech Perception Lab files can be found here. The files should be in a compressed folder called SpeechLap.zip that should automatically decompress into a normal “SpeechLab” folder.
[If the folder does not automatically decompress, then you may need to just double-click on the .zip file, or you may need to download a utility to decompress the folder. Note: on a PC, you may need to use the “extract all” command (found when right-clicking the .zip file or in the menu of your File Explorer) to extract everything correctly. If you receive an error like the following: “ValueError: setSound: could not find a sound named [sound]“, it is likely because it was not extracted properly.]
As you run through the experiments, PsychoPy will automatically save your data. If you are running the experiments on your own computer, you do not need to do anything special to access the data later. If you are running the experiments on a shared computer, remember to bring along a USB stick to save the files to, or email them to yourself before you leave the computer.
Headphones & Volume
In order to improve the sound quality of the sounds that you are listening to (and, if you’re in a public area, to avoid driving the person sitting next to you crazy) you should listen to the sounds through headphones. You will also find that the experimental tasks are easier if you’re listening through headphones.
Order of Experiments
There are 4 short experiments in this lab. If we were planning to use the results from these experiments in a real research project, we would randomize the order of experiments across participants, so that different people complete the experiments in a different order. But in this case we recommend that you follow the order in the list above. Run the identification experiments before the discrimination experiments, and run the English variant before the Russian variant.
Opening the Experiments
First, check that PsychoPy is available on the computer you are using. If not, see above for instructions on downloading it. All of the files needed for the lab can be found in the folder “SpeechLab”. It should look something like the folder in the image below.
Each experiment for this lab is a separate PsychoPy file (extension .psyexp). You’ll need to open the following:
- English DT identification: ID_DT.psyexp
- Russian identification: ID_Russian.psyexp
- English DT discrimination: Dis_DT.psyexp
- Russian discrimination: Dis_Russian.psyexp
Upon clicking on any of the experiment files (DT identification, Russian identification, DT discrimination, and Russian discrimination) the experiments should automatically open in PsychoPy. Unless disabled, PsychoPy opens with a ‘Tip of the Day’ pop-up. You can exit out of this. You should now see the builder interface of PsychoPy, that looks like this.
To run the experiment you have opened, go to the top ribbon and click on the green circle icon with an arrow on it. Alternatively, use Ctrl + R.
In a few moments, a new window should appear, prompting you to input participant and session identification information. Use your initials to fill in the participant section, and leave the session section as is (it should read ‘001’.) Then, click ‘OK’. The experiment should begin now.
Running the Experiments
At the beginning of each experiment, a couple of screens of instructions will appear. Read the instructions carefully.
Each experiment begins with a brief practice session, so you always get a chance to practice listening to the sounds before data collection begins.
When you press the “f” or the “j” key to indicate your response after each trial, you should immediately hear a beep, and then the next trial will begin. If you do not give any response the beep will automatically sound after 6 seconds, and then the next trial will begin. This means that the trial has ‘timed out’, and no response is recorded.
In the identification experiments, which you will attempt first, you will hear 100 sounds, in a single block. In each of the discrimination experiments you will hear around 200 pairs of sounds.
It should take around 30 minutes to complete all 4 experiments. Feel free to take a break between experiments if you get too tired during the experiment. You will perform more consistently if you are not tired.
At the end of the experiment, you may see a warning that “Movie2 stim could not be imported and won’t be available” – this is okay. You can ignore this and proceed.
Switching between Experiments
After completing the first experiment, close out of the current PsychoPy window, and open the next experiment file, repeating the above process.
Completing the Experiment
Once you are done with all 4 experiments, you need to check that your data files have been saved. PsychoPy will produce three files for each experiment. The first is a .csv file that can be opened in Microsoft Excel or a similar spreadsheet program. This contains your responses, response time, etc. The .log file contains a chronological record of everything that happened in the experiment. You shouldn’t need to use this unless you encounter an error. The last file is a .psydat file – according to PsychoPy documentation, these files are “designed to be used by experienced users with previous experience of python and, probably, matplotlib. The contents of the file can be explored with dir(), as any other python object.” You will only need to analyze the .csv file.
You can analyze the results using any spreadsheet or similar program (e.g. Microsoft Excel, Google Docs Spreadsheet), on any computing platform (Mac, PC, Linux).
After any individual experiment, you can check that the data has been saved by going to File>Open, or using ctrl+O to open the folder within PsychoPy. The data should be in a file in the folder ‘data’.
The next step is to analyze your data.
Last updated: September 24th, 2020. We believe these issues are now all fixed in the updated experiment files linked above.
Updates to PsychoPy have led some scripts and files that previously worked flawlessly to cause a number of problems. Apologies for the headaches that this has created. And thank you to all who have helped to identify the issues and the solutions. Here are some troubleshooting tips.
1. (Russian) discrimination experiment fails. The instructions work fine, but the main experiment does not run. Solution: change the name of the keyboard response in “trial” routines from “response” to “resp”, i.e., some name other than “response”.
This issue was discovered on PsychoPy 3, v.2020.1.2 on MacOS. It seems to be related to a conflict caused by a default variable name (“response”) that is generated by the practice session.
2. Unplayable sound files in discrimination tasks (English and Russian). Windows users appear to be impacted by sound files in mono rather than stereo format. This problem can be fixed by replacing the sound files with new stereo-only versions found in this folder. Thanks to Xinchi Yu, Luisa Seguin, and Jack Ying for figuring out this issue.
3. Extracting the SpeechLab.zip folder on Windows. The files should be extracted to a specific folder.
Analyzing the Results
Having run yourself as a participant in the experiments, your task now is to analyze your data. This is best done using a spreadsheet program such as Microsoft Excel or Google Docs Spreadsheets. Most spreadsheet programs also allow you to plot graphs. Since the PsychoPy data files are .csv files, they can be read on any computer platform.
Identification Tasks: graph judgments and reaction times for both the English voicing contrast and the Russian voicing contrast. You should do this using a PivotTable. Instructions below.
Discrimination Tasks: graph percentage accuracy and reaction times and d’ sensitivity scores (“d-prime”) for both the English and the Russian contrasts. Another tab in these instructions provides notes on calculating sensitivity scores.
Discussion: write-up a description and analysis of your results. What differences, if any, do you observe between your performance with the English and the Russian contrasts? Do you observe any patterns in your reaction times – do shorter reaction times always correspond to more accurate judgments? For the patterns that you observe in the data, try to suggest explanations. The readings by Werker (1994, 1995) should be helpful in this regard.
In displaying the results, there is no specific expectation for how this should be presented — you should figure out for yourself how your findings can be presented most clearly. But, line graphs are most likely to yield easy-to-read results.
Note: you are encouraged to work with other students on data analysis. However, you should submit your report based upon your own data.
Analyzing the Data
Keep a back-up copy of your original data files, in case you accidentally lose or corrupt data at some point in your analysis, and need to go back to your original files. This is always a good idea.
PsychoPy data files are generated based on the Participant Name, Experiment Name, and date/time of completing the experiment. It may seem overwhelming but they will all be of the same format. For example, if Shaun the Sheep (participant: SS) completed the ID_DT experiment at 11:30 on September 1, 2016, the data file would be SS_ID_DT_2016_Sep_01_1130.csv.
Open the data files in Excel or another spreadsheet processing program. The data files produced contain much more information than you need for these analyses, so you can (optionally) delete extraneous columns and then save As a .xls(x) file. The .csv file format will not support PivotTables.
For the identification experiment, you will need the following columns: condition, resp.keys, resp.rt, and participant. The resp columns indicate which key was pressed and the response time.
For the discrimination experiment, you will need the following columns: sound1, sound2, corrAns, resp.keys, resp.rt, and participant. The corrAns column indicates whether the sounds were the same (0) or different (1), which you can also confirm by comparing the sounds listed in the sound1 and sound2 columns.
Then, find and replace ‘f’ and ‘j’ with ‘0’ and ‘1’. Changing the letters to numbers is useful when you want to compute average accuracy scores. Don’t forget to save as a .xls(x) file. Do not replace ‘f’ and ‘j’ with ‘0’ and ‘1’ manually – this is not good use of your time. Highlight the column, and then use Edit > Replace … to replace all instances of ‘f’ with ‘0’ in a single step, and so on.
To compute average values for the conditions in your data, use a PivotTable. Select any cell in your data and go to Data > PivotTable… The appropriate range should already be selected, so click OK. It will present a layout and allow you to specify rows, columns, values, and filters to determine what analyses are computed and how they are displayed.
Below you can find instructions for creating a PivotTable for the average response in the identification task. You can then modify these instructions to get average reaction times, and various other summaries of your data. After doing this once or twice you will be able to get through the task of analyzing your data very quickly.
In the PivotTable Builder, drag “condition” into the “Row Labels” area. This indicates that you want your analyses to be organized according to the categories specified in the Condition column of your data. This is known as the Independent Variable.
Then “resp.keys” (or if you renamed the column after converting to 0 and 1, use that column name) should be dragged into the “Values” (or “Data”) area of the Builder. This indicates that you want your analysis to use the results shown in the “resp.keys” column. This is known as the Dependent Variable.
In the Values box, Excel will by default indicate that you want to display the “Sum of Response”, but you should change it to “Average” by clicking on the info button next to the field in “Value” and then selecting “Average.”
When you have done that, you should have a table of the percentage of dae responses. You can modify these steps to complete your other analyses.
In the data files for the identification experiments, the sounds are listed by condition name. The table below shows the condition names, and what sounds they represent.
The English stimuli are drawn from a computer synthesized continuum, created using the Klatt synthesizer for Macintosh. The only property of the sounds that varies is the voicing onset time (VOT) — i.e. the time lag between the release of the stop closure and the onset of voicing. The continuum spans VOT values from 0msec (dae) to 60ms (tae).
The Russian stimuli are based on natural recordings from a native speaker that were subsequently computer-edited to create a continuum. The Russian continuum includes sounds with a voicing lag, i.e., the voicing onset follows the consonant release, as in the English sounds, and sounds with a voicing lead, i.e., voicing precedes the consonant release, also known as prevoicing. Prevoiced sounds are listed below as having negative VOT values. Russian speakers tend to classify sounds with positive VOTs, and sounds with negative VOTs below ~15ms as voiceless /tae/, and sounds with more negative VOTs as voiced /dae/.
|English ID||Russian ID||English Discrim||Russian Discrim|
|00dt||ADA+10: +10ms||1S: 0-0||1RS: +8-+8|
|10dt||ADA+2: +2ms||2S: 7-7||2RS: +2-+2|
|20dt||ADA06: -6ms||3S: 14-14||3RS: -4–4|
|24dt||ADA14: -14ms||4S: 21-21||4RS: -10–10|
|28dt||ADA16: -16ms||5S: 28-28||5RS: -16–16|
|32dt||ADA18: -18ms||6S: 35-35||6RS: -24–24|
|36dt||ADA20: -20ms||7S: 42-42||7RS: -28–28|
|40dt||ADA28: -28ms||8S: 49-49||8RS: -34–34|
|50dt||ADA36: -36ms||9S: 56-56||9RS: -40–40|
|60dt||ADA44: -44ms||1: 0-14||1R: +8–4|
|2: 7-21||2R: +2–10|
|3: 14-28||3R: -4–16|
|4: 21-35||4R: -10–24|
|5: 28-42||5R: -16–28|
|6: 35-49||6R: -24–34|
|7: 42-56||7R: -28–40|
|1L: 7-35||1RL: +2–16|
|2L: 14-42||2RL: -4–24|
|3L: 21-49||3RL: -10–28|
|1L5: 0-35||1RL5: +8–24|
|2L5: 7-42||2RL5: +2–28|
|3L5: 14-49||3RL5: -4–34|
|4L5: 21-56||4RL5: -10–40|
Below are the instructions for creating condition names in Excel for the discrimination experiments (thanks to Masato):
1. Copy sound1 and sound2 and paste them into new columns.
2.1. For Russian data; select these columns and delete “da+” (replace “da+” with blank) and then replace “da” with “-“.
2.2. For English data; select these columns and delete “dt”.
3. Paste =CONCATENATE(MAX(X2,Y2),”/”,
4. Paste =ABS(X2-Y2) to the right next cell (and change X and Y again). This calculates the differences between the VOT of two sounds.
5. Select these two new cells, hold the bottom right corner and pull it all the way down to the end of the data.
6. Now you have the names for the conditions (and differences between the VOT of the two sounds)
Note: for the Russian data, you will need to use this function since the VOT differences are not constant within conditions: =IF(Z2=0,”RS”,IF(Z2<15,”R”,IF(
Note on Graphing Accuracy Data
In the identification experiment, you will probably want to plot sound category (x-axis), against responses (e.g. % [dæ] responses). Remember that in this study you simply made judgments about individual sounds. There is no right or wrong answer about which category each sound belongs to. Nevertheless, your judgments probably were quite consistent.
In the discrimination experiment, you made a same/different judgment. In this task, there is a fact of the matter about whether the sounds in each pair are the same or different. For some pairs of sounds the answer “same” is correct, for other pairs of sounds the answer “different” is correct. Therefore, if you want to graph your accuracy across all pairs of stimuli, you should not just plot your percentage of “same” or “different” judgments. You will need to convert the scores from percentage same/different to percentage correct/incorrect. You can do this easily by comparing the “corrAns” column and the “response” column.
For tips on how to create graphs in Excel, see the on-line Help that is included with the program.
Note on Reporting Numbers
When you calculate average response proportions or reaction times, you will often be shown a rather detailed number, e.g., average reaction time is 452.12865 milliseconds. You should not simply copy this into your report. The 5 decimal places (100,000ths of milliseconds) is far beyond the meaningful resolution of your measurements, and it conveys false precision. To report data like a pro, look at how similar measures are typically reported in publications in the relevant field. For example, in psycholinguistics it is standard to report reaction times in numbers of milliseconds, with no fractions of milliseconds.
When you analyze your accuracy in the discrimination task, one straightforward approach is to simply calculate for each pair of sounds the percentage of times that you correctly responded “same” or correctly responded “different”.
However, there is a serious weakness with this approach. It would be possible for you to score 100% accuracy in detecting sound pairs that are different, simply by answering “different” on all of the trials. You would then, of course, also score 0% accuracy on detecting sound pairs that are the same. In this situation, you would be showing a strong response bias. You would not be showing any success in telling apart those sound pairs that were the same and those pairs that were different. We therefore need a measure that assigns credit in situations where you successfully distinguish the different types of trials, and that takes account of response bias.
For this, we turn to Signal Detection Theory, and to a measure known as d’ (“d-prime”). d’ scores are sometimes referred to as sensitivity scores.
- Interactive tutorial on Signal Detection Theory
- The Claremont Colleges WISE project provides additional interactive tutorials on basic statistical concepts
- Excel worksheet for calculating d’ scores for the speech perception lab in this course
- A lecture notes page on Signal Detection Theory by David Heeger (New York University)
The basic idea, which we will discuss in class, is that the d’ measure combines information about the likelihood that the participant successfully detects target trials (“hits”), with information about the likelihood that the subject incorrectly classifies non-target trials as targets (“false alarms”). Credit is assigned for hits, and a penalty is deducted for false alarms. The unit of credit and penalty is z-scores, a measure based on standard deviations that is discussed in the WISE project notes, any basic statistics text, and hundreds of locations on the internet.
More on Analyzing & Graphing the Results
This page contains some tips on creating graphs for the results from the Speech Perception experiments. It assumes that you are using Microsoft Excel to create your graphs. Similar functionality should be available in Google Docs Spreadsheets or other programs.
Preparing the Data
The page on Analysis of the Results provides guidelines for preparing the data files and replacing “f” and “j” responses with numerical values (e.g. “0” and “1”). The Analysis page also contains information on creating averages of all of the responses to the same sound using PivotTables.
In order to prepare your results for graphing, you’ll want to group all of the average judgments and reaction times together in a single block of cells as follows. You can use the values from the PivotTables to do so.
Generating the Graphs
If you want to plot “Category” against “Percentage ‘d’ Responses”, then you’ll need to highlight the appropriate columns of data (category and % d-responses). Once you have selected the data that you want to graph, you’ll make a line chart using the Chart Wizard tool. It should generate a graph like the following.
The above graph is a sample graph of judgments for the English DT Identification. The below graph is a sample graph of reaction times for DT Identification.
But one thing is misleading in the figure. The y-axis scale goes from 0 – 1.2, but the heading refers to percentage of ‘d’ responses. This is inaccurate. The y-axis shows proportions, not percentages. And the maximum possible proportion is 1.0. The y-axis should be edited so that it does not go beyond the maximum possible value.
You’ll follow the same steps to graph response times and correct/incorrect responses for the Discrimination tasks. Remember: in order to graph accuracy for the discrimination experiments, you need to convert same/different judgments into correct/incorrect scores. This should be easy to do by just comparing the “corrAns” column and the “response” column in your data spreadsheet.