Phonological Similarity versus Semantic Similarity on False Memory Induction


The purpose of this study was to compare the efficacy of semantically and phonologically related words on false memory induction in working memory. Three groups of individuals participated. One group viewed lists of words that were phonologically similar, another group viewed semantically similar word lists, and the third group viewed arbitrarily related word lists. After viewing each list, participants completed corresponding recognition tasks imbedded with unstudied words used to induce false memories. A one way ANOVA showed no difference in the effectiveness of phonologically and semantically similar words to induce false memories; however, participants created more false memories when viewing these words compared to arbitrarily related words. These findings suggest that differences between semantic and phonologically related words false memory induction capabilities could be limited to long term memory.

Keywords: false memory, phonological, semantic, working memory

False memories are misrepresentations of past experiences and distortions of the source of information for those experiences (Flegal, Atkins, & Reuter- Lorenz, 2010). The consequences of remembering a situation erroneously can have serious legal and emotional repercussions. In one instance, a psychiatrist used suggestive techniques to convince a woman she had participated in a satanic cult and even eaten babies (Loftus, 1974). Frederic Bartlett (1932) performed early research on false memories. Bartlett read participants a series of fables and short stories to determine if repeated exposures to source material would improve memory over time. After the reading, he instructed participants to recall the stories to the best of their ability. The results of his research indicated that participants were able to recall the stories better in comparison to their recall ability before the exposures. However, Bartlett also discovered that when participants were unable to remember a specific part of the story, they used preexisting schemas to fill in the gaps of their memory, creating false memories (Bartlett, 1932).

Forty years after Bartlett performed his research, Loftus and Palmer (1974) investigated the nature of false memory induction. They instructed participants to watch a video of a car accident at an intersection. After viewing the film, the participants responded to a question regarding the speed of the two cars at impact. Participants questioned with strong verbs such as smashed or collided recalled seeing the cars wreck at a greater speed than participants questioned with weaker verbs such as bumped or contacted (Loftus & Palmer, 1974). These findings lend evidence to the extreme susceptibility that individuals have to false memory induction.

While Loftus was concerned with false memory induction in episodic memories, James Deese (1959) created a paradigm for investigating the cognitive mechanisms of false memory induction twenty years earlier (McEvoy, Nelson, & Komatsu, 1999). James Deese’s false memory induction paradigm was more streamlined although less experimentally rigorous in comparison to previous research involving the study of false memories (Roediger & McDermott, 1995).  Deese (1959) composed 36 lists of 12 words, with each list consisting of semantically related words and a critical lure that was not present in the original list. Deese used the critical lures to induce false memories in participants. Deese discovered that in a free recall task, many participants would erroneously report the critical lure as being part of the original list.

Expanding on Deese’s (1959) methodology, the experimenters Roediger and McDermott (1995) devised a new paradigm in hopes to replicate Deese’s results. This new paradigm included a recognition task in addition to the previous recall task and implemented only six lists of words with a critical lure in comparison to Deese’s 36 lists.  After participants’ exposure to the lists, Roediger and McDermott asked the participants to perform a serial recall task and then instructed participants to write down the words from the previous list in the correct order. After the recall task was completed, the participants began work on a 42 item recognition task. Twelve words in the task were words that appeared on the previous list.  Six of the words were critical words from which the lists were generated (these served as critical lures to confuse the participants). Twelve words in the recognition task were words that were unrelated to any of the words appearing in any of the previous six lists. Lastly, the remaining 12 words were words that were weakly correlated to the words on the previous list (these served as weak lures). The results indicated that on the recall task and the recognition task, the participants reported the critical lures as appearing on the previous list almost as much as the previously studied items (Roediger & McDermott, 1995). This expansion of Deese’s original study has become the gold standard for studying false memories. This new methodology is the Deese/Roediger & McDermott (DRM) paradigm (McEvoy, Nelson, & Komatsu, 1999).

The DRM is not limited to the application of strictly semantic lists of words, with many researchers applying the DRM to studies involving false memory induction using phonological lists of words. Consistent with the research involving strictly semantic lists, participants recall and recognize critical phonological lures almost as much as they recognize previously studied material (Watson, Balota, & Roediger, 2003).  Prior to this research, the direct comparison between strictly phonological lists and strictly semantic lists had not been performed under ideal experimental conditions. Watson, Balota, and Roediger (2003) examined the interaction between a strictly semantic list and a composite list composed of semantically related words and select phonological cues associated with the critical semantic lure. For example, if the critical semantic word was sports, a phonological cue could be ball, wall, call, and doll.  The experimenters determined that hybrid lists of semantic and phonological lures produced twice as many false recalls of semantic critical lures in comparison to strictly semantic lists (Watson et al., 2003). The experimenters also determined that purely semantic lists produce a greater number of false memories as opposed to pure lists of phonological groups (Watson et al., 2003).

To account for aforementioned findings, Meade, Watson, Balota, and Roediger (2007) reviewed the theories and evidence explaining the DRM’s ability to induce false memories. They argued the most convincing theory is the Implicit Associate Response (IAR) model. Based off the spreading activation theory of memory retrieval, the IAR model suggests that when an individual views a word during the DRM paradigm, it activates the meaning of that word in a neural network and implicitly sends a spreading activation to other related words and concepts. The theory behind the IAR model is that the activation of other related concepts and words leads to individuals erroneously remembering highly associated words and concepts as being part of the original list (Meade et al., 2007). One study conducted by Robinson and Roediger (1997) lends evidence to this theory. They tested the efficacy of short word lists versus long words lists to induce false memories and found that longer lists induced more false memories. Meade et al. (2007) hypothesized that these findings meant that the greater amount of words presented created a larger network of associations, increasing the chances of inducing false memories in individuals.

In addition to the IAR model, Meade et al. (2007) explained that spreading activation alone cannot account for false memory creation, but that a decision making process can mitigate their formation. The Activating Monitoring Theory (AMT) suggests that during the DRM task, an individual’s decision on the source of a word on the list can determine the efficacy of the word creating a false memory. Meade et al. (2007) reviewed a study that lent evidence to this theory. The researchers found that warning subjects about critical lures meant to induce false memories significantly reduced the number of false memories created.

The production of false memories is not merely a result of external factors in the environment, individual differences do play a part. However, much research regarding individual differences in the production of false memories have been primarily limited to the geriatric and clinical populations (Watson, Balota, & Sergent-Marshall, 2001). In many cases, the clinical populations studied normally have low grade Alzheimer’s disease in conjuncture with dementia. Alzheimer’s disease is characterized by the individual’s ongoing degeneration of brain structures and continued impairments in cognitive functioning. Watson, Balota, and Sergent-Marshall (2001) viewed this population as the chance to measure individual differences between veridical recall and false recall. Using a modified version of the DRM, the experimenters were able to compare false memory induction across three separate lists of words (semantic, phonological, hybrid). The results indicate that as individuals age their ability to recall information that has been previously learned (veridical recall) decreases while the probability that they will falsely recall something increases with age (Watson et al., 2001).

These results are further exacerbated by the prevalence of low grade Alzheimer’s in some of the participants. With episodic memories being one of the first areas to be affected by Alzheimer’s, it makes sense that this particular population reported greater instances of false memory induction in the semantic group in comparison to the phonological group.  The researchers Watson, Balota, and Roediger (2001) already determined that hybrid lists of semantic and phonological words create more false memories than lists that are purely semantic or phonological, but participants with low grade Alzheimer’s (DAT) report three times as many false memories when exposed to the hybrid list in comparison to any other population group.

A considerable amount of the aforementioned false memory research (Deese, 1959; Loftus, 1974; Roediger & McDermott, 1995; Watson, Balota, & Roediger, 2003) has involved the investigation of false memory induction in long-term memory (LTM). In addition, although researchers have investigated phonological word lists (Chan, McDermott, Watson, & Gallo, 2005) and semantic word lists (Roediger & McDermott, 1995), only some researchers (Watson, Balota, & Roediger, 2001) have compared the efficacy of the two and these comparisons are limited to LTM. Additionally, research investigating false memory induction in working memory (WM) (Atkins, & Reuter- Lorenz, 2008; Tehan, 2010) has been limited to either strictly semantic or phonological word lists.

In comparison to the past research on the efficacy of phonological and semantically related words to induce false memories in LTM, we sought to compare the efficacy of phonological and semantically related words in WM.  Past researchers, by using the DRM, have implemented only one critical lure per word list. We created a modified DRM with five critical lures per list, allowing participants to make more critical lure errors in all of our conditions. We hoped this modified paradigm would be a more robust method for measuring the quantity of false memories at one time in comparison to the limited number of lures in other methods. In addition, we are interested in seeing if our instrument is sensitive enough to detect the significant differences found in previous false LTM research between semantically related word lists and phonologically related word lists. We hypothesized, in congruence with previous research and the spreading activation theory of false memories, that semantic lures would create more false memories than phonological lures in WM.



The sample was comprised of 63 Longwood University undergraduate psychology students. We recruited participants using convenience sampling via an online management system and active solicitation. Participants were compensated one extra credit point in their psychology course for their participation. There were 22 males and 41 females with ages ranging from 18 to 40 (M= 20.17, SD= 2.89). Twenty-one freshmen, 9 sophomores, 18 juniors, 14 seniors, and one individual who omitted their class rank participated in the experiment. Twenty-one of the participants experienced the semantic word lists, 20 experienced the phonological word lists, and 22 experienced the arbitrary word lists.

Materials and Procedure

In a between-groups design, we randomly assigned participants to one of three groups, a semantically related words group, a phonologically related words group, or an arbitrarily related words group that served as a control. We deceived the participants into believing they were participating in a study measuring the effects of color on memory. At the beginning of the experiment, participants received instructions on what they would be doing during the experiment and signed informed consent forms.

In each group, participants viewed two separate lists of 16 words via PowerPoint presentation. The phonological group viewed words that were rhythmically similar (Table 1). The semantic group viewed words that were conceptually, or categorically similar (Table 2) and the arbitrary group viewed words that were unrelated conceptually and rhythmically (Table 3). Each word was individually displayed in 1 s intervals. After participants viewed each list of words, the PowerPoint presentation cycled to a blank screen. At this point, the presentation stopped and participants had 90 s to complete a memory recognition task that corresponded to the previously viewed word list (see Appendices A, C, and E). Each recognition task consisted of 11 words that appeared on the recently viewed presentation and five critical lures (Table 4). Critical lures were words similar to those previously viewed, but not actually present during the presentation, used to induce false memories. The researchers instructed participants to circle words they remembered seeing on the PowerPoint presentation. After the completion of the first recognition task, the experimenters began the second word list in the PowerPoint presentation. After the second presentation finished, the researchers gave another memory recognition task similar to the first to the participants (see Appendices B, D, and F). Once both memory tasks were completed, the experimenters collected the participants’ worksheets and then debriefed them about the true nature of the study.

After collecting all the data, we counted the total number of critical lures falsely remembered as being a part of the original list on the PowerPoint presentation for each group. We then compared the totals to determine which group falsely recalled the most critical lures to determine the efficacy of the types of words to induce false memories. In addition, we totaled the number of words that participants in each group correctly recalled in order to determine if the recognition tasks were similar in difficulty. This was to ensure that the recognition tasks were not confounding to our study.


A one way Analysis of Variance (ANOVA) showed there were significant differences between the number of false memories reported across the different word groups, F(2, 60) = 8.522, p = .001. A Scheffé post hoc analysis revealed no significant differences in the number of false lures reported  between the participants in the semantic group (M = 1.29; SD = 0.902) and the phonological (M = 1.70; SD = 1.302) group, p = .42. However, significant differences existed between the semantic word group and the arbitrary word group (M = 0.45; SD = 0.739), with the semantic group reporting more false memories on the recognition task than the arbitrary word group, p = .03. Significant differences were also present between the phonological group and the arbitrary group with the phonological group reporting more false memories on the recognition task than the arbitrary group, p = .001 (see Figure 1).

To ensure that the recognition task in one group was no more difficult than the recognition task in another group, the authors performed a second ANOVA comparing the number of correct words reported on the corresponding recognition tasks. The results suggest that there were no significant differences between the number of correct words reported on the recognition task across the different word groups, F(2, 60) = 0.673, p = .514. The participants in the semantic group (M = 13.90; SD = 3.618), the participants in the phonological group (M = 15.05; SD = 2.188), and the participants in the arbitrary group (M = 14.73; SD = 3.718) all performed equally well, with no group recognizing more correct words than any other group (see Figure 2).

To determine if there were sex differences concerning the number of critical lures reported on the recognition tasks, the authors performed an independent t-test. The independent t-test showed that females (M = 1.41; SD = 1.161) reported more false lures than males (M = 0.59; SD = 0.796) on the recognition tasks, t(61) = 3.976, p < .001  (see Figure 3).


Contrary to our hypothesis, the data did not suggest that semantic word lists were more efficacious than phonological lists in inducing false memories in WM. However, findings did reveal that participants in both the phonological and semantic groups created more false memories than participants in the arbitrary group. These findings are interestingly incongruent with previous false LTM research (Chan, McDermott, Watson, & Gallo, 2005; Watson, Balota, & Roediger, 2003) that found that semantic word lists tend to induce false memories at greater frequencies than phonological lists. These differences in results might be attributed to the paradigm of the present study. In comparison to past research that has focused on inducing false LTM (Roediger & McDermott, 1995), we investigated false memory induction in WM. It is possible that the semantic networks responsible for creating false memories during LTM semantic recognition task are not as efficacious during WM recognition tasks. This could account for the similar recall in both the phonological and semantic group.

In addition, unlike previous research, we used a modified DRM paradigm. Instead of using 55 lists of words with only one critical lure per list, our research consisted of only two lists of words utilizing multiple critical lures on the recognition task. We were interested in seeing if we could create a simpler DRM paradigm that could create more than one false memory. However, on average, participants tended to report only one or two critical lures. It is possible that these findings demonstrate a capacity for false memories or a limitation of the DRM paradigm to only create one or two false memories. Although more research is necessary to determine if there is a limitation in the efficacy of the DRM to induce false memories in both WM and LTM, a limit of only one to two critical lures would have serious implications for the ecological validity of the DRM. It would mean that the DRM paradigm is likely not a valid measure of the cognitive processes that account for the creation of more enduring false episodic memories. Moreover, significant results were not limited to the experimental conditions.

Our results revealed that sex differences existed between males and females with females creating more false memories than males on the recognition tasks. These results are contrary to past research that examines sex differences and false memory induction by Bauste and Ferraro (2004). In Bauste and Ferraro’s study, a modified form of the DRM paradigm was shortened to include five critical word lists, with the experimenters predicting that males and females would report more false memories in regards to remembering gender-typed words. Their results suggested that there was no significant main effect for sex on the production of false memories. However, a limitation of their study was that their words lists failed to include a phonological subset category (Bauste, & Ferraro, 2004).These results further suggest that a significant main effect for semantic false memory induction may not exist, but a main effect for sex on the induction of false memories of the phonological variety may exist. An idea for future research would be to drastically increase our participant size and perform a two-by-two factorial design comparing types of word lists with sex on false memory induction to see if there are main effects and an interaction.

A limitation of the current research was that we were not able to determine if a particular critical lure in either the phonological group or the semantic group was more likely to be reported as a false memory than any other lure. We took the semantic word lists from the strongly correlated word lists from the McEvoy, Nelson, and Komatsu’s (1999) study. However, we created our critical lures for the semantic lists without any analysis to determine if participants reported certain lures at greater frequencies than other lures. In addition, the phonological words in list one and list two were both products of our own creation, and were not analyzed to determine if the words were highly correlated with one another. For future research, we would replace the words in the semantic group with highly correlated words from the DRM as well as replace our lures with other words from the validated measure.

In addition, it would be interesting for future researchers to compare the efficacy of LTM memory recognition tasks to WM tasks in false memory induction. Researchers could use a similar DRM paradigm that allows participants to create more than one false memory and then manipulate the amount of time between word presentation and the recognition task. Additionally, future research should investigate the limitations of the DRM’s ecological validity and determine if a capacity for the quantity of false memories exists.

The current paradigm appeared to be capable of inducing false memories, although it was unable to create the hypothesized differences between the phonological and semantic groups. Although we did not find differences using our version of the DRM in WM, we believe we found interesting findings for the false memory field that, with future research, might have important implications for false memory induction in both WM and LTM.


Atkins, A.S., Reuter-Lorenz, P.A. (2008). False working memories? Semantic distortion in a

mere 4 seconds. Memory & Cognition, 36(1), 74-81. doi: 10.3758/MC.36.1.74

Bartlett, F. C. (1932) Remembering: A study in experimental and social psychology. Cambridge,

England: Cambridge Univeristy Press.

Bauste, G., Ferraro, F.R. (2004). Gender differnces in false memory production. Current

Psychology: A Journal for Diverse Perspectives on Diverse Psychological Issues, 23(3), 238-244. doi: 10.1007/s12144-004-1023-0

Chan, J.C., McDermott, K.B., Watson, J.M., & Gallo, D.A. (2005). The importance of material

processing interactions in inducing false memories. Memory & Cognition, 33(3), 389-395. Retrieved from

Deese, J. (1959). On the prediction of occurrence of particular verbal intrusions in immediate

recall. Journal of Experimental Psychology, 58(1), 17-22. doi: 10.1037/h0046671

Flegal, K. E., Atkins, A. S., & Reuter-Lorenz, P. A. (2010). False memories seconds later: The

rapid and compelling onset of illusory recognition. Journal of Experimental Psychology:    Learning, Memory, and Cognition, 36(5), 1331-1338. doi: 10.1037/a0019903

Loftus, E. F., Palmer, J. C. (1974). Reconstruction of automobile destruction: An example of the

interaction between language and memory. Journal of Verbal Learning & Verbal Behavior, 13(5), 585-589. doi: 10.1016/S0022-5371(74)80011-3

McEvoy, C. L., Nelson, D. L., & Komatsu, T. (1999). What is the connection between true and

false memories? The differential roles of interitem associations in recall and recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25(5), 1177-1194. doi: 10.1037/0278-7393.25.5.1177

Meade, M. L., Watson, J. M., Balota, D. A., Roediger, H. L. (2007). The roles of spreading          activation and retrieval mode in producing false recognition in the DRM paradigm.           Journal of Memory and Language, 56(3), 305-320. doi: 10.1016/j.jml.2006.07.007

Roediger, H. L., & McDermott, K. B. (1995). Creating false memories: Remembering words not

presented in lists.  Journal of Experimental Psychology: Learning, Memory, and

Cognition, 21(4), 803-814. Retrieved from

Tehan, G. (2010). Associative relatedness enhances recall and produces false memories in immediate serial recall. Canadian Journal of Experimental Psychology, 64(4), 266-272.       doi: 10.1037/a0021375

Watson, J. M., Balota, D. A., Roediger, H. L. III (2003). Creating false memories with hybrid

lists and phonological associates: Over-additive false memories produced by converging

associative networks. Journal of Memory and Language, 49, 95-118.

doi: 10.1016/S0749-596X(03)00019-6

Watson, J. M., Balota, D. A., & Sergent-Mashall, S. D. (2001) Semantic, phonological, and

hybrid verdical and false memories in healthy older adults and in individuals with dementia of the alzheimer type. Neuropsychology, 15(2), 254-267.

doi: 10.1037/0894-4105.15.2.254

One Response

Leave a Comment