Language Shift Effect on Memory Generalization of Chinese-English Bilinguals

Language shift occurs when people learn information in one language but recall it in another language. This mismatch between encoding and retrieval language is found to impair memory accuracy when memory is tested immediately after learning. However, does the observed language shift effect still exist after a certain period of delay? Would it influence other aspects of memory, especially memory generalization? To address these two questions, we performed a memory experiment among unbalanced Chinese-English bilinguals. In the experiment, participants were required to read two stories (one in English, one in Chinese) and to retell the stories in Chinese from their memories afterward. Delay interval was manipulated in the experiment where participants either took memory recall tests immediately after story-reading or after 24 hours' delay. To analyze memory generalization, we coded the generalized words participants used to retell the stories. The results suggest that language shift (encoding in English and retrieving in Chinese) leads to a more generalized description in a memory recall task. However, the observed language shift effect disappears after 24 hours' delay. It can be concluded that language shift impacts bilingual learners' memory generalization in immediate recall tests, but such effect disappears after 24 hours' delay, which indicates the key role of delay interval in modulating language shift effect.


Introduction 1
As English as a lingua franca becomes increasingly important worldwide, more and more college students in China, even non-English major students (such as computer science, medicine, finance) take major courses in English and read English textbooks as well (Pang & Ding, 2005). However, most of them have to communicate learned knowledge with others in Chinese in their future career, in which case the encoding language can not match the retrieval language. Then, does this mismatch of language influence their memory representation? This question is critical for bilingual education and bilingual memory and a great deal of research has contributed to this issue (Matsumoto & Stanny, 2006). However, the majority of studies in this area only focused on memory accuracy and paid little attention to other aspects of memory, especially memory generalization. In addition, most relevant research failed to discuss the influence of delay interval, which was believed to be a mediating factor for memory recollection. Therefore, this study aims to explore the influence of language shift on memory generalization and to investigate how the language shift effect be modulated by delay interval.
A considerable amount of research has focused on the issue of language shift effect on memory, referring to the fact that when people use different languages in the process of encoding and retrieving, their memory performance would be largely decreased (Altarriba, 2000 academic learning among Russian-English bilingual college students. In their study, participants firstly learned a fictitious history lesson in one of their languages and a fictitious geography lesson in the other. Then they were tested with materials, one session in Russian and one session in English. Results revealed that participants retrieved stories faster and more accurately in their training language and better performance, which indicated the language shift effects on academic learning. Similarly, Marian and Fausey (2006) examined how language shift influenced students' academic learning by testing 24 Spanish-English bilinguals with academic stories. In the experiments, stories and judgment questions were presented to the participants in either English or Spanish while the languages for stories and questions were either consistent or inconsistent. The results showed that when using the same language at learning and testing, participants' judgment accuracy increased and reaction time decreased significantly. A further study was done by Grabner, Saalbach, and Eckstein (2012) further revealed the cognitive and neural mechanisms of language shift effect by carrying out an fMRI study among Italian-German bilinguals. It was found that language shift elicited additional task-specific processing, resulting in greater brain activities during language switch. These studies agree with prior suggestions that language shift at encoding and retrieval does affect recalled memory.
However, studies reviewed above mainly took the use of narratives introducing novel scientific knowledge with limited attention paid to memory on events that were highly familiar to everyone and closely related to our life experiences. It should be noted that memorizing scientific concepts and daily stories may rely on different encoding mechanisms, which can be divided into laboratory memory and autobiographical memory (Roediger & McDermott, 2013). When memorizing a daily event, people can easily understand and interpret the stories using their prior experience and knowledge (Bartlett, 1932). Hence it is more likely that their memories be influenced by both prior experiences and actual text input. Nevertheless, when learning novel scientific concepts, people's s memories depend exclusively on text input due to their limited knowledge relating to these topics. Additionally, the information presented in scientific passages usually does not have an inherent causal structure or a gist of events whereas stories about daily life usually do. As a consequence, people tend to remember the details of scientific narratives but retain both details and gist of daily stories. Therefore, examining the language shift effect using daily events allows us to further explore how language shift modulates the integration of prior knowledge and text representation and its influence on the encoding and retrieval process of memory.
In addition, prior studies about the language shift effect only emphasized memory accuracy and it still remains unclear whether language shift influences memory generalization (Marian & Fausey, 2006). It is suggested that human memory is not the exact replica of the original experiences, but the reconstruction of past experiences relying on the event gist to rebuild the lost details (Guerin et al, 2012). For example, Sulin and Doolin (1974) tested participants' memories with two short biographical stories. In their experiment, participants were required to read out every story sentence. Then they were asked to tell whether the given sentence was the same as the one they read after a certain period of delay. It was found that their memories for the gist were kept, but details were forgotten with delay. The study suggested that memory was recollected in a generalized way. Hence, when we discuss how language shift impacts memory representation, it is essential to consider its influence on memory generalization. Therefore, we will investigate how language shift affects memory generalization in this study.
Moreover, previous studies investigating the language shift effect rarely discussed how language shift effect varied with delay interval (Gablasova, 2014). It has been found that memories for texts and words decay with time and the story schema becomes more dominant in-memory representations gradually, as more and more story details are assimilated with longer retention intervals (Bartlett, 1932). Besides, Schmalhofer and Glavanov (1986) also proposed that information for texts is generally encoded in three levels: surface structure (word-by-word text), propositional structure (abstract meanings or propositions), and situational structure (story gist). These three levels of information decay at a different rates of time. To be more specific, people's memory for the three structures remains equal and complete in the immediate recall test. But the surface structure barely remains for even 40 minutes while memory for the situational structure can last for four days after learning (Kintsch et al, 1990). As time is a critical factor in both memory decay and generalization, it is necessary to test the language shift effect on memory with different delay intervals. Hence, we will examine how language shift influences memory recollection immediately or one day after learning in the current study. Presumably, more generalized memory will be observed with longer delay intervals.
As reviewed above, this study aims to investigate how language shift influences memory generalization and how this effect varies with different delay intervals. To achieve that, we will make use of daily stories as experiment materials and will ask participants to retell the learned stories immediately or after 24 hours. In addition, participants will learn the stories either in English or Chinese and will recollect stories in Chinese afterward. In this way, stories will be recalled either in the same language (none language shift condition, CH-CH) or in different languages (language shift condition, EN-CH). In the data analysis section, to measure how memory generalization varies with language shift conditions and delay interval, we will code the generalized words in participants' retelling and will calculate the number of sub-events containing generalized words. Following previous studies in this field, we predict that (1) language shift influences the process of memory generalization; (2) longer delay interval may lead to more generalized memory; (3) delay interval may mediate language shift effect on memory generalization

Methodology
A mixed experiment design was applied to test the influence of language shift and delay interval on memory generalization. Specifically, all participants would learn one story in the language shift condition (EN-CH) and another story in none language shift condition (CH-EN). Then their memories for stories were either tested immediately (immediate test group) or 24 hours after learning (delayed test group).

Participants
60 Chinese native speakers in total were recruited in the experiments and 30 were randomly allocated to the immediate test group (20 females and 10 males, mean age: 21.7) and 30 were allocated to the delayed test group (6 males and 24 males, mean age: 21.1).
All participants started learning English as the second language for at least 7 years. Their English proficiency varied between intermediate level and advanced level. Most of them have passed CET-4, an English test for college students with an intermediate level of listening, reading, translating, and writing in English. Besides, their mastery of vocabulary was tested on an online platform Lextale, with a full mark of 100. It was shown that the vocabulary level of participants was at an intermediate or advanced level (mean = 73.38, SE = 9.03). Further analysis of CET-4 scores and Lextale scores showed there's no significant difference in participants' English proficiency between the immediate test group and delayed test group (p>0.05).

Materials
Two short stories of daily life events were exclusively composed for the study, which was precisely matched in the topic, story length, story structure, the amount of conveyed information, the difficulty level of vocabulary, the grammatical structure of story details (see Appendix and Table 1). Therefore, we could rule out the compounding effect of experiment materials which may influence the results of two experiments. Besides, we also performed an online familiarity questionnaire to examine participants' familiarity with stories details. In the questionnaire, participants were asked to rate their familiarity with the objects and events contained in the stories on a 7-point scale questionnaire, in which number 1 indicated "not familiar at all" and number 7 indicated "very familiar". The results showed that the two stories were not significantly different in familiarity (p>0.05). In addition, both stories had a Chinese version and an English version, which were translated precisely in terms of grammar and lexicon. In order to guarantee that the semantic meaning and grammar of the two versions were consistent, translation work was done by three translators independently and then they discussed to decide the final translated versions. Moreover, to decrease difficulty in comprehension and avoid ambiguity, polysemy was excluded from the stories and all the words in English story versions were selected from the required vocabulary list for National College Entrance Examination, which was highly familiar to all participants.

Design and Procedure
There were two lists in the experiment, each of which contained two stories, either in Chinese or in English. Specifically, List 1 included the Chinese version of Story1 and the English version of Story2. List 2 included the English version of Story1 and the Chinese version of Story2. The two stories in both lists were presented in random order in the experiment.
Due to COVID-19, this experiment had to be carried out online. We took use of an online questionnaire Qualtrics to present stories. While presenting stories, the experimenters communicated with participants and gave instructions by making video calls through a social media platform named Tencent QQ and recorded their voices in memory recall tasks by a voice recorder. Participants were instructed to memorize the stories as detailed as possible while reading orally at their own speed. They were only given one chance to read through the two stories, which were presented one by one in random order. After reading the two stories, participants in the immediate test group took the free recall task, in which they were required to retell the learned stories in Chinese. In this task, a cue word referring to the story themes (e.g., shopping and zoo) was given on the screen as a retrieval cue and then 30 seconds were given for preparation. In addition, participants were informed to organize their words freely according to their memories and preference. They were encouraged to describe whatever they remember as detailed as possible with no time limitation or requirement for speaking speed. As for participants in the delayed test group, instead of testing their memories immediately after learning, they were asked to perform their regular schedules and to take part in the free recall task the next day when it was exactly 24 hours after learning.
After finishing the memory tests for both groups, participants took a questionnaire to provide needed personal information, including age, gender, birthplace, educational background, and scores of CET-4 test. Then they were required to take the online vocabulary test using the online platform Lextale.

Data Treatment
Firstly, the voice recordings of all participants were transcribed into the text formatting automatically using Xunfeitingjian, an audio-recognition and text-transferring app launched by Iflytek Co., Ltd. Then three coders manually checked the transcriptions independently to guarantee the accuracy of the transcription texts and then discussed to work out the final versions of transcription.
Next, we coded generalized words that appeared in participants' recalled descriptions. The superordinate categorical names relating to the story details but not directly presented in the original texts were regarded as generalized words. For example, in the original text of story 1, participants would see "orange", "tomato", "beef", "eggs" and "strawberries", which were closely related to some unseen categorical names such as "fruit"，"vegetables". These categorical names could be extracted from text information of stories, which reflected the process of memory generalization. Therefore, the categorical names that were not presented in the original text but closely connected with story details would be treated as generalized words. In total, there were 175 generalized words including synonyms coded for data analysis, 97 in the delayed test group and 78 in the immediate test group. Two coders coded the generalized data independently blind to experimental conditions. As participants did not frequently make use of generalized words in story description, we further coded whether a story answer included generalized words. As long as a story recall included one generalized word, this story recall would be coded as 1 indicating "with the generalized word", otherwise it was coded as 0 indicating "without the generalized word". By Cohen's kappa, inter-coder agreement ranged from 84% to 97%, which indicates a strong agreement and assumes data are 64-81% reliable. In addition, as information omission may influence our data analysis and explanation, we also coded omission of sub-events (shopping, cleaning, dressing up, and watching animals). If a story's recalled information omitted one or more than one sub-event(s), we coded the story as 1 indicating "sub-event omission". If there's no omission of sub-event in the story recall, it would be coded as 0 indicating " no sub-event omission".

Language Shift Effect on Memory Generalization in immediate recall and delayed recall
Immediate Recall: To testify whether language shift (English-Chinese or Chinese-Chinese) influences participants' production of generalized description, we performed a Chi-square test and compared the number of stories with and without generalized words under both conditions. The results showed a significant influence of encoding language on the production of generalized events (χ2 = 5.55, p = 0.035) (see Figure 1). Specifically, when stories were encoded in English and then recalled in Chinese, participants were more likely to include generalized words in description (73%, 22 of 30), compared with the case where stories were encoded and retrieved in Chinese (43%, 13 of 30).

Figure 1 Language shift effect on memory generalization in an immediate recall test
Delayed Recall: Although we have observed the language shift effect on memory generalization in the immediate recall test, we did not obtain this effect for the delayed group. We did the same coding and Chi-square analysis for the delayed test group, but there was no significant difference between language shift condition and non-language shift condition (χ2 = 0.29, p = 0.79) (see Figure 2). Specifically, when stories were encoded and retrieved all in Chinese, there were 60% (18 of 30) recalled stories containing the generalized words in story recall. Meanwhile, when stories were encoded in English and recollected in Chinese, there were 67% (20 of 30) stories indicating a generalized description.

Memory Generalization and Delay Interval
Memory Generalization: To investigate how memory generalization changes with time, we also performed a Chi-square analysis to compare the data between the immediate test group and delayed test group. Although there's a tendency for more generalized description in the delayed test group, we did not observe significantly increased use of generalized words (χ2 = 0.32, p = 0.71). Specifically, for the immediate test group, there were 58% (35 of 60) stories including the generalized words in the immediate test group, and 63% (38 of 60) stories including the generalized words in the delayed test group.
Memory Omission: So why did we fail to observe the influence of delay on memory generalization? One possible reason lies in memory omission. More specifically, instead of generating a generalized description, people may totally omit the whole subevent (shopping, cleaning, dressing up, and watching animals) in the stories, which leads to no obvious delay effect on generalization. To test this possibility, we conducted a Chi-square with these omission data. The results showed that there were significant differences in the number of stories with omission between the immediate test group and delayed test group (χ2 = 7.60, p = 0.01) (see Figure 3). Specifically, 57% of stories (34 of 60) in the delayed test group were recalled with omitted subevents, 32% (19 of 60) stories in the immediate group were described with the omission in sub-events.

General Discussion
In line with our predictions and previous findings, we found that encoding stories in L2 and then recalled in L1 generated more generalized descriptions in the immediate recall test. However, such an effect disappeared after 24 hours' delay. In addition, inconsistent with our predictions, we failed to observe delay effect on memory generalization, which could possibly be explained by increased omission in the delayed test group and we will discuss it further in detail in this section.
Firstly, we observed language shift effect on memory generalization, which is in support of the memory-dependent theory and consistent with previous findings in this area (Marian & Neisser, 2000). Then, why would memory generalization be influenced by language shift in the immediate group? It is possibly because when people recall stories in a language different from the encoding language, they have to translate the encoded text into another language according to the text meaning during memory recollection. Therefore, people's lexicon based on previous knowledge may be involved in the translation process. Moreover, as suggested by the Hierarchical Network Model and Spreading Activation Model, related superior level concepts would be automatically activated during semantic concept retrieval (Heredia, 1997;Martin et al, 1994). Therefore, the superior level concept or generalized word would be reported in participants' verbal descriptions of the stories. However, if participants learn and immediately recall stories in the same language, they are able to directly recollect the story by orally repeating the text they just saw without translation, as the origin texts are stored in their memories and they do not need further translation for the stored text. In this way, when memory was tested immediately after learning, the more generalized description would be observed when encoding and retrieval languages are mismatched than when they are matched.
However, it should be noted that the observed language shift effect interacts with the encoding effect. In this study, the language shift condition means that participants learn stories in English, and in the non-shift condition, people learn stories in Chinese. When people encode stories in English (language shift condition), larger processing costs are required relative to encoding stories in Chinese. Therefore, when stories are initially encoded in English, participants can not encode the story details efficiently enough and all the details are poorly encoded. As a result, due to limited story details in memory, people have to describe the events in a more generalized way in the retrieval process. On the contrary, when stories are initially encoded in Chinese (no language shift condition), the story details can be easily encoded, stored, and recollected. In this way, people do not have to use generalized words to describe stories. It should be known that the two findings do not have to be exclusive because the two processes may occur at the same time (Dell, 1986).
Second, why did language shift effect on memory generalization disappear after 24 hours? The reason possibly lies in the mechanism of memory consolidation (Levy & Wagner, 2013). Specifically, when people are tested immediately after learning the stories, their memories for the stories are vivid enough and they can remember the most specific words they had seen in the stories. Therefore, in the immediate recall test, bilinguals' memories for texts are supposed to be language-based or text-based. Hence, the consistency of language in the encoding and retrieval process would have an impact on memory generalization (as discussed earlier). However, memory for stories would decay gradually with time and the specific words used in the texts would be forgotten finally. In this condition, people have to reconstruct the story based on the story gist and their own interpretation of the story content. Memory turns out to be text-independent or language-independent so translation between two languages is not needed anymore. As a result, the influence of encoding language on generalized description disappears as well.
Thirdly, inconsistent with our predictions, we failed to observe the main effect of delay on memory generalization. Then how could we explain this phenomenon? For one thing, it is possibly because we did not code every generalized detail of the whole story. Instead, we only focused on the target events (shopping, cleaning, dressing up, and watching animals). For another thing, it is possible that omission rather than generalization is the core feature of memory decay. The analysis of sub-event omission suggested that there were significantly more omissions in the delayed test group than in the immediate test group, which may influence the detection of memory generalization.

Conclusion
This research first discussed the influence of language shift on memory generalization while previous studies only focused on language shift effect on memory accuracy. It enables us to have a better understanding of the language shift effect. In addition, this study sheds light on the effect of delay interval and language shift on the generalized description in memory recall. Moreover, this study provides implications for Chinese-English bilinguals education, indicating that language shift and L2 encoding arouse more generalized memory for the learned information and therefore affect English learners' memory retrieval, which is an issue worthwhile for China's English educators to raise their awareness.
However, this study also has some limitations. Firstly, the L2 proficiency of participants was not measured specifically with a standard test in this study, which may influence the final results. Therefore, a potential amendment in the future would focus on the effect of language proficiency. In addition, this study only tested the language shift effect in the L1 retrieval condition and it still remains unknown how the language shift effect works when we retrieve information in L2. Hence, in further studies, it would be necessary to compare all the four language conditions (L1 learning-L2 testing, L1 learning-L1 testing, L2 learning-L1 testing, L2 learning-L2 testing) to further examine the language shift effect and the influence of encoding and retrieving languages.
To conclude, the present research investigated how language shift and delay interval influenced memory generalization of Chinese-English bilinguals. The results showed that language shift led to more generalized memory for daily stories and such effect was modulated by delay interval due to the mechanism of memory consolidation. The findings of this study have important implications for our understanding of bilingual memory and bilingual education.