Linking Third Language Learner Corpora to Foreign Language Research and Pedagogy

Principal Supervisor

Professor Zhang Yanhui, Department of Linguistics and Modern Languages

Duration

1 year and 4 months

Approved Budget

HK $409,252

 
  • Project Objectives
  • Description of process, outcomes or deliverables
  • Evaluation
  • Dissemination, diffusion and impact

Project Objectives

Local students enrolled in the CUHK modern language courses typically develop Cantonese as their first language (L1), English as a second Language (L2) and the additional modern language as their third language (L3). The project has developed a sizable L3 learner language corpus – the first and the only L3 corpus in existence to date. The corpus aims to facilitate data-driven language teaching and learning activities, including (1) providing authentic source of L3 learner languages to students taking language acquisition courses to enhance their understandings to various acquisition issues; and (2) presenting language learning errors and learning curves to pre-service or in-service teachers so that they can gain more insights to language pedagogy. The project also brings about a transformative data-driven approach to language teaching and learning.

Description of process, outcomes or deliverables

The corpus includes learners’ language productions from four modern language courses – French, German, Korean, and Spanish. The learner productions will be stored at the CHILDES corpus. The sound files and their transcriptions, along with learners’ demographic profile, can be retrieved under different categories of target languages. A companion website with guidelines on how to use the data for teaching, learning, and research is openly accessible by language instructors and students.

Evaluation

The project has two measures of evaluations: (1) The qualitative evaluation will be in the format of users’ feedback; and (2) The quantitative evaluation will be the tracked record of hit rates and data retrieval rates.

Dissemination, diffusion and impact

The corpus has been introduced to modern language instructors in three briefing workshops. When the high quality, big, and open data sets are available online, language teachers, students, and researchers can retrieve information to gain concrete understandings to the process and characteristics of language learning and improve language pedagogy. It will further promote the concept of data-driven foreign language teaching and learning.