Presenters

Contagious Words/Epidemic Behaviours:
Language Big Data Approaches to COVID-19

17 October 2020 (Saturday) @Zoom, and @The Hong Kong Polytechnic University

Jinghang GU

Title: Extreme Multilabel Classification on Covid19 Literature.

Abstract
Extreme multi-label classification (XMC) problem can be found in many biomedical applications, such as document indexing and disease categorization. Recently, with the rapid development of deep neural networks, deep learning methods have achieved outstanding performances in XMC tasks. This paper describes our work of the XMC task on COVID-19 articles, where the objective is to attach documents with the most relevant semantic labels from an extremely large label set. We first constructed the COVID-19 semantic indexing corpus (CSIC) with MeSH terms which consists of more than 80,000 COVID-19 articles. We then proposed to leverage the correlation neural network to represent the latent label correlations to enhance the model predictions. Experimental results show the correlation neural network can significantly improve the prediction preformance and can be easily extended to other existing deep XMC models.

Short Biography
Dr. Jinghang Gu is a postdoctoral researcher in the Department of Chinese and Bilingual Studies (CBS), The Hong Kong Polytechnic University (PolyU), under the supervision of Prof. Chu-Ren Huang. He received his M.S. and Ph.D. degrees from the Department of Computer Science and Technology , Soochow University , China, in 2014 and 2017, respectively. Before he joined PolyU CBS, Dr. Gu worked as a senior natural language processing engineer in Big Data Group, Baidu. His research focuses on Natural Language Understanding and Big Data Mining. He is currently working on biomedical information extraction, knowledge discovery and extreme multi-label classification.

Christine M. JI

Title: Bayesian Network analysis of the Australian Bureau of Statistics COVID-19 Household Survey.

Short Biography
Christine Meng Ji specialises in empirical translation studies, especially data- driven multilingual corpus analyses. She has published on environmental translation, healthcare translation, statistical translation stylistics/authorship attribution, and international multilingual education (statistical translation quality evaluation). She is the author/editor of more than two dozen research books (with Cambridge University Press, Oxford University Press, Routledge, Palgrave, Springer, John Benjamins, Waseda University Press in Tokyo, University of Montréal Press), the editor of two special journal issues published by the MIT Press (Leonardo), USA and University of Montreal Press in Canada (Meta: Journal des traducteurs), and more than 50 journal papers and book chapters on empirical translation studies. She is the editor of The Oxford Handbook of Translation and Social Practices, New York: Oxford University Press (with Professor Sara Laviosa) (2020); editor of Advances in Empirical Translation Studies, Cambridge University Press (with Professor Michael Oakes) (2019); guest special section editor of Leonardo: TransCreation: Creativity and Innovation in Translation, Cambridge: The MIT Press (2020); founding series editor of the Cambridge Studies in Language Practices and Social Development, Cambridge University Press; founding editorial board member of the series of Cambridge Elements of Translation and Interpreting, Cambridge University Press; and the founding editor of Routledge Studies of Empirical Translation and Multilingual Communication, New York/Oxon: Routledge. Her research has been supported by the British Academy, Japanese Society for the Promotion of Sciences, the Australian Research Council, Economic and Social Research Council of the UK, Toshiba International Foundation, Worldwide University Networks Research Development Fund, and a number of leading universities in Europe, North America, Japan, South Korea, Brazil. She is a also qualified professional translator between English, Spanish and Chinese having previously worked for major international organisations before teaching at universities.

Menghan JIANG

Title: Epidemic or Memetic: Modelling Chinese Neologisms with internet usage data.

Abstract
This paper adopts models from epidemiology and memetics to account for the development and decline of neologisms based on internet usage. The model fitting research design focuses on the important issue of whether a meme-driven memetic model or a host-driven epidemic model is better suited to explain human behavior regarding neologisms. We extract the search frequency data from Google Trends covering the ninety most influential Chinese neologisms from 2008-2016 and find most of them (62 out of 90) possess similar rapidly rising-decaying pattern. Memetic and epidemic models are utilized to fit the evolution of these Internet based neologisms. Although both models have good fitting performance for the rapid growth, the epidemic model is able to predict the peak point in the neologism’s life cycle. This result underlines the role of human agents in the life cycle of neologisms and supports the macro-theory that the evolution of human languages mirrors the biological evolution of human beings.

Short Biography
Dr. Menghan Jiang is currently a postdoctoral researcher under a joint programme between Department of Chinese Language and Literature, Peking University and Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University. Her research interests are Corpus linguistics, Language variation and language change, Language modeling (on human collective behaviours), Chinese syntax, and Conceptual Metaphor.

Siyu LEI

Title: Contagious words and epidemic behaviors: A usage-based exploration of internet neologisms related to COVID-19.

Abstract
Internet neologisms are contagious and reflect collective human behavior. Can this observation be leveraged to reflect an epidemic situation by tracking the development of internet neologisms over time? This paper proposes an innovative approach to correlate the competition of neologisms with the development of an epidemic as collective human behavior changes, enhancing our understanding of these two phenomena. Specifically, this study tracks the use of COVID-19 neologisms from late December 2019 to the end of June, 2020 based on Baidu index. The neologisms are designated by five categories: under-specified references, pre-official names, pejorative names, official names, and English abbreviations. Qualitative analyses based on these categories show the impact of language-internal factors (i.e., frequency) and the changes of social psychological situation (i.e., policy and emotion) on lexical competition and evolution. Quantitative analyses show strong correlation between neologisms and pandemic development that can be expressed by a binomial formulation. These results are summarized in a flowchart showing different developmental stage of COVID-19, neologism uses and the changes in collective behaviors especially in terms of emotion. In sum, this innovative approach of leveraging internet usage data to study emergent events is shown to be effective when observational data is inadequate or unaccessible.

Keywords: neologisms; internet usage data; COVID-19 pandemic; frequency; social psychological factors

Short Biography
Ms. Siyu Lei is a dual PhD award student. She is currently the third year PhD student in Xi’an Jiaotong University under the supervision of Prof. Ruiying Yang whereas the first year PhD student in The Hong Kong Polytechnic University under the supervision of Prof. Chu-Ren Huang. Her research interests are corpus linguistics, English for academic purposes, second language acquisition, genre analysis. Till now she has one paper published on Journal of English for Academic Purposes.

Jing LI

Title: Social Media Keyphrase Prediction for COVID-19 Context Modelling.

Abstract
As social media continues its worldwide expansion, the way we communicate with each other has been profoundly revolutionized. The exposure to new information and the exchange of personal opinions have been mediated through online platforms. Especially in the crisis of COVID-19, social media becomes the only channel for individuals to stay in touch with the outside world. Facing the explosive growth of online messages, which far outpace human beings’ reading and understanding capacity, how shall we help individuals quickly access the key information they need? In this talk, I will present our recent work to automatically predict keyphrases concerning the salient contents on social media. Most existing keyphrase prediction methods, though work well on formally-written and well-edited texts, will suffer from the data sparseness issue widely exhibited in short and informal social texts. This talk presents two possible ways to enrich contexts for short messages: one is to exploit explicit contexts formed with user responses; the other is to explore implicit contexts via discovering latent topic clusters underlying the corpus. Beyond the text signals, we also strive to understand cross-modality contexts from texts and images and their joint effects to indicate keyphrases. At last, I will discuss the big picture of how social media keyphrase prediction methods will help us fight against the COVID-19 crisis.

Short Biography
Dr. Jing Li is an Assistant Professor of the Department of Computing, The Hong Kong Polytechnic University (PolyU) since 2019. She is a member of Data Science and AI Lab (DaSAIL) of Department of Computing and the COMP representative for Doctor of Applied Language Sciences Programme (DALS). Before joining PolyU, she worked in the Natural Language Processing Center, Tencent AI Lab as a senior researcher from 2017 to 2019. Jing obtained her PhD degree from the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong in 2017. Before that, she received her B.S. degree from Department of Machine Intelligence, Peking University in 2013. Jing has broad research interests on natural language processing, computational social science, and machine learning. Particularly, she works on novel algorithms for topic modeling, information extraction, discourse analysis, and their applications on social interaction understanding. Jing regularly published in top-tier NLP conferences and journals, such as ACL, EMNLP, NAACL, TACL, and CL. For academic services, she served as a publication co-chair in EMNLP 2020, a tutorial co-chair in ICONIP 2020, and a program committee member in many premier conferences (e.g., ACL, EMNLP, NAACL, AAAI, IJCAI), where she received a best reviewer award in EMNLP 2018.

Cindy SB NGAI

Title: Grappling with the COVID-19 health crisis: Analysis of communication strategies and their effects on public engagement on social media.

Abstract
Background: COVID-19 has posed an unprecedented challenge to governments worldwide. Effective government communication of COVID-19 information with the public is of crucial importance.

Objective: We investigated how the most-read state-owned newspaper in China, People’s Daily, utilized an online social networking site, Sina Weibo, to communicate about COVID-19 and whether this could engage the public. The objective of this study was to develop an integrated framework to examine the content, message style, and interactive features of COVID-19-related posts, and determine their effects on public engagement in the largest social media network in China.

Methods: Content analysis was employed to scrutinize 608 COVID-19 posts and coding was performed on three main dimensions: content, message style, and interactive features. The content dimension was coded into six sub- dimensions: (C1) Action, (C2) New evidence, (C3) Reassurance, (C4) Disease prevention, (C5) Healthcare services, and (C6) Uncertainty, while the style dimension was coded into the sub-dimensions of (S1) Narrative and (S2) Non- narrative. As for interactive features, they were coded into: (I1) Links to external sources, (I2) Use of hashtags, (I3) Use of questions to solicit feedback, and (I4) Use of multimedia. Public engagement was measured in the form of the number of shares, comments, and likes on the People’s Daily’s Sina Weibo account from 20 January 2020 to 11 March 2020 to reveal the association between different levels of public engagement and communication strategies. One-way ANOVA followed by post-hoc Tukey test, and negative binomial regression analysis were employed to generate the results.

Results: We found that although the content frames of (C1) Action, (C2) New evidence, and (C3) Reassurance delivered in a (S2) Non-narrative style were predominant in COVID-19 communication by the government, posts related to (C2) New evidence and a (S2) Non-narrative style were strong negative predictors of the number of shares. In terms of generating a high number of shares, it was found that (C4) Disease prevention posts delivered in a (S1) Narrative style were able to achieve this purpose. Additionally, an interaction effect was found between content and style. The use of a (S1) Narrative style in (C4) Disease prevention posts had a significant positive effect on generating comments and likes by the Chinese public while links to external sources fostered sharing.

Conclusions: These results have implications for governments, health organizations, medical professionals, the media, and researchers on their epidemic communication to engage the public. Selecting suitable communication strategies may foster active liking and sharing of posts on social media, which in turn, might raise the public’s awareness of COVID-19 and motivate them to take preventive measures. The sharing of COVID-19 posts is particularly important because this action can reach out to a large audience, potentially helping to contain the spread of the virus.

Short Biography
Cindy SB Ngai (PhD) is an Assistant Professor cum Programme Leader of MA in Bilingual Corporate Communication in The Hong Kong Polytechnic University (PolyU). By adopting an interdisciplinary research approach, she integrates her knowledge of language, media and communication into the business, health and science disciplines. Her work appeared in SCI and SSCI journals including Discourse and Communication, English for Specific Purpose, International Journal of Business Communication, Journal of Business and Technical Communication, Journal of Medial Internet Research, PLOS One, Public Relations Review and Studies in Higher Education.

Mingyu WAN

Title: Understanding and Combating ‘Infodemic’: A Corpus Linguistic Approach to Analyzing COVID19 Misinformation.

Abstract
COVID-19 misinformation, also known as ‘infodemic’, presents a serious risk to public health and public policies. The urgency of finding COVID-19 misinformation can be attested by scores of already published papers (e.g. Brennen et al. 2020, Pennycook et al. 2020) and constant discussion in press and in social media. Misinformation refers to fabricated, deceptive or distorted information at various degrees, which can mislead people’s decision-making, harm the public trust, and even lead to global tragedies (e.g. Grinberg et al. 2019, Su et al. 2020). To mitigate its risks to the society, it is of vital importance for us to understand its key properties (linguistic generalization patterns in particular) before taking right actions. There hasn't been any major corpus linguistic work on the analysis of COVID-19 misinformation. A few published papers in corpus linguistics, such as Wolfer et al. (2020), focus on applying corpus linguistics tools to analyzing COVID-19 texts without dealing with information quality issues. Past work on misinformation relied mostly on computational ways of misinformation detection without in-depth analysis of how misinformation is constructed (e.g. Guacho et al. 2018, Torabi & Taboada 2019). As such these studies do not contribute to our standing of language and cannot be viewed as corpus linguistic research. In addition, automatic textual classification studies by themselves do not help to pinpoint the fake part of the news or how these fake news misinform. These studies have little contribution to ameliorate the negative effect of misinformation. To urgently combat the negative impact of COVID-19, we propose a corpus driven analysis of misinformation aiming to analyze the intrinsic (linguistic) properties of texts containing COVID-19 misinformation. Preliminary analysis found that in addition to the frequently mentioned keywords of “virus, coronavirus, China, Chinese, spread, death, kill”, misinformation tends to employ more negative emotion words (e.g. fear, worthless, deadliest), expression for exclusion (e.g. cannot, without, except), vulnerable population groups (e.g. Children, young people, older adults), verbs of elimination (e.g. reduce, die, kill). Besides, false information demonstrates a less formal, linguistically simplier, as well as less specific compared to factual information. Our provisional theory is that misinformation typically focuses on 1) people’s inherent fear for particular people and/or especially vulnerable groups; or 2) both fear and anger against specific groups of the society. These groups could be elite or could be socially marginalized, but it is crucial that they are easily separated from the identities of the speakers. Based on our hypothesis and our data, we will adopt the theory of Bronstein et al. (2019) to investigate the identification of constructions, logical incongruities and metaphorical expressions, etc. for introspecting the salient cognition mechanism of information generators.

Short Biography
Dr. Mingyu Wan currently works for Prof. Huang, Chu-Ren and Dr. Su, Qi under the Boya Joint Post-doctoral Project. Her recent research focuses on misinformation detection, metaphor detection, complexity analysis and Mandarin Alphabetical Words (code-mixing words) with linguistics-motivated, corpus-based and NLP- oriented methodologies.

Vincent Xian WANG

Title: Two Tales of One city: Unveiling the sentiments and conceptual metaphors for a pandemic in Macao.

Abstract
This study seeks to understand Macao residents’ lives during the COVID-19 pandemic. We gathered data from two main sources – (a) articles published in Macao Daily (MD) and (b) postings on Chuchu Channel (CCC) on Youtube. The results showed that Macao Daily systematically lays out the measures initiated by the local government to combat the disease and to ease the financial distress of both the residents and local businesses, while the netizens who posted on Chuchu Channel elaborated on topics about casinos, local economy and politics, making critical remarks on The Hong Kong and Macao governments. The MD and CCC used conceptual metaphors differently in various ways. MD predominantly employed WAR, TIDING and JOURNEY metaphors that entailed a collective and lasting battle against the pandemic, and comparing the virus with a PERSON or a MONSTER. By contrast, CCC only favoured some subtypes of WAR metaphors – e.g., 撐 chēng ‘endure, hold on’ – and 執(笠) zhí (lì) ‘close down, shut down (business)’. MD tended to convey positive sentiments consistently, whereas the viewers of ChuChu Channel exhibited sharply divided views on political matters and on the shutdown of casinos, provoking bitter arguments between the posters. We interpret our findings in relation to the two-dimensional model proposed by Bentley, O’Brien and Brock (2014) for mapping collective behaviour. From the tales told by the two groups, we lend support to the central argument advanced by Chater’s (2020) editorial that multiple interpretations rather than a singular one about the pandemic – which can be effectively captured by distinct conceptual metaphors – need to be endeavoured by policy makers in the middle of the unprecedented uncertainties.

Short Biography
Vincent X. Wang, associate professor of the University of Macau and a NAATI- certified translator, received his PhD in Applied Linguistics from the University of Queensland (2006). He published journal articles in Sage Open, Target, Journal of Language, Literature and Culture and TESOL-related periodicals, book chapters with Springer, Routledge and Brill, among others, and a monograph Making Requests by Chinese EFL Learners (John Benjamins).

Xiaowen WANG

Title: From Contact Prevention to Social Distancing: The Co-evolution of Bilingual Neologisms and Public Health Campaigns in Two Cities in the Time of COVID-19.

Abstract
This paper investigates the evolution of social distancing expressions in Chinese and English in two geographically close yet culturally distinct metropolitan cities: Hong Kong and Guangzhou. Our study of bilingual public health campaign posters during the COVID-19 pandemic focuses on how the evolution of neologisms and linguistic strategies in public health campaigns adapts to different societal contexts. Baseline meanings of the re-purposed linguistic expressions were established based on the BNC corpus for English and the Chinese Gigaword Corpus for Chinese. To establish the links between linguistic expressions and public health events, we converted them to eventive structures using the Module-Attribute Representation of Verbs and added interpersonal meaning interpretations based on Systemic Functional Linguistics. The two cities are found to take divergent approaches. Guangzhou prefers “contact prevention” with behavior-inhibiting imperatives and high value modality. By contrast, the original use of “contact prevention” in Hong Kong was gradually replaced by the neologism “social distancing” in English, triggering competing loan translations in Chinese. Predominantly behavior-encouraging expressions are used with positive polarity and varying modality and mood devices, varying to map the epidemic curve of COVID-19. We conclude that lexical evolution interacts with social realities. Different speech acts, prohibition in Guangzhou but advice and warning in Hong Kong, are constructed with careful bilingual reconfiguration of eventive information, mood, modality and polarity to tactfully cope with the social dynamics in the two cities.

Key words: COVID-19, social distancing, event representation, health communication, bilingual communication

Short Biography
Xiaowen Wang, Annie is an Associate Professor of Applied Linguistics and director of the Research Center for English Education and Linguistic Studies at the School of English Education, Guangdong University of Foreign Studies. She also serves as the associate editor for the Asian English for Specific Purposes Journal. Currently, she is doing her doctoral study under the supervision of Professor Huang Chu-ren at the Hong Kong Polytechnic University. Her research interests cover English for Medical Purposes, corpus linguistics, computational linguistics, discourse analysis, lexical semantics, translation and English Education. She has been principal investigators of 6 provincial-level or university-level projects in China.

Speakers Program Schedule

Workshop Organizer

Workshop Co-organizers

Enquiry

Contagious Words/Epidemic Behaviours: Language Big Data Approaches to COVID-19

rp2u2@polyu.edu.hk

Connect with Us

CBS LLT