177 Huntington Ave
Boston, MA 02115
ATTN: Ha Trinh, 910 - 177
360 Huntington Avenue
Boston, MA 02115
- PhD in Computing, University of Dundee
- BSc (Hons) in Applied Computing, University of Dundee
Dr. Ha Trinh is a Senior Research Scientist in the College of Computer and Information Science. She received a PhD in Computing in 2013 and a BSc (Hons) in Applied Computing in 2009, both from University of Dundee (UK).
Dr. Trinh’s primary research interests are in the design and evaluation of technologies to support communication and healthcare. Her PhD work explored the potential of predictive communication aids to support people with severe speech and physical impairments. Her recent research focuses on the use of intelligent virtual agents in a number of domains, from public speaking to health education and health behavior change interventions. At the heart of her research is the application of multimodal interaction paradigms, natural language processing techniques, and user-centered design methodologies to improve the usability and accessibility of interactive systems.
Dr. Trinh’s research has received awards at premier HCI conferences, including an honorable mention award at CHI ’14 and the 2012 SIGACCESS Best Student Paper award.
Field of research/teaching
Human-Computer Interaction, Assistive Technologies, Personal Health Informatics, Intelligent Virtual Agents, Natural Language Processing
What is your educational background?
My background is in computer science. I received a BSc (Hons) in Applied Computing and a PhD in Computing from University of Dundee in 2009 and 2013, respectively.
What is your research focus?
The focus of my research is on technologies to support communication and healthcare for individuals with diverse language, speech, physical and cognitive abilities. In my current projects, I investigate the use of intelligent virtual agents to transform oral presentations for the better, as well as to support health education and health behavior change interventions.
Asadi, R., Fell, H. J., Bickmore, T., & Trinh, H. (2016). Real-Time Presentation Tracking Using Semantic Keyword Spotting. Interspeech 2016, 3081-3085.
Given presentation slides with detailed written speaking notes, automatic tracking of oral presentations can help speakers ensure they cover their planned content, and can reduce their anxiety during the speech. Tracking is a more complex problem than speech-to-text alignment, since presenters rarely follow their exact presentation notes, and it must be performed in realtime. In this paper, we propose a novel system that can track the current degree of coverage of each slide’s contents. To do this, the presentation notes for each slide are segmented into sentences, and the words are filtered into keyword candidates. These candidates are then scored based on word specificity and semantic similarity measures to find the most useful keywords for the tracking task. Real-time automatic speech recognition results are matched against the keywords and their synonyms. Sentences are scored based on detected keywords, and the ones with scores higher than a threshold are tagged as covered. We manually and automatically annotated 150 slide presentation recordings to evaluate the system. A simple tracking method, matching speech recognition results against the notes, was used as the baseline. The results show that our approach led to higher accuracy measures compared to the baseline method.
Bickmore, T., Trinh, H., Hoppmann, M., & Asadi, R. (2016, September). Virtual Agents in the Classroom: Experience Fielding a Co-presenter Agent in University Courses. In International Conference on Intelligent Virtual Agents (pp. 154-163). Springer International Publishing.
The design of a conversational virtual agent that assists professors and students in giving in-class oral presentations is described, along with preliminary evaluation results. The life-sized agent is integrated with PowerPoint presentation software and can deliver presentations in conjunction with a human presenter using appropriate verbal and nonverbal behavior. Results from evaluation studies in two courses—business and professional speaking, and computer science research methods—indicate that the agent is widely accepted in the classroom by students, and can serve to increase engagement in presentations given both by professors and students.
1. Trinh, H., Asadi, R., Edge, D., and Bickmore, T. 2017. RoboCOP: a robotic coach for oral presentations. Proceedings of the ACM Journal on Interactive, Mobile, Wearable and Ubiquitous Technologies (UbiComp’17), 1, 2:27. ACM
Rehearsing in front of a live audience is invaluable when preparing for important presentations. However, not all presenters take the opportunity to engage in such rehearsal, due to time constraints, availability of listeners who can provide constructive feedback, or public speaking anxiety. We present RoboCOP, an automated anthropomorphic robot head that acts as a coach to provide spoken feedback during presentation rehearsals at both the individual slide and overall presentation level. The robot offers conversational coaching on three key aspects of presentations: speech quality, content coverage, and audience orientation. The design of the feedback strategies was informed by findings from an exploratory study with academic professionals who were experienced in mentoring students on their presentations. In a within-subjects study comparing RoboCOP to visual feedback and spoken feedback without a robot, the robotic coach was shown to lead to significant improvement in the overall experience of presenters. Results of a second within-subjects evaluation study comparing RoboCOP with existing rehearsal practices show that our system creates a natural, interactive, and motivating rehearsal environment that leads to improved presentation quality.
2. Trinh, H., Asadi, R., and Bickmore, T. 2017. Designing health conversations with relational agents. ACM CHI 2017 Workshop on Conversational UX Design.
Automated dialogue systems represent a promising approach for health care promotion, thanks to their ability to emulate the experience of face-to-face interactions between health providers and patients. In this position paper we describe our framework for designing health dialogue systems using embodied conversational agents, and discuss our strategies for building health counseling dialogues and maintaining user engagement in longitudinal health interventions.
A Friendly Face in the Storm: Self-care Support System Requirements for Individuals with Spinal Cord Injury
3. Shamekhi, A., Trinh, H., Bickmore, T., Ellis, T., Houlihan, B., Latham, N. 2017. A friendly face in the storm: self-care support system requirements for individuals with spinal cord injury. ACM CHI 2017 Workshop on Interactive Systems in Healthcare (WISH).
Individuals with Spinal Cord Injury (SCI) have to deal with myriad lifestyle changes including significant reductions in physical mobility, physical rehabilitation, and a variety of self-monitoring procedures. They are also at risk of depression and social isolation. We explored the design requirements for an interactive system to best address the needs of people newly diagnosed with SCI through one-on-one interviews and a technology probe. We gathered qualitative information from ten participants with SCI, and identified key themes regarding the challenges the target population faces in their transition to a new lifestyle, and illustrate how technology can address these. We present rationale and design requirements for a conversational agent-based system as an ideal medium to provide health education, self-care management coaching and emotional and social support for persons with SCI.
4. Shamekhi, A., Trinh, H., Bickmore, T., DeAngelis, T. R., Ellis, T., Houlihan, B. V., and Latham, N. K. 2016. A virtual self-care coach for individuals with spinal cord injury. 18th ACM SIGACCESS Conference on Computes and Accessibility, 327-328. ACM.
Most persons with spinal cord injury (SCI) require training and support for self-care management to help prevent the development of serious secondary conditions after hospital discharge. We have designed a virtual coach system, in which an animated character engages users in simulated face-to-face conversation to provide health education and motivate healthy behavior. We conducted an exploratory study with nine participants who have SCI to examine the acceptance and attitudes towards our system. Results of the study show that participants are highly receptive of the virtual coach technology and recognize it as an effective medium to promote self-care.
2. Trinh, H., Edge, D., Ring, L., and Bickmore, T. 2016. Thinking outside the box: co-planning scientific presentations with virtual agents. 16th International Conference on Intelligent Virtual Agents, 306-316. Springer.
Oral presentations are central to scientific communication, yet the quality of many scientific presentations is poor. To improve presentation quality, scientists need to invest greater effort in the creative design of presentation content. We present AceTalk, a presentation planning system supported by a virtual assistant. This assistant motivates and collaborates with users in a structured brainstorming process to explore engaging presentation structures and content types. Our study of AceTalk demonstrates the potential of human-agent collaboration to facilitate the design of audience-centered presentations, while highlighting the need for rich modelling of audiences, presenters and talk contexts.
5. Kimani, E., Bickmore, T., Trinh, H., Ring, L., Paasche-Orlow, M. K., and Magnani, J. W. 2016. A smartphone-based virtual agent for atrial fibrillation education and counseling. 16th International Conference on Intelligent Virtual Agents, 120-127. Springer
When deployed on smartphones, virtual agents have the potential to deliver life-saving advice regarding emergency medical conditions, as well as provide a convenient channel for health education to help improve the safety and efficacy of pharmacotherapy. This paper describes the use of a smartphone-based virtual agent that provides counseling to patients with Atrial Fibrillation, along with the results from a pilot acceptance study among patients with the condition. Atrial Fibrillation is a highly prevalent heart rhythm disorder and is known to significantly increase the risk of stroke, heart failure and death. In this study, a virtual agent is deployed in conjunction with a smartphone-based heart rhythm monitor that lets patients obtain real-time diagnostic information on the status of their atrial fibrillation and determine whether immediate action may be needed. The results of the study indicate that participants are satisfied with receiving information about Atrial Fibrillation via the virtual agent.
Trinh, H., Ring, L. and Bickmore, T., 2015. DynamicDuo: Co-presenting with Virtual Agents. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15), (pp. 1739-1748). ACM.
The quality of most professional oral presentations is often poor, owing to a number of factors, including public speaking anxiety. We present DynamicDuo, a system that uses an automated, life-sized, animated agent to help inexperienced speakers deliver their presentations in front of an audience. The design of the system was informed by an analysis of TED talks given by two human presenters to identify the most common dual-presentation formats and transition behaviors used. In a within-subjects study (N=12) comparing co-presenting with DynamicDuo against solo-presenting with conventional presentation software, we demonstrated that our system led to significant improvements in public speaking anxiety and speaking confidence for non-native English speakers. Judges who viewed videotapes of these presentations rated those with DynamicDuo significantly higher on speech quality and overall presentation quality for all presenters.
Bickmore, T., Asadi, R., Ehyaei, A., Fell, H., Henault, L., Intille, S., Quintiliani, L., Shamekhi, A., Trinh, H., Waite, K. and Shanahan, C., 2015. Context-Awareness in a Persistent Hospital Companion Agent. In Intelligent Virtual Agents (IVA '15) (pp. 332-342). Springer International Publishing.
We describe the design and preliminary evaluation of a virtual agent that provides continual bedside companionship and a range of health, information, and entertainment functions to hospital patients during their stay. The agent system uses sensors to enable it to be aware of events in the hospital room and the status of the patient, in order to provide context-sensitive health counseling. Patients in the pilot study responded well to having the agent in their rooms for 1–3 days and engaged in 9.4 conversations per day with the agent on average, using all available functions.
Trinh, H., Yatani, K. and Edge, D., 2014l. PitchPerfect: integrated rehearsal environment for structured presentation preparation. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems (CHI '14) (pp. 1571-1580). ACM.
Rehearsal is a critical component of preparing to give an oral presentation, yet it is frequently abbreviated, performed in ways that are inefficient or ineffective, or simply omitted. We conducted an exploratory study to understand the relationship between the theory and practice of presentation rehearsal, classifying our qualitative results into five themes to motivate more structured rehearsal support deeply integrated in slide presentation software. In a within-subject study (N=12) comparing against participants’ existing rehearsal practices, we found that our resulting PitchPerfect system significantly improved overall presentation quality and content coverage as well as provided greater support for content mastery, time management, and confidence building.
Trinh, H., Waller, A., Vertanen, K., Kristensson, P.O. and Hanson, V.L., 2014. Phoneme-based predictive text entry interface. In Proceedings of the 16th international ACM SIGACCESS Conference on Computers & Accessibility (ASSETS '14) (pp. 351-352). ACM.
Phoneme-based text entry provides an alternative typing method for nonspeaking individuals who often experience difficulties in orthographic spelling. In this paper, we investigate the application of rate enhancement strategies to improve the user performance of phoneme-based text entry systems. We have developed a phoneme-based predictive typing system, which employs statistical language modeling techniques to dynamically reduce the phoneme search space and offer accurate word predictions. Results of a case study with a nonspeaking participant demonstrated that our rate enhancement strategies led to improved text entry speed and error rates.
Trinh, H., Waller, A., Vertanen, K., Kristensson, P.O. and Hanson, V.L., 2012. iSCAN: a phoneme-based predictive communication aid for nonspeaking individuals. In Proceedings of the 14th international ACM SIGACCESS conference on Computers and Accessibility (ASSETS '12) (pp. 57-64). ACM.
The high incidence of literacy deficits among people with severe speech impairments (SSI) has been well documented. Without literacy skills, people with SSI are unable to effectively use orthographic-based communication systems to generate novel linguistic items in spontaneous conversation. To address this problem, phoneme-based communication systems have been proposed which enable users to create spoken output from phoneme sequences. In this paper, we investigate whether prediction techniques can be employed to improve the usability of such systems. We have developed iSCAN, a phoneme-based predictive communication system, which offers phoneme prediction and phoneme-based word prediction. A pilot study with 16 able-bodied participants showed that our predictive methods led to a 108.4% increase in phoneme entry speed and a 79.0% reduction in phoneme error rate. The benefits of the predictive methods were also demonstrated in a case study with a cerebral palsied participant. Moreover, results of a comparative evaluation conducted with the same participant after 16 sessions using iSCAN indicated that our system outperformed an orthographic-based predictive communication device that the participant has used for over 4 years.
Trinh, H., Waller, A., Vertanen, K., Kristensson, P.O. and Hanson, V.L., 2012. Applying prediction techniques to phoneme-based AAC systems. In Proceedings of the SIGSLPAT Third Workshop on Speech and Language Processing for Assistive Technologies (pp. 19-27). ACL.
It is well documented that people with severe speech and physical impairments (SSPI) often experience literacy difficulties, which hinder them from effectively using orthographic-based AAC systems for communication. To address this problem, phoneme-based AAC systems have been proposed, which enable users to access a set of spoken phonemes and combine phonemes into speech output. In this paper we investigate how prediction techniques can be applied to improve user performance of such systems. We have developed a phoneme-based prediction system, which supports single phoneme prediction and phoneme-based word prediction using statistical language models generated using a crowdsourced AAC-like corpus. We incorporated our prediction system into a hypothetical 12-key reduced phoneme keyboard. A computational experiment showed that our prediction system led to 56.3% average keystroke savings.
Trinh, H., 2011. Using a computer intervention to support phonological awareness development of nonspeaking adults. In The proceedings of the 13th international ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '11) (pp. 329-330). ACM.
The present study investigates the effectiveness of a computer-based intervention to support adults with severe speech and physical impairments (SSPI) in developing their phonological awareness, an essential contributory factor to literacy acquisition. Three participants with SSPI undertook seven intervention sessions during which they were asked to play a training game on an iPad. The game was designed to enable learners to practice their phonological awareness skills independently with minimal instruction from human instructors. Preliminary results of post-intervention assessments demonstrate general positive effects of the intervention upon the phonological awareness and literacy skills of the participants. These results support the use of mainstream technologies to aid learning for individuals with disabilities.
Trinh, H., 2011. Developing a phoneme-based talking joystick for nonspeaking individuals. ACM SIGACCESS Accessibility and Computing, (99), pp.50-54. ACM.
This research investigates the potential of developing a novel phoneme-based assistive communication system for pre-literate individuals with severe speech and physical impairments. Using a force-feedback joystick-like game controller as the access tool, the system enables users to select forty-two phonemes (i.e., English sounds) used in literacy teaching and combine them together to generate spoken messages. What distinguishes this phoneme-based device from other communication systems currently available on the market is that it allows users who have not mastered literacy skills to create novel words and sentences without the need for a visual interface. Natural Language Processing (NLP) technologies, including phoneme-to-speech synthesis, phoneme-based disambiguation and prediction, and haptic force feedback technology are being incorporated into the device to improve its accessibility and usability.