Yejin Choi

Yejin Choi
Assistant Professor

Department of Computer Science 
Stony Brook University
Stony Brook, NY 11794-2424

ychoi [at]


Natural Language Processing, Computational Linguistics.


Yejin Choi ​is affiliated with the Department of Computer Science at the University of Washington. She received her Ph.D. in Computer Science at Cornell University, and BS in Computer Science and Engineering at Seoul National University. She spent the summer of 2009 as a research intern at Yahoo! Research and joined the faculty of Computer Science Department at Stony Brook University in Sep 2010.


  • Integrative Models for Natural Language and Images, Language Grounding

    Web data today is increasingly multi-modal, opening up opportunities as well as the need for integrative models to bridge Natural Language Processing with Computer Vision. Yejin Choi's project's recent explorations include
    - Generating natural language descriptions of images by guiding object detection with language prior [CVPR-11] , by predicting likely action verbs from language-driven world knowledge [CoNLL-11] , and by composing phrases retrieved by partial image matching [ACL-12].
    - Understanding characteristics of visual descriptions [NAACL-12].
    - Constructing a new image-text parallel corpus by reducing information misalignment between images and text [ACL-13].

  • Writing Styles, Deception Detection, Personal Analytics, Forensic Language Technologies

    Language is a window into people's minds. This project explores data-driven approaches to statistical stylometry (i.e., the study of linguistic styles), and forensic language technologies (e.g., authorship verification, obfuscation, deception detection). This research is naturally interdisciplinary with broad connections to Psychology, Social Science, Cognitive Science, Psycholinguistics, and Literature.
    Recent development includes
    - Detecting socio-cognitive identities, such as authorship [EMNLP-12], gender [CoNLL-11], and nationality.
    - Uncovering (hidden) intent of the authors, such as deception [ACL-11, ACL-12, ICWSM-12], and textual vandalism [ACL-11].

  • Learning Connotation from a Network of Words

    This project recently presents algorithms to learn subtle, nuanced connotation of words using a large-scale constraint optimization on a network of words [ACL-13, EMNLP-11].
    Details are at connotation lexicon.
    Past explorations on opinion analysis include
    - Lexicon induction and adaptation for sentiment analysis [EMNLP-09, EMNLP-11].
    - Analysis in light of compositional semantics [EMNLP-08, WWW-10].
    - Fine-grained opinion analysis [ACL-10, IJCAI-07, EMNLP-06, EMNLP-05].
    Some of the work has been adopted by


Yejin Choi is a recipient of the Google Research Grant.

Teaching Summary

CSE 300, CSE 392, CSE 507, CSE 628