Yejin Choi
    Assistant Professor
    Stony Brook University (SUNY Stony Brook)
    1422 Computer Science
    Stony Brook, NY 11794-4400
    (phone) 631-632-8457
    (fax) 631-632-8334
    email:

News:

  • Area chair for EMNLP 2014
  • 1 paper at ACL
  • Our EMNLP paper on predicting successful novels is featured in numerous media outlets --- IEEE Spectrum Podcast; Toronto Star; NPR; CBS Radio Canada; Phys.org
  • 3 papers (2 long + 1 short) at EMNLP 2013
  • 1 paper at ICCV 2013 -- Best Paper Award
  • New media coverage: our collaboration with Mike Luca at Harvard Business School is featured in The Atlantic
  • 2 papers at ACL 2013: one on connotation lexicon, another on new image-text parallel corpus
  • New media coverage: our work on connotation lexicon is featured by FastCompany
  • 1 journal to appear at TPAMI 2013
  • Invited speaker at Vision+NLP Workshop at NAACL 2013
  • Panel speaker at Student Research Workshop at NAACL 2013
  • Interview with News for New York @ WNBC on deception cues in product reviews
  • Area chair for EMNLP 2012
  • Area chair for NAACL 2012
Office Hours: MON 2:30pm-4:00pm

Teaching:

    CSE 628 (grad) Introduction to Natural Language Processing [Spring 2014, Fall 2012, Fall 2011, Fall 2010]
    CSE 507 (grad) Computational Linguistics [Spring 2013, Spring 2012, Spring 2011]
    CSE 392 (ugrad) Introduction to Natural Language Processing [Fall 2013]
    CSE 300 (ugrad) Technical Communications [Fall 2013]


Recent Research Projects:

  • Language and Vision; Language Grounding

    Web data today is increasingly multi-modal, opening up opportunities as well as the need for integrative models to bridge Natural Language Processing with Computer Vision. Our recent explorations include
    - Generating natural language descriptions of images by guiding object detection with language prior [CVPR-11], by predicting likely action verbs from language-driven world knowledge [CoNLL-11], and by composing phrases retrieved by partial image matching [ACL-12].
    - Understanding characteristics of visual descriptions [NAACL-12].
    - Constructing a new image-text parallel corpus by reducing information misalignment between images and text [ACL-13].
  • Writing Styles, Deception Detection, Personal Analytics, Forensic Language Technologies

    Language is a window into people's minds. We explore data-driven approaches to statistical stylometry (i.e., the study of linguistic styles), and forensic language technologies (e.g., authorship verification, obfuscation, deception detection). This research is naturally interdisciplinary with broad connections to Psychology, Social Science, Cognitive Science, Psycholinguistics, and Literature.

    Our recent development includes
    - Predicting the success of novels [EMNLP-13a], and creative lexical compositions [EMNLP-13b].
    - Uncovering (hidden) intent of the authors, such as deception [ACL-11, ACL-12, ICWSM-12], and textual vandalism [ACL-11].
    - Detecting socio-cognitive identities, such as authorship [EMNLP-12], gender [CoNLL-11], and nationality.

Publications:


Students:

Short Bio:

Personal: