Skip to main content
Solid blue color image as a placeholder

Arman Cohan, PhD

Faculty Member

Center for Neurocomputation and Machine Intelligence

Email | Department | X

Natural language processing and deep learning

Arman Cohan's research broadly encompasses natural language processing (NLP) and artificial intelligence (AI). Language represents a crucial aspect of human cognition, enabling individuals to express thoughts, ideas, emotions, communicate with others, shape perceptions of the world, and influence thinking. The focus within NLP is on Large Language Models and Generation. The development and study of Large Language Models have direct implications for understanding human cognition, specifically in language processing. Large Language Models are a powerful tool for conducting experiments and simulations at scale, facilitating the testing of theories regarding language processing in the brain. Concurrently, insights from neuroscience, particularly concerning language acquisition and processing, contribute to developing enhanced models.




Cohan is Assistant Professor of Computer Science at Yale. His work covers a range of issues at the nexus of NLP and Machine Learning, including Language Modeling, Generation, Representation Learning, and Applications in Specialized Domains. Cohan earned his PhD from Georgetown University in 2018. Before his appointment at Yale in 2023, he was a Research Scientist at the Allen Institute for AI (AI2).

Research Contributions

ABNIRML: Analyzing the Behavior of Neural IR Models

Transactions of the Association for Computational Linguistics (2022)

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

ACL: Processings of the Association for Computational Linguistics (2021)

FLEX: Unifying Evaluation for Few-Shot NLP

NeurIPS: Advances in Neural Information Processing Systems (2021)

SPECTER: Document-level Representation Learning using Citation-informed Transformers

ACL: Proceedings of the Association for Computational Linguistics (2020)

TLDR: Extreme Summarization of Scientific Documents

Findings of the Association for Computational Linguistics: EMNLP 2020 (2000)