Bodhisattwa Majumder - Allen Institute for AI

Bodhisattwa Prasad Majumder
bodhisattwam[at]allenai.org
Office @ Northlake Commons
Allen Institute for AI

I am a Research Scientist at Allen Institute for AI. I lead research on automating data-driven discovery for Asta, Ai2's agentic ecosystem for scientific research.

I received my Ph.D. in Computer Science from UC San Diego, advised by Julian McAuley. I was fortunate to receive UCSD CSE Doctoral Award for Excellence in Research (2022), Adobe Research Fellowship (2022), Qualcomm Innovation Fellowship (2020), and led UCSD (Team Bernard) in Amazon Alexa Prize, 2019.

Google Scholar

Github | LinkedIn

Previously, I spent wonderful summers at Facebook AI Research, Microsoft Research, and Google AI Research. I worked as a technical editor for The Batch, a weekly newsletter by deeplearning.ai. I also co-authored a best-selling book on NLP with O'Reilly Media.

Research

My research goal is to develop communicative reasoners that can learn, adapt, and reason by interacting with the world and produce effective, explainable, and equitable outcomes. I apply my research to advance the frontier of science.

I led the development of Asta DataVoyager, a data-driven discovery tool to accelerate science with generative AI. We are the first to introduce the trustworthy evaluation harness for AI agents solving autonomous scientific discovery tasks.

Research highlights includes DataVoyager, Auto-discovery with Surprisal, DiscoveryBench, DiscoveryWorld, CodeScientist. I also design, train, and evaluate LLM agents at scale. See my full body of work in Publications.

Awards

[2025] Outstanding Position Paper in International Conference on Machine Learning (ICML), 2025
[2024] Receipient of UCSD CSE Doctoral Dissertation Award
[2022] Work recognized as Highlights of ACM RecSys' 22; invited for ACM Transactions on Recommendation Systems
[2022] Receipient of TrustNLP Travel Grant award for NAACL 2022
[2022] Receipient of UCSD CSE Doctoral Award for Excellence in Research, 2022
[2022] Receipient of Adobe Research Fellowship, 2022
[2021] Receipient of Friends of the International Center Fellowship, UC San Diego
[2020] Receipient of Qualcomm Innovation Fellowship, 2020 from North America
[2019] Intern Spotlight in Google-wide Engineering Newsletter for summer internship project with the Juicer Team
[2019] Awarded $250,000 for leading UC San Diego (Team Bernard) in the finals of Alexa Prize 2019
[2018] Department Fellowship, 1st-year of PhD, Dept. of CSE, UC San Diego
[2017] Gold medal and Endowment for the highest academic performance (Rank-1) in Masters, IIT Kharagpur
[2016] Finalist, Data Science Game '16, Paris; Represented India (1 out of 3 teams), International Rank 14
[2015] Scholarship for academic excellence (obtaining CGPA > 9.5), Indian Statistical Institute
[2011] 4-year scholarship for academic excellence, Ministry of Human Resource & Development, India

Book: Practical Natural Language Processing by O'Reilly

Practical Natural Language Processing
O'Reilly Media, 2020
Sowmya Vajjala, Bodhisattwa P. Majumder, Anuj Gupta, Harshit Surana
amazon | safari online | website

Practical Natural Language Processing distills our collective wisdom on building real world applications such as data collection, working with noisy data and signals, incremental development of solutions, and issues involved in deploying the solutions as a part of a larger application - bridging a gap between current textbooks and online offerings.

Highlights:

Endorsed by Zach Lipton, Sebastian Ruder, Marc Najork et al.
#1 Best seller in Amazon.com in Data Mining category
#1 New release in Amazon.com in Natural Language Processing category
Read and adapted by 20+ AI companies and 6 academic courses internationally

Talks

Continual Learning with Language Agents | slides

[2024] at Commonsense Reasoning in Natural Language Processing, University of British Columbia

User-centic Natural Language Processing | video

[2023] PhD Defense, CSE, UC San Diego

Effective, Explainable, and Equitable NLP with Knowledge and Interactions | slides

[2022] at Stanford University
[2022] at Allen Institute for AI
[2022] at University of Southern California/USC-ISI
[2022] at Harvard University/Harvard Business School
[2022] at University College London
[2022] at UC Irvine
[2022] at University of British Columbia
[2022] at UC San Diego

Producing Explanations with Commonsense and Interactions | slides

[2022] at AI Research Seminar, UC San Diego
[2021] at Allen Institute for AI

Explainable Language Generation with Commonsense | slides

[2021] at Facebook AI Research
[2021] at Machine Learning Group, Oxford University

Grounding Language Generation with World Knowledge | slides

[2021] at Microsoft Research, India
[2021] at IIT Kharapgur
[2020] at NC State, AI Club
[2020] at INFORMS 2020, Mining and Learning on Graphs session, Washington, DC

Clarification Question Generation using Global Knowledge | slides

[2021] at Microsoft Research, Redmond
[2021] at AI Research Seminar, UC San Diego

Personalization, NLP and others

[2020] at UC San Diego, CSE Research Open House, on Personalization in Natural Language Generation
[2018] at Indian Inst of Management Calcutta, Industry Conclave & Graduate Orientation, on NLP - a primer
[2017] at Walmart Labs, on Information Extraction from Images - Application in e-Commerce
[2017] at Indian Statistical Institute, on Deep Neural Network: in light of Optimization and Regularization

Thanks to Jon Barron for this nice template!
Gorgeous Geisel Library cover art from here.