Home·Publications·Background·Awards·Book·Talks

Bodhisattwa Prasad Majumder
bodhisattwam[at]allenai.org
Office @ Northlake Commons
Allen Institute for AI

I am a Research Scientist at Allen Institute for AI. I lead research on automating data-driven discovery for Asta, Ai2's agentic ecosystem for scientific research.

I received my Ph.D. in Computer Science from UC San Diego, advised by Julian McAuley. I was fortunate to receive UCSD CSE Doctoral Award for Excellence in Research (2022), Adobe Research Fellowship (2022), Qualcomm Innovation Fellowship (2020), and led UCSD (Team Bernard) in Amazon Alexa Prize, 2019.


Google Scholar




Github  |  LinkedIn





Previously, I spent wonderful summers at Facebook AI Research, Microsoft Research, and Google AI Research. I worked as a technical editor for The Batch, a weekly newsletter by deeplearning.ai. I also co-authored a best-selling book on NLP with O'Reilly Media.


Research

My research goal is to develop communicative reasoners that can learn, adapt, and reason by interacting with the world and produce effective, explainable, and equitable outcomes. I apply my research to advance the frontier of science.

I led the development of Asta DataVoyager, a data-driven discovery tool to accelerate science with generative AI. We are the first to introduce the trustworthy evaluation harness for AI agents solving autonomous scientific discovery tasks.

Research highlights includes DataVoyager, Auto-discovery with Surprisal, DiscoveryBench, DiscoveryWorld, CodeScientist. I also design, train, and evaluate LLM agents at scale. See my full body of work in Publications.


Awards
Book: Practical Natural Language Processing by O'Reilly
PontTuset

Practical Natural Language Processing
O'Reilly Media, 2020
Sowmya Vajjala, Bodhisattwa P. Majumder, Anuj Gupta, Harshit Surana
amazon | safari online | website

Practical Natural Language Processing distills our collective wisdom on building real world applications such as data collection, working with noisy data and signals, incremental development of solutions, and issues involved in deploying the solutions as a part of a larger application - bridging a gap between current textbooks and online offerings.

Highlights:

  • Endorsed by Zach Lipton, Sebastian Ruder, Marc Najork et al.
  • #1 Best seller in Amazon.com in Data Mining category
  • #1 New release in Amazon.com in Natural Language Processing category
  • Read and adapted by 20+ AI companies and 6 academic courses internationally

Talks

Continual Learning with Language Agents | slides

    [2024] at Commonsense Reasoning in Natural Language Processing, University of British Columbia

User-centic Natural Language Processing | video

    [2023] PhD Defense, CSE, UC San Diego

Effective, Explainable, and Equitable NLP with Knowledge and Interactions | slides

  • [2022] at Stanford University
  • [2022] at Allen Institute for AI
  • [2022] at University of Southern California/USC-ISI
  • [2022] at Harvard University/Harvard Business School
  • [2022] at University College London
  • [2022] at UC Irvine
  • [2022] at University of British Columbia
  • [2022] at UC San Diego

Producing Explanations with Commonsense and Interactions | slides

  • [2022] at AI Research Seminar, UC San Diego
  • [2021] at Allen Institute for AI

Explainable Language Generation with Commonsense | slides

  • [2021] at Facebook AI Research
  • [2021] at Machine Learning Group, Oxford University

Grounding Language Generation with World Knowledge | slides

  • [2021] at Microsoft Research, India
  • [2021] at IIT Kharapgur
  • [2020] at NC State, AI Club
  • [2020] at INFORMS 2020, Mining and Learning on Graphs session, Washington, DC

Clarification Question Generation using Global Knowledge | slides

  • [2021] at Microsoft Research, Redmond
  • [2021] at AI Research Seminar, UC San Diego

Personalization, NLP and others

  • [2020] at UC San Diego, CSE Research Open House, on Personalization in Natural Language Generation
  • [2018] at Indian Inst of Management Calcutta, Industry Conclave & Graduate Orientation, on NLP - a primer
  • [2017] at Walmart Labs, on Information Extraction from Images - Application in e-Commerce
  • [2017] at Indian Statistical Institute, on Deep Neural Network: in light of Optimization and Regularization


Thanks to Jon Barron for this nice template!
Gorgeous Geisel Library cover art from here.