Bodhisattwa P. Majumder
Office @ 4146, CSE (EBU3B)
UC San Diego

I am a 2nd year Ph.D. student at the Artificial Intelligence Group, Computer Science Department, UC San Diego, advised by Prof. Julian McAuley. I focus on various aspects in Natural Language Generation and Conversational AI tasks to introduce personalization and common sense reasoning, while keeping my research interest at the broad intersection of Natural Language Processing and Machine learning.

Currently I lead Team Bernard from UC San Diego in Amazon Alexa Prize. I also spent a wonderful summer in 2019 at Google AI Reserach with Sandeep Tata and Navneet Potti.

Previously, I graduated (2017) summa cum laude from IIT Kharagpur with a Masters in Machine Learning. I was advised by Prof. Animesh Mukherjee and Prof. Pawan Goyal. Before joining UC San Diego, I was a Research Engineer at Walmart Labs building large-scale NLP and Machine Learning applications for eCommerce.

CV  |  Google Scholar  |  Github  |  LinkedIn  |  Twitter

By Note-to-Self
Museum of Photographic Arts (MOPA)
A fusion with piano notes and digital image

Research Experiences Book & Talks

Here in xkcd.


I explore dialog systems, question-answering, assistive generation and broadly various natural language generation tasks. I'm interested in developing generative models catering to personalization, commonsense reasoning and subjective question answering broadly connecting to conversational modeling. My previous research on NLP includes sequence labeling, sequence generation, and natural language parsers. I also worked on statistical modeling, game theory, and machine learning applications.

Selected reseach projects are listed here. The complete list of my publications can be seen from the Google Scholar page.

(* denotes equal contribution)

ReZero is All You Need: Fast Convergence at Large Depth
Thomas Bachlechner*, Bodhisattwa P. Majumder*, Henry Mao*, Gary Cottrell, Julian McAuley
Preprint. Work In Progress. arXiv, 2020
pdf | code

To facilitate deep signal propagation, we propose ReZero, a simple change to the architecture that initializes an arbitrary layer as the identity map, using a single additional learned parameter per layer. When applied to 12 layer Transformers, ReZero converges 56% faster on enwiki8. ReZero applies beyond Transformers to other residual networks, enabling 1,500% faster convergence for deep fully connected networks and 32% faster convergence for a ResNet-56 trained on CIFAR 10.


Generating Personalized Recipes from Historical User Preferences
Bodhisattwa P. Majumder*, Shuyang Li*, Jianmo Ni, Julian McAuley
2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)
pdf | code | data

Press: UCSD CSE News, UCSD JSOE News

We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user's historical preferences.


Improving Neural Story Generation by Targeted Common Sense Grounding
Henry Mao, Bodhisattwa P. Majumder, Julian McAuley, Gary Cottrell
2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)
pdf | code

We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary training signals from datasets designed to provide common sense grounding.


Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit
Amrith Krishna, Bodhisattwa P. Majumder, Rajesh S. Bhat, Pawan Goyal
2018 Conference on Computational Natural Language Learning (CoNLL), co-located with EMNLP
pdf | code+data | supplementary

We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit. We find that the use of copying mechanism (Gu et al., 2016) yields a percentage increase of 7.69 in Character Recognition Rate (CRR) than the current SOTA model in solving monotone sequence-to-sequence tasks (Schnober et al., 2016) This work was done in a collaboration with CNeRG.


An 'Eklavya' approach to learning Context Free Grammar rules for Sanskrit using Adaptor Grammar
Amrith Krishna, Bodhisattwa P. Majumder, Anil K. Boga, Pawan Goyal
17th World Sanskrit Conference , 2018

This work presents the use of Adaptor Grammar, a non-parametric Bayesian approach for learning (Probabilistic) Context Free Grammar productions from data. We discuss the effect of using Adaptor grammars for Sanskrit language at word-level supervised tasks such as compound type identification, identification of source and derived words from the corpora for derivational nouns and sentence-level structured prediction. This work was done in a collaboration with CNeRG.


Deep Recurrent Neural Networks for Product Attribute Extraction in eCommerce
Bodhisattwa P. Majumder*, Aditya Subramanian*, Abhinandan Krishnan, Shreyansh Gandhi, Ajinkya More
ArXiv , 2017
pdf | system description | video

We demonstrate the potential of neural recurrent structures in product attribute extraction by improving overall F1 scores, as compared to the previous benchmarks (More et al., 2016) by at least 0.0391. This has made Walmart e-commerce achieve a significant coverage of important facets or attributes of products. This work was done at Walmart Labs and was followed by a US patent from Wal-mart.


Distributed Semantic Representations of Retail Products based on Large-scale Transaction Logs
Bodhisattwa P. Majumder*, Sumanth S Prabhu*, Julian McAuley

We processed 18 million transactions consisting of unique 325,548 products from 1,551 categories to obtain vector representations which preserve product analogy. These representations were effective in identifying substitutes and complements. This work was done at Walmart Labs.


When lolcats meet philosoraptors! - What's in a 'meme'?
Bodhisattwa P. Majumder, Amrith Krishna, Unni Krishnan, Anil K. Boga, Animesh Mukherjee
Arxiv , 2018
pdf | presentation

How similar are the dynamics of meme based communities to that of text based communities? We try to explain the community dynamics by categorising each day based on temporal variations in the user engagement. This work was done in a collaboration with CNeRG.

  • [US patent] REDCLAN - RElative Density based CLustering and Anomaly Detection, Wal-mart, 2018
  • [US patent] Automated Extraction of Product Attributes from Images, Wal-mart, 2018
  • [US patent] System and Method for Product Attribute Extraction Using a Deep Recurrent System, Wal-mart, 2017
  • [US patent] Analytical Determination of Competitive Interrelationship between Item Pairs, Wal-mart, 2017

Amazon Alexa Prize
Team Leader of Bernard, UC San Diego.

Press: cnet

Building free-form social conversational agent as a finalist in the Amazon Alexa Prize Challenge 2019-2020 along with 9 other finalist universities. We have been awarded $250,000 to support our research on dialog systems.


Google AI, Mountain View
Summer, 2019
Team Juicer with Sandeep Tata and Navneet Potti.

Developed an Information Extraction Framework for form-like documents using representation learning. The work was published as an Intern spotlight article in the Google-wide Newsletter and is being integrated with Google Cloud's Document AI.


Walmart Labs
Research Engineer

Developed a neural multimodal attribute tagging framework to improve faceted product using both product description and product images. The work produced 2 US patents and one technical report published in arXiv. Other works on user modeling and product embeddings also have been patented in US.

Book: Practical NLP with O'Reilly

Practical Natural Language Processing
O'Reilly Media, 2020
Sowmya Vajjala, Bodhisattwa P. Majumder, Anuj Gupta, Harshit Surana
pre-order | early release (requires login) | website

Practical Natural Language Processing is a guide to build, iterate and scale NLP systems in a business setting and to tailor them for various industry verticals. The book distills our collective wisdom on building real world applications such as data collection, working with noisy data and signals, incremental development of solutions, and issues involved in deploying the solutions as a part of a larger application - bridging a gap between current textbooks and online offerings.

Invited Talks
  • [2020] at UC San Diego, CSE Research Open House, on Personalization in Natural Language Generation
  • [2018] at Indian Inst of Management Calcutta, Industry Conclave & Graduate Orientation, on NLP - a primer
  • [2017] at Walmart Labs, on Information Extraction from Images - Application in e-Commerce
  • [2017] at Indian Statistical Institute, on Deep Neural Network: in light of Optimization and Regularization

Thanks to Jon Barron for this nice template! Art by Bekin M ~