Brainstorm | Precog Research Group

Brainstorm is the group’s reading session, which focuses on developing a wholesome knowledge about various topics in which research is being conducted at Precog. It provides an opportunity for focused discussions and exploring current state of the art. We believe these sessions help in developing scientific approach to problem articulation, sound research practices, and fact-based learning through accepted publications and case studies. The current reading group is comprised of vivacious graduate and undergraduate students at IIIT-H, with common interest in improving and contributing towards above mentioned areas of work. Please join the mailing list to get meet details, & reminders.

When: 2030-2130 hrs IST, Tuesdays

07 January 2025

Toy Models of Superposition
by Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan, Dario Amodei, Martin Wattenberg,Christopher Olah

Presenter: Vaishnavi Shivkumar

Paper

15 January 2025

Large Concept Models: Language Modeling in a Sentence Representation Space
by LCM team, Loïc Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussà, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk

Presenter: Varshita Kolipaka

Paper

21 January 2025

GraphAny: A Foundation Model for Node Classification on Any Graph
by Jianan Zhao, Hesham Mostafa, Mikhail Galkin, Michael Bronstein, Zhaocheng Zhu, Jian Tang

Presenter: Akshit Sinha

Paper

28 January 2025

Towards Foundation Models for Knowledge Graph Reasoning
by Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu

Presenter: Sumit Kumar

Paper

11 February 2025

OLMo: Accelerating the Science of Language Models
by Dirk Groeneveld, Iz Beltagy, Pete Walsh and others

Presenter: Sreeram Vennam

Paper

11 February 2025

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
by Luca Soldaini, Rodney Kinney, Akshita Bhagia and others

Presenter: Sreeram Vennam

Paper

11 March 2025

DeepSeek V3 Technical Report
by DeepSeek AI, Aixin Liu, Bei Feng, Bing Xue and others

Presenter: Ishwar B and Hemang Jain

Paper

25 March 2025

Large Language Diffusion Models
by Shen Nie, Fengqi Zhu and others

Presenter: Elamparithy M

Paper
When: 2045-2145 hrs IST, Mondays

12 October 2024

Linformer: Self-attention with linear complexity
by Wang, Sinong, et al.

Presenter: Debangan Mishra

Paper

09 September 2024

Graph neural networks are dynamic programmers
by Andrew J. Dudzik, Petar Veličković

Presenter: Monish Singhal

Paper Slides

31 August 2024

Understanding convolutions on graphs
by Ameya Daigavane, Balaraman Ravindran, Gaurav Aggarwal

Presenter: Akshit Sinha

Paper Slides Contact

24 August 2024

Auto-regressive next-token predictors are universal learners
by Malach, Eran

Presenter: Sreeram Vennam

Paper Slides Contact

17 August 2024

Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks
by Wu, Zhaofeng, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim

Presenter: Vamshi Krishna

Paper
When: 1530-1630 hrs IST, Saturdays

08 June 2024

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
by Collin Burns, Pavel Izmailov, Jan Hendrik Kirchner, Bowen Baker, Leo Gao, Leopold Aschenbrenner, Yining Chen, Adrien Ecoffet, Manas Joglekar, Jan Leike, Ilya Sutskever, Jeff Wu

Presenter: Swarang Joshi, Prishanshul Govil

Paper

01 June 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
by Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Presenter: Hari, Vamsi Krishna

Paper

25 May 2024

Stealing Part of a Production Language Model
by Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvĳotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Eric Wallace, David Rolnick, Florian Tramèr

Presenter: Sumit Kumar, Shiven Sinha

Paper

18 May 2024

Evolutionary Optimization of Model Merging Recipes
by Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, David Ha

Presenter: Anish R Joishy, Varshita Kolipaka

Paper

11 May 2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces
by Albert Gu, Tri Dao

Presenter: Ameya Rathod, Ishan Kavathekar

Paper

07 January 2025

Toy Models of Superposition

15 January 2025

Large Concept Models: Language Modeling in a Sentence Representation Space

21 January 2025

GraphAny: A Foundation Model for Node Classification on Any Graph

28 January 2025

Towards Foundation Models for Knowledge Graph Reasoning

11 February 2025

OLMo: Accelerating the Science of Language Models

11 February 2025

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

11 March 2025

DeepSeek V3 Technical Report

25 March 2025

Large Language Diffusion Models

12 October 2024

Linformer: Self-attention with linear complexity

09 September 2024

Graph neural networks are dynamic programmers

31 August 2024

Understanding convolutions on graphs

24 August 2024

Auto-regressive next-token predictors are universal learners

17 August 2024

Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks

08 June 2024

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

01 June 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

25 May 2024

Stealing Part of a Production Language Model

18 May 2024

Evolutionary Optimization of Model Merging Recipes

11 May 2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces