Teaching AI to Read and Group Like I Bookmark the Web: A Journey into Dynamic Topic Modeling

Quick Links: Dataset on HuggingFace The Topic Modeling Challenge You know that feeling when you have 50 browser tabs open, and you’re desperately trying to organize them into bookmark folders? “ML Papers To Read,” “Funny Cat Videos,” “Recipes I’ll Never Make”… We all have our system. And apparently, it’s such a universal problem that every tech company is launching their own solution - Arc Browser with its “Spaces,” Chrome with its tab groups, and about 500 extensions promising to color-code your digital hoarding habits into submission....

November 11, 2024 | 4 min | Aman Priyanshu

Contra-Topic-bottleneck-t5: Efficient Topic Extraction Without the Computational Overhead

Quick Links: Model on HuggingFace | Interactive Demo When it comes to topic extraction, the AI world seems fixated on massive models and expensive compute. But what if there was a simpler way? 🤔 The Genesis: Simplicity Through Linear Transformation Picture this: There I was, looking for an open-source solution to extract topics from text at scale. The available options were either massive language models or complex fine-tuning pipelines. That’s when it hit me – what if we could leverage the semantic structure of existing embeddings with just a linear transformation?...

November 6, 2024 | 3 min | Aman Priyanshu

LinearCosine: When AI Researchers Decided Multiplication was Too Mainstream

Hey there, optimization seekers and efficiency enthusiasts! 📊🧮 Today, we’re diving into a world where even basic arithmetic operations are up for debate. Buckle up as we explore LinearCosine, an experiment that asks: “Do we really need multiplication for AI?” Quick Links to skip the talk: Project Website - Linear Cosine | GitHub Repo | Original Paper The Paper That Started It All During my fall break, while I was supposed to be relaxing, my roommate Yash Maurya forwarded me a fascinating paper by Hongyin Luo and Wei Sun titled “Addition is All You Need for Energy-efficient Language Models”....

October 21, 2024 | 6 min | Aman Priyanshu

AdaptKeyBERT: Stumbling Through Two Years of Keyword Extraction

Quick links (in case you want to skip my ramblings): PyPI Package GitHub Repository Alright, gather ‘round, word enthusiasts and syntax sorcerers! 🧙‍♂️📚 Remember that time you tried to explain machine learning to your grandma and ended up comparing neural networks to her knitting patterns? Well, buckle up, because we’re about to dive into a similar realm of “What was I thinking?” – the saga of AdaptKeyBERT. It’s been two trips around the sun since I cobbled together this quirky little keyword extractor and sent it off into the wild world of NLP....

September 22, 2024 | 3 min | Aman Priyanshu

YC-Dendrolinguistics: Planting Linguistic Trees in the Startup Forest

Hey there, fellow AI adventurers and startup enthusiasts! 🌳🚀 Today, I’m excited to give you a peek into my latest passion project: YC-Dendrolinguistics. Buckle up as we embark on a journey through the linguistic forests of Y-Combinator pitches! The Seed of an Idea Picture this: It’s 2 AM, I’m knee-deep in YC application videos, and suddenly it hits me – what if startup pitches are like trees? 🤔 Each word a branch, each phrase a limb, growing into this complex organism we call a pitch....

September 12, 2024 | 4 min | Aman Priyanshu

Synaptic Sparks: Why I'm Wiring My Thoughts into a Neural Blogosphere

Hey there, fellow AI enthusiasts and curious minds! 🧠🤖 Today, I just want to document what’s leading to this new adventure in regular blogging. The Knowledge Synapse Picture me back in 2019, a wide-eyed novice bouncing around the vast landscape of machine learning. I was devouring every GitHub gist, Medium post, and arXiv paper I could find, growing and learning at a dizzying pace. Fast forward to today, and it feels like I’ve stepped into an alternate universe....

September 9, 2024 | 4 min | Aman Priyanshu

FRACTURED-SORRY-Bench: Unraveling AI Safety through Decomposing Malicious Intents

Hello, fellow AI enthusiasts! 🤖 Today, I wanted to dive into the FRACTURED-SORRY-Bench framework and dataset we just released. Check out the dataset, website, and github for the dataset! The FRACTURED-SORRY Saga: A Tale of Adaptation and Decomposition Picture this: you’re wandering through the lush collection of prompt-injection and llm-red-teaming papers, marveling at some of the weird and some of the crazier attack mechanisms that have been released recently. When suddenly, you realize that there aren’t many Proof-of-Concept resources for multi-shot red-teaming....

August 28, 2024 | 3 min | Aman Priyanshu