CANINE — Transformers without Tokenization 🔤How to train Transformers directly on CharactersMar 181Mar 181
Publication Data Reveals Shifting Reader Preferences in 2025Which publications are winning in 2025 and whyMar 11Mar 11
Pics of Europe trip — GermanyReal pics from my 3 months — Koblenz, Cologne, Rothenburg ob de Tauber, Munich, Fussen, Berlin.Mar 11Mar 11
A late review of OpenAI’s “Training Verifiers to Solve Math Word Problems”Solving math problems with GPT. Verifiers, GSM-8K dataset, mathematical reasoningMar 14, 20241Mar 14, 20241
Published inCodeXSentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text…The tokenization method for Alpaca, LLaMA, T5, XLNetMay 19, 20231May 19, 20231
Published inCodeXAn Introduction to Multiprocessing Using PythonIs multithreading or multiprocessing faster than another?Apr 16, 2023Apr 16, 2023
Subword RegularizationPrerequisite for understanding the commonly used `SentencePiece` tokenizer. Improve your tokenization using subword regularization!Mar 5, 2023Mar 5, 2023
Details of Faster, Mask, Cascade R-CNN 🔥WHY should you care? Faster R-CNN is an actively cited benchmark for comparing object detection performance of modern network…Jun 7, 2022Jun 7, 2022