Akash Verma blog | Documenting my deeplearning journey and a home for my poorly researched ideas that I find myself repeating often anyway

Aug 21, 2024
Global Attention
Jun 22, 2024
Creating simple RAG
Jun 3, 2024
Interpretable Features from Neural Networks
May 21, 2024
CNNs
May 16, 2024
Gradient-Checkpointing
May 9, 2024
Case Study on Transformer-based architecture
May 6, 2024
Key-Value caching
May 4, 2024
I challenged myself to visualize attentions (nothing special)
Apr 28, 2024
Transformers
Apr 12, 2024
Parameter updates
Apr 2, 2024
Understanding loss.backward()
Mar 10, 2024
Batch Normalization
Mar 4, 2024
Dead Neurons
Feb 16, 2024
Gated Recurrent Unit (GRU)
Feb 7, 2024
Long Short-Term Memory (LSTMs)
Feb 6, 2024
Stateful Recurrent Neural Network
Feb 5, 2024
Tokenize & Numericalize
Feb 3, 2024
Self-Supervised & Transfer Learning in Language Models
Feb 1, 2024
Embeddings in sequential Neural Network
Jan 15, 2024
Embeddings in Recommendation Systems
Dec 22, 2023
Cross Entropy in Classification
Dec 11, 2023
Logarithms in Deep Learning
Nov 4, 2023
Algorithm behind universal function approximator
Sep 30, 2023
Traffic on blocked ports
Sep 15, 2023
I relearned entropy