Giles' Blog

Giles' Blog is the personal technical blog for Giles Thomas, a software engineer and entrepreneur. Current information about Giles Thomas can be found on the about page.

This page lists the 20 most recent posts, and then all categorised posts by their category (many posts have multiple categories).

Posts in category AI

Evolution in action posted on 2008-10-03T17:52:56+00:00
Building an AI chatbot for beginners: part 0 posted on 2023-03-19T20:45:00+00:00
Building an AI chatbot for beginners: part 1 posted on 2023-03-19T21:45:00+00:00
Building an AI chatbot for beginners: part 2 posted on 2023-04-04T19:45:00+00:00
Giving up on the AI chatbot tutorial (for now) posted on 2024-02-27T20:45:00+00:00
LLM Quantisation Weirdness posted on 2024-02-27T22:45:00+00:00
Messing around with fine-tuning LLMs posted on 2024-04-27T22:45:00+00:00
Messing around with fine-tuning LLMs, part 2 -- to the cloud! posted on 2024-04-28T22:45:00+00:00
Messing around with fine-tuning LLMs, part 3 -- moar GPUs posted on 2024-05-15T23:45:00+00:00
Messing around with fine-tuning LLMs, part 4 -- training cross-GPU. posted on 2024-05-21T21:45:00+00:00
Messing around with fine-tuning LLMs, part 5 -- exploring memory usage posted on 2024-07-05T17:45:00+00:00
Messing around with fine-tuning LLMs, part 6 -- measuring memory usage more systematically posted on 2024-07-10T23:45:00+00:00
Messing around with fine-tuning LLMs, part 7 -- detailed memory usage across sequence lengths for an 8B model posted on 2024-08-16T23:45:00+00:00
Messing around with fine-tuning LLMs, part 8 -- detailed memory usage across batch sizes posted on 2024-08-25T23:00:00+00:00
Messing around with fine-tuning LLMs, part 9 -- gradient checkpointing posted on 2024-09-03T23:00:00+00:00
Messing around with fine-tuning LLMs, part 10 -- finally training the model! posted on 2024-12-22T19:00:00+00:00
Writing an LLM from scratch, part 1 posted on 2024-12-22T21:00:00+00:00
Writing an LLM from scratch, part 2 posted on 2024-12-23T21:00:00+00:00
Writing an LLM from scratch, part 3 posted on 2024-12-26T22:30:00+00:00
Writing an LLM from scratch, part 4 posted on 2024-12-28T22:30:00+00:00
An AI chatroom (beginnings) posted on 2024-12-29T23:15:00+00:00
An AI chatroom (a few steps further) posted on 2024-12-30T23:15:00+00:00
Writing an LLM from scratch, part 5 -- more on self-attention posted on 2025-01-11T23:30:00+00:00
Do reasoning LLMs need their own Philosophical Language? posted on 2025-01-16T23:30:00+00:00
Writing an LLM from scratch, part 6 -- starting to code self-attention posted on 2025-01-21T22:30:00+00:00
Writing an LLM from scratch, part 6b -- a correction posted on 2025-01-28T22:30:00+00:00
Writing an LLM from scratch, part 7 -- wrapping up non-trainable self-attention posted on 2025-02-07T21:30:00+00:00
On the perils of AI-first debugging -- or, why Stack Overflow still matters in 2025 posted on 2025-02-19T02:30:00+00:00
Basic matrix maths for neural networks: the theory posted on 2025-02-20T22:45:00+00:00
Basic matrix maths for neural networks: in practice posted on 2025-02-22T23:45:00+00:00
Writing an LLM from scratch, part 8 -- trainable self-attention posted on 2025-03-04T21:30:00+00:00
Writing an LLM from scratch, part 9 -- causal attention posted on 2025-03-09T23:30:00+00:00
Adding /llms.txt posted on 2025-03-18T22:30:00+00:00
Writing an LLM from scratch, part 10 -- dropout posted on 2025-03-19T23:30:00+00:00
Dropout and mandatory vacation posted on 2025-03-24T23:45:00+00:00
Writing an LLM from scratch, part 11 -- batches posted on 2025-04-19T23:00:00+00:00
Writing an LLM from scratch, part 12 -- multi-head attention posted on 2025-04-21T23:00:00+00:00
Writing an LLM from scratch, part 13 -- the 'why' of attention, or: attention heads are dumb posted on 2025-05-08T22:00:00+00:00
Writing an LLM from scratch, part 14 -- the complexity of self-attention at scale posted on 2025-05-14T21:00:00+00:00
Writing an LLM from scratch, part 15 -- from context vectors to logits; or, can it really be that simple?! posted on 2025-05-31T23:55:00+00:00
Writing an LLM from scratch, part 16 -- layer normalisation posted on 2025-07-08T18:50:00+00:00
Writing an LLM from scratch, part 17 -- the feed-forward network posted on 2025-08-12T23:00:00+00:00
The fixed length bottleneck and the feed forward network posted on 2025-08-14T23:00:00+00:00
Writing an LLM from scratch, part 18 -- residuals, shortcut connections, and the Talmud posted on 2025-08-18T20:20:00+00:00
Writing an LLM from scratch, part 19 -- wrapping up Chapter 4 posted on 2025-08-29T17:00:00+00:00
What AI chatbots are actually doing under the hood posted on 2025-08-29T20:00:00+00:00
The maths you need to start understanding LLMs posted on 2025-09-02T23:30:00+00:00
An addendum to 'the maths you need to start understanding LLMs' posted on 2025-09-08T18:15:00+00:00
How do LLMs work? posted on 2025-09-15T23:20:00+00:00
Writing an LLM from scratch, part 20 -- starting training, and cross entropy loss posted on 2025-10-02T22:10:00+00:00
Writing an LLM from scratch, part 21 -- perplexed by perplexity posted on 2025-10-07T20:00:00+00:00
Revisiting Karpathy’s 'The Unreasonable Effectiveness of Recurrent Neural Networks' posted on 2025-10-11T01:00:00+00:00
Writing an LLM from scratch, part 22 -- finally training our LLM! posted on 2025-10-15T23:40:00+00:00
Writing an LLM from scratch, part 23 -- fine-tuning for classification posted on 2025-10-22T23:40:00+00:00
A classifier using Qwen3 posted on 2025-10-24T23:30:00+00:00
Retro Language Models: Rebuilding Karpathy’s RNN in PyTorch posted on 2025-10-24T19:00:00+00:00
Writing an LLM from scratch, part 24 -- the transcript hack posted on 2025-10-28T20:15:00+00:00
Writing an LLM from scratch, part 25 -- instruction fine-tuning posted on 2025-10-29T23:40:00+00:00
Writing an LLM from scratch, part 26 -- evaluating the fine-tuned model posted on 2025-11-03T19:40:00+00:00
Writing an LLM from scratch, part 27 -- what's left, and what's next? posted on 2025-11-04T00:40:00+00:00
Why smart instruction-following makes prompt injection easier posted on 2025-11-12T19:00:00+00:00
Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 posted on 2025-12-02T18:15:00+00:00
Writing an LLM from scratch, part 29 -- using DistributedDataParallel to train a base model from scratch in the cloud posted on 2026-01-07T20:40:00+00:00
Writing an LLM from scratch, part 30 -- digging into the LLM-as-a-judge results posted on 2026-01-09T01:15:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer) posted on 2026-01-28T23:00:00+00:00
Writing an LLM from scratch, part 32a -- Interventions: training a baseline model posted on 2026-02-04T01:45:00+00:00
Writing an LLM from scratch, part 32b -- Interventions: gradient clipping posted on 2026-02-05T01:20:00+00:00
Writing an LLM from scratch, part 32c -- Interventions: removing dropout posted on 2026-02-05T23:35:00+00:00
Writing an LLM from scratch, part 32d -- Interventions: adding attention bias posted on 2026-02-06T23:55:00+00:00
Writing an LLM from scratch, part 32e -- Interventions: the learning rate posted on 2026-03-10T23:55:00+00:00
Writing an LLM from scratch, part 32f -- Interventions: weight decay posted on 2026-03-23T23:55:00+00:00
Writing an LLM from scratch, part 32g -- Interventions: weight tying posted on 2026-03-24T19:50:00+00:00
Automating starting Lambda Labs instances posted on 2026-04-02T23:30:00+00:00
Writing an LLM from scratch, part 32h -- Interventions: full fat float32 posted on 2026-04-03T23:50:00+00:00
Writing an LLM from scratch, part 32i -- Interventions: what is in the noise? posted on 2026-04-07T21:00:00+00:00
Writing an LLM from scratch, part 32j -- Interventions: trying to train a better model in the cloud posted on 2026-04-09T20:00:00+00:00
Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation posted on 2026-04-15T20:00:00+00:00
How an LLM becomes more coherent as we train it posted on 2026-04-17T23:30:00+00:00
Writing an LLM from scratch, part 32l -- Interventions: updated instruction fine-tuning results posted on 2026-04-20T20:00:00+00:00
Writing an LLM from scratch, part 32m -- Interventions: conclusion posted on 2026-04-21T18:15:00+00:00
Writing an LLM from scratch, part 33 -- what I learned from finally getting round to the appendices posted on 2026-04-22T17:30:00+00:00
On first looking into JAX posted on 2026-05-30T18:45:00+00:00

Posts in category TIL deep dives

Getting phpBB to accept Django sessions posted on 2008-12-10T16:44:46+00:00
OpenCL: first investigations with an NVIDIA card posted on 2010-02-24T17:54:27+00:00
OpenCL: .NET, C# and Resolver One integration -- the very beginnings posted on 2010-03-18T20:16:47+00:00
SNI-based reverse proxying with Go(lang) posted on 2013-07-18T20:10:02+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 0: introduction posted on 2013-08-08T14:18:07+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy posted on 2013-08-12T19:02:48+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 2: handling multiple connections with epoll posted on 2013-09-07T16:21:36+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 3: Lua-based configuration posted on 2013-09-11T19:39:45+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: non-blocking output posted on 2013-09-28T22:08:46+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: fixed it! posted on 2013-09-29T23:09:39+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 4: Dealing with slow writes to the network posted on 2013-10-10T21:09:34+00:00
SHA-1 sunset in Chromium, and libnss3 posted on 2015-08-06T12:18:50+00:00
pam-unshare: a PAM module that switches into a PID namespace posted on 2016-04-15T19:03:27+00:00
Python code to generate Let's Encrypt certificates posted on 2018-11-16T19:55:00+00:00
Fun with network namespaces posted on 2021-03-13T19:30:00+00:00
Messing around with fine-tuning LLMs posted on 2024-04-27T22:45:00+00:00
Messing around with fine-tuning LLMs, part 2 -- to the cloud! posted on 2024-04-28T22:45:00+00:00
Messing around with fine-tuning LLMs, part 3 -- moar GPUs posted on 2024-05-15T23:45:00+00:00
Messing around with fine-tuning LLMs, part 4 -- training cross-GPU. posted on 2024-05-21T21:45:00+00:00
Messing around with fine-tuning LLMs, part 5 -- exploring memory usage posted on 2024-07-05T17:45:00+00:00
Messing around with fine-tuning LLMs, part 6 -- measuring memory usage more systematically posted on 2024-07-10T23:45:00+00:00
Messing around with fine-tuning LLMs, part 7 -- detailed memory usage across sequence lengths for an 8B model posted on 2024-08-16T23:45:00+00:00
Messing around with fine-tuning LLMs, part 8 -- detailed memory usage across batch sizes posted on 2024-08-25T23:00:00+00:00
Messing around with fine-tuning LLMs, part 9 -- gradient checkpointing posted on 2024-09-03T23:00:00+00:00
Messing around with fine-tuning LLMs, part 10 -- finally training the model! posted on 2024-12-22T19:00:00+00:00
Writing an LLM from scratch, part 1 posted on 2024-12-22T21:00:00+00:00
Writing an LLM from scratch, part 2 posted on 2024-12-23T21:00:00+00:00
Writing an LLM from scratch, part 3 posted on 2024-12-26T22:30:00+00:00
Writing an LLM from scratch, part 4 posted on 2024-12-28T22:30:00+00:00
Writing an LLM from scratch, part 5 -- more on self-attention posted on 2025-01-11T23:30:00+00:00
Writing an LLM from scratch, part 6 -- starting to code self-attention posted on 2025-01-21T22:30:00+00:00
Writing an LLM from scratch, part 7 -- wrapping up non-trainable self-attention posted on 2025-02-07T21:30:00+00:00
Basic matrix maths for neural networks: the theory posted on 2025-02-20T22:45:00+00:00
Basic matrix maths for neural networks: in practice posted on 2025-02-22T23:45:00+00:00
Writing an LLM from scratch, part 8 -- trainable self-attention posted on 2025-03-04T21:30:00+00:00
Writing an LLM from scratch, part 9 -- causal attention posted on 2025-03-09T23:30:00+00:00
Writing an LLM from scratch, part 10 -- dropout posted on 2025-03-19T23:30:00+00:00
Writing an LLM from scratch, part 11 -- batches posted on 2025-04-19T23:00:00+00:00
Writing an LLM from scratch, part 12 -- multi-head attention posted on 2025-04-21T23:00:00+00:00
Writing an LLM from scratch, part 13 -- the 'why' of attention, or: attention heads are dumb posted on 2025-05-08T22:00:00+00:00
Writing an LLM from scratch, part 14 -- the complexity of self-attention at scale posted on 2025-05-14T21:00:00+00:00
Writing an LLM from scratch, part 15 -- from context vectors to logits; or, can it really be that simple?! posted on 2025-05-31T23:55:00+00:00
Writing an LLM from scratch, part 16 -- layer normalisation posted on 2025-07-08T18:50:00+00:00
Writing an LLM from scratch, part 17 -- the feed-forward network posted on 2025-08-12T23:00:00+00:00
Writing an LLM from scratch, part 18 -- residuals, shortcut connections, and the Talmud posted on 2025-08-18T20:20:00+00:00
Writing an LLM from scratch, part 19 -- wrapping up Chapter 4 posted on 2025-08-29T17:00:00+00:00
Writing an LLM from scratch, part 20 -- starting training, and cross entropy loss posted on 2025-10-02T22:10:00+00:00
Writing an LLM from scratch, part 21 -- perplexed by perplexity posted on 2025-10-07T20:00:00+00:00
Revisiting Karpathy’s 'The Unreasonable Effectiveness of Recurrent Neural Networks' posted on 2025-10-11T01:00:00+00:00
Writing an LLM from scratch, part 22 -- finally training our LLM! posted on 2025-10-15T23:40:00+00:00
Writing an LLM from scratch, part 23 -- fine-tuning for classification posted on 2025-10-22T23:40:00+00:00
Retro Language Models: Rebuilding Karpathy’s RNN in PyTorch posted on 2025-10-24T19:00:00+00:00
Writing an LLM from scratch, part 24 -- the transcript hack posted on 2025-10-28T20:15:00+00:00
Writing an LLM from scratch, part 25 -- instruction fine-tuning posted on 2025-10-29T23:40:00+00:00
Writing an LLM from scratch, part 26 -- evaluating the fine-tuned model posted on 2025-11-03T19:40:00+00:00
Writing an LLM from scratch, part 27 -- what's left, and what's next? posted on 2025-11-04T00:40:00+00:00
Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 posted on 2025-12-02T18:15:00+00:00
Writing an LLM from scratch, part 29 -- using DistributedDataParallel to train a base model from scratch in the cloud posted on 2026-01-07T20:40:00+00:00
Writing an LLM from scratch, part 30 -- digging into the LLM-as-a-judge results posted on 2026-01-09T01:15:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer) posted on 2026-01-28T23:00:00+00:00
Writing an LLM from scratch, part 32a -- Interventions: training a baseline model posted on 2026-02-04T01:45:00+00:00
Writing an LLM from scratch, part 32b -- Interventions: gradient clipping posted on 2026-02-05T01:20:00+00:00
Writing an LLM from scratch, part 32c -- Interventions: removing dropout posted on 2026-02-05T23:35:00+00:00
Writing an LLM from scratch, part 32d -- Interventions: adding attention bias posted on 2026-02-06T23:55:00+00:00
Writing an LLM from scratch, part 32e -- Interventions: the learning rate posted on 2026-03-10T23:55:00+00:00
Writing an LLM from scratch, part 32f -- Interventions: weight decay posted on 2026-03-23T23:55:00+00:00
Writing an LLM from scratch, part 32g -- Interventions: weight tying posted on 2026-03-24T19:50:00+00:00
Writing an LLM from scratch, part 32h -- Interventions: full fat float32 posted on 2026-04-03T23:50:00+00:00
Writing an LLM from scratch, part 32i -- Interventions: what is in the noise? posted on 2026-04-07T21:00:00+00:00
Writing an LLM from scratch, part 32j -- Interventions: trying to train a better model in the cloud posted on 2026-04-09T20:00:00+00:00
Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation posted on 2026-04-15T20:00:00+00:00
Writing an LLM from scratch, part 32l -- Interventions: updated instruction fine-tuning results posted on 2026-04-20T20:00:00+00:00
Writing an LLM from scratch, part 32m -- Interventions: conclusion posted on 2026-04-21T18:15:00+00:00
Writing an LLM from scratch, part 33 -- what I learned from finally getting round to the appendices posted on 2026-04-22T17:30:00+00:00

Posts in category Python

Resolver One as a Python Success Story posted on 2008-08-01T10:41:47+00:00
A bit of fun posted on 2008-09-29T22:52:13+00:00
Why use IronPython? posted on 2008-10-07T11:00:35+00:00
Ironclad 0.7 released posted on 2008-11-27T22:16:36+00:00
Getting phpBB to accept Django sessions posted on 2008-12-10T16:44:46+00:00
xmlrpc posted on 2009-02-13T14:07:16+00:00
R in Resolver One (and perhaps IronPython generally) posted on 2009-03-02T19:19:04+00:00
Fix for pygame/PyOpenGL/NeHe tutorial windows not disappearing when run from IDLE posted on 2009-08-30T20:44:04+00:00
3D graphics in Resolver One using OpenGL and Tao, part II: an orrery posted on 2009-09-17T15:06:10+00:00
London Financial Python Users Group posted on 2009-11-11T18:19:30+00:00
New York Financial Users Group posted on 2009-11-13T14:50:28+00:00
A website for LFPUG posted on 2009-12-07T18:39:50+00:00
Next London Financial Python Users Group meeting posted on 2010-01-28T15:57:15+00:00
London Financial Python Users' Group posted on 2010-02-16T12:01:54+00:00
Playing with NLTK posted on 2010-02-18T18:21:19+00:00
OpenCL: .NET, C# and Resolver One integration -- the very beginnings posted on 2010-03-18T20:16:47+00:00
Regular expressions and Resolver One column-level formulae posted on 2010-04-26T17:26:29+00:00
Generating political news using NLTK posted on 2010-05-04T17:16:25+00:00
London Financial User Group Meeting: September 15 posted on 2010-08-24T15:06:05+00:00
A big announcement from Resolver Systems posted on 2010-10-01T18:39:24+00:00
London Financial User Group Meeting: 17 January posted on 2011-01-10T19:52:19+00:00
Busy, busy, busy posted on 2011-04-27T14:37:28+00:00
Teaching programming posted on 2011-10-14T14:16:41+00:00
PythonAnywhereAnywhere posted on 2012-02-27T15:31:40+00:00
Running Django unit tests on PythonAnywhere posted on 2012-05-21T19:35:19+00:00
How many Python programmers are there in the world? posted on 2013-06-24T18:13:05+00:00
An HTTP request's journey through a platform-as-a-service posted on 2014-08-20T12:32:33+00:00
Parsing website SSL certificates in Python posted on 2016-12-09T17:31:52+00:00
Creating a time series from existing data in pandas posted on 2017-05-09T12:31:40+00:00
Python code to generate Let's Encrypt certificates posted on 2018-11-16T19:55:00+00:00
Building an AI chatbot for beginners: part 0 posted on 2023-03-19T20:45:00+00:00
Building an AI chatbot for beginners: part 1 posted on 2023-03-19T21:45:00+00:00
Building an AI chatbot for beginners: part 2 posted on 2023-04-04T19:45:00+00:00
Messing around with fine-tuning LLMs posted on 2024-04-27T22:45:00+00:00
Messing around with fine-tuning LLMs, part 2 -- to the cloud! posted on 2024-04-28T22:45:00+00:00
Messing around with fine-tuning LLMs, part 3 -- moar GPUs posted on 2024-05-15T23:45:00+00:00
Messing around with fine-tuning LLMs, part 4 -- training cross-GPU. posted on 2024-05-21T21:45:00+00:00
Messing around with fine-tuning LLMs, part 5 -- exploring memory usage posted on 2024-07-05T17:45:00+00:00
Messing around with fine-tuning LLMs, part 6 -- measuring memory usage more systematically posted on 2024-07-10T23:45:00+00:00
Messing around with fine-tuning LLMs, part 7 -- detailed memory usage across sequence lengths for an 8B model posted on 2024-08-16T23:45:00+00:00
Messing around with fine-tuning LLMs, part 8 -- detailed memory usage across batch sizes posted on 2024-08-25T23:00:00+00:00
Messing around with fine-tuning LLMs, part 9 -- gradient checkpointing posted on 2024-09-03T23:00:00+00:00
Messing around with fine-tuning LLMs, part 10 -- finally training the model! posted on 2024-12-22T19:00:00+00:00
Writing an LLM from scratch, part 2 posted on 2024-12-23T21:00:00+00:00
Writing an LLM from scratch, part 3 posted on 2024-12-26T22:30:00+00:00
An AI chatroom (beginnings) posted on 2024-12-29T23:15:00+00:00
An AI chatroom (a few steps further) posted on 2024-12-30T23:15:00+00:00
Michael Foord: RIP posted on 2025-01-26T20:30:00+00:00
Writing an LLM from scratch, part 9 -- causal attention posted on 2025-03-09T23:30:00+00:00
Writing an LLM from scratch, part 11 -- batches posted on 2025-04-19T23:00:00+00:00
Writing an LLM from scratch, part 12 -- multi-head attention posted on 2025-04-21T23:00:00+00:00
Writing an LLM from scratch, part 15 -- from context vectors to logits; or, can it really be that simple?! posted on 2025-05-31T23:55:00+00:00
Moving from Fabric3 to Fabric posted on 2025-06-15T01:30:00+00:00
The fixed length bottleneck and the feed forward network posted on 2025-08-14T23:00:00+00:00
Writing an LLM from scratch, part 23 -- fine-tuning for classification posted on 2025-10-22T23:40:00+00:00
Retro Language Models: Rebuilding Karpathy’s RNN in PyTorch posted on 2025-10-24T19:00:00+00:00
Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 posted on 2025-12-02T18:15:00+00:00
Writing an LLM from scratch, part 29 -- using DistributedDataParallel to train a base model from scratch in the cloud posted on 2026-01-07T20:40:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer) posted on 2026-01-28T23:00:00+00:00
Writing an LLM from scratch, part 32b -- Interventions: gradient clipping posted on 2026-02-05T01:20:00+00:00
Writing an LLM from scratch, part 32d -- Interventions: adding attention bias posted on 2026-02-06T23:55:00+00:00
Writing an LLM from scratch, part 32e -- Interventions: the learning rate posted on 2026-03-10T23:55:00+00:00
Writing an LLM from scratch, part 32f -- Interventions: weight decay posted on 2026-03-23T23:55:00+00:00
Writing an LLM from scratch, part 32g -- Interventions: weight tying posted on 2026-03-24T19:50:00+00:00
Writing an LLM from scratch, part 32h -- Interventions: full fat float32 posted on 2026-04-03T23:50:00+00:00
Writing an LLM from scratch, part 32i -- Interventions: what is in the noise? posted on 2026-04-07T21:00:00+00:00
Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation posted on 2026-04-15T20:00:00+00:00
On first looking into JAX posted on 2026-05-30T18:45:00+00:00
Using Safetensors with Flax posted on 2026-06-04T23:30:00+00:00
JAX backends and devices posted on 2026-06-05T19:30:00+00:00

Posts in category LLM from scratch

Writing an LLM from scratch, part 1 posted on 2024-12-22T21:00:00+00:00
Writing an LLM from scratch, part 2 posted on 2024-12-23T21:00:00+00:00
Writing an LLM from scratch, part 3 posted on 2024-12-26T22:30:00+00:00
Writing an LLM from scratch, part 4 posted on 2024-12-28T22:30:00+00:00
Writing an LLM from scratch, part 5 -- more on self-attention posted on 2025-01-11T23:30:00+00:00
Writing an LLM from scratch, part 6 -- starting to code self-attention posted on 2025-01-21T22:30:00+00:00
Writing an LLM from scratch, part 6b -- a correction posted on 2025-01-28T22:30:00+00:00
Writing an LLM from scratch, part 7 -- wrapping up non-trainable self-attention posted on 2025-02-07T21:30:00+00:00
Writing an LLM from scratch, part 8 -- trainable self-attention posted on 2025-03-04T21:30:00+00:00
Writing an LLM from scratch, part 9 -- causal attention posted on 2025-03-09T23:30:00+00:00
Writing an LLM from scratch, part 10 -- dropout posted on 2025-03-19T23:30:00+00:00
Writing an LLM from scratch, part 11 -- batches posted on 2025-04-19T23:00:00+00:00
Writing an LLM from scratch, part 12 -- multi-head attention posted on 2025-04-21T23:00:00+00:00
Writing an LLM from scratch, part 13 -- the 'why' of attention, or: attention heads are dumb posted on 2025-05-08T22:00:00+00:00
Writing an LLM from scratch, part 14 -- the complexity of self-attention at scale posted on 2025-05-14T21:00:00+00:00
Writing an LLM from scratch, part 15 -- from context vectors to logits; or, can it really be that simple?! posted on 2025-05-31T23:55:00+00:00
Writing an LLM from scratch, part 16 -- layer normalisation posted on 2025-07-08T18:50:00+00:00
Writing an LLM from scratch, part 17 -- the feed-forward network posted on 2025-08-12T23:00:00+00:00
Writing an LLM from scratch, part 18 -- residuals, shortcut connections, and the Talmud posted on 2025-08-18T20:20:00+00:00
Writing an LLM from scratch, part 19 -- wrapping up Chapter 4 posted on 2025-08-29T17:00:00+00:00
Writing an LLM from scratch, part 20 -- starting training, and cross entropy loss posted on 2025-10-02T22:10:00+00:00
Writing an LLM from scratch, part 21 -- perplexed by perplexity posted on 2025-10-07T20:00:00+00:00
Writing an LLM from scratch, part 22 -- finally training our LLM! posted on 2025-10-15T23:40:00+00:00
Writing an LLM from scratch, part 23 -- fine-tuning for classification posted on 2025-10-22T23:40:00+00:00
Writing an LLM from scratch, part 24 -- the transcript hack posted on 2025-10-28T20:15:00+00:00
Writing an LLM from scratch, part 25 -- instruction fine-tuning posted on 2025-10-29T23:40:00+00:00
Writing an LLM from scratch, part 26 -- evaluating the fine-tuned model posted on 2025-11-03T19:40:00+00:00
Writing an LLM from scratch, part 27 -- what's left, and what's next? posted on 2025-11-04T00:40:00+00:00
Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 posted on 2025-12-02T18:15:00+00:00
Writing an LLM from scratch, part 29 -- using DistributedDataParallel to train a base model from scratch in the cloud posted on 2026-01-07T20:40:00+00:00
Writing an LLM from scratch, part 30 -- digging into the LLM-as-a-judge results posted on 2026-01-09T01:15:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Writing an LLM from scratch, part 32a -- Interventions: training a baseline model posted on 2026-02-04T01:45:00+00:00
Writing an LLM from scratch, part 32b -- Interventions: gradient clipping posted on 2026-02-05T01:20:00+00:00
Writing an LLM from scratch, part 32c -- Interventions: removing dropout posted on 2026-02-05T23:35:00+00:00
Writing an LLM from scratch, part 32d -- Interventions: adding attention bias posted on 2026-02-06T23:55:00+00:00
Writing an LLM from scratch, part 32e -- Interventions: the learning rate posted on 2026-03-10T23:55:00+00:00
Writing an LLM from scratch, part 32f -- Interventions: weight decay posted on 2026-03-23T23:55:00+00:00
Writing an LLM from scratch, part 32g -- Interventions: weight tying posted on 2026-03-24T19:50:00+00:00
Writing an LLM from scratch, part 32h -- Interventions: full fat float32 posted on 2026-04-03T23:50:00+00:00
Writing an LLM from scratch, part 32i -- Interventions: what is in the noise? posted on 2026-04-07T21:00:00+00:00
Writing an LLM from scratch, part 32j -- Interventions: trying to train a better model in the cloud posted on 2026-04-09T20:00:00+00:00
Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation posted on 2026-04-15T20:00:00+00:00
Writing an LLM from scratch, part 32l -- Interventions: updated instruction fine-tuning results posted on 2026-04-20T20:00:00+00:00
Writing an LLM from scratch, part 32m -- Interventions: conclusion posted on 2026-04-21T18:15:00+00:00
Writing an LLM from scratch, part 33 -- what I learned from finally getting round to the appendices posted on 2026-04-22T17:30:00+00:00

Posts in category Resolver One

Screencast posted on 2007-12-18T14:23:17+00:00
Off to visit the Beast of Redmond ;-) posted on 2008-01-26T01:40:02+00:00
Resolver One as a Python Success Story posted on 2008-08-01T10:41:47+00:00
Evolution in action posted on 2008-10-03T17:52:56+00:00
Why use IronPython? posted on 2008-10-07T11:00:35+00:00
Do one thing and do it well posted on 2008-11-20T10:34:55+00:00
Resolver One plug posted on 2008-11-26T20:46:08+00:00
Ironclad 0.7 released posted on 2008-11-27T22:16:36+00:00
VAT calculations posted on 2008-11-28T20:01:00+00:00
Money for spreadsheets posted on 2008-12-18T17:18:49+00:00
How much should I charge for my software? posted on 2009-01-05T18:26:12+00:00
Resolver Systems competition closing soon posted on 2009-01-15T15:20:33+00:00
The Resolver One Spreadsheet Challenge: a winner for round one! posted on 2009-01-20T16:18:05+00:00
How much we decided to charge for our software posted on 2009-01-23T15:06:41+00:00
xmlrpc posted on 2009-02-13T14:07:16+00:00
Usability testers needed posted on 2009-02-27T13:21:42+00:00
R in Resolver One (and perhaps IronPython generally) posted on 2009-03-02T19:19:04+00:00
One-day discount posted on 2009-03-17T14:56:09+00:00
Resolver One and Digipede posted on 2009-04-30T17:35:30+00:00
Talk at London Geek Night posted on 2009-05-01T17:57:52+00:00
A Resolver One model on the FT politics blog posted on 2009-07-23T17:07:48+00:00
Clicking the tabs from left to right posted on 2009-08-05T17:24:28+00:00
3D graphics in Resolver One using OpenGL and Tao, part I posted on 2009-09-09T16:43:57+00:00
3D graphics in Resolver One using OpenGL and Tao, part II: an orrery posted on 2009-09-17T15:06:10+00:00
3D graphics in Resolver One using OpenGL and Tao, part III: Stock prices posted on 2009-11-20T20:00:42+00:00
Joining TheyWorkForYou to Twitter posted on 2010-01-20T00:06:56+00:00
OpenCL: .NET, C# and Resolver One integration -- the very beginnings posted on 2010-03-18T20:16:47+00:00
Regular expressions and Resolver One column-level formulae posted on 2010-04-26T17:26:29+00:00
Generating political news using NLTK posted on 2010-05-04T17:16:25+00:00
Running Resolver One on Mono for Windows posted on 2010-05-28T16:52:07+00:00
A big announcement from Resolver Systems posted on 2010-10-01T18:39:24+00:00
A Dirigible screencast posted on 2010-11-15T18:16:46+00:00
Busy, busy, busy posted on 2011-04-27T14:37:28+00:00
Resolver is hiring posted on 2011-05-16T17:24:25+00:00

Posts in category PyTorch

Writing an LLM from scratch, part 3 posted on 2024-12-26T22:30:00+00:00
Writing an LLM from scratch, part 9 -- causal attention posted on 2025-03-09T23:30:00+00:00
Writing an LLM from scratch, part 11 -- batches posted on 2025-04-19T23:00:00+00:00
Writing an LLM from scratch, part 12 -- multi-head attention posted on 2025-04-21T23:00:00+00:00
Writing an LLM from scratch, part 15 -- from context vectors to logits; or, can it really be that simple?! posted on 2025-05-31T23:55:00+00:00
Writing an LLM from scratch, part 23 -- fine-tuning for classification posted on 2025-10-22T23:40:00+00:00
Retro Language Models: Rebuilding Karpathy’s RNN in PyTorch posted on 2025-10-24T19:00:00+00:00
Writing an LLM from scratch, part 28 -- training a base model from scratch on an RTX 3090 posted on 2025-12-02T18:15:00+00:00
Writing an LLM from scratch, part 29 -- using DistributedDataParallel to train a base model from scratch in the cloud posted on 2026-01-07T20:40:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer) posted on 2026-01-28T23:00:00+00:00
Writing an LLM from scratch, part 32b -- Interventions: gradient clipping posted on 2026-02-05T01:20:00+00:00
Writing an LLM from scratch, part 32d -- Interventions: adding attention bias posted on 2026-02-06T23:55:00+00:00
Writing an LLM from scratch, part 32e -- Interventions: the learning rate posted on 2026-03-10T23:55:00+00:00
Writing an LLM from scratch, part 32f -- Interventions: weight decay posted on 2026-03-23T23:55:00+00:00
Writing an LLM from scratch, part 32g -- Interventions: weight tying posted on 2026-03-24T19:50:00+00:00
Writing an LLM from scratch, part 32h -- Interventions: full fat float32 posted on 2026-04-03T23:50:00+00:00
Writing an LLM from scratch, part 32i -- Interventions: what is in the noise? posted on 2026-04-07T21:00:00+00:00
Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation posted on 2026-04-15T20:00:00+00:00
Writing an LLM from scratch, part 33 -- what I learned from finally getting round to the appendices posted on 2026-04-22T17:30:00+00:00
On first looking into JAX posted on 2026-05-30T18:45:00+00:00

Posts in category Blogkeeping

Hello, world! posted on 2006-11-10T22:04:01+00:00
Back again posted on 2006-11-29T00:59:44+00:00
Recovered! posted on 2008-10-29T01:04:55+00:00
Click-through ratios posted on 2009-03-16T15:34:27+00:00
Shiny new blog theme posted on 2009-11-19T01:37:14+00:00
A brief sidetrack: Varnish posted on 2013-10-02T19:18:38+00:00
...and another sidetrack -- a new theme! posted on 2013-10-03T20:08:30+00:00
...just resting... posted on 2013-12-12T19:51:40+00:00
A new beginning posted on 2021-02-16T22:43:00+00:00
Comments are back! posted on 2021-02-22T00:39:00+00:00
Blog design update posted on 2025-02-07T22:45:00+00:00
Adding mathematical typesetting to the blog posted on 2025-02-09T20:00:00+00:00
Going through the archives posted on 2025-02-23T23:52:55+00:00
On the benefits of learning in public posted on 2025-02-23T19:00:00+00:00
It's still worth blogging in the age of AI posted on 2025-02-24T23:52:55+00:00
Should RSS feeds contain the full blog post? posted on 2025-03-16T23:30:00+00:00
Adding /llms.txt posted on 2025-03-18T22:30:00+00:00
The RSS feed now has the full text posted on 2025-03-18T19:30:00+00:00

Posts in category TIL

MSBuild WTF: 'The error was:' posted on 2006-11-15T19:22:16+00:00
Workaround for Vista stupidity posted on 2008-07-01T17:23:20+00:00
Fix for pygame/PyOpenGL/NeHe tutorial windows not disappearing when run from IDLE posted on 2009-08-30T20:44:04+00:00
An odd crontab problem posted on 2010-05-18T12:40:08+00:00
Bare Git repositories posted on 2010-07-01T18:55:05+00:00
Running Django unit tests on PythonAnywhere posted on 2012-05-21T19:35:19+00:00
Raspberry Pi setup notes: getting the display to work! posted on 2012-06-20T19:13:28+00:00
Reverse proxying HTTP and WebSockets with virtual hosts using nginx and tcp_proxy_module posted on 2012-10-05T19:03:58+00:00
Parsing website SSL certificates in Python posted on 2016-12-09T17:31:52+00:00
Creating a time series from existing data in pandas posted on 2017-05-09T12:31:40+00:00
Installing the unifi controller on Arch posted on 2019-08-20T22:13:32+00:00
Adding mathematical typesetting to the blog posted on 2025-02-09T20:00:00+00:00
Getting MathML to render properly in Chrome, Chromium and Brave posted on 2025-02-16T20:00:00+00:00
10Gb/s Ethernet: what I had to (re)learn posted on 2026-04-28T18:45:00+00:00
10Gb/s Ethernet: what I actually did to get it working in my home posted on 2026-04-29T14:15:00+00:00
10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module posted on 2026-05-18T19:15:00+00:00
Using Safetensors with Flax posted on 2026-06-04T23:30:00+00:00
JAX backends and devices posted on 2026-06-05T19:30:00+00:00

Posts in category PythonAnywhere

Busy, busy, busy posted on 2011-04-27T14:37:28+00:00
Resolver is hiring posted on 2011-05-16T17:24:25+00:00
Teaching programming posted on 2011-10-14T14:16:41+00:00
PythonAnywhereAnywhere posted on 2012-02-27T15:31:40+00:00
Running Django unit tests on PythonAnywhere posted on 2012-05-21T19:35:19+00:00
Reverse proxying HTTP and WebSockets with virtual hosts using nginx and tcp_proxy_module posted on 2012-10-05T19:03:58+00:00
A super-simple chat app with AngularJS, SockJS and node.js posted on 2013-02-12T20:13:01+00:00
How many Python programmers are there in the world? posted on 2013-06-24T18:13:05+00:00
SNI-based reverse proxying with Go(lang) posted on 2013-07-18T20:10:02+00:00
A fun bug posted on 2014-03-28T17:40:07+00:00
An HTTP request's journey through a platform-as-a-service posted on 2014-08-20T12:32:33+00:00
pam-unshare: a PAM module that switches into a PID namespace posted on 2016-04-15T19:03:27+00:00
Parsing website SSL certificates in Python posted on 2016-12-09T17:31:52+00:00
A somewhat indirect way of reporting stolen cards to the bank posted on 2022-02-06T20:45:00+00:00
Acquired! posted on 2022-09-28T20:45:00+00:00
Giving up on the AI chatbot tutorial (for now) posted on 2024-02-27T20:45:00+00:00
Leaving PythonAnywhere posted on 2025-06-05T19:30:00+00:00

Posts in category Linux

Dear lazyweb: what is is about Linux and WPA? posted on 2008-01-11T00:41:40+00:00
New gadget: Nokia N900 posted on 2009-12-23T01:44:55+00:00
An odd crontab problem posted on 2010-05-18T12:40:08+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 0: introduction posted on 2013-08-08T14:18:07+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy posted on 2013-08-12T19:02:48+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 2: handling multiple connections with epoll posted on 2013-09-07T16:21:36+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 3: Lua-based configuration posted on 2013-09-11T19:39:45+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: non-blocking output posted on 2013-09-28T22:08:46+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: fixed it! posted on 2013-09-29T23:09:39+00:00
A brief sidetrack: Varnish posted on 2013-10-02T19:18:38+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 4: Dealing with slow writes to the network posted on 2013-10-10T21:09:34+00:00
SHA-1 sunset in Chromium, and libnss3 posted on 2015-08-06T12:18:50+00:00
pam-unshare: a PAM module that switches into a PID namespace posted on 2016-04-15T19:03:27+00:00
Installing the unifi controller on Arch posted on 2019-08-20T22:13:32+00:00
Fun with network namespaces posted on 2021-03-13T19:30:00+00:00
Moving from Fabric3 to Fabric posted on 2025-06-15T01:30:00+00:00

Posts in category Startups

Screencast posted on 2007-12-18T14:23:17+00:00
Making a fool of yourself in public posted on 2008-05-06T17:59:42+00:00
Resolver One as a Python Success Story posted on 2008-08-01T10:41:47+00:00
Off to BoS posted on 2008-09-02T01:56:25+00:00
Evolution in action posted on 2008-10-03T17:52:56+00:00
Do one thing and do it well posted on 2008-11-20T10:34:55+00:00
Product management with Google AdWords posted on 2008-12-04T19:07:09+00:00
How much should I charge for my software? posted on 2009-01-05T18:26:12+00:00
How much we decided to charge for our software posted on 2009-01-23T15:06:41+00:00
Talk at London Geek Night posted on 2009-05-01T17:57:52+00:00
IT headhunters considered harmful posted on 2010-01-07T18:23:18+00:00
Busy, busy, busy posted on 2011-04-27T14:37:28+00:00
Does #EUVAT make accepting bitcoins impossible for EU-based digital services businesses? posted on 2014-12-19T16:14:08+00:00
A somewhat indirect way of reporting stolen cards to the bank posted on 2022-02-06T20:45:00+00:00
Acquired! posted on 2022-09-28T20:45:00+00:00

Posts in category Hugging Face

Messing around with fine-tuning LLMs posted on 2024-04-27T22:45:00+00:00
Messing around with fine-tuning LLMs, part 2 -- to the cloud! posted on 2024-04-28T22:45:00+00:00
Messing around with fine-tuning LLMs, part 3 -- moar GPUs posted on 2024-05-15T23:45:00+00:00
Messing around with fine-tuning LLMs, part 4 -- training cross-GPU. posted on 2024-05-21T21:45:00+00:00
Messing around with fine-tuning LLMs, part 5 -- exploring memory usage posted on 2024-07-05T17:45:00+00:00
Messing around with fine-tuning LLMs, part 6 -- measuring memory usage more systematically posted on 2024-07-10T23:45:00+00:00
Messing around with fine-tuning LLMs, part 7 -- detailed memory usage across sequence lengths for an 8B model posted on 2024-08-16T23:45:00+00:00
Messing around with fine-tuning LLMs, part 8 -- detailed memory usage across batch sizes posted on 2024-08-25T23:00:00+00:00
Messing around with fine-tuning LLMs, part 9 -- gradient checkpointing posted on 2024-09-03T23:00:00+00:00
Messing around with fine-tuning LLMs, part 10 -- finally training the model! posted on 2024-12-22T19:00:00+00:00
A classifier using Qwen3 posted on 2025-10-24T23:30:00+00:00
Writing an LLM from scratch, part 31 -- the models are now on Hugging Face posted on 2026-01-17T19:45:00+00:00
Getting a custom PyTorch LLM onto the Hugging Face Hub (Transformers: AutoModel, pipeline, and Trainer) posted on 2026-01-28T23:00:00+00:00

Posts in category NSLU2 offsite backup project

Project: Automated offsite backups for an NSLU2 -- part 2 posted on 2006-11-11T02:08:39+00:00
Project: Automated offsite backups for an NSLU2 -- part 3 posted on 2006-11-11T02:53:34+00:00
Project: Automated offsite backups for an NSLU2 -- part 4 posted on 2006-11-11T19:25:51+00:00
Project: Automated offsite backups for an NSLU2 -- part 5 posted on 2006-11-11T22:13:19+00:00
Project: Automated offsite backups for an NSLU2 -- part 6 posted on 2006-11-11T23:12:06+00:00
Project: Automated offsite backups for an NSLU2 -- part 1 posted on 2006-11-11T01:41:32+00:00
Project: Automated offsite backups for an NSLU2 -- part 7 posted on 2006-11-12T02:01:59+00:00
Project: Automated offsite backups for an NSLU2 -- part 8 posted on 2006-11-12T04:00:21+00:00
Project: Automated offsite backups for an NSLU2 -- part 10 posted on 2006-11-14T01:15:21+00:00
Project: Automated offsite backups for an NSLU2 -- part 11 posted on 2006-11-14T23:00:09+00:00
Project: Automated offsite backups for an NSLU2 -- part 9 posted on 2006-11-14T00:07:07+00:00
Project: Automated offsite backups for an NSLU2 -- part 12 posted on 2006-11-16T00:50:41+00:00
Project: Automated offsite backups for an NSLU2 -- part 13 posted on 2006-11-17T01:01:59+00:00

Posts in category Funny

HTML tattoo posted on 2007-03-04T03:30:53+00:00
A Thinking Ape's Critique of Trans-Simianism posted on 2008-06-11T18:04:45+00:00
Best. Video. Ever. posted on 2008-11-20T01:17:59+00:00
Spam subject line of the day posted on 2009-01-22T08:29:15+00:00
Praying that this isn't a hoax... posted on 2009-02-19T14:00:35+00:00
hExcel -- A Hexagonal Spreadsheet posted on 2009-08-06T13:54:52+00:00
New splogging technique? posted on 2009-11-04T01:45:48+00:00
Generating political news using NLTK posted on 2010-05-04T17:16:25+00:00
'Your Support Request has been submitted to the Support Request' posted on 2010-09-08T16:32:46+00:00
And the same to you too, Google! posted on 2010-12-23T14:35:21+00:00
If programming languages were literary genres... posted on 2011-06-24T23:09:33+00:00

Posts in category Gadgets

Eee PC posted on 2007-12-21T00:22:43+00:00
Eee, day 2 posted on 2007-12-22T01:10:42+00:00
New gadget! posted on 2008-01-10T23:54:09+00:00
Dear lazyweb: what is is about Linux and WPA? posted on 2008-01-11T00:41:40+00:00
SSDs posted on 2009-03-18T01:16:13+00:00
Dear lazyweb: To i7 or not to i7? posted on 2009-10-28T02:26:32+00:00
New gadget: Nokia N900 posted on 2009-12-23T01:44:55+00:00
New laptop! posted on 2010-01-28T01:44:27+00:00
10Gb/s Ethernet: what I had to (re)learn posted on 2026-04-28T18:45:00+00:00
10Gb/s Ethernet: what I actually did to get it working in my home posted on 2026-04-29T14:15:00+00:00
10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module posted on 2026-05-18T19:15:00+00:00

Posts in category Musings

Why should the government fund space exploration? posted on 2008-01-13T00:06:53+00:00
Making a fool of yourself in public posted on 2008-05-06T17:59:42+00:00
An aside: SEO for restaurants posted on 2010-03-19T23:42:05+00:00
Do reasoning LLMs need their own Philosophical Language? posted on 2025-01-16T23:30:00+00:00
On the perils of AI-first debugging -- or, why Stack Overflow still matters in 2025 posted on 2025-02-19T02:30:00+00:00
On the benefits of learning in public posted on 2025-02-23T19:00:00+00:00
It's still worth blogging in the age of AI posted on 2025-02-24T23:52:55+00:00
Dropout and mandatory vacation posted on 2025-03-24T23:45:00+00:00
The fixed length bottleneck and the feed forward network posted on 2025-08-14T23:00:00+00:00
Why smart instruction-following makes prompt injection easier posted on 2025-11-12T19:00:00+00:00
On first looking into JAX posted on 2026-05-30T18:45:00+00:00

Posts in category Finance

I came for the article, I stayed for the comments posted on 2008-10-09T00:20:11+00:00
VAT calculations posted on 2008-11-28T20:01:00+00:00
London Financial Python Users Group posted on 2009-11-11T18:19:30+00:00
New York Financial Users Group posted on 2009-11-13T14:50:28+00:00
A website for LFPUG posted on 2009-12-07T18:39:50+00:00
Next London Financial Python Users Group meeting posted on 2010-01-28T15:57:15+00:00
London Financial Python Users' Group posted on 2010-02-16T12:01:54+00:00
London Financial User Group Meeting: September 15 posted on 2010-08-24T15:06:05+00:00
London Financial User Group Meeting: 17 January posted on 2011-01-10T19:52:19+00:00
How to bet on the bubble? posted on 2011-03-30T17:57:46+00:00

Posts in category Fine-tuning LLMs

Messing around with fine-tuning LLMs posted on 2024-04-27T22:45:00+00:00
Messing around with fine-tuning LLMs, part 2 -- to the cloud! posted on 2024-04-28T22:45:00+00:00
Messing around with fine-tuning LLMs, part 3 -- moar GPUs posted on 2024-05-15T23:45:00+00:00
Messing around with fine-tuning LLMs, part 4 -- training cross-GPU. posted on 2024-05-21T21:45:00+00:00
Messing around with fine-tuning LLMs, part 5 -- exploring memory usage posted on 2024-07-05T17:45:00+00:00
Messing around with fine-tuning LLMs, part 6 -- measuring memory usage more systematically posted on 2024-07-10T23:45:00+00:00
Messing around with fine-tuning LLMs, part 7 -- detailed memory usage across sequence lengths for an 8B model posted on 2024-08-16T23:45:00+00:00
Messing around with fine-tuning LLMs, part 8 -- detailed memory usage across batch sizes posted on 2024-08-25T23:00:00+00:00
Messing around with fine-tuning LLMs, part 9 -- gradient checkpointing posted on 2024-09-03T23:00:00+00:00
Messing around with fine-tuning LLMs, part 10 -- finally training the model! posted on 2024-12-22T19:00:00+00:00

Posts in category C

A bit of fun posted on 2008-09-29T22:52:13+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 0: introduction posted on 2013-08-08T14:18:07+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 1: a trivial single-threaded proxy posted on 2013-08-12T19:02:48+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 2: handling multiple connections with epoll posted on 2013-09-07T16:21:36+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 3: Lua-based configuration posted on 2013-09-11T19:39:45+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: non-blocking output posted on 2013-09-28T22:08:46+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, pause to regroup: fixed it! posted on 2013-09-29T23:09:39+00:00
Writing a reverse proxy/loadbalancer from the ground up in C, part 4: Dealing with slow writes to the network posted on 2013-10-10T21:09:34+00:00
pam-unshare: a PAM module that switches into a PID namespace posted on 2016-04-15T19:03:27+00:00

Posts in category Personal

Feelix Growing posted on 2007-02-23T19:20:48+00:00
Making a fool of yourself in public posted on 2008-05-06T17:59:42+00:00
A bit of fun posted on 2008-09-29T22:52:13+00:00
Ada Lovelace day posted on 2009-03-25T01:19:28+00:00
COVID-19 breakthrough / re-infection: a personal tale posted on 2021-11-30T20:59:00+00:00
Happy New Year! posted on 2025-01-05T23:15:00+00:00
Michael Foord: RIP posted on 2025-01-26T20:30:00+00:00
Leaving PythonAnywhere posted on 2025-06-05T19:30:00+00:00

Posts in category Robotics

Christmas has come early posted on 2006-11-13T20:48:19+00:00
Make:06 Trimet... hmmm. posted on 2006-12-11T01:03:14+00:00
Another robot posted on 2006-12-31T18:31:13+00:00
Building the Coat Hanger Walker posted on 2007-01-09T23:25:07+00:00
'Dancing mule' posted on 2007-01-09T23:57:30+00:00
Feelix Growing posted on 2007-02-23T19:20:48+00:00
Best. Video. Ever. posted on 2008-11-20T01:17:59+00:00
Affective robots posted on 2008-11-26T00:08:16+00:00

Posts in category Website design

Clicking the tabs from left to right posted on 2009-08-05T17:24:28+00:00
A new beginning posted on 2021-02-16T22:43:00+00:00
Blog design update posted on 2025-02-07T22:45:00+00:00
Adding mathematical typesetting to the blog posted on 2025-02-09T20:00:00+00:00
Getting MathML to render properly in Chrome, Chromium and Brave posted on 2025-02-16T20:00:00+00:00
Should RSS feeds contain the full blog post? posted on 2025-03-16T23:30:00+00:00
Adding /llms.txt posted on 2025-03-18T22:30:00+00:00
The RSS feed now has the full text posted on 2025-03-18T19:30:00+00:00

Posts in category 3D

Fix for pygame/PyOpenGL/NeHe tutorial windows not disappearing when run from IDLE posted on 2009-08-30T20:44:04+00:00
3D graphics in Resolver One using OpenGL and Tao, part I posted on 2009-09-09T16:43:57+00:00
3D graphics in Resolver One using OpenGL and Tao, part II: an orrery posted on 2009-09-17T15:06:10+00:00
WebGL posted on 2009-10-15T14:09:16+00:00
3D graphics in Resolver One using OpenGL and Tao, part III: Stock prices posted on 2009-11-20T20:00:42+00:00

Posts in category Rants

Patronising messages posted on 2007-03-24T23:43:24+00:00
Dear lazyweb: what is is about Linux and WPA? posted on 2008-01-11T00:41:40+00:00
Workaround for Vista stupidity posted on 2008-07-01T17:23:20+00:00
IT headhunters considered harmful posted on 2010-01-07T18:23:18+00:00
New business idea posted on 2012-04-10T09:53:52+00:00

Posts in category Cryptography

SNI-based reverse proxying with Go(lang) posted on 2013-07-18T20:10:02+00:00
SHA-1 sunset in Chromium, and libnss3 posted on 2015-08-06T12:18:50+00:00
Parsing website SSL certificates in Python posted on 2016-12-09T17:31:52+00:00
Python code to generate Let's Encrypt certificates posted on 2018-11-16T19:55:00+00:00

Posts in category JavaScript

A bit of fun posted on 2008-09-29T22:52:13+00:00
Fun with the Audio Data API posted on 2010-12-06T20:03:54+00:00
Some old JavaScript posted on 2011-02-03T02:04:07+00:00
A super-simple chat app with AngularJS, SockJS and node.js posted on 2013-02-12T20:13:01+0

… [truncated — open the raw llms.txt above for the full file]

Giles' Blog

Giles' Blog

Recent posts

Posts in category AI

Posts in category TIL deep dives

Posts in category Python

Posts in category LLM from scratch

Posts in category Resolver One

Posts in category PyTorch

Posts in category Blogkeeping

Posts in category TIL

Posts in category PythonAnywhere

Posts in category Linux

Posts in category Startups

Posts in category Hugging Face

Posts in category NSLU2 offsite backup project

Posts in category Funny

Posts in category Gadgets

Posts in category Musings

Posts in category Finance

Posts in category Fine-tuning LLMs

Posts in category C

Posts in category Personal

Posts in category Robotics

Posts in category Website design

Posts in category 3D

Posts in category Rants

Posts in category Cryptography

Posts in category JavaScript