{"product_id":"9798868822964","title":"Building Large Language Models from Scratch: Design, Train, and Deploy LLMs with PyTorch","description":"\u003ch1\u003eBuilding Large Language Models from Scratch: Design, Train, and Deploy LLMs with PyTorch\u003c\/h1\u003e \u003ch2\u003eGrigorov, Dilyan\u003c\/h2\u003e \u003cp\u003e\u003c\/p\u003e\u003cp class=\"MsoNormal\"\u003eThis book is a complete, hands-on guide to designing, training, and deploying your own Large Language Models (LLMs)—from the foundations of tokenization to the advanced stages of fine-tuning and reinforcement learning. Written for developers, data scientists, and AI practitioners, it bridges core principles and state-of-the-art techniques, offering a rare, transparent look at how modern transformers truly work beneath the surface.\u003c\/p\u003e\n\u003cp class=\"MsoNormal\"\u003eStarting from the essentials, you’ll learn how to set up your environment with Python and PyTorch, manage datasets, and implement critical fundamentals such as tensors, embeddings, and gradient descent. You’ll then progress through the architectural heart of modern models, covering RMS normalization, rotary positional embeddings (RoPE), scaled dot-product attention, Grouped Query Attention (GQA), Mixture of Experts (MoE), and SwiGLU activations, each explored in depth and built step by step in code. As you advance, the book introduces custom CUDA kernel integration, teaching you how to optimize key components for speed and memory efficiency at the GPU level—an essential skill for scaling real-world LLMs. You’ll also gain mastery over the phases of training that define today’s leading models:\u003c\/p\u003e\n\u003cul style=\"margin-top: 0in;\" type=\"disc\"\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l0 level1 lfo1; tab-stops: list .5in;\"\u003ePretraining - Building general linguistic and semantic understanding.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l0 level1 lfo1; tab-stops: list .5in;\"\u003eMidtraining - Expanding domain-specific capabilities and adaptability.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l0 level1 lfo1; tab-stops: list .5in;\"\u003eSupervised Fine-Tuning (SFT) - Aligning behavior with curated, task-driven data.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l0 level1 lfo1; tab-stops: list .5in;\"\u003eReinforcement Learning from Human Feedback (RLHF) - Refining responses through reward-based optimization for human alignment.\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003cp class=\"MsoNormal\"\u003eThe final chapters guide you through dataset preparation, filtering, deduplication, and training optimization, culminating in model evaluation and real-world prompting with a custom TokenGenerator for text generation and inference.\u003c\/p\u003e\n\u003cp class=\"MsoNormal\"\u003eBy the end of this book, you’ll have the knowledge and confidence to architect, train, and deploy your own transformer-based models, equipped with both the theoretical depth and practical expertise to innovate in the rapidly evolving world of AI.\u003c\/p\u003e\n\u003cp class=\"MsoNormal\"\u003e\u003cstrong\u003eWhat You’ll Learn\u003c\/strong\u003e\u003c\/p\u003e\n\u003cul style=\"margin-top: 0in;\" type=\"disc\"\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eHow to configure and optimize your development environment using PyTorch\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eThe mechanics of tokenization, embeddings, normalization, and attention mechanisms.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eHow to implement transformer components like RMSNorm, RoPE, GQA, MoE, and SwiGLU from scratch.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eHow to integrate custom CUDA kernels to accelerate transformer computations.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eThe full LLM training pipeline: pretraining, midtraining, supervised fine-tuning, and RLHF.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eTechniques for dataset preparation, deduplication, model debugging, and GPU memory management.\u003c\/li\u003e\n\u003cli class=\"MsoNormal\" style=\"mso-list: l1 level1 lfo2; tab-stops: list .5in;\"\u003eHow to train, evaluate, and deploy a complete GPT-like architecture for real-world tasks.\u003cbr style=\"mso-special-character: line-break;\"\u003e\u003c!-- [if !supportLineBreakNewLine]--\u003e\u003cbr style=\"mso-special-character: line-break;\"\u003e\u003c!--[endif]--\u003e\n\u003c\/li\u003e\n\u003c\/ul\u003e\n\u003cp class=\"MsoNormal\"\u003e\u003cstrong\u003eWho this book is for:\u003c\/strong\u003e\u003c\/p\u003e\n\u003cp class=\"MsoNormal\"\u003eSoftware developers, data scientists, machine learning engineers and AI enthusiasts looking to build their models from scratch.\u003c\/p\u003e \u003ch3\u003eDetails\u003c\/h3\u003e \u003cp\u003ePublished by: Apress\u003c\/p\u003e \u003cp\u003ePublication Date: 2026-04-28\u003c\/p\u003e \u003cp\u003eFormat: Paperback\u003c\/p\u003e \u003cp\u003eISBN-13: 9798868822964\u003c\/p\u003e \u003cp\u003eDOI: 10.1007\/979-8-8688-2297-1\u003c\/p\u003e \u003cp\u003eDimensions: 254cm x178cm\u003c\/p\u003e \u003cp\u003ePages: 530\u003c\/p\u003e ","brand":"Apress","offers":[{"title":"Default Title","offer_id":44698929692812,"sku":"9798868822964","price":53.99,"currency_code":"USD","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0710\/9545\/1788\/files\/9798868822964.jpg?v=1779654341","url":"https:\/\/lateknightbooks.com\/products\/9798868822964","provider":"Late Knight Books and Services, LLC","version":"1.0","type":"link"}