Model Parallelism: Building and Deploying Large Neural Networks Training
Commitment | 1 Day, 7-8 hours a day. |
Language | English |
User Ratings | Average User Rating 4.8 See what learners said |
Price | REQUEST |
Delivery Options | Instructor-Led Onsite, Online, and Classroom Live |
COURSE OVERVIEW
Model Parallelism: Building and Deploying Large Neural Networks Training: Very large deep neural networks (DNNs), whether applied to natural language processing (e.g., GPT-3), computer vision (e.g., huge Vision Transformers), or speech AI (e.g., Wave2Vec 2) have certain properties that set them apart from their smaller counterparts. As DNNs become larger and are trained on progressively larger datasets, they can adapt to new tasks with just a handful of training examples, accelerating the route toward general artificial intelligence. Training models that contain tens to hundreds of billions of parameters on vast datasets isn’t trivial and requires a unique combination of AI, high-performance computing (HPC), and systems knowledge.
Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be canceled, and no refund will be issued, regardless of attendance.
WHAT'S INCLUDED?
- 1 day of Model Parallelism: Building and Deploying Large Neural Networks Training with an expert instructor
- Model Parallelism: Building and Deploying Large Neural Networks Electronic Course Guide
- Certificate of Completion
- 100% Satisfaction Guarantee
RESOURCES
- Model Parallelism: Building and Deploying Large Neural Networks – https://www.wiley.com/
- Model Parallelism: Building and Deploying Large Neural Networks – https://www.packtpub.com/
- Model Parallelism: Building and Deploying Large Neural Networks – https://store.logicaloperations.com/
- Model Parallelism: Building and Deploying Large Neural Networks – https://us.artechhouse.com/
- Model Parallelism: Building and Deploying Large Neural Networks Training – https://www.amazon.com/
RELATED COURSES
- Applications of AI for Anomaly Detection Training
- Data Parallelism: How to Train Deep Learning Models on Multiple GPUs Training
- Fundamentals of Deep Learning Training
- Building Conversational AI Applications Training
- Applications of AI for Predictive Maintenance Training
- Model Parallelism: Building and Deploying Large Neural Networks Training
- Building Transformer-Based Natural Language Processing Applications Training
- Computer Vision for Industrial Inspection Training
- Building AI-Based Cybersecurity Pipelines Training
ADDITIONAL INFORMATION
COURSE OBJECTIVES
Upon completion of this Model Parallelism: Building and Deploying Large Neural Networks Training course, participants can:
- Train neural networks across multiple servers
- Use techniques such as activation checkpointing, gradient accumulation, and various forms of model parallelism to overcome the challenges associated with large-model memory footprint
- Capture and understand training performance characteristics to optimize model architecture
- Deploy very large multi-GPU models to production using NVIDIA Triton™ Inference Server
CUSTOMIZE IT
- We can adapt this Model Parallelism: Building and Deploying Large Neural Networks Training course to your group’s background and work requirements at little to no added cost.
- If you are familiar with some aspects of this Model Parallelism: Building and Deploying Large Neural Networks course, we can omit or shorten their discussion.
- We can adjust the emphasis placed on the various topics or build the Model Parallelism: Building and Deploying Large Neural Networks course around the mix of technologies of interest to you (including technologies other than those in this outline).
- If your background is nontechnical, we can exclude the more technical topics, include the topics that may be of special interest to you (e.g., as a manager or policymaker), and present the Model Parallelism: Building and Deploying Large Neural Networks course in a manner understandable to lay audiences.
AUDIENCE/TARGET GROUP
The target audience for this Model Parallelism: Building and Deploying Large Neural Networks Training course:
- ALL
CLASS PREREQUISITES
The knowledge and skills that a learner must have before attending this Model Parallelism: Building and Deploying Large Neural Networks Training course are:
- Good understanding of PyTorch
- Good understanding of deep learning and data parallel training concepts
- Practice with deep learning and data parallel are useful, but optional
COURSE SYLLABUS
Introduction
- Meet the instructor.
- Create an account at courses.nvidia.com/join
Introduction to Training of Large Models
- Learn about the motivation behind and key challenges of training large models.
- Get an overview of the basic techniques and tools needed for large-scale training.
- Get an introduction to distributed training and the Slurm job scheduler.
- Train a GPT model using data parallelism.
- Profile the training process and understand execution performance.
Model Parallelism: Advanced Topics
- Increase the model size using a range of memory-saving techniques.
- Get an introduction to tensor and pipeline parallelism.
- Go beyond natural language processing and get an introduction to DeepSpeed.
- Auto-tune model performance.
- Learn about mixture-of-experts models.
Inference of Large Models
- Understand the challenges of deployment associated with large models.
- Explore techniques for model reduction.
- Learn how to use TensorRT-LLM.
- Learn how to use Triton Inference Server.
- Understand the process of deploying GPT checkpoint to production.
- See an example of prompt engineering.
Final Review
- Review key learnings and answer questions.
- Complete the assessment and earn a certificate.
- Complete the workshop survey.