Enhancing Data Science Outcomes with Efficient Workflow Training
Commitment | 1 Day, 7-8 hours a day. |
Language | English |
User Ratings | Average User Rating 4.8 See what learners said |
Price | REQUEST |
Delivery Options | Instructor-Led Onsite, Online, and Classroom Live |
COURSE OVERVIEW
With Enhancing Data Science Outcomes with Efficient Workflow Training, participants will learn how to create an end-to-end, hardware-accelerated machine learning pipeline for large datasets. Throughout the development process, you’ll use diagnostic tools to identify delays and learn to mitigate common pitfalls.
Please note that once a booking has been confirmed, it is non-refundable. This means that after you have confirmed your seat for an event, it cannot be canceled, and no refund will be issued, regardless of attendance.
WHAT'S INCLUDED?
- 1 day of Enhancing Data Science Outcomes with Efficient Workflow Training with an expert instructor
- Enhancing Data Science Outcomes with Efficient Workflow Electronic Course Guide
- Certificate of Completion
- 100% Satisfaction Guarantee
RESOURCES
- Enhancing Data Science Outcomes with Efficient Workflow – https://www.wiley.com/
- Enhancing Data Science Outcomes with Efficient Workflow – https://www.packtpub.com/
- Enhancing Data Science Outcomes with Efficient Workflow – https://store.logicaloperations.com/
- Enhancing Data Science Outcomes with Efficient Workflow – https://us.artechhouse.com/
- Enhancing Data Science Outcomes with Efficient Workflow Training – https://www.amazon.com/
RELATED COURSES
ADDITIONAL INFORMATION
COURSE OBJECTIVES
Upon completion of this Enhancing Data Science Outcomes with Efficient Workflow Training course, participants can:
- Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
- Scale data science workflows using distributed computing
- Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
- Enhance machine learning solutions through feature engineering and rapid experimentation
- Improve data processing pipeline performance by optimizing memory management and hardware utilization
CUSTOMIZE IT
- We can adapt this Enhancing Data Science Outcomes with Efficient Workflow Training course to your group’s background and work requirements at little to no added cost.
- If you are familiar with some aspects of this Enhancing Data Science Outcomes with Efficient Workflow course, we can omit or shorten their discussion.
- We can adjust the emphasis placed on the various topics or build the Enhancing Data Science Outcomes with Efficient Workflow Training course around the mix of technologies of interest to you (including technologies other than those in this outline).
- If your background is nontechnical, we can exclude the more technical topics, include the topics that may be of special interest to you (e.g., as a manager or policymaker), and present the Enhancing Data Science Outcomes with Efficient Workflow course in a manner understandable to lay audiences.
AUDIENCE/TARGET GROUP
The target audience for this Enhancing Data Science Outcomes with Efficient Workflow course:
- ALL
CLASS PREREQUISITES
The knowledge and skills that a learner must have before attending this Enhancing Data Science Outcomes with Efficient Workflow course are:
- Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.
- Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the “Get Started” guide from Dask.
- Completion of the DLI’s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.
COURSE SYLLABUS
Introduction
- Meet the instructor.
- Create an account at courses.nvidia.com/join
Advanced Extract, Transform, and Load (ETL)
- Learn to process large volumes of data efficiently for downstream analysis:
- Discuss current challenges of growing data sizes.
- Perform ETL efficiently on large datasets.
- Discuss hidden slowdowns and perform DataFrame transformations properly.
- Discuss diagnostic tools to monitor and optimize hardware utilization.
- Persist data in a way that’s conducive for downstream analytics.
Training on Multiple GPUs With PyTorch Distributed Data Parallel (DDP)
- Learn how to improve data analysis on large datasets:
- Build and compare classification models.
- Perform feature selection based on predictive power of new and existing features.
- Perform hyperparameter tuning.
- Create embeddings using deep learning and clustering on embeddings.
Deployment
- Learn how to deploy and measure the performance of an accelerated data processing pipeline:
- Deploy a data processing pipeline with Triton Inference Server.
- Discuss various tuning parameters to optimize performance.
Assessment and Q&A