{"id":110348,"date":"2025-09-17T07:11:36","date_gmt":"2025-09-17T07:11:36","guid":{"rendered":"https:\/\/www.dumpsbase.com\/freedumps\/?p=110348"},"modified":"2025-10-03T03:49:25","modified_gmt":"2025-10-03T03:49:25","slug":"download-the-nvidia-ai-infrastructure-ncp-aii-dumps-v8-02-and-start-preparation-today-continue-to-read-ncp-aii-free-dumps-part-2-q41-q80","status":"publish","type":"post","link":"https:\/\/www.dumpsbase.com\/freedumps\/download-the-nvidia-ai-infrastructure-ncp-aii-dumps-v8-02-and-start-preparation-today-continue-to-read-ncp-aii-free-dumps-part-2-q41-q80.html","title":{"rendered":"Download the NVIDIA AI Infrastructure NCP-AII Dumps (V8.02) and Start Preparation Today: Continue to Read NCP-AII Free Dumps (Part 2, Q41-Q80)"},"content":{"rendered":"<p>According to the feedback, most candidates have completed their NVIDIA Certified Professional AI Infrastructure (NCP-AII) certification with DumpsBase. The NCP-AII dumps (V8.02) are the ideal choice for busy professionals seeking reliable, high-impact results. You can check our <a href=\"https:\/\/www.dumpsbase.com\/freedumps\/new-ncp-aii-dumps-v8-02-become-the-preferred-choice-for-making-preparations-check-the-nvidia-ncp-aii-free-dumps-part-1-q1-q40.html\"><em><strong>NCP-AII free dumps (Part 1, Q1-Q40)<\/strong><\/em><\/a> online and verify our quality. Download the NVIDIA NCP-AII dumps (V8.02) and practice our questions and answers in a PDF file and testing engine software. Trust, our expert-crafted exam questions deepen your understanding and prepare you for every possible scenario. Regularly updated to reflect the latest exam format, our dumps build your knowledge and confidence. With DumpsBase, your NVIDIA Certified Professional AI Infrastructure (NCP-AII) exam preparation is strategic, focused, and designed for success.<\/p>\n<h2>Today, we have <span style=\"background-color: #ffcc99;\"><em>NCP-AII free dumps (Part 2, Q41-Q80) online<\/em><\/span>, then you can continue to read demos:<\/h2>\n<script>\n\t  window.fbAsyncInit = function() {\n\t    FB.init({\n\t      appId            : '622169541470367',\n\t      autoLogAppEvents : true,\n\t      xfbml            : true,\n\t      version          : 'v3.1'\n\t    });\n\t  };\n\t\n\t  (function(d, s, id){\n\t     var js, fjs = d.getElementsByTagName(s)[0];\n\t     if (d.getElementById(id)) {return;}\n\t     js = d.createElement(s); js.id = id;\n\t     js.src = \"https:\/\/connect.facebook.net\/en_US\/sdk.js\";\n\t     fjs.parentNode.insertBefore(js, fjs);\n\t   }(document, 'script', 'facebook-jssdk'));\n\t<\/script><script type=\"text\/javascript\" >\ndocument.addEventListener(\"DOMContentLoaded\", function(event) { \nif(!window.jQuery) alert(\"The important jQuery library is not properly loaded in your site. Your WordPress theme is probably missing the essential wp_head() call. You can switch to another theme and you will see that the plugin works fine and this notice disappears. If you are still not sure what to do you can contact us for help.\");\n});\n<\/script>  \n  \n<div  id=\"watupro_quiz\" class=\"quiz-area single-page-quiz\">\n<p id=\"submittingExam10794\" style=\"display:none;text-align:center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/img\/loading.gif\" width=\"16\" height=\"16\"><\/p>\n\n<div class=\"watupro-exam-description\" id=\"description-quiz-10794\"><\/div>\n\n<form action=\"\" method=\"post\" class=\"quiz-form\" id=\"quiz-10794\"  enctype=\"multipart\/form-data\" >\n<div class='watu-question ' id='question-1' style=';'><div id='questionWrap-1'  class='   watupro-question-id-426149'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>1. <\/span>In an InfiniBand fabric, what is the primary role of the Subnet Manager (SM) with respect to routing?<\/div><input type='hidden' name='question_id[]' id='qID_1' value='426149' \/><input type='hidden' id='answerType426149' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426149[]' id='answer-id-1649911' class='answer   answerof-426149 ' value='1649911'   \/><label for='answer-id-1649911' id='answer-label-1649911' class=' answer'><span>To forward packets based on destination IP addresses, similar to a traditional IP router.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426149[]' id='answer-id-1649912' class='answer   answerof-426149 ' value='1649912'   \/><label for='answer-id-1649912' id='answer-label-1649912' class=' answer'><span>To discover the network topology, calculate routing paths, and program the forwarding tables (LID tables) in the switches.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426149[]' id='answer-id-1649913' class='answer   answerof-426149 ' value='1649913'   \/><label for='answer-id-1649913' id='answer-label-1649913' class=' answer'><span>To monitor the network for congestion and dynamically adjust packet priorities using Quality of Service (QOS) mechanisms.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426149[]' id='answer-id-1649914' class='answer   answerof-426149 ' value='1649914'   \/><label for='answer-id-1649914' id='answer-label-1649914' class=' answer'><span>To provide a command-line interface for users to manually configure routing tables on each InfiniBand switch.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426149[]' id='answer-id-1649915' class='answer   answerof-426149 ' value='1649915'   \/><label for='answer-id-1649915' id='answer-label-1649915' class=' answer'><span>To act as a firewall, blocking unauthorized traffic based on pre-defined rules.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-2' style=';'><div id='questionWrap-2'  class='   watupro-question-id-426150'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>2. <\/span>You\u2019re debugging performance issues in a distributed training job. \u2018nvidia-smi\u2019 shows consistently high GPU utilization across all nodes, but the training speed isn\u2019t increasing linearly with the number of GPUs. Network bandwidth is sufficient. <br \/>\r<br>What is the most likely bottleneck?<\/div><input type='hidden' name='question_id[]' id='qID_2' value='426150' \/><input type='hidden' id='answerType426150' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426150[]' id='answer-id-1649916' class='answer   answerof-426150 ' value='1649916'   \/><label for='answer-id-1649916' id='answer-label-1649916' class=' answer'><span>Inefficient data loading and preprocessing pipeline, causing GPUs to wait for data.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426150[]' id='answer-id-1649917' class='answer   answerof-426150 ' value='1649917'   \/><label for='answer-id-1649917' id='answer-label-1649917' class=' answer'><span>NCCL is not configured optimally for the network topology, leading to high communication overhead.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426150[]' id='answer-id-1649918' class='answer   answerof-426150 ' value='1649918'   \/><label for='answer-id-1649918' id='answer-label-1649918' class=' answer'><span>The learning rate is not adjusted appropriately for the increased batch size across multiple GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426150[]' id='answer-id-1649919' class='answer   answerof-426150 ' value='1649919'   \/><label for='answer-id-1649919' id='answer-label-1649919' class=' answer'><span>The global batch size has exceeded the optimal point for the model, reducing per-sample accuracy and slowing convergence.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426150[]' id='answer-id-1649920' class='answer   answerof-426150 ' value='1649920'   \/><label for='answer-id-1649920' id='answer-label-1649920' class=' answer'><span>CUDA Graphs is not being utilized.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-3' style=';'><div id='questionWrap-3'  class='   watupro-question-id-426151'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>3. <\/span>You are tasked with designing a high-performance network for a large-scale recommendation system. The system requires low latency and high throughput for both training and inference. <br \/>\r<br>Which interconnect technology is MOST suitable for connecting the nodes within the cluster?<\/div><input type='hidden' name='question_id[]' id='qID_3' value='426151' \/><input type='hidden' id='answerType426151' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426151[]' id='answer-id-1649921' class='answer   answerof-426151 ' value='1649921'   \/><label for='answer-id-1649921' id='answer-label-1649921' class=' answer'><span>Gigabit Ethernet<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426151[]' id='answer-id-1649922' class='answer   answerof-426151 ' value='1649922'   \/><label for='answer-id-1649922' id='answer-label-1649922' class=' answer'><span>10 Gigabit Ethernet<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426151[]' id='answer-id-1649923' class='answer   answerof-426151 ' value='1649923'   \/><label for='answer-id-1649923' id='answer-label-1649923' class=' answer'><span>InfiniBand<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426151[]' id='answer-id-1649924' class='answer   answerof-426151 ' value='1649924'   \/><label for='answer-id-1649924' id='answer-label-1649924' class=' answer'><span>Fibre Channel<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426151[]' id='answer-id-1649925' class='answer   answerof-426151 ' value='1649925'   \/><label for='answer-id-1649925' id='answer-label-1649925' class=' answer'><span>100 Gigabit Ethernet<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-4' style=';'><div id='questionWrap-4'  class='   watupro-question-id-426152'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>4. <\/span>An A1 inferencing server, using NVIDIA Triton Inference Server, experiences intermittent crashes under peak load. The logs reveal CUDA out-of-memory errors (00M) despite sufficient system RAM. You suspect a GPU memory leak within one of the models. <br \/>\r<br>Which strategy BEST addresses this issue?<\/div><input type='hidden' name='question_id[]' id='qID_4' value='426152' \/><input type='hidden' id='answerType426152' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426152[]' id='answer-id-1649926' class='answer   answerof-426152 ' value='1649926'   \/><label for='answer-id-1649926' id='answer-label-1649926' class=' answer'><span>Increase the system RAM to accommodate the growing memory footprint.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426152[]' id='answer-id-1649927' class='answer   answerof-426152 ' value='1649927'   \/><label for='answer-id-1649927' id='answer-label-1649927' class=' answer'><span>Implement CUDA memory pooling within the Triton Inference Server configuration to reuse memory allocations efficiently.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426152[]' id='answer-id-1649928' class='answer   answerof-426152 ' value='1649928'   \/><label for='answer-id-1649928' id='answer-label-1649928' class=' answer'><span>Reduce the batch size and concurrency of the offending model in the Triton configuration.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426152[]' id='answer-id-1649929' class='answer   answerof-426152 ' value='1649929'   \/><label for='answer-id-1649929' id='answer-label-1649929' class=' answer'><span>Upgrade the GPUs to models with larger memory capacity.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426152[]' id='answer-id-1649930' class='answer   answerof-426152 ' value='1649930'   \/><label for='answer-id-1649930' id='answer-label-1649930' class=' answer'><span>Disable other models running on the same GPU to free up memory.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-5' style=';'><div id='questionWrap-5'  class='   watupro-question-id-426153'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>5. <\/span>When setting up a multi-server, multi-GPU environment using NVLink switches, what is the primary consideration when planning the network topology for optimal performance?<\/div><input type='hidden' name='question_id[]' id='qID_5' value='426153' \/><input type='hidden' id='answerType426153' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426153[]' id='answer-id-1649931' class='answer   answerof-426153 ' value='1649931'   \/><label for='answer-id-1649931' id='answer-label-1649931' class=' answer'><span>Minimizing the number of hops between GPUs that need to communicate frequently.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426153[]' id='answer-id-1649932' class='answer   answerof-426153 ' value='1649932'   \/><label for='answer-id-1649932' id='answer-label-1649932' class=' answer'><span>Maximizing the distance between servers to improve cooling.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426153[]' id='answer-id-1649933' class='answer   answerof-426153 ' value='1649933'   \/><label for='answer-id-1649933' id='answer-label-1649933' class=' answer'><span>Using a star topology for simplified management.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426153[]' id='answer-id-1649934' class='answer   answerof-426153 ' value='1649934'   \/><label for='answer-id-1649934' id='answer-label-1649934' class=' answer'><span>Ensuring all servers are on the same subnet for ease of configuration.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426153[]' id='answer-id-1649935' class='answer   answerof-426153 ' value='1649935'   \/><label for='answer-id-1649935' id='answer-label-1649935' class=' answer'><span>Placing servers near the network\u2019s edge to reduce latency.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-6' style=';'><div id='questionWrap-6'  class='   watupro-question-id-426154'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>6. <\/span>Consider a scenario where you are using NCCL (NVIDIA Collective Communications Library) for multi-GPU training across multiple servers connected via NVLink switches. <br \/>\r<br>Which NCCL environment variable would you use to specify the network interface to be used for communication?<\/div><input type='hidden' name='question_id[]' id='qID_6' value='426154' \/><input type='hidden' id='answerType426154' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426154[]' id='answer-id-1649936' class='answer   answerof-426154 ' value='1649936'   \/><label for='answer-id-1649936' id='answer-label-1649936' class=' answer'><span>NCCL PORT<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426154[]' id='answer-id-1649937' class='answer   answerof-426154 ' value='1649937'   \/><label for='answer-id-1649937' id='answer-label-1649937' class=' answer'><span>NCCL SOCKET IFNAME<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426154[]' id='answer-id-1649938' class='answer   answerof-426154 ' value='1649938'   \/><label for='answer-id-1649938' id='answer-label-1649938' class=' answer'><span>NCCL NET INTERFACE<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426154[]' id='answer-id-1649939' class='answer   answerof-426154 ' value='1649939'   \/><label for='answer-id-1649939' id='answer-label-1649939' class=' answer'><span>NCCL 1B HCA<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426154[]' id='answer-id-1649940' class='answer   answerof-426154 ' value='1649940'   \/><label for='answer-id-1649940' id='answer-label-1649940' class=' answer'><span>NCCL COMM ID<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-7' style=';'><div id='questionWrap-7'  class='   watupro-question-id-426155'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>7. <\/span>You are tasked with setting up network fabric ports to connect several servers, each with multiple NVIDIA GPUs, to an InfiniBand switch. Each server has two ConnectX-6 adapters. <br \/>\r<br>What is the best strategy to maximize bandwidth and redundancy between the servers and the InfiniBand fabric?<\/div><input type='hidden' name='question_id[]' id='qID_7' value='426155' \/><input type='hidden' id='answerType426155' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426155[]' id='answer-id-1649941' class='answer   answerof-426155 ' value='1649941'   \/><label for='answer-id-1649941' id='answer-label-1649941' class=' answer'><span>Connect only one adapter from each server to the switch to minimize cable clutter.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426155[]' id='answer-id-1649942' class='answer   answerof-426155 ' value='1649942'   \/><label for='answer-id-1649942' id='answer-label-1649942' class=' answer'><span>Connect both adapters from each server to the same switch, but do not configure link aggregation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426155[]' id='answer-id-1649943' class='answer   answerof-426155 ' value='1649943'   \/><label for='answer-id-1649943' id='answer-label-1649943' class=' answer'><span>Connect both adapters from each server to the same switch and configure link aggregation (LACP or static LAG) on both the server and the switch.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426155[]' id='answer-id-1649944' class='answer   answerof-426155 ' value='1649944'   \/><label for='answer-id-1649944' id='answer-label-1649944' class=' answer'><span>Connect one adapter from each server to one switch, and the second adapter to a different switch, without link aggregation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426155[]' id='answer-id-1649945' class='answer   answerof-426155 ' value='1649945'   \/><label for='answer-id-1649945' id='answer-label-1649945' class=' answer'><span>Connect one adapter from each server to one switch, and the second adapter to a different switch, and configure multi-pathing on the servers.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-8' style=';'><div id='questionWrap-8'  class='   watupro-question-id-426156'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>8. <\/span>Given the following \u2018nvswitch-cli\u2019 output, what does the \u2018Link Speed\u2019 indicate, and what potential bottleneck might a low \u2018Link Speed\u2019 suggest?<\/div><input type='hidden' name='question_id[]' id='qID_8' value='426156' \/><input type='hidden' id='answerType426156' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426156[]' id='answer-id-1649946' class='answer   answerof-426156 ' value='1649946'   \/><label for='answer-id-1649946' id='answer-label-1649946' class=' answer'><span>It indicates the effective bandwidth of the NVLink connection; a low value suggests a potential cable issue or misconfiguration.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426156[]' id='answer-id-1649947' class='answer   answerof-426156 ' value='1649947'   \/><label for='answer-id-1649947' id='answer-label-1649947' class=' answer'><span>It indicates the clock speed of the GPU memory; a low value suggests a memory bottleneck.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426156[]' id='answer-id-1649948' class='answer   answerof-426156 ' value='1649948'   \/><label for='answer-id-1649948' id='answer-label-1649948' class=' answer'><span>It indicates the PCle generation supported by the GPIJ; a low value suggests an outdated GP<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426156[]' id='answer-id-1649949' class='answer   answerof-426156 ' value='1649949'   \/><label for='answer-id-1649949' id='answer-label-1649949' class=' answer'><span>It indicates the NVLink protocol version; a low value suggests firmware incompatibility.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426156[]' id='answer-id-1649950' class='answer   answerof-426156 ' value='1649950'   \/><label for='answer-id-1649950' id='answer-label-1649950' class=' answer'><span>It indicates the power consumption of the NVLink switch; a high value suggests overheating issues.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-9' style=';'><div id='questionWrap-9'  class='   watupro-question-id-426157'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>9. <\/span>You\u2019re optimizing an Intel Xeon server with 4 NVIDIA GPUs for inference serving using Triton Inference Server. You\u2019ve deployed multiple models concurrently. You observe that the overall throughput is lower than expected, and the GPU utilization is not consistently high. <br \/>\r<br>What are potential bottlenecks and optimization strategies? (Select all that apply)<\/div><input type='hidden' name='question_id[]' id='qID_9' value='426157' \/><input type='hidden' id='answerType426157' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426157[]' id='answer-id-1649951' class='answer   answerof-426157 ' value='1649951'   \/><label for='answer-id-1649951' id='answer-label-1649951' class=' answer'><span>Model loading and unloading overhead. Use model ensemble or dynamic batching to reduce frequency.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426157[]' id='answer-id-1649952' class='answer   answerof-426157 ' value='1649952'   \/><label for='answer-id-1649952' id='answer-label-1649952' class=' answer'><span>Insufficient CPU cores to handle the model loading and preprocessing requests. Increase the number of Triton instance groups for CPU-based models.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426157[]' id='answer-id-1649953' class='answer   answerof-426157 ' value='1649953'   \/><label for='answer-id-1649953' id='answer-label-1649953' class=' answer'><span>The models are memory-bound. Reduce the model precision (e.g., FP32 to FP16 or INT8).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426157[]' id='answer-id-1649954' class='answer   answerof-426157 ' value='1649954'   \/><label for='answer-id-1649954' id='answer-label-1649954' class=' answer'><span>The GPUs are underutilized due to small batch sizes. Implement dynamic batching to increase batch sizes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426157[]' id='answer-id-1649955' class='answer   answerof-426157 ' value='1649955'   \/><label for='answer-id-1649955' id='answer-label-1649955' class=' answer'><span>Insufficient PCle bandwidth between CPU and GPIJs. Reconfigure PCle lanes to improve bandwidth allocation to each GPI<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-10' style=';'><div id='questionWrap-10'  class='   watupro-question-id-426158'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>10. <\/span>A distributed training job using multiple nodes, each with eight NVIDIA GPUs, experiences significant performance degradation. You notice that the network bandwidth between nodes is consistently near its maximum capacity. However, \u2018nvidia-smi\u2019 shows low GPU utilization on some nodes. <br \/>\r<br>What is the MOST likely cause?<\/div><input type='hidden' name='question_id[]' id='qID_10' value='426158' \/><input type='hidden' id='answerType426158' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426158[]' id='answer-id-1649956' class='answer   answerof-426158 ' value='1649956'   \/><label for='answer-id-1649956' id='answer-label-1649956' class=' answer'><span>The GPUs are overheating, causing thermal throttling.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426158[]' id='answer-id-1649957' class='answer   answerof-426158 ' value='1649957'   \/><label for='answer-id-1649957' id='answer-label-1649957' class=' answer'><span>Data is not being distributed evenly across the nodes; some nodes are waiting for data from others.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426158[]' id='answer-id-1649958' class='answer   answerof-426158 ' value='1649958'   \/><label for='answer-id-1649958' id='answer-label-1649958' class=' answer'><span>The NVIDIA drivers are outdated, causing communication bottlenecks.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426158[]' id='answer-id-1649959' class='answer   answerof-426158 ' value='1649959'   \/><label for='answer-id-1649959' id='answer-label-1649959' class=' answer'><span>The network interface cards (NICs) are faulty, causing packet loss and retransmissions.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426158[]' id='answer-id-1649960' class='answer   answerof-426158 ' value='1649960'   \/><label for='answer-id-1649960' id='answer-label-1649960' class=' answer'><span>The CPU is heavily loaded, causing contention for network resources.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-11' style=';'><div id='questionWrap-11'  class='   watupro-question-id-426159'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>11. <\/span>You are tasked with optimizing storage performance for a deep learning training job on an NVIDIA DGX server. The training data consists of millions of small image files. <br \/>\r<br>Which of the following storage optimization techniques would be MOST effective in reducing I\/O bottlenecks?<\/div><input type='hidden' name='question_id[]' id='qID_11' value='426159' \/><input type='hidden' id='answerType426159' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426159[]' id='answer-id-1649961' class='answer   answerof-426159 ' value='1649961'   \/><label for='answer-id-1649961' id='answer-label-1649961' class=' answer'><span>Implementing RAID 0 across all storage devices.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426159[]' id='answer-id-1649962' class='answer   answerof-426159 ' value='1649962'   \/><label for='answer-id-1649962' id='answer-label-1649962' class=' answer'><span>Using a distributed file system with data striping across multiple storage nodes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426159[]' id='answer-id-1649963' class='answer   answerof-426159 ' value='1649963'   \/><label for='answer-id-1649963' id='answer-label-1649963' class=' answer'><span>Enabling data compression on the storage volume.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426159[]' id='answer-id-1649964' class='answer   answerof-426159 ' value='1649964'   \/><label for='answer-id-1649964' id='answer-label-1649964' class=' answer'><span>Increasing the block size of the file system to the maximum supported value.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426159[]' id='answer-id-1649965' class='answer   answerof-426159 ' value='1649965'   \/><label for='answer-id-1649965' id='answer-label-1649965' class=' answer'><span>Implementing a tiered storage system with NVMe drives for frequently accessed data and HDDs for less frequently accessed data.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-12' style=';'><div id='questionWrap-12'  class='   watupro-question-id-426160'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>12. <\/span>Which of the following statements regarding VXLAN (Virtual Extensible LAN) is MOST accurate in the context of data center networking for AI\/ML workloads?<\/div><input type='hidden' name='question_id[]' id='qID_12' value='426160' \/><input type='hidden' id='answerType426160' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426160[]' id='answer-id-1649966' class='answer   answerof-426160 ' value='1649966'   \/><label for='answer-id-1649966' id='answer-label-1649966' class=' answer'><span>VXLAN provides Layer 2 connectivity across Layer 3 networks, enabling virtual machine mobility.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426160[]' id='answer-id-1649967' class='answer   answerof-426160 ' value='1649967'   \/><label for='answer-id-1649967' id='answer-label-1649967' class=' answer'><span>VXLAN primarily improves network security by encrypting all traffic.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426160[]' id='answer-id-1649968' class='answer   answerof-426160 ' value='1649968'   \/><label for='answer-id-1649968' id='answer-label-1649968' class=' answer'><span>VXLAN is only suitable for small-scale networks due to its limited scalability.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426160[]' id='answer-id-1649969' class='answer   answerof-426160 ' value='1649969'   \/><label for='answer-id-1649969' id='answer-label-1649969' class=' answer'><span>VXLAN reduces network overhead compared to traditional VLANs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426160[]' id='answer-id-1649970' class='answer   answerof-426160 ' value='1649970'   \/><label for='answer-id-1649970' id='answer-label-1649970' class=' answer'><span>VXLAN requires specialized hardware and cannot be implemented in software.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-13' style=';'><div id='questionWrap-13'  class='   watupro-question-id-426161'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>13. <\/span>1.A GPU in your AI server consistently overheats during inference workloads. You\u2019ve ruled out inadequate cooling and software bugs. <br \/>\r<br>Running \u2018nvidia-smi\u2019 shows high power draw even when idle. <br \/>\r<br>Which of the following hardware issues are the most likely causes?<\/div><input type='hidden' name='question_id[]' id='qID_13' value='426161' \/><input type='hidden' id='answerType426161' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426161[]' id='answer-id-1649971' class='answer   answerof-426161 ' value='1649971'   \/><label for='answer-id-1649971' id='answer-label-1649971' class=' answer'><span>Degraded thermal paste between the GPU die and the heatsink.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426161[]' id='answer-id-1649972' class='answer   answerof-426161 ' value='1649972'   \/><label for='answer-id-1649972' id='answer-label-1649972' class=' answer'><span>A failing voltage regulator module (VRM) on the GPU board, causing excessive power leakage.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426161[]' id='answer-id-1649973' class='answer   answerof-426161 ' value='1649973'   \/><label for='answer-id-1649973' id='answer-label-1649973' class=' answer'><span>Incorrectly seated GPU in the PCle slot, leading to poor power delivery.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426161[]' id='answer-id-1649974' class='answer   answerof-426161 ' value='1649974'   \/><label for='answer-id-1649974' id='answer-label-1649974' class=' answer'><span>A BIOS setting that is overvolting the GP<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426161[]' id='answer-id-1649975' class='answer   answerof-426161 ' value='1649975'   \/><label for='answer-id-1649975' id='answer-label-1649975' class=' answer'><span>Insufficient system RA<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-14' style=';'><div id='questionWrap-14'  class='   watupro-question-id-426162'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>14. <\/span>You are configuring a Mellanox InfiniBand network for a DGXAIOO cluster. <br \/>\r<br>What is the RECOMMENDED subnet manager for a large, high-performance A1 training environment, and why?<\/div><input type='hidden' name='question_id[]' id='qID_14' value='426162' \/><input type='hidden' id='answerType426162' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426162[]' id='answer-id-1649976' class='answer   answerof-426162 ' value='1649976'   \/><label for='answer-id-1649976' id='answer-label-1649976' class=' answer'><span>OpenSM, because it\u2019s the default and easiest to configure.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426162[]' id='answer-id-1649977' class='answer   answerof-426162 ' value='1649977'   \/><label for='answer-id-1649977' id='answer-label-1649977' class=' answer'><span>UFM (Unified Fabric Manager), because it provides advanced management, monitoring, and optimization capabilities.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426162[]' id='answer-id-1649978' class='answer   answerof-426162 ' value='1649978'   \/><label for='answer-id-1649978' id='answer-label-1649978' class=' answer'><span>IBA management tools that ship with the OS (e.g., \u2018ibnetdiscover\u2019).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426162[]' id='answer-id-1649979' class='answer   answerof-426162 ' value='1649979'   \/><label for='answer-id-1649979' id='answer-label-1649979' class=' answer'><span>Any subnet manager; the performance difference is negligible.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426162[]' id='answer-id-1649980' class='answer   answerof-426162 ' value='1649980'   \/><label for='answer-id-1649980' id='answer-label-1649980' class=' answer'><span>A custom-built subnet manager using the InfiniBand verbs AP<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-15' style=';'><div id='questionWrap-15'  class='   watupro-question-id-426163'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>15. <\/span>You are troubleshooting a performance issue on an Intel Xeon server with NVIDIAAI 00 GPUs. Your application involves frequent data transfers between CPU memory and GPU memory. You suspect that the PCle bus is a bottleneck. <br \/>\r<br>How can you verify and mitigate this bottleneck?<\/div><input type='hidden' name='question_id[]' id='qID_15' value='426163' \/><input type='hidden' id='answerType426163' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426163[]' id='answer-id-1649981' class='answer   answerof-426163 ' value='1649981'   \/><label for='answer-id-1649981' id='answer-label-1649981' class=' answer'><span>Use \u2018nvidia-smi\u2019 to monitor the PCle bandwidth utilization of the GPUs. If it\u2019s consistently high (near the theoretical limit), the PCle bus is likely a bottleneck. Mitigate by reducing the frequency of CPU-GPU data transfers, using pinned (page-locked) memory, and ensuring that the GPUs are connected to PCle slots with sufficient bandwidth.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426163[]' id='answer-id-1649982' class='answer   answerof-426163 ' value='1649982'   \/><label for='answer-id-1649982' id='answer-label-1649982' class=' answer'><span>Check the CPU utilization. If it\u2019s low, the PCle bus is likely the bottleneck. Mitigate by increasing the number of CPU cores assigned to the data transfer tasks.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426163[]' id='answer-id-1649983' class='answer   answerof-426163 ' value='1649983'   \/><label for='answer-id-1649983' id='answer-label-1649983' class=' answer'><span>Examine the system logs for PCle errors. If there are many errors, the PCle bus is likely unstable. Mitigate by reseating the GPUs and checking the power supply.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426163[]' id='answer-id-1649984' class='answer   answerof-426163 ' value='1649984'   \/><label for='answer-id-1649984' id='answer-label-1649984' class=' answer'><span>Monitor the GPU temperature. If it\u2019s high, the PCle bus is likely overheating. Mitigate by improving the server\u2019s cooling.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426163[]' id='answer-id-1649985' class='answer   answerof-426163 ' value='1649985'   \/><label for='answer-id-1649985' id='answer-label-1649985' class=' answer'><span>Use \u2018nvprof to profile the application and identify the exact lines of code that are causing the high PCle traffic. Optimize those sections of code to reduce data transfers.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-16' style=';'><div id='questionWrap-16'  class='   watupro-question-id-426164'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>16. <\/span>You are managing a cluster of GPU servers for deep learning. You observe that one server consistently exhibits high GPU temperature during training, causing thermal throttling and reduced performance. You\u2019ve already ensured adequate airflow. <br \/>\r<br>Which of the following actions would be MOST effective in addressing this issue?<\/div><input type='hidden' name='question_id[]' id='qID_16' value='426164' \/><input type='hidden' id='answerType426164' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426164[]' id='answer-id-1649986' class='answer   answerof-426164 ' value='1649986'   \/><label for='answer-id-1649986' id='answer-label-1649986' class=' answer'><span>Reduce the ambient temperature of the data center.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426164[]' id='answer-id-1649987' class='answer   answerof-426164 ' value='1649987'   \/><label for='answer-id-1649987' id='answer-label-1649987' class=' answer'><span>Lower the GPU power limit using \u2018nvidia-smi \u2015power-limit*.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426164[]' id='answer-id-1649988' class='answer   answerof-426164 ' value='1649988'   \/><label for='answer-id-1649988' id='answer-label-1649988' class=' answer'><span>Update the NVIDIA drivers to the latest version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426164[]' id='answer-id-1649989' class='answer   answerof-426164 ' value='1649989'   \/><label for='answer-id-1649989' id='answer-label-1649989' class=' answer'><span>Re-seat the GPU in its PCle slot to ensure proper contact and heat dissipation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426164[]' id='answer-id-1649990' class='answer   answerof-426164 ' value='1649990'   \/><label for='answer-id-1649990' id='answer-label-1649990' class=' answer'><span>Increase the fan speed of the GPU cooler using \u2018nvidia-smi --fan\u2019.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-17' style=';'><div id='questionWrap-17'  class='   watupro-question-id-426165'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>17. <\/span>After upgrading the network card drivers on your A1 inference server, you experience intermittent network connectivity issues, including packet loss and high latency. You\u2019ve verified that the physical connections are secure. <br \/>\r<br>Which of the following steps would be most effective in troubleshooting this issue?<\/div><input type='hidden' name='question_id[]' id='qID_17' value='426165' \/><input type='hidden' id='answerType426165' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426165[]' id='answer-id-1649991' class='answer   answerof-426165 ' value='1649991'   \/><label for='answer-id-1649991' id='answer-label-1649991' class=' answer'><span>Roll back the network card drivers to the previous version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426165[]' id='answer-id-1649992' class='answer   answerof-426165 ' value='1649992'   \/><label for='answer-id-1649992' id='answer-label-1649992' class=' answer'><span>Check the system logs for error messages related to the network card or driver.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426165[]' id='answer-id-1649993' class='answer   answerof-426165 ' value='1649993'   \/><label for='answer-id-1649993' id='answer-label-1649993' class=' answer'><span>Run network diagnostic tools like \u2018ping\u2019, \u2018traceroute\u2019, and \u2018iperf3\u2019 to assess the network performance.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426165[]' id='answer-id-1649994' class='answer   answerof-426165 ' value='1649994'   \/><label for='answer-id-1649994' id='answer-label-1649994' class=' answer'><span>Reinstall the operating system.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426165[]' id='answer-id-1649995' class='answer   answerof-426165 ' value='1649995'   \/><label for='answer-id-1649995' id='answer-label-1649995' class=' answer'><span>Update the server\u2019s BIO<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-18' style=';'><div id='questionWrap-18'  class='   watupro-question-id-426166'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>18. <\/span>You are deploying a multi-tenant A1 infrastructure with strict isolation requirements. <br \/>\r<br>Which network technology would be most suitable for creating isolated virtual networks for each tenant?<\/div><input type='hidden' name='question_id[]' id='qID_18' value='426166' \/><input type='hidden' id='answerType426166' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426166[]' id='answer-id-1649996' class='answer   answerof-426166 ' value='1649996'   \/><label for='answer-id-1649996' id='answer-label-1649996' class=' answer'><span>VLANs (Virtual LANs)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426166[]' id='answer-id-1649997' class='answer   answerof-426166 ' value='1649997'   \/><label for='answer-id-1649997' id='answer-label-1649997' class=' answer'><span>VXLAN (Virtual Extensible LAN)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426166[]' id='answer-id-1649998' class='answer   answerof-426166 ' value='1649998'   \/><label for='answer-id-1649998' id='answer-label-1649998' class=' answer'><span>QinQ (802. lad)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426166[]' id='answer-id-1649999' class='answer   answerof-426166 ' value='1649999'   \/><label for='answer-id-1649999' id='answer-label-1649999' class=' answer'><span>GRE (Generic Routing Encapsulation)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426166[]' id='answer-id-1650000' class='answer   answerof-426166 ' value='1650000'   \/><label for='answer-id-1650000' id='answer-label-1650000' class=' answer'><span>IPsec<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-19' style=';'><div id='questionWrap-19'  class='   watupro-question-id-426167'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>19. <\/span>You\u2019re profiling the performance of a PyTorch model running on an AMD server with multiple NVIDIA GPUs. You notice significant overhead in the data loading pipeline. <br \/>\r<br>Which of the following strategies can help optimize data loading and improve GPU utilization? Select all that apply.<\/div><input type='hidden' name='question_id[]' id='qID_19' value='426167' \/><input type='hidden' id='answerType426167' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426167[]' id='answer-id-1650001' class='answer   answerof-426167 ' value='1650001'   \/><label for='answer-id-1650001' id='answer-label-1650001' class=' answer'><span>Using the \u2018torch.utils.data.DataLoader\u2019 with multiple worker processes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426167[]' id='answer-id-1650002' class='answer   answerof-426167 ' value='1650002'   \/><label for='answer-id-1650002' id='answer-label-1650002' class=' answer'><span>Loading the entire dataset into RAM before training.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426167[]' id='answer-id-1650003' class='answer   answerof-426167 ' value='1650003'   \/><label for='answer-id-1650003' id='answer-label-1650003' class=' answer'><span>Implementing asynchronous data prefetching using \u2018torch .Generator\u2019.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426167[]' id='answer-id-1650004' class='answer   answerof-426167 ' value='1650004'   \/><label for='answer-id-1650004' id='answer-label-1650004' class=' answer'><span>IJsing a faster storage system (e.g., NVMe SSD instead of HDD).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426167[]' id='answer-id-1650005' class='answer   answerof-426167 ' value='1650005'   \/><label for='answer-id-1650005' id='answer-label-1650005' class=' answer'><span>Reducing the batch size to decrease the amount of data loaded per iteration.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-20' style=';'><div id='questionWrap-20'  class='   watupro-question-id-426168'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>20. <\/span>You are setting up network fabric ports for hosts in an NVIDIA-Certified Professional A1 Infrastructure (NCP-AII) environment. You need to configure Jumbo Frames to improve network throughput. <br \/>\r<br>What is the typical MTU (Maximum Transmission Unit) size you would set on the network interfaces and switches, and why?<\/div><input type='hidden' name='question_id[]' id='qID_20' value='426168' \/><input type='hidden' id='answerType426168' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426168[]' id='answer-id-1650006' class='answer   answerof-426168 ' value='1650006'   \/><label for='answer-id-1650006' id='answer-label-1650006' class=' answer'><span>1500 bytes, as it\u2019s the default and compatible with most networks.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426168[]' id='answer-id-1650007' class='answer   answerof-426168 ' value='1650007'   \/><label for='answer-id-1650007' id='answer-label-1650007' class=' answer'><span>9000 bytes, also known as Jumbo Frames, reduces overhead and improves throughput for large data transfers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426168[]' id='answer-id-1650008' class='answer   answerof-426168 ' value='1650008'   \/><label for='answer-id-1650008' id='answer-label-1650008' class=' answer'><span>65535 bytes, the theoretical maximum MTU size, for maximum performance.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426168[]' id='answer-id-1650009' class='answer   answerof-426168 ' value='1650009'   \/><label for='answer-id-1650009' id='answer-label-1650009' class=' answer'><span>576 bytes, the minimum MTU size required by IPv4.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426168[]' id='answer-id-1650010' class='answer   answerof-426168 ' value='1650010'   \/><label for='answer-id-1650010' id='answer-label-1650010' class=' answer'><span>Any MTU size between 1500 and 9000 bytes; the specific value doesn\u2019t matter.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-21' style=';'><div id='questionWrap-21'  class='   watupro-question-id-426169'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>21. <\/span>You are implementing a distributed deep learning training setup using multiple servers connected via NVLink switches. You want to ensure optimal utilization of the NVLink interconnect. <br \/>\r<br>Which of the following strategies would be MOST effective in achieving this goal?<\/div><input type='hidden' name='question_id[]' id='qID_21' value='426169' \/><input type='hidden' id='answerType426169' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426169[]' id='answer-id-1650011' class='answer   answerof-426169 ' value='1650011'   \/><label for='answer-id-1650011' id='answer-label-1650011' class=' answer'><span>Configure NCCL to use GPUDirect RDMA for inter-GPU communication across servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426169[]' id='answer-id-1650012' class='answer   answerof-426169 ' value='1650012'   \/><label for='answer-id-1650012' id='answer-label-1650012' class=' answer'><span>Use a standard TCP\/IP socket connection for inter-GPU communication across servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426169[]' id='answer-id-1650013' class='answer   answerof-426169 ' value='1650013'   \/><label for='answer-id-1650013' id='answer-label-1650013' class=' answer'><span>Implement a data compression algorithm that can be processed by the CPU before sending data over NVLink.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426169[]' id='answer-id-1650014' class='answer   answerof-426169 ' value='1650014'   \/><label for='answer-id-1650014' id='answer-label-1650014' class=' answer'><span>Disable peer-to-peer GPU memory access within each server to avoid contention.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426169[]' id='answer-id-1650015' class='answer   answerof-426169 ' value='1650015'   \/><label for='answer-id-1650015' id='answer-label-1650015' class=' answer'><span>Increase the batch size to reduce the frequency of inter-GPU communication.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-22' style=';'><div id='questionWrap-22'  class='   watupro-question-id-426170'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>22. <\/span>You are managing a server farm of GPU servers used for A1 model training. You observe frequent GPU failures across different servers. <br \/>\r<br>Analysis reveals that the failures often occur during periods of peak ambient temperature in the data center. You can\u2019t immediately improve the data center cooling. <br \/>\r<br>What are TWO proactive measures you can implement to mitigate these failures without significantly impacting training performance?<\/div><input type='hidden' name='question_id[]' id='qID_22' value='426170' \/><input type='hidden' id='answerType426170' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426170[]' id='answer-id-1650016' class='answer   answerof-426170 ' value='1650016'   \/><label for='answer-id-1650016' id='answer-label-1650016' class=' answer'><span>Reduce the GPU power limit using \u2018nvidia-smi\u2019 to decrease heat generation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426170[]' id='answer-id-1650017' class='answer   answerof-426170 ' value='1650017'   \/><label for='answer-id-1650017' id='answer-label-1650017' class=' answer'><span>Increase the fan speeds of the GPU coolers to improve heat dissipation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426170[]' id='answer-id-1650018' class='answer   answerof-426170 ' value='1650018'   \/><label for='answer-id-1650018' id='answer-label-1650018' class=' answer'><span>Implement a more aggressive GPU frequency scaling profile to throttle performance during peak temperatures.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426170[]' id='answer-id-1650019' class='answer   answerof-426170 ' value='1650019'   \/><label for='answer-id-1650019' id='answer-label-1650019' class=' answer'><span>Schedule training jobs to run during off-peak hours when ambient temperatures are lower.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426170[]' id='answer-id-1650020' class='answer   answerof-426170 ' value='1650020'   \/><label for='answer-id-1650020' id='answer-label-1650020' class=' answer'><span>Replace all existing GPUs with water-cooled models.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-23' style=';'><div id='questionWrap-23'  class='   watupro-question-id-426171'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>23. <\/span>You notice that one of the fans in your GPU server is running at a significantly higher RPM than the others, even under minimal load. ipmitool sensor\u2019 output shows a normal temperature for that GPU. <br \/>\r<br>What could be the potential causes?<\/div><input type='hidden' name='question_id[]' id='qID_23' value='426171' \/><input type='hidden' id='answerType426171' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426171[]' id='answer-id-1650021' class='answer   answerof-426171 ' value='1650021'   \/><label for='answer-id-1650021' id='answer-label-1650021' class=' answer'><span>The fan\u2019s PWM control signal is malfunctioning, causing it to run at full speed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426171[]' id='answer-id-1650022' class='answer   answerof-426171 ' value='1650022'   \/><label for='answer-id-1650022' id='answer-label-1650022' class=' answer'><span>The fan bearing is wearing out, causing increased friction and requiring higher RPM to maintain airflow.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426171[]' id='answer-id-1650023' class='answer   answerof-426171 ' value='1650023'   \/><label for='answer-id-1650023' id='answer-label-1650023' class=' answer'><span>The fan is attempting to compensate for restricted airflow due to dust buildup.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426171[]' id='answer-id-1650024' class='answer   answerof-426171 ' value='1650024'   \/><label for='answer-id-1650024' id='answer-label-1650024' class=' answer'><span>The server\u2019s BMC (Baseboard Management Controller) has a faulty temperature sensor reading, causing it to overcompensate.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426171[]' id='answer-id-1650025' class='answer   answerof-426171 ' value='1650025'   \/><label for='answer-id-1650025' id='answer-label-1650025' class=' answer'><span>A network connectivity issue is causing higher CPU utilization, leading to increased system-wide heat.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-24' style=';'><div id='questionWrap-24'  class='   watupro-question-id-426172'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>24. <\/span>You suspect a power supply issue is causing intermittent GPU failures in a server with four NVIDIAAIOO GPUs. The server is rated for a peak power consumption of 3000W. You have a power meter available. <br \/>\r<br>Which of the following methods provides the most accurate assessment of the server\u2019s power consumption under full GPU load?<\/div><input type='hidden' name='question_id[]' id='qID_24' value='426172' \/><input type='hidden' id='answerType426172' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426172[]' id='answer-id-1650026' class='answer   answerof-426172 ' value='1650026'   \/><label for='answer-id-1650026' id='answer-label-1650026' class=' answer'><span>Run \u2018nvidia-smi\u2019 and sum the reported power consumption for each GPI<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426172[]' id='answer-id-1650027' class='answer   answerof-426172 ' value='1650027'   \/><label for='answer-id-1650027' id='answer-label-1650027' class=' answer'><span>Use the power meter to measure the server\u2019s power consumption at idle and multiply by four.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426172[]' id='answer-id-1650028' class='answer   answerof-426172 ' value='1650028'   \/><label for='answer-id-1650028' id='answer-label-1650028' class=' answer'><span>Use the power meter to measure the server\u2019s power consumption while running a synthetic benchmark that fully utilizes all GPIJs simultaneously.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426172[]' id='answer-id-1650029' class='answer   answerof-426172 ' value='1650029'   \/><label for='answer-id-1650029' id='answer-label-1650029' class=' answer'><span>Check the server\u2019s BIOS for power consumption readings.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426172[]' id='answer-id-1650030' class='answer   answerof-426172 ' value='1650030'   \/><label for='answer-id-1650030' id='answer-label-1650030' class=' answer'><span>Add the maximum power rating of each GPU to the CPU\u2019s TDP (Thermal Design Power).<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-25' style=';'><div id='questionWrap-25'  class='   watupro-question-id-426173'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>25. <\/span>You are configuring a server with multiple GPUs for CUDA-aware MPI. <br \/>\r<br>Which environment variable is critical for ensuring proper GPU affinity, so that each MPI process uses the correct GPU?<\/div><input type='hidden' name='question_id[]' id='qID_25' value='426173' \/><input type='hidden' id='answerType426173' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426173[]' id='answer-id-1650031' class='answer   answerof-426173 ' value='1650031'   \/><label for='answer-id-1650031' id='answer-label-1650031' class=' answer'><span>CUDA VISIBLE DEVICES<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426173[]' id='answer-id-1650032' class='answer   answerof-426173 ' value='1650032'   \/><label for='answer-id-1650032' id='answer-label-1650032' class=' answer'><span>CUDA DEVICE ORDER<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426173[]' id='answer-id-1650033' class='answer   answerof-426173 ' value='1650033'   \/><label for='answer-id-1650033' id='answer-label-1650033' class=' answer'><span>LD LIBRARY PATH<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426173[]' id='answer-id-1650034' class='answer   answerof-426173 ' value='1650034'   \/><label for='answer-id-1650034' id='answer-label-1650034' class=' answer'><span>MPI GPU SUPPORT<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426173[]' id='answer-id-1650035' class='answer   answerof-426173 ' value='1650035'   \/><label for='answer-id-1650035' id='answer-label-1650035' class=' answer'><span>CUDA LAUNCH BLOCKING-I<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-26' style=';'><div id='questionWrap-26'  class='   watupro-question-id-426174'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>26. <\/span>Consider an AMD EPYC-based server with 8 NVIDIAAIOO GPUs connected via PCle Gen4. You\u2019re running a distributed training job using Horovod. You\u2019ve noticed that communication between GPUs is a bottleneck. <br \/>\r<br>Which of the following NCCL configuration options would be MOST beneficial in this scenario? (Assume all options are syntactically correct for NCCL).<\/div><input type='hidden' name='question_id[]' id='qID_26' value='426174' \/><input type='hidden' id='answerType426174' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426174[]' id='answer-id-1650036' class='answer   answerof-426174 ' value='1650036'   \/><label for='answer-id-1650036' id='answer-label-1650036' class=' answer'><span>NCCL SOCKET IF-NAME=eth0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426174[]' id='answer-id-1650037' class='answer   answerof-426174 ' value='1650037'   \/><label for='answer-id-1650037' id='answer-label-1650037' class=' answer'><span>NCCL IB DISABLE-1<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426174[]' id='answer-id-1650038' class='answer   answerof-426174 ' value='1650038'   \/><label for='answer-id-1650038' id='answer-label-1650038' class=' answer'><span>NCCL P2P DISABLE-0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426174[]' id='answer-id-1650039' class='answer   answerof-426174 ' value='1650039'   \/><label for='answer-id-1650039' id='answer-label-1650039' class=' answer'><span>NCCL IB HCA=mlx5 0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426174[]' id='answer-id-1650040' class='answer   answerof-426174 ' value='1650040'   \/><label for='answer-id-1650040' id='answer-label-1650040' class=' answer'><span>NCCL NET PLUGIN=none<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-27' style=';'><div id='questionWrap-27'  class='   watupro-question-id-426175'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>27. <\/span>You\u2019re optimizing an Intel Xeon server with 4 NVIDIAAIOO GPUs for a computer vision application that uses CODA. You notice that the GPU utilization is fluctuating significantly, and performance is inconsistent. Using \u2018nvprof, you identify that there are frequent stalls in the CUDA kernels due to thread divergence. <br \/>\r<br>What are possible causes and solutions?<\/div><input type='hidden' name='question_id[]' id='qID_27' value='426175' \/><input type='hidden' id='answerType426175' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426175[]' id='answer-id-1650041' class='answer   answerof-426175 ' value='1650041'   \/><label for='answer-id-1650041' id='answer-label-1650041' class=' answer'><span>The input data is not properly aligned in memory. Ensure that data is aligned to 128-byte boundaries using aligned memory allocation techniques.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426175[]' id='answer-id-1650042' class='answer   answerof-426175 ' value='1650042'   \/><label for='answer-id-1650042' id='answer-label-1650042' class=' answer'><span>The CUDA code contains conditional branches that lead to different execution paths for different threads within the same warp. Rewrite the CUDA code to minimize branching and favor uniform execution paths within warps.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426175[]' id='answer-id-1650043' class='answer   answerof-426175 ' value='1650043'   \/><label for='answer-id-1650043' id='answer-label-1650043' class=' answer'><span>The GPUs are overheating, causing thermal throttling. Improve the server\u2019s cooling.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426175[]' id='answer-id-1650044' class='answer   answerof-426175 ' value='1650044'   \/><label for='answer-id-1650044' id='answer-label-1650044' class=' answer'><span>The CUDA compiler is generating suboptimal code. Try using different compiler optimization flags (e.g., \u2018-O3\u2019 or \u2018-ftz=true\u2019).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426175[]' id='answer-id-1650045' class='answer   answerof-426175 ' value='1650045'   \/><label for='answer-id-1650045' id='answer-label-1650045' class=' answer'><span>The CUDA driver version is incompatible with the CUDA toolkit version. Update the CUDA driver to a compatible version.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-28' style=';'><div id='questionWrap-28'  class='   watupro-question-id-426176'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>28. <\/span>Your AI training pipeline involves a pre-processing step that reads data from a large HDF5 file. You notice significant delays during this step. You suspect the HDF5 file structure might be contributing to the slow read times. <br \/>\r<br>What optimization technique is MOST likely to improve read performance from this HDF5 file?<\/div><input type='hidden' name='question_id[]' id='qID_28' value='426176' \/><input type='hidden' id='answerType426176' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426176[]' id='answer-id-1650046' class='answer   answerof-426176 ' value='1650046'   \/><label for='answer-id-1650046' id='answer-label-1650046' class=' answer'><span>Converting the HDF5 file to a CSV file.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426176[]' id='answer-id-1650047' class='answer   answerof-426176 ' value='1650047'   \/><label for='answer-id-1650047' id='answer-label-1650047' class=' answer'><span>Storing the HDF5 file on a network file system like NF<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426176[]' id='answer-id-1650048' class='answer   answerof-426176 ' value='1650048'   \/><label for='answer-id-1650048' id='answer-label-1650048' class=' answer'><span>Reorganizing the HDF5 file to improve data contiguity and chunking.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426176[]' id='answer-id-1650049' class='answer   answerof-426176 ' value='1650049'   \/><label for='answer-id-1650049' id='answer-label-1650049' class=' answer'><span>Compressing the HDF5 file using gzip.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426176[]' id='answer-id-1650050' class='answer   answerof-426176 ' value='1650050'   \/><label for='answer-id-1650050' id='answer-label-1650050' class=' answer'><span>Encrypting the HDF5 file for enhanced security.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-29' style=';'><div id='questionWrap-29'  class='   watupro-question-id-426177'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>29. <\/span>You have an Intel Xeon Gold server with 2 NVIDIA Tesla VI 00 GPUs. After deploying your A1 application, you observe that one GPU is consistently running at a significantly higher temperature than the other <br \/>\r<br>What could be a plausible reason for this behavior?<\/div><input type='hidden' name='question_id[]' id='qID_29' value='426177' \/><input type='hidden' id='answerType426177' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426177[]' id='answer-id-1650051' class='answer   answerof-426177 ' value='1650051'   \/><label for='answer-id-1650051' id='answer-label-1650051' class=' answer'><span>One GPU is defective and drawing excessive power.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426177[]' id='answer-id-1650052' class='answer   answerof-426177 ' value='1650052'   \/><label for='answer-id-1650052' id='answer-label-1650052' class=' answer'><span>The server\u2019s airflow is inadequate, causing poor cooling for one of the GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426177[]' id='answer-id-1650053' class='answer   answerof-426177 ' value='1650053'   \/><label for='answer-id-1650053' id='answer-label-1650053' class=' answer'><span>The workload is not evenly distributed between the GPUs, causing one GPU to be more heavily utilized.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426177[]' id='answer-id-1650054' class='answer   answerof-426177 ' value='1650054'   \/><label for='answer-id-1650054' id='answer-label-1650054' class=' answer'><span>One GPU\u2019s driver version is outdated, leading to inefficient power management.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426177[]' id='answer-id-1650055' class='answer   answerof-426177 ' value='1650055'   \/><label for='answer-id-1650055' id='answer-label-1650055' class=' answer'><span>The ambient temperature in the server room is higher on one side of the rack.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-30' style=';'><div id='questionWrap-30'  class='   watupro-question-id-426178'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>30. <\/span>Which of the following are valid methods for verifying the health and connectivity of InfiniBand links in an NCP-AII environment? (Select TWO)<\/div><input type='hidden' name='question_id[]' id='qID_30' value='426178' \/><input type='hidden' id='answerType426178' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426178[]' id='answer-id-1650056' class='answer   answerof-426178 ' value='1650056'   \/><label for='answer-id-1650056' id='answer-label-1650056' class=' answer'><span>Using \u2018ping\u2019 to test basic IP connectivity over the InfiniBand interface.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426178[]' id='answer-id-1650057' class='answer   answerof-426178 ' value='1650057'   \/><label for='answer-id-1650057' id='answer-label-1650057' class=' answer'><span>Using \u2018ibstat\u2019 to check the link state, physical state, and other relevant parameters of InfiniBand ports.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426178[]' id='answer-id-1650058' class='answer   answerof-426178 ' value='1650058'   \/><label for='answer-id-1650058' id='answer-label-1650058' class=' answer'><span>Using \u2018netstat\u2019 to check TCP connections.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426178[]' id='answer-id-1650059' class='answer   answerof-426178 ' value='1650059'   \/><label for='answer-id-1650059' id='answer-label-1650059' class=' answer'><span>Using \u2018sminfo\u2019 to query the Subnet Manager for network topology and status information.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426178[]' id='answer-id-1650060' class='answer   answerof-426178 ' value='1650060'   \/><label for='answer-id-1650060' id='answer-label-1650060' class=' answer'><span>Checking the system logs ( \u2018 \/var\/log\/messages\u2019 or equivalent) for any InfiniBand-related error messages.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-31' style=';'><div id='questionWrap-31'  class='   watupro-question-id-426179'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>31. <\/span>An A1 server exhibits frequent kernel panics under heavy GPU load. \u2018dmesg\u2019 reveals the following error: \u2018NVRM: Xid (PCl:0000:3B:00): 79, pid=..., name=..., GPU has fallen off the bus.\u2019 <br \/>\r<br>Which of the following is the least likely cause of this issue?<\/div><input type='hidden' name='question_id[]' id='qID_31' value='426179' \/><input type='hidden' id='answerType426179' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426179[]' id='answer-id-1650061' class='answer   answerof-426179 ' value='1650061'   \/><label for='answer-id-1650061' id='answer-label-1650061' class=' answer'><span>Insufficient power supply to the GPIJ, causing it to become unstable under load.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426179[]' id='answer-id-1650062' class='answer   answerof-426179 ' value='1650062'   \/><label for='answer-id-1650062' id='answer-label-1650062' class=' answer'><span>A loose or damaged PCle riser cable connecting the GPU to the motherboard.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426179[]' id='answer-id-1650063' class='answer   answerof-426179 ' value='1650063'   \/><label for='answer-id-1650063' id='answer-label-1650063' class=' answer'><span>A driver bug in the NVIDIA drivers, leading to GPU instability.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426179[]' id='answer-id-1650064' class='answer   answerof-426179 ' value='1650064'   \/><label for='answer-id-1650064' id='answer-label-1650064' class=' answer'><span>Overclocking the GPU beyond its stable limits.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426179[]' id='answer-id-1650065' class='answer   answerof-426179 ' value='1650065'   \/><label for='answer-id-1650065' id='answer-label-1650065' class=' answer'><span>A faulty CP<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-32' style=';'><div id='questionWrap-32'  class='   watupro-question-id-426180'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>32. <\/span>You suspect a faulty NVIDIA ConnectX-6 network adapter in a server used for RDMA-based distributed training. <br \/>\r<br>Which commands or tools can you use to diagnose potential issues with the adapter\u2019s hardware and connectivity?<\/div><input type='hidden' name='question_id[]' id='qID_32' value='426180' \/><input type='hidden' id='answerType426180' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426180[]' id='answer-id-1650066' class='answer   answerof-426180 ' value='1650066'   \/><label for='answer-id-1650066' id='answer-label-1650066' class=' answer'><span>Ispci -v to verify the adapter is detected and its resources are allocated correctly.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426180[]' id='answer-id-1650067' class='answer   answerof-426180 ' value='1650067'   \/><label for='answer-id-1650067' id='answer-label-1650067' class=' answer'><span>ibstat to check the adapter\u2019s status, link speed, and active ports.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426180[]' id='answer-id-1650068' class='answer   answerof-426180 ' value='1650068'   \/><label for='answer-id-1650068' id='answer-label-1650068' class=' answer'><span>ethtool to examine the adapter\u2019s Ethernet settings and statistics.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426180[]' id='answer-id-1650069' class='answer   answerof-426180 ' value='1650069'   \/><label for='answer-id-1650069' id='answer-label-1650069' class=' answer'><span>ping to test basic network connectivity.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426180[]' id='answer-id-1650070' class='answer   answerof-426180 ' value='1650070'   \/><label for='answer-id-1650070' id='answer-label-1650070' class=' answer'><span>nvsmimonitord to monitor GPU metrics and detect anomalies.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-33' style=';'><div id='questionWrap-33'  class='   watupro-question-id-426181'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>33. <\/span>You are deploying a multi-GPU server for deep learning training. After installing the GPUs, the system boots, but \u2018nvidia-smi\u2019 only detects one GPU. The motherboard has multiple PCle slots, all of which are physically capable of supporting GPUs. <br \/>\r<br>What is the most probable cause?<\/div><input type='hidden' name='question_id[]' id='qID_33' value='426181' \/><input type='hidden' id='answerType426181' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426181[]' id='answer-id-1650071' class='answer   answerof-426181 ' value='1650071'   \/><label for='answer-id-1650071' id='answer-label-1650071' class=' answer'><span>The other GPUs are not properly seated in their PCle slots. Reseat the GPUs and ensure they are securely connected.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426181[]' id='answer-id-1650072' class='answer   answerof-426181 ' value='1650072'   \/><label for='answer-id-1650072' id='answer-label-1650072' class=' answer'><span>The other GPUs are faulty and need to be replaced. Test each GPU individually to confirm their functionality.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426181[]' id='answer-id-1650073' class='answer   answerof-426181 ' value='1650073'   \/><label for='answer-id-1650073' id='answer-label-1650073' class=' answer'><span>The system BIOS\/UEFI is not configured to enable all PCle slots or the PCle lanes are not allocated correctly. Check the BIOS\/IJEFI settings to enable all slots and configure the PCle lane allocation (e.g., x16\/x8\/x8).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426181[]' id='answer-id-1650074' class='answer   answerof-426181 ' value='1650074'   \/><label for='answer-id-1650074' id='answer-label-1650074' class=' answer'><span>The NVIDIA drivers are not installed correctly or are incompatible with the GPUs. Reinstall the drivers and ensure they are compatible with the specific GPU model and CUDA version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426181[]' id='answer-id-1650075' class='answer   answerof-426181 ' value='1650075'   \/><label for='answer-id-1650075' id='answer-label-1650075' class=' answer'><span>The power supply is not providing enough power to all GPIJs. Upgrade to a higher wattage power supply.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-34' style=';'><div id='questionWrap-34'  class='   watupro-question-id-426182'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>34. <\/span>You\u2019re designing a new InfiniBand network for a distributed deep learning workload. The workload consists of a mix of large-message all- to-all communication and small-message parameter synchronization. <br \/>\r<br>Considering the different traffic patterns, what routing strategy would MOST effectively minimize latency and maximize bandwidth utilization across the fabric?<\/div><input type='hidden' name='question_id[]' id='qID_34' value='426182' \/><input type='hidden' id='answerType426182' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426182[]' id='answer-id-1650076' class='answer   answerof-426182 ' value='1650076'   \/><label for='answer-id-1650076' id='answer-label-1650076' class=' answer'><span>Rely solely on the default Subnet Manager (SM) with a Min Hop path selection algorithm.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426182[]' id='answer-id-1650077' class='answer   answerof-426182 ' value='1650077'   \/><label for='answer-id-1650077' id='answer-label-1650077' class=' answer'><span>Implement a static routing scheme with manually configured forwarding tables on each switch.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426182[]' id='answer-id-1650078' class='answer   answerof-426182 ' value='1650078'   \/><label for='answer-id-1650078' id='answer-label-1650078' class=' answer'><span>Utilize a combination of Adaptive Routing (AR) to handle dynamic traffic patterns and Quality of Service (QOS) to prioritize small-message parameter synchronization.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426182[]' id='answer-id-1650079' class='answer   answerof-426182 ' value='1650079'   \/><label for='answer-id-1650079' id='answer-label-1650079' class=' answer'><span>Implement a purely deterministic routing scheme, disabling all adaptive routing features.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426182[]' id='answer-id-1650080' class='answer   answerof-426182 ' value='1650080'   \/><label for='answer-id-1650080' id='answer-label-1650080' class=' answer'><span>Disable multicast.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-35' style=';'><div id='questionWrap-35'  class='   watupro-question-id-426183'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>35. <\/span>You\u2019re monitoring the storage I\/O for an AI training workload and observe high disk utilization but relatively low CPU utilization. <br \/>\r<br>Which of the following actions is LEAST likely to improve the performance of the training job?<\/div><input type='hidden' name='question_id[]' id='qID_35' value='426183' \/><input type='hidden' id='answerType426183' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426183[]' id='answer-id-1650081' class='answer   answerof-426183 ' value='1650081'   \/><label for='answer-id-1650081' id='answer-label-1650081' class=' answer'><span>Switching from HDDs to NVMe SSDs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426183[]' id='answer-id-1650082' class='answer   answerof-426183 ' value='1650082'   \/><label for='answer-id-1650082' id='answer-label-1650082' class=' answer'><span>Implementing data prefetching to load data into memory before it\u2019s needed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426183[]' id='answer-id-1650083' class='answer   answerof-426183 ' value='1650083'   \/><label for='answer-id-1650083' id='answer-label-1650083' class=' answer'><span>Increasing the batch size of the training job.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426183[]' id='answer-id-1650084' class='answer   answerof-426183 ' value='1650084'   \/><label for='answer-id-1650084' id='answer-label-1650084' class=' answer'><span>Adding more RAM to the system.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426183[]' id='answer-id-1650085' class='answer   answerof-426183 ' value='1650085'   \/><label for='answer-id-1650085' id='answer-label-1650085' class=' answer'><span>Reducing the number of parallel data loading threads.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-36' style=';'><div id='questionWrap-36'  class='   watupro-question-id-426184'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>36. <\/span>You\u2019re optimizing an AMD EPYC server with 4 NVIDIAAIOO GPUs for a large language model training workload. You observe that the GPUs are consistently underutilized (50-60% utilization) while the CPUs are nearly maxed out. <br \/>\r<br>Which of the following is the MOST likely bottleneck?<\/div><input type='hidden' name='question_id[]' id='qID_36' value='426184' \/><input type='hidden' id='answerType426184' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426184[]' id='answer-id-1650086' class='answer   answerof-426184 ' value='1650086'   \/><label for='answer-id-1650086' id='answer-label-1650086' class=' answer'><span>Insufficient CPU cores to prepare and feed data to the GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426184[]' id='answer-id-1650087' class='answer   answerof-426184 ' value='1650087'   \/><label for='answer-id-1650087' id='answer-label-1650087' class=' answer'><span>The PCle interconnect between the CPUs and GPUs is saturated.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426184[]' id='answer-id-1650088' class='answer   answerof-426184 ' value='1650088'   \/><label for='answer-id-1650088' id='answer-label-1650088' class=' answer'><span>The system RAM is too small, causing excessive swapping.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426184[]' id='answer-id-1650089' class='answer   answerof-426184 ' value='1650089'   \/><label for='answer-id-1650089' id='answer-label-1650089' class=' answer'><span>The storage system (SSD\/NVMe) is too slow, leading to data starvation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426184[]' id='answer-id-1650090' class='answer   answerof-426184 ' value='1650090'   \/><label for='answer-id-1650090' id='answer-label-1650090' class=' answer'><span>The NCCL (NVIDIA Collective Communications Library) is not properly configured for inter-GPU communication.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-37' style=';'><div id='questionWrap-37'  class='   watupro-question-id-426185'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>37. <\/span>You observe high latency and low bandwidth between two GPUs connected via an NVLink switch. You suspect a problem with the NVLink link itself. <br \/>\r<br>Which of the following methods would be the most effective in diagnosing the physical NVLink link health?<\/div><input type='hidden' name='question_id[]' id='qID_37' value='426185' \/><input type='hidden' id='answerType426185' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426185[]' id='answer-id-1650091' class='answer   answerof-426185 ' value='1650091'   \/><label for='answer-id-1650091' id='answer-label-1650091' class=' answer'><span>Using \u2018iperf3\u2019 to measure network throughput between the servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426185[]' id='answer-id-1650092' class='answer   answerof-426185 ' value='1650092'   \/><label for='answer-id-1650092' id='answer-label-1650092' class=' answer'><span>Running a CUDA-aware memory bandwidth test specifically designed for NVLink.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426185[]' id='answer-id-1650093' class='answer   answerof-426185 ' value='1650093'   \/><label for='answer-id-1650093' id='answer-label-1650093' class=' answer'><span>Examining system logs for NVLink-related error messages.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426185[]' id='answer-id-1650094' class='answer   answerof-426185 ' value='1650094'   \/><label for='answer-id-1650094' id='answer-label-1650094' class=' answer'><span>Using \u2018ping\u2019 to check network connectivity between the servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426185[]' id='answer-id-1650095' class='answer   answerof-426185 ' value='1650095'   \/><label for='answer-id-1650095' id='answer-label-1650095' class=' answer'><span>Physically inspecting the NVLink cables for damage.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-38' style=';'><div id='questionWrap-38'  class='   watupro-question-id-426186'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>38. <\/span>You are installing a GPU server in a data center with limited cooling capacity. <br \/>\r<br>Which of the following server configuration choices would BEST help minimize the server\u2019s thermal output, without significantly compromising performance? Assume all options are compatible.<\/div><input type='hidden' name='question_id[]' id='qID_38' value='426186' \/><input type='hidden' id='answerType426186' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426186[]' id='answer-id-1650096' class='answer   answerof-426186 ' value='1650096'   \/><label for='answer-id-1650096' id='answer-label-1650096' class=' answer'><span>Choose GPUs with a lower TDP (Thermal Design Power), even if it means using older generation GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426186[]' id='answer-id-1650097' class='answer   answerof-426186 ' value='1650097'   \/><label for='answer-id-1650097' id='answer-label-1650097' class=' answer'><span>Use a passively cooled CPU to reduce fan noise and power consumption.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426186[]' id='answer-id-1650098' class='answer   answerof-426186 ' value='1650098'   \/><label for='answer-id-1650098' id='answer-label-1650098' class=' answer'><span>Configure the BIOS\/UEFI to aggressively throttle CPU and GPU frequencies under heavy load.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426186[]' id='answer-id-1650099' class='answer   answerof-426186 ' value='1650099'   \/><label for='answer-id-1650099' id='answer-label-1650099' class=' answer'><span>Implement liquid cooling for the GPUs and CPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-426186[]' id='answer-id-1650100' class='answer   answerof-426186 ' value='1650100'   \/><label for='answer-id-1650100' id='answer-label-1650100' class=' answer'><span>Increase the ambient temperature of the data center to reduce the temperature differential.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-39' style=';'><div id='questionWrap-39'  class='   watupro-question-id-426187'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>39. <\/span>You are experiencing link flapping (frequent up\/down transitions) on several InfiniBand links in your AI infrastructure. This is causing intermittent connectivity issues and performance degradation. <br \/>\r<br>What are the MOST likely causes of this issue, and what steps should you take to troubleshoot and resolve it? (Select TWO)<\/div><input type='hidden' name='question_id[]' id='qID_39' value='426187' \/><input type='hidden' id='answerType426187' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426187[]' id='answer-id-1650101' class='answer   answerof-426187 ' value='1650101'   \/><label for='answer-id-1650101' id='answer-label-1650101' class=' answer'><span>Incorrect MTU (Maximum Transmission Unit) configuration on the affected interfaces.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426187[]' id='answer-id-1650102' class='answer   answerof-426187 ' value='1650102'   \/><label for='answer-id-1650102' id='answer-label-1650102' class=' answer'><span>Faulty or damaged cables, connectors, or transceivers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426187[]' id='answer-id-1650103' class='answer   answerof-426187 ' value='1650103'   \/><label for='answer-id-1650103' id='answer-label-1650103' class=' answer'><span>Software bugs in the operating system or InfiniBand drivers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426187[]' id='answer-id-1650104' class='answer   answerof-426187 ' value='1650104'   \/><label for='answer-id-1650104' id='answer-label-1650104' class=' answer'><span>Mismatched link speeds or duplex settings between connected devices.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426187[]' id='answer-id-1650105' class='answer   answerof-426187 ' value='1650105'   \/><label for='answer-id-1650105' id='answer-label-1650105' class=' answer'><span>Excessive broadcast traffic causing congestion.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-40' style=';'><div id='questionWrap-40'  class='   watupro-question-id-426188'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>40. <\/span>You\u2019re deploying a new cluster with multiple NVIDIAAIOO GPUs per node. You want to ensure optimal inter-GPU communication performance using NVLink. <br \/>\r<br>Which of the following configurations are critical for achieving maximum NVLink bandwidth?<\/div><input type='hidden' name='question_id[]' id='qID_40' value='426188' \/><input type='hidden' id='answerType426188' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426188[]' id='answer-id-1650106' class='answer   answerof-426188 ' value='1650106'   \/><label for='answer-id-1650106' id='answer-label-1650106' class=' answer'><span>All GPUs within a node must be the same model and have identical firmware versions.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426188[]' id='answer-id-1650107' class='answer   answerof-426188 ' value='1650107'   \/><label for='answer-id-1650107' id='answer-label-1650107' class=' answer'><span>The motherboard must support PCle Gen5 to maximize NVLink bandwidth.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426188[]' id='answer-id-1650108' class='answer   answerof-426188 ' value='1650108'   \/><label for='answer-id-1650108' id='answer-label-1650108' class=' answer'><span>GPUs should be physically installed in slots that maximize direct NVLink connections based on the server\u2019s architecture.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426188[]' id='answer-id-1650109' class='answer   answerof-426188 ' value='1650109'   \/><label for='answer-id-1650109' id='answer-label-1650109' class=' answer'><span>The NVIDIA driver must be configured to enable NVLink; it is disabled by default.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-426188[]' id='answer-id-1650110' class='answer   answerof-426188 ' value='1650110'   \/><label for='answer-id-1650110' id='answer-label-1650110' class=' answer'><span>The server must use a specific CPU model to leverage NVLink capabilities.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div style='display:none' id='question-41'>\n\t<div class='question-content'>\n\t\t<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/img\/loading.gif\" width=\"16\" height=\"16\" alt=\"Loading...\" title=\"Loading...\" \/>&nbsp;Loading...\t<\/div>\n<\/div>\n\n<br \/>\n\t\n\t\t\t<div class=\"watupro_buttons flex \" id=\"watuPROButtons10794\" >\n\t\t  <div id=\"prev-question\" style=\"display:none;\"><input type=\"button\" value=\"&lt; Previous\" onclick=\"WatuPRO.nextQuestion(event, 'previous');\"\/><\/div>\t\t  \t\t  \t\t   \n\t\t   \t  \t\t<div><input type=\"button\" name=\"action\" class=\"watupro-submit-button\" onclick=\"WatuPRO.submitResult(event)\" id=\"action-button\" value=\"View Results\"  \/>\n\t\t<\/div>\n\t\t<\/div>\n\t\t\n\t<input type=\"hidden\" name=\"quiz_id\" value=\"10794\" id=\"watuPROExamID\"\/>\n\t<input type=\"hidden\" name=\"start_time\" id=\"startTime\" value=\"2026-05-05 19:09:30\" \/>\n\t<input type=\"hidden\" name=\"start_timestamp\" id=\"startTimeStamp\" value=\"1778008170\" \/>\n\t<input type=\"hidden\" name=\"question_ids\" value=\"\" \/>\n\t<input type=\"hidden\" name=\"watupro_questions\" value=\"426149:1649911,1649912,1649913,1649914,1649915 | 426150:1649916,1649917,1649918,1649919,1649920 | 426151:1649921,1649922,1649923,1649924,1649925 | 426152:1649926,1649927,1649928,1649929,1649930 | 426153:1649931,1649932,1649933,1649934,1649935 | 426154:1649936,1649937,1649938,1649939,1649940 | 426155:1649941,1649942,1649943,1649944,1649945 | 426156:1649946,1649947,1649948,1649949,1649950 | 426157:1649951,1649952,1649953,1649954,1649955 | 426158:1649956,1649957,1649958,1649959,1649960 | 426159:1649961,1649962,1649963,1649964,1649965 | 426160:1649966,1649967,1649968,1649969,1649970 | 426161:1649971,1649972,1649973,1649974,1649975 | 426162:1649976,1649977,1649978,1649979,1649980 | 426163:1649981,1649982,1649983,1649984,1649985 | 426164:1649986,1649987,1649988,1649989,1649990 | 426165:1649991,1649992,1649993,1649994,1649995 | 426166:1649996,1649997,1649998,1649999,1650000 | 426167:1650001,1650002,1650003,1650004,1650005 | 426168:1650006,1650007,1650008,1650009,1650010 | 426169:1650011,1650012,1650013,1650014,1650015 | 426170:1650016,1650017,1650018,1650019,1650020 | 426171:1650021,1650022,1650023,1650024,1650025 | 426172:1650026,1650027,1650028,1650029,1650030 | 426173:1650031,1650032,1650033,1650034,1650035 | 426174:1650036,1650037,1650038,1650039,1650040 | 426175:1650041,1650042,1650043,1650044,1650045 | 426176:1650046,1650047,1650048,1650049,1650050 | 426177:1650051,1650052,1650053,1650054,1650055 | 426178:1650056,1650057,1650058,1650059,1650060 | 426179:1650061,1650062,1650063,1650064,1650065 | 426180:1650066,1650067,1650068,1650069,1650070 | 426181:1650071,1650072,1650073,1650074,1650075 | 426182:1650076,1650077,1650078,1650079,1650080 | 426183:1650081,1650082,1650083,1650084,1650085 | 426184:1650086,1650087,1650088,1650089,1650090 | 426185:1650091,1650092,1650093,1650094,1650095 | 426186:1650096,1650097,1650098,1650099,1650100 | 426187:1650101,1650102,1650103,1650104,1650105 | 426188:1650106,1650107,1650108,1650109,1650110\" \/>\n\t<input type=\"hidden\" name=\"no_ajax\" value=\"0\">\t\t\t<\/form>\n\t<p>&nbsp;<\/p>\n<\/div>\n\n<script type=\"text\/javascript\">\n\/\/jQuery(document).ready(function(){\ndocument.addEventListener(\"DOMContentLoaded\", function(event) { \t\nvar question_ids = \"426149,426150,426151,426152,426153,426154,426155,426156,426157,426158,426159,426160,426161,426162,426163,426164,426165,426166,426167,426168,426169,426170,426171,426172,426173,426174,426175,426176,426177,426178,426179,426180,426181,426182,426183,426184,426185,426186,426187,426188\";\nWatuPROSettings[10794] = {};\nWatuPRO.qArr = question_ids.split(',');\nWatuPRO.exam_id = 10794;\t    \nWatuPRO.post_id = 110348;\nWatuPRO.store_progress = 0;\nWatuPRO.curCatPage = 1;\nWatuPRO.requiredIDs=\"0\".split(\",\");\nWatuPRO.hAppID = \"0.52391800 1778008170\";\nvar url = \"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/show_exam.php\";\nWatuPRO.examMode = 1;\nWatuPRO.siteURL=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-admin\/admin-ajax.php\";\nWatuPRO.emailIsNotRequired = 0;\nWatuPROIntel.init(10794);\nWatuPRO.inCategoryPages=1;});    \t \n<\/script>\n<p>&nbsp;<\/p>\n<h3>NVIDIA <a href=\"https:\/\/www.dumpsbase.com\/freedumps\/complete-your-nvidia-certified-professional-ai-infrastructure-exam-with-ncp-aii-dumps-v8-02-continue-to-check-ncp-aii-free-dumps-part-3-q81-q120.html\"><span style=\"background-color: #ffcc99;\"><em>NCP-AII free dumps (Part 3, Q81-Q120)<\/em><\/span><\/a> are also available online for checking.<\/h3>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>According to the feedback, most candidates have completed their NVIDIA Certified Professional AI Infrastructure (NCP-AII) certification with DumpsBase. The NCP-AII dumps (V8.02) are the ideal choice for busy professionals seeking reliable, high-impact results. You can check our NCP-AII free dumps (Part 1, Q1-Q40) online and verify our quality. Download the NVIDIA NCP-AII dumps (V8.02) and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18718,18913],"tags":[19852,19781],"class_list":["post-110348","post","type-post","status-publish","format-standard","hentry","category-nvidia","category-nvidia-certified-professional","tag-ncp-aii-free-dumps","tag-nvidia-certified-professional-ai-infrastructure-ncp-aii"],"_links":{"self":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/110348","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/comments?post=110348"}],"version-history":[{"count":2,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/110348\/revisions"}],"predecessor-version":[{"id":111635,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/110348\/revisions\/111635"}],"wp:attachment":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/media?parent=110348"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/categories?post=110348"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/tags?post=110348"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}