{"id":122484,"date":"2026-03-30T08:48:59","date_gmt":"2026-03-30T08:48:59","guid":{"rendered":"https:\/\/www.dumpsbase.com\/freedumps\/?p=122484"},"modified":"2026-03-30T08:49:51","modified_gmt":"2026-03-30T08:49:51","slug":"asking-for-more-ncp-aii-demo-questions-ncp-aii-free-dumps-part-2-q40-q79-of-v10-03-are-available-for-testing","status":"publish","type":"post","link":"https:\/\/www.dumpsbase.com\/freedumps\/asking-for-more-ncp-aii-demo-questions-ncp-aii-free-dumps-part-2-q40-q79-of-v10-03-are-available-for-testing.html","title":{"rendered":"Asking for More NCP-AII Demo Questions? &#8211; NCP-AII Free Dumps (Part 2, Q40-Q79) of V10.03 Are Available for Testing"},"content":{"rendered":"\r\n<p>It has been verified that the NCP-AII dumps (V10.03) with practice questions and answers are valid for passing the NVIDIA Certified Professional AI Infrastructure certification exam. And we have shared the <a href=\"https:\/\/www.dumpsbase.com\/freedumps\/ncp-aii-dumps-v10-03-ensure-your-2026-nvidia-certified-professional-ai-infrastructure-exam-preparation-ncp-aii-free-dumps-part-1-q1-q39-are-online.html\">NCP-AII free dumps (Part 1, Q1-Q39) of V10.03<\/a> online to help you check the quality. From the free demo questions, you can believe that DumpsBase helps you have a clear understanding of current objectives, hands-on troubleshooting skills, and the ability to perform under timed, performance-based testing conditions. With DumpsBase, the updated NCP-AII dumps (V10.03) make your exam easier to practice efficiently across devices, build confidence through realistic drills, and reinforce key AI infrastructure concepts through repeated exposure. Most are asking for more demo questions. Come here and read our NCP-AII free dumps (Part 2, Q40-Q79) of V10.03 today.<\/p>\r\n\r\n\r\n\r\n<h2 class=\"wp-block-heading has-vivid-cyan-blue-color has-text-color has-link-color wp-elements-139a20d44c6793f99e7c0a477feae901\">Below are our <span style=\"color: #ff9900;\"><em>NCP-AII free dumps (Part 2, Q40-Q79) of V10.03<\/em><\/span>, read now:<\/h2>\r\n\r\n\r\n<script>\n\t  window.fbAsyncInit = function() {\n\t    FB.init({\n\t      appId            : '622169541470367',\n\t      autoLogAppEvents : true,\n\t      xfbml            : true,\n\t      version          : 'v3.1'\n\t    });\n\t  };\n\t\n\t  (function(d, s, id){\n\t     var js, fjs = d.getElementsByTagName(s)[0];\n\t     if (d.getElementById(id)) {return;}\n\t     js = d.createElement(s); js.id = id;\n\t     js.src = \"https:\/\/connect.facebook.net\/en_US\/sdk.js\";\n\t     fjs.parentNode.insertBefore(js, fjs);\n\t   }(document, 'script', 'facebook-jssdk'));\n\t<\/script><script type=\"text\/javascript\" >\ndocument.addEventListener(\"DOMContentLoaded\", function(event) { \nif(!window.jQuery) alert(\"The important jQuery library is not properly loaded in your site. Your WordPress theme is probably missing the essential wp_head() call. You can switch to another theme and you will see that the plugin works fine and this notice disappears. If you are still not sure what to do you can contact us for help.\");\n});\n<\/script>  \n  \n<div  id=\"watupro_quiz\" class=\"quiz-area single-page-quiz\">\n<p id=\"submittingExam11887\" style=\"display:none;text-align:center;\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/img\/loading.gif\" width=\"16\" height=\"16\"><\/p>\n\n<div class=\"watupro-exam-description\" id=\"description-quiz-11887\"><\/div>\n\n<form action=\"\" method=\"post\" class=\"quiz-form\" id=\"quiz-11887\"  enctype=\"multipart\/form-data\" >\n<div class='watu-question ' id='question-1' style=';'><div id='questionWrap-1'  class='   watupro-question-id-465740'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>1. <\/span>Consider an AMD EPYC-based server with 8 NVIDIAAIOO GPUs connected via PCle Gen4. You\u2019re running a distributed training job using Horovod. You\u2019ve noticed that communication between GPUs is a bottleneck. <br \/>\r<br>Which of the following NCCL configuration options would be MOST beneficial in this scenario? (Assume all options are syntactically correct for NCCL).<\/div><input type='hidden' name='question_id[]' id='qID_1' value='465740' \/><input type='hidden' id='answerType465740' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465740[]' id='answer-id-1800212' class='answer   answerof-465740 ' value='1800212'   \/><label for='answer-id-1800212' id='answer-label-1800212' class=' answer'><span>NCCL SOCKET IF-NAME=eth0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465740[]' id='answer-id-1800213' class='answer   answerof-465740 ' value='1800213'   \/><label for='answer-id-1800213' id='answer-label-1800213' class=' answer'><span>NCCL IB DISABLE-1<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465740[]' id='answer-id-1800214' class='answer   answerof-465740 ' value='1800214'   \/><label for='answer-id-1800214' id='answer-label-1800214' class=' answer'><span>NCCL P2P DISABLE-0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465740[]' id='answer-id-1800215' class='answer   answerof-465740 ' value='1800215'   \/><label for='answer-id-1800215' id='answer-label-1800215' class=' answer'><span>NCCL IB HCA=mlx5 0<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465740[]' id='answer-id-1800216' class='answer   answerof-465740 ' value='1800216'   \/><label for='answer-id-1800216' id='answer-label-1800216' class=' answer'><span>NCCL NET PLUGIN=none<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-2' style=';'><div id='questionWrap-2'  class='   watupro-question-id-465741'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>2. <\/span>In an InfiniBand fabric, what is the primary role of the Subnet Manager (SM) with respect to routing?<\/div><input type='hidden' name='question_id[]' id='qID_2' value='465741' \/><input type='hidden' id='answerType465741' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465741[]' id='answer-id-1800217' class='answer   answerof-465741 ' value='1800217'   \/><label for='answer-id-1800217' id='answer-label-1800217' class=' answer'><span>To forward packets based on destination IP addresses, similar to a traditional IP router.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465741[]' id='answer-id-1800218' class='answer   answerof-465741 ' value='1800218'   \/><label for='answer-id-1800218' id='answer-label-1800218' class=' answer'><span>To discover the network topology, calculate routing paths, and program the forwarding tables (LID tables) in the switches.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465741[]' id='answer-id-1800219' class='answer   answerof-465741 ' value='1800219'   \/><label for='answer-id-1800219' id='answer-label-1800219' class=' answer'><span>To monitor the network for congestion and dynamically adjust packet priorities using Quality of Service (QOS) mechanisms.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465741[]' id='answer-id-1800220' class='answer   answerof-465741 ' value='1800220'   \/><label for='answer-id-1800220' id='answer-label-1800220' class=' answer'><span>To provide a command-line interface for users to manually configure routing tables on each InfiniBand switch.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465741[]' id='answer-id-1800221' class='answer   answerof-465741 ' value='1800221'   \/><label for='answer-id-1800221' id='answer-label-1800221' class=' answer'><span>To act as a firewall, blocking unauthorized traffic based on pre-defined rules.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-3' style=';'><div id='questionWrap-3'  class='   watupro-question-id-465742'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>3. <\/span>You have a large dataset stored on a BeeGFS file system. The training job is single node and uses data augmentation to generate more data on the fly. The data augmentation process is CPU-bound, but you notice that the GPU is underutilized due to the training data not being fed to the GPU fast enough. <br \/>\r<br>How can you reduce the load on the CPU and improve the overall training throughput?<\/div><input type='hidden' name='question_id[]' id='qID_3' value='465742' \/><input type='hidden' id='answerType465742' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465742[]' id='answer-id-1800222' class='answer   answerof-465742 ' value='1800222'   \/><label for='answer-id-1800222' id='answer-label-1800222' class=' answer'><span>Move the training data to a local NVMe drive on the training node.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465742[]' id='answer-id-1800223' class='answer   answerof-465742 ' value='1800223'   \/><label for='answer-id-1800223' id='answer-label-1800223' class=' answer'><span>Increase the number of BeeGFS metadata servers (MDSs) to improve metadata performance.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465742[]' id='answer-id-1800224' class='answer   answerof-465742 ' value='1800224'   \/><label for='answer-id-1800224' id='answer-label-1800224' class=' answer'><span>Implement asynchronous 1\/0 in the data loading pipeline using a library like NVIDIA DALI to offload data processing tasks from the CPU to the GP<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465742[]' id='answer-id-1800225' class='answer   answerof-465742 ' value='1800225'   \/><label for='answer-id-1800225' id='answer-label-1800225' class=' answer'><span>Decrease the batch size of the training job to reduce the amount of data being processed at each iteration.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465742[]' id='answer-id-1800226' class='answer   answerof-465742 ' value='1800226'   \/><label for='answer-id-1800226' id='answer-label-1800226' class=' answer'><span>Enable data compression on the BeeGFS file system to reduce the amount of data being transferred over the network.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-4' style=';'><div id='questionWrap-4'  class='   watupro-question-id-465743'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>4. <\/span>A server with eight NVIDIAAIOO GPUs experiences frequent CUDA errors during large model training. \u2018nvidia-smi\u2019 reports seemingly normal temperatures for all GPUs. However, upon closer inspection using IPMI, the inlet temperature for GPUs 3 and 4 is significantly higher than others. <br \/>\r<br>What is the MOST likely cause and the immediate action to take?<\/div><input type='hidden' name='question_id[]' id='qID_4' value='465743' \/><input type='hidden' id='answerType465743' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465743[]' id='answer-id-1800227' class='answer   answerof-465743 ' value='1800227'   \/><label for='answer-id-1800227' id='answer-label-1800227' class=' answer'><span>A driver issue is causing incorrect temperature reporting; reinstall the NVIDIA driver.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465743[]' id='answer-id-1800228' class='answer   answerof-465743 ' value='1800228'   \/><label for='answer-id-1800228' id='answer-label-1800228' class=' answer'><span>The temperature sensors on GPUs 3 and 4 are faulty; replace the GPUs immediately.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465743[]' id='answer-id-1800229' class='answer   answerof-465743 ' value='1800229'   \/><label for='answer-id-1800229' id='answer-label-1800229' class=' answer'><span>There is a localized airflow problem affecting GPUs 3 and 4; check fan speeds and airflow obstructions.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465743[]' id='answer-id-1800230' class='answer   answerof-465743 ' value='1800230'   \/><label for='answer-id-1800230' id='answer-label-1800230' class=' answer'><span>The power supply is failing to provide sufficient power to GPUs 3 and 4; replace the power supply.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465743[]' id='answer-id-1800231' class='answer   answerof-465743 ' value='1800231'   \/><label for='answer-id-1800231' id='answer-label-1800231' class=' answer'><span>A software bug in the CUDA toolkit is causing the errors; downgrade to an earlier version.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-5' style=';'><div id='questionWrap-5'  class='   watupro-question-id-465744'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>5. <\/span>You\u2019re optimizing an Intel Xeon server with 4 NVIDIAAIOO GPUs for a computer vision application that uses CODA. You notice that the GPU utilization is fluctuating significantly, and performance is inconsistent. Using \u2018nvprof, you identify that there are frequent stalls in the CUDA kernels due to thread divergence. <br \/>\r<br>What are possible causes and solutions?<\/div><input type='hidden' name='question_id[]' id='qID_5' value='465744' \/><input type='hidden' id='answerType465744' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465744[]' id='answer-id-1800232' class='answer   answerof-465744 ' value='1800232'   \/><label for='answer-id-1800232' id='answer-label-1800232' class=' answer'><span>The input data is not properly aligned in memory. Ensure that data is aligned to 128-byte boundaries using aligned memory allocation techniques.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465744[]' id='answer-id-1800233' class='answer   answerof-465744 ' value='1800233'   \/><label for='answer-id-1800233' id='answer-label-1800233' class=' answer'><span>The CUDA code contains conditional branches that lead to different execution paths for different threads within the same warp. Rewrite the CUDA code to minimize branching and favor uniform execution paths within warps.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465744[]' id='answer-id-1800234' class='answer   answerof-465744 ' value='1800234'   \/><label for='answer-id-1800234' id='answer-label-1800234' class=' answer'><span>The GPUs are overheating, causing thermal throttling. Improve the server\u2019s cooling.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465744[]' id='answer-id-1800235' class='answer   answerof-465744 ' value='1800235'   \/><label for='answer-id-1800235' id='answer-label-1800235' class=' answer'><span>The CUDA compiler is generating suboptimal code. Try using different compiler optimization flags (e.g., \u2018-O3\u2019 or \u2018-ftz=true\u2019).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465744[]' id='answer-id-1800236' class='answer   answerof-465744 ' value='1800236'   \/><label for='answer-id-1800236' id='answer-label-1800236' class=' answer'><span>The CUDA driver version is incompatible with the CUDA toolkit version. Update the CUDA driver to a compatible version.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-6' style=';'><div id='questionWrap-6'  class='   watupro-question-id-465745'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>6. <\/span>You have an Intel Xeon Gold server with 2 NVIDIA Tesla VI 00 GPUs. After deploying your A1 application, you observe that one GPU is consistently running at a significantly higher temperature than the other <br \/>\r<br>What could be a plausible reason for this behavior?<\/div><input type='hidden' name='question_id[]' id='qID_6' value='465745' \/><input type='hidden' id='answerType465745' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465745[]' id='answer-id-1800237' class='answer   answerof-465745 ' value='1800237'   \/><label for='answer-id-1800237' id='answer-label-1800237' class=' answer'><span>One GPU is defective and drawing excessive power.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465745[]' id='answer-id-1800238' class='answer   answerof-465745 ' value='1800238'   \/><label for='answer-id-1800238' id='answer-label-1800238' class=' answer'><span>The server\u2019s airflow is inadequate, causing poor cooling for one of the GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465745[]' id='answer-id-1800239' class='answer   answerof-465745 ' value='1800239'   \/><label for='answer-id-1800239' id='answer-label-1800239' class=' answer'><span>The workload is not evenly distributed between the GPUs, causing one GPU to be more heavily utilized.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465745[]' id='answer-id-1800240' class='answer   answerof-465745 ' value='1800240'   \/><label for='answer-id-1800240' id='answer-label-1800240' class=' answer'><span>One GPU\u2019s driver version is outdated, leading to inefficient power management.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465745[]' id='answer-id-1800241' class='answer   answerof-465745 ' value='1800241'   \/><label for='answer-id-1800241' id='answer-label-1800241' class=' answer'><span>The ambient temperature in the server room is higher on one side of the rack.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-7' style=';'><div id='questionWrap-7'  class='   watupro-question-id-465746'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>7. <\/span>You are troubleshooting slow I\/O performance in a deep learning training environment utilizing BeeGFS parallel file system. You suspect the metadata operations are bottlenecking the training process. <br \/>\r<br>How can you optimize metadata handling in BeeGFS to potentially improve performance?<\/div><input type='hidden' name='question_id[]' id='qID_7' value='465746' \/><input type='hidden' id='answerType465746' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465746[]' id='answer-id-1800242' class='answer   answerof-465746 ' value='1800242'   \/><label for='answer-id-1800242' id='answer-label-1800242' class=' answer'><span>Increase the number of storage targets (OSTs) to distribute the data across more devices.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465746[]' id='answer-id-1800243' class='answer   answerof-465746 ' value='1800243'   \/><label for='answer-id-1800243' id='answer-label-1800243' class=' answer'><span>Implement data striping across multiple OSTs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465746[]' id='answer-id-1800244' class='answer   answerof-465746 ' value='1800244'   \/><label for='answer-id-1800244' id='answer-label-1800244' class=' answer'><span>Increase the number of metadata servers (MDSs) and distribute the metadata load across them.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465746[]' id='answer-id-1800245' class='answer   answerof-465746 ' value='1800245'   \/><label for='answer-id-1800245' id='answer-label-1800245' class=' answer'><span>Enable client-side caching of metadata on the training nodes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465746[]' id='answer-id-1800246' class='answer   answerof-465746 ' value='1800246'   \/><label for='answer-id-1800246' id='answer-label-1800246' class=' answer'><span>Configure BeeGFS to use a different network protocol with lower overhead.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-8' style=';'><div id='questionWrap-8'  class='   watupro-question-id-465747'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>8. <\/span>You are setting up network fabric ports for hosts in an NVIDIA-Certified Professional A1 Infrastructure (NCP-AII) environment. You need to configure Jumbo Frames to improve network throughput. <br \/>\r<br>What is the typical MTU (Maximum Transmission Unit) size you would set on the network interfaces and switches, and why?<\/div><input type='hidden' name='question_id[]' id='qID_8' value='465747' \/><input type='hidden' id='answerType465747' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465747[]' id='answer-id-1800247' class='answer   answerof-465747 ' value='1800247'   \/><label for='answer-id-1800247' id='answer-label-1800247' class=' answer'><span>1500 bytes, as it\u2019s the default and compatible with most networks.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465747[]' id='answer-id-1800248' class='answer   answerof-465747 ' value='1800248'   \/><label for='answer-id-1800248' id='answer-label-1800248' class=' answer'><span>9000 bytes, also known as Jumbo Frames, reduces overhead and improves throughput for large data transfers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465747[]' id='answer-id-1800249' class='answer   answerof-465747 ' value='1800249'   \/><label for='answer-id-1800249' id='answer-label-1800249' class=' answer'><span>65535 bytes, the theoretical maximum MTU size, for maximum performance.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465747[]' id='answer-id-1800250' class='answer   answerof-465747 ' value='1800250'   \/><label for='answer-id-1800250' id='answer-label-1800250' class=' answer'><span>576 bytes, the minimum MTU size required by IPv4.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465747[]' id='answer-id-1800251' class='answer   answerof-465747 ' value='1800251'   \/><label for='answer-id-1800251' id='answer-label-1800251' class=' answer'><span>Any MTU size between 1500 and 9000 bytes; the specific value doesn\u2019t matter.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-9' style=';'><div id='questionWrap-9'  class='   watupro-question-id-465748'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>9. <\/span>You are deploying a new A1 cluster using RoCEv2 over a lossless Ethernet fabric. <br \/>\r<br>Which of the following QOS (Quality of Service) mechanisms is critical for ensuring reliable RDMA communication?<\/div><input type='hidden' name='question_id[]' id='qID_9' value='465748' \/><input type='hidden' id='answerType465748' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465748[]' id='answer-id-1800252' class='answer   answerof-465748 ' value='1800252'   \/><label for='answer-id-1800252' id='answer-label-1800252' class=' answer'><span>DSCP (Differentiated Services Code Point) marking<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465748[]' id='answer-id-1800253' class='answer   answerof-465748 ' value='1800253'   \/><label for='answer-id-1800253' id='answer-label-1800253' class=' answer'><span>ECN (Explicit Congestion Notification)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465748[]' id='answer-id-1800254' class='answer   answerof-465748 ' value='1800254'   \/><label for='answer-id-1800254' id='answer-label-1800254' class=' answer'><span>PFC (Priority Flow control)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465748[]' id='answer-id-1800255' class='answer   answerof-465748 ' value='1800255'   \/><label for='answer-id-1800255' id='answer-label-1800255' class=' answer'><span>ACL (Access Control List)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465748[]' id='answer-id-1800256' class='answer   answerof-465748 ' value='1800256'   \/><label for='answer-id-1800256' id='answer-label-1800256' class=' answer'><span>Rate Limiting<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-10' style=';'><div id='questionWrap-10'  class='   watupro-question-id-465749'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>10. <\/span>You need to remotely monitor the GPU temperature and utilization of a server without installing any additional software on the server itself. <br \/>\r<br>Assuming you have network access to the server\u2019s BMC (Baseboard Management Controller), which protocol and standard data format would BEST facilitate this?<\/div><input type='hidden' name='question_id[]' id='qID_10' value='465749' \/><input type='hidden' id='answerType465749' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465749[]' id='answer-id-1800257' class='answer   answerof-465749 ' value='1800257'   \/><label for='answer-id-1800257' id='answer-label-1800257' class=' answer'><span>SNMP (Simple Network Management Protocol) with MIB (Management Information Base)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465749[]' id='answer-id-1800258' class='answer   answerof-465749 ' value='1800258'   \/><label for='answer-id-1800258' id='answer-label-1800258' class=' answer'><span>HTTP with JSON<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465749[]' id='answer-id-1800259' class='answer   answerof-465749 ' value='1800259'   \/><label for='answer-id-1800259' id='answer-label-1800259' class=' answer'><span>SSH with plain text output from \u2018nvidia-smi\u2019<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465749[]' id='answer-id-1800260' class='answer   answerof-465749 ' value='1800260'   \/><label for='answer-id-1800260' id='answer-label-1800260' class=' answer'><span>IPMI (Intelligent Platform Management Interface) with SDR (Sensor Data Records)<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465749[]' id='answer-id-1800261' class='answer   answerof-465749 ' value='1800261'   \/><label for='answer-id-1800261' id='answer-label-1800261' class=' answer'><span>Syslog with CSV (Comma-separated Values)<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-11' style=';'><div id='questionWrap-11'  class='   watupro-question-id-465750'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>11. <\/span>You\u2019re monitoring the storage I\/O for an AI training workload and observe high disk utilization but relatively low CPU utilization. <br \/>\r<br>Which of the following actions is LEAST likely to improve the performance of the training job?<\/div><input type='hidden' name='question_id[]' id='qID_11' value='465750' \/><input type='hidden' id='answerType465750' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465750[]' id='answer-id-1800262' class='answer   answerof-465750 ' value='1800262'   \/><label for='answer-id-1800262' id='answer-label-1800262' class=' answer'><span>Switching from HDDs to NVMe SSDs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465750[]' id='answer-id-1800263' class='answer   answerof-465750 ' value='1800263'   \/><label for='answer-id-1800263' id='answer-label-1800263' class=' answer'><span>Implementing data prefetching to load data into memory before it\u2019s needed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465750[]' id='answer-id-1800264' class='answer   answerof-465750 ' value='1800264'   \/><label for='answer-id-1800264' id='answer-label-1800264' class=' answer'><span>Increasing the batch size of the training job.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465750[]' id='answer-id-1800265' class='answer   answerof-465750 ' value='1800265'   \/><label for='answer-id-1800265' id='answer-label-1800265' class=' answer'><span>Adding more RAM to the system.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465750[]' id='answer-id-1800266' class='answer   answerof-465750 ' value='1800266'   \/><label for='answer-id-1800266' id='answer-label-1800266' class=' answer'><span>Reducing the number of parallel data loading threads.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-12' style=';'><div id='questionWrap-12'  class='   watupro-question-id-465751'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>12. <\/span>Your A1 inference server utilizes Triton Inference Server and experiences intermittent latency spikes. Profiling reveals that the GPU is frequently stalling due to memory allocation issues. <br \/>\r<br>Which strategy or tool would be least effective in mitigating these memory allocation stalls?<\/div><input type='hidden' name='question_id[]' id='qID_12' value='465751' \/><input type='hidden' id='answerType465751' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465751[]' id='answer-id-1800267' class='answer   answerof-465751 ' value='1800267'   \/><label for='answer-id-1800267' id='answer-label-1800267' class=' answer'><span>Using CIJDA memory pools to pre-allocate memory and reduce allocation overhead during inference requests.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465751[]' id='answer-id-1800268' class='answer   answerof-465751 ' value='1800268'   \/><label for='answer-id-1800268' id='answer-label-1800268' class=' answer'><span>Enabling CUDA graph capture to reduce kernel launch overhead.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465751[]' id='answer-id-1800269' class='answer   answerof-465751 ' value='1800269'   \/><label for='answer-id-1800269' id='answer-label-1800269' class=' answer'><span>Reducing the model\u2019s memory footprint by using quantization or pruning techniques.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465751[]' id='answer-id-1800270' class='answer   answerof-465751 ' value='1800270'   \/><label for='answer-id-1800270' id='answer-label-1800270' class=' answer'><span>Increasing the GPU\u2019s TCC (Tesla Compute Cluster) mode priority.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465751[]' id='answer-id-1800271' class='answer   answerof-465751 ' value='1800271'   \/><label for='answer-id-1800271' id='answer-label-1800271' class=' answer'><span>Optimize the model using TensorR<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-13' style=';'><div id='questionWrap-13'  class='   watupro-question-id-465752'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>13. <\/span>You are deploying a new NVLink Switch based cluster. The GPUs are installed in different servers, but need to be configured to utilize <br \/>\r<br>NVLink interconnect. <br \/>\r<br>Which of the following should be performed during the installation phase to confirm correct configuration?<\/div><input type='hidden' name='question_id[]' id='qID_13' value='465752' \/><input type='hidden' id='answerType465752' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465752[]' id='answer-id-1800272' class='answer   answerof-465752 ' value='1800272'   \/><label for='answer-id-1800272' id='answer-label-1800272' class=' answer'><span>Run NCCL tests to verify the GPU-to-GPU bandwidth and latency between servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465752[]' id='answer-id-1800273' class='answer   answerof-465752 ' value='1800273'   \/><label for='answer-id-1800273' id='answer-label-1800273' class=' answer'><span>Verify that GPUDirect RDMA is enabled and functioning correctly.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465752[]' id='answer-id-1800274' class='answer   answerof-465752 ' value='1800274'   \/><label for='answer-id-1800274' id='answer-label-1800274' class=' answer'><span>Check that the \u2018nvidia-sm\u2019 command shows the correct NVLink topology.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465752[]' id='answer-id-1800275' class='answer   answerof-465752 ' value='1800275'   \/><label for='answer-id-1800275' id='answer-label-1800275' class=' answer'><span>Run standard TCP\/IP network bandwidth tests to check inter-server communication.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465752[]' id='answer-id-1800276' class='answer   answerof-465752 ' value='1800276'   \/><label for='answer-id-1800276' id='answer-label-1800276' class=' answer'><span>All the GPU\u2019s are in the same IP subnet<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-14' style=';'><div id='questionWrap-14'  class='   watupro-question-id-465753'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>14. <\/span>You are configuring a Mellanox InfiniBand network for a DGXAIOO cluster. <br \/>\r<br>What is the RECOMMENDED subnet manager for a large, high-performance A1 training environment, and why?<\/div><input type='hidden' name='question_id[]' id='qID_14' value='465753' \/><input type='hidden' id='answerType465753' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465753[]' id='answer-id-1800277' class='answer   answerof-465753 ' value='1800277'   \/><label for='answer-id-1800277' id='answer-label-1800277' class=' answer'><span>OpenSM, because it\u2019s the default and easiest to configure.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465753[]' id='answer-id-1800278' class='answer   answerof-465753 ' value='1800278'   \/><label for='answer-id-1800278' id='answer-label-1800278' class=' answer'><span>UFM (Unified Fabric Manager), because it provides advanced management, monitoring, and optimization capabilities.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465753[]' id='answer-id-1800279' class='answer   answerof-465753 ' value='1800279'   \/><label for='answer-id-1800279' id='answer-label-1800279' class=' answer'><span>IBA management tools that ship with the OS (e.g., \u2018ibnetdiscover\u2019).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465753[]' id='answer-id-1800280' class='answer   answerof-465753 ' value='1800280'   \/><label for='answer-id-1800280' id='answer-label-1800280' class=' answer'><span>Any subnet manager; the performance difference is negligible.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465753[]' id='answer-id-1800281' class='answer   answerof-465753 ' value='1800281'   \/><label for='answer-id-1800281' id='answer-label-1800281' class=' answer'><span>A custom-built subnet manager using the InfiniBand verbs AP<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-15' style=';'><div id='questionWrap-15'  class='   watupro-question-id-465754'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>15. <\/span>Consider a scenario where you are using NCCL (NVIDIA Collective Communications Library) for multi-GPU training across multiple servers connected via NVLink switches. <br \/>\r<br>Which NCCL environment variable would you use to specify the network interface to be used for communication?<\/div><input type='hidden' name='question_id[]' id='qID_15' value='465754' \/><input type='hidden' id='answerType465754' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465754[]' id='answer-id-1800282' class='answer   answerof-465754 ' value='1800282'   \/><label for='answer-id-1800282' id='answer-label-1800282' class=' answer'><span>NCCL PORT<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465754[]' id='answer-id-1800283' class='answer   answerof-465754 ' value='1800283'   \/><label for='answer-id-1800283' id='answer-label-1800283' class=' answer'><span>NCCL SOCKET IFNAME<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465754[]' id='answer-id-1800284' class='answer   answerof-465754 ' value='1800284'   \/><label for='answer-id-1800284' id='answer-label-1800284' class=' answer'><span>NCCL NET INTERFACE<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465754[]' id='answer-id-1800285' class='answer   answerof-465754 ' value='1800285'   \/><label for='answer-id-1800285' id='answer-label-1800285' class=' answer'><span>NCCL 1B HCA<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465754[]' id='answer-id-1800286' class='answer   answerof-465754 ' value='1800286'   \/><label for='answer-id-1800286' id='answer-label-1800286' class=' answer'><span>NCCL COMM ID<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-16' style=';'><div id='questionWrap-16'  class='   watupro-question-id-465755'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>16. <\/span>Which of the following is the MOST important reason for using a dedicated storage network (e.g., InfiniBand or RoCE) for AI\/ML workloads compared to using the existing Ethernet network?<\/div><input type='hidden' name='question_id[]' id='qID_16' value='465755' \/><input type='hidden' id='answerType465755' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465755[]' id='answer-id-1800287' class='answer   answerof-465755 ' value='1800287'   \/><label for='answer-id-1800287' id='answer-label-1800287' class=' answer'><span>Improved security due to network isolation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465755[]' id='answer-id-1800288' class='answer   answerof-465755 ' value='1800288'   \/><label for='answer-id-1800288' id='answer-label-1800288' class=' answer'><span>Lower latency and higher bandwidth for data transfer.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465755[]' id='answer-id-1800289' class='answer   answerof-465755 ' value='1800289'   \/><label for='answer-id-1800289' id='answer-label-1800289' class=' answer'><span>Simplified network management and configuration.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465755[]' id='answer-id-1800290' class='answer   answerof-465755 ' value='1800290'   \/><label for='answer-id-1800290' id='answer-label-1800290' class=' answer'><span>Reduced cost compared to upgrading the existing Ethernet infrastructure.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465755[]' id='answer-id-1800291' class='answer   answerof-465755 ' value='1800291'   \/><label for='answer-id-1800291' id='answer-label-1800291' class=' answer'><span>Automatic Quality of Service (QOS) prioritization for AI\/ML traffic.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-17' style=';'><div id='questionWrap-17'  class='   watupro-question-id-465756'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>17. <\/span>Which of the following statements regarding VXLAN (Virtual Extensible LAN) is MOST accurate in the context of data center networking for AI\/ML workloads?<\/div><input type='hidden' name='question_id[]' id='qID_17' value='465756' \/><input type='hidden' id='answerType465756' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465756[]' id='answer-id-1800292' class='answer   answerof-465756 ' value='1800292'   \/><label for='answer-id-1800292' id='answer-label-1800292' class=' answer'><span>VXLAN provides Layer 2 connectivity across Layer 3 networks, enabling virtual machine mobility.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465756[]' id='answer-id-1800293' class='answer   answerof-465756 ' value='1800293'   \/><label for='answer-id-1800293' id='answer-label-1800293' class=' answer'><span>VXLAN primarily improves network security by encrypting all traffic.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465756[]' id='answer-id-1800294' class='answer   answerof-465756 ' value='1800294'   \/><label for='answer-id-1800294' id='answer-label-1800294' class=' answer'><span>VXLAN is only suitable for small-scale networks due to its limited scalability.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465756[]' id='answer-id-1800295' class='answer   answerof-465756 ' value='1800295'   \/><label for='answer-id-1800295' id='answer-label-1800295' class=' answer'><span>VXLAN reduces network overhead compared to traditional VLANs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465756[]' id='answer-id-1800296' class='answer   answerof-465756 ' value='1800296'   \/><label for='answer-id-1800296' id='answer-label-1800296' class=' answer'><span>VXLAN requires specialized hardware and cannot be implemented in software.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-18' style=';'><div id='questionWrap-18'  class='   watupro-question-id-465757'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>18. <\/span>You are tasked with diagnosing performance issues on a GPU server running a large-scale HPC simulation. The simulation utilizes multiple GPUs and InfiniBand for inter-GPU communication. You suspect that RDMA (Remote Direct Memory Access) is not functioning correctly. <br \/>\r<br>How would you comprehensively test and verify the proper operation of RDMA between the GPUs?<\/div><input type='hidden' name='question_id[]' id='qID_18' value='465757' \/><input type='hidden' id='answerType465757' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465757[]' id='answer-id-1800297' class='answer   answerof-465757 ' value='1800297'   \/><label for='answer-id-1800297' id='answer-label-1800297' class=' answer'><span>Use \u2018ping\u2019 to verify basic network connectivity between the server\u2019s InfiniBand interfaces.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465757[]' id='answer-id-1800298' class='answer   answerof-465757 ' value='1800298'   \/><label for='answer-id-1800298' id='answer-label-1800298' class=' answer'><span>Employ and from the \u2018perftest\u2019 suite to measure RDMA bandwidth and latency between GPUs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465757[]' id='answer-id-1800299' class='answer   answerof-465757 ' value='1800299'   \/><label for='answer-id-1800299' id='answer-label-1800299' class=' answer'><span>Run \u2018nvidia-smi topo -m\u2019 to check the GPU interconnect topology and verify that NVLink or PCle is being used for communication.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465757[]' id='answer-id-1800300' class='answer   answerof-465757 ' value='1800300'   \/><label for='answer-id-1800300' id='answer-label-1800300' class=' answer'><span>Utilize NCCL\u2019s internal diagnostic tools to verify proper inter-GPU communication within the simulation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465757[]' id='answer-id-1800301' class='answer   answerof-465757 ' value='1800301'   \/><label for='answer-id-1800301' id='answer-label-1800301' class=' answer'><span>Monitor CPU utilization during the simulation; high CPU usage suggests that RDMA is not offloading communication effectively.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-19' style=';'><div id='questionWrap-19'  class='   watupro-question-id-465758'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>19. <\/span>You are troubleshooting a network performance issue in your NCP-AII environment. <br \/>\r<br>After running \u2018ibstat\u2019 on a host, you see the following output for one of the InfiniBand ports: <br \/>\r<br><br><img decoding=\"async\" width=649 height=8 id=\"\u56fe\u7247 33\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image001-37.jpg\"><br><br \/>\r<br>What does the \u2018LMC: 0\u2019 indicate, and what are the implications for network performance?<\/div><input type='hidden' name='question_id[]' id='qID_19' value='465758' \/><input type='hidden' id='answerType465758' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465758[]' id='answer-id-1800302' class='answer   answerof-465758 ' value='1800302'   \/><label for='answer-id-1800302' id='answer-label-1800302' class=' answer'><span>LMC: 0 indicates that the link is down and not functioning correctly.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465758[]' id='answer-id-1800303' class='answer   answerof-465758 ' value='1800303'   \/><label for='answer-id-1800303' id='answer-label-1800303' class=' answer'><span>LMC: 0 indicates that Link Aggregation (LAG) is not enabled on this port, meaning only a single link is being used for communication.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465758[]' id='answer-id-1800304' class='answer   answerof-465758 ' value='1800304'   \/><label for='answer-id-1800304' id='answer-label-1800304' class=' answer'><span>LMC: 0 indicates the port is operating at the lowest possible speed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465758[]' id='answer-id-1800305' class='answer   answerof-465758 ' value='1800305'   \/><label for='answer-id-1800305' id='answer-label-1800305' class=' answer'><span>LMC: 0 indicates that the Subnet Manager is not running correctly.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465758[]' id='answer-id-1800306' class='answer   answerof-465758 ' value='1800306'   \/><label for='answer-id-1800306' id='answer-label-1800306' class=' answer'><span>LMC: 0 is the default and expected value; it has no impact on performance.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-20' style=';'><div id='questionWrap-20'  class='   watupro-question-id-465759'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>20. <\/span>You\u2019re designing a new InfiniBand network for a distributed deep learning workload. The workload consists of a mix of large-message all- to-all communication and small-message parameter synchronization. <br \/>\r<br>Considering the different traffic patterns, what routing strategy would MOST effectively minimize latency and maximize bandwidth utilization across the fabric?<\/div><input type='hidden' name='question_id[]' id='qID_20' value='465759' \/><input type='hidden' id='answerType465759' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465759[]' id='answer-id-1800307' class='answer   answerof-465759 ' value='1800307'   \/><label for='answer-id-1800307' id='answer-label-1800307' class=' answer'><span>Rely solely on the default Subnet Manager (SM) with a Min Hop path selection algorithm.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465759[]' id='answer-id-1800308' class='answer   answerof-465759 ' value='1800308'   \/><label for='answer-id-1800308' id='answer-label-1800308' class=' answer'><span>Implement a static routing scheme with manually configured forwarding tables on each switch.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465759[]' id='answer-id-1800309' class='answer   answerof-465759 ' value='1800309'   \/><label for='answer-id-1800309' id='answer-label-1800309' class=' answer'><span>Utilize a combination of Adaptive Routing (AR) to handle dynamic traffic patterns and Quality of Service (QOS) to prioritize small-message parameter synchronization.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465759[]' id='answer-id-1800310' class='answer   answerof-465759 ' value='1800310'   \/><label for='answer-id-1800310' id='answer-label-1800310' class=' answer'><span>Implement a purely deterministic routing scheme, disabling all adaptive routing features.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465759[]' id='answer-id-1800311' class='answer   answerof-465759 ' value='1800311'   \/><label for='answer-id-1800311' id='answer-label-1800311' class=' answer'><span>Disable multicast.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-21' style=';'><div id='questionWrap-21'  class='   watupro-question-id-465760'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>21. <\/span>An AI server with 8 GPUs is experiencing random system crashes under heavy load. The system logs indicate potential memory errors, but standard memory tests (memtest86+) pass without any failures. The GPUs are passively cooled. <br \/>\r<br>What are the THREE most likely root causes of these crashes?<\/div><input type='hidden' name='question_id[]' id='qID_21' value='465760' \/><input type='hidden' id='answerType465760' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465760[]' id='answer-id-1800312' class='answer   answerof-465760 ' value='1800312'   \/><label for='answer-id-1800312' id='answer-label-1800312' class=' answer'><span>Incompatible NVIDIA driver version with the installed Linux kernel.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465760[]' id='answer-id-1800313' class='answer   answerof-465760 ' value='1800313'   \/><label for='answer-id-1800313' id='answer-label-1800313' class=' answer'><span>GPIJ memory errors that are not detectable by standard CPU-based memory tests.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465760[]' id='answer-id-1800314' class='answer   answerof-465760 ' value='1800314'   \/><label for='answer-id-1800314' id='answer-label-1800314' class=' answer'><span>Insufficient airflow within the server, leading to overheating of the GPUs and VRMs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465760[]' id='answer-id-1800315' class='answer   answerof-465760 ' value='1800315'   \/><label for='answer-id-1800315' id='answer-label-1800315' class=' answer'><span>A faulty power supply unit (PSU) that is unable to provide stable power under peak load.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465760[]' id='answer-id-1800316' class='answer   answerof-465760 ' value='1800316'   \/><label for='answer-id-1800316' id='answer-label-1800316' class=' answer'><span>Network congestion causing intermittent data corruption during distributed training.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-22' style=';'><div id='questionWrap-22'  class='   watupro-question-id-465761'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>22. <\/span>You\u2019re optimizing an Intel Xeon server with 4 NVIDIA GPUs for inference serving using Triton Inference Server. You\u2019ve deployed multiple models concurrently. You observe that the overall throughput is lower than expected, and the GPU utilization is not consistently high. <br \/>\r<br>What are potential bottlenecks and optimization strategies? (Select all that apply)<\/div><input type='hidden' name='question_id[]' id='qID_22' value='465761' \/><input type='hidden' id='answerType465761' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465761[]' id='answer-id-1800317' class='answer   answerof-465761 ' value='1800317'   \/><label for='answer-id-1800317' id='answer-label-1800317' class=' answer'><span>Model loading and unloading overhead. Use model ensemble or dynamic batching to reduce frequency.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465761[]' id='answer-id-1800318' class='answer   answerof-465761 ' value='1800318'   \/><label for='answer-id-1800318' id='answer-label-1800318' class=' answer'><span>Insufficient CPU cores to handle the model loading and preprocessing requests. Increase the number of Triton instance groups for CPU-based models.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465761[]' id='answer-id-1800319' class='answer   answerof-465761 ' value='1800319'   \/><label for='answer-id-1800319' id='answer-label-1800319' class=' answer'><span>The models are memory-bound. Reduce the model precision (e.g., FP32 to FP16 or INT8).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465761[]' id='answer-id-1800320' class='answer   answerof-465761 ' value='1800320'   \/><label for='answer-id-1800320' id='answer-label-1800320' class=' answer'><span>The GPUs are underutilized due to small batch sizes. Implement dynamic batching to increase batch sizes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465761[]' id='answer-id-1800321' class='answer   answerof-465761 ' value='1800321'   \/><label for='answer-id-1800321' id='answer-label-1800321' class=' answer'><span>Insufficient PCle bandwidth between CPU and GPIJs. Reconfigure PCle lanes to improve bandwidth allocation to each GPI<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-23' style=';'><div id='questionWrap-23'  class='   watupro-question-id-465762'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>23. <\/span>You\u2019re working with a large dataset of microscopy images stored as individual TIFF files. The images are accessed randomly during a training job. The current storage solution is a single HDD. You\u2019re tasked with improving data loading performance. <br \/>\r<br>Which of the following storage optimizations would provide the GREATEST performance improvement in this specific scenario?<\/div><input type='hidden' name='question_id[]' id='qID_23' value='465762' \/><input type='hidden' id='answerType465762' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465762[]' id='answer-id-1800322' class='answer   answerof-465762 ' value='1800322'   \/><label for='answer-id-1800322' id='answer-label-1800322' class=' answer'><span>Implementing data deduplication on the storage volume.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465762[]' id='answer-id-1800323' class='answer   answerof-465762 ' value='1800323'   \/><label for='answer-id-1800323' id='answer-label-1800323' class=' answer'><span>Migrating the data to a large, sequential HD<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465762[]' id='answer-id-1800324' class='answer   answerof-465762 ' value='1800324'   \/><label for='answer-id-1800324' id='answer-label-1800324' class=' answer'><span>Replacing the HDD with a RAID 5 array of HDDs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465762[]' id='answer-id-1800325' class='answer   answerof-465762 ' value='1800325'   \/><label for='answer-id-1800325' id='answer-label-1800325' class=' answer'><span>Replacing the HDD with a single NVMe SS<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465762[]' id='answer-id-1800326' class='answer   answerof-465762 ' value='1800326'   \/><label for='answer-id-1800326' id='answer-label-1800326' class=' answer'><span>Compressing the TIFF files using a lossless compression algorithm.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-24' style=';'><div id='questionWrap-24'  class='   watupro-question-id-465763'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>24. <\/span>You are configuring a network bridge on a Linux host that will connect multiple physical network interfaces to a virtual machine. You need to ensure that the virtual machine receives an IP address via DHCP. <br \/>\r<br>Which of the following is the correct command sequence to create the bridge interface \u2018br0\u2019, add physical interfaces \u2018eth0\u2019 and \u2018eth1\u2019 to it, and bring up the bridge interface? Assume the required packages are installed. Consider using \u2018ip\u2019 command. <br \/>\r<br>A ) <br \/>\r<br><br><img decoding=\"async\" width=649 height=18 id=\"\u56fe\u7247 32\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image002-31.jpg\"><br><br \/>\r<br>B ) <br \/>\r<br><br><img decoding=\"async\" width=649 height=9 id=\"\u56fe\u7247 31\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image003-28.jpg\"><br><br \/>\r<br>C ) <br \/>\r<br><br><img decoding=\"async\" width=649 height=13 id=\"\u56fe\u7247 30\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image004-26.jpg\"><br><br \/>\r<br>D ) <br \/>\r<br><br><img decoding=\"async\" width=649 height=7 id=\"\u56fe\u7247 29\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image005-24.jpg\"><br><br \/>\r<br>E ) <br \/>\r<br><br><img decoding=\"async\" width=650 height=12 id=\"\u56fe\u7247 28\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/uploads\/2026\/03\/image006-21.jpg\"><br><\/div><input type='hidden' name='question_id[]' id='qID_24' value='465763' \/><input type='hidden' id='answerType465763' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465763[]' id='answer-id-1800327' class='answer   answerof-465763 ' value='1800327'   \/><label for='answer-id-1800327' id='answer-label-1800327' class=' answer'><span>Option A<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465763[]' id='answer-id-1800328' class='answer   answerof-465763 ' value='1800328'   \/><label for='answer-id-1800328' id='answer-label-1800328' class=' answer'><span>Option B<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465763[]' id='answer-id-1800329' class='answer   answerof-465763 ' value='1800329'   \/><label for='answer-id-1800329' id='answer-label-1800329' class=' answer'><span>Option C<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465763[]' id='answer-id-1800330' class='answer   answerof-465763 ' value='1800330'   \/><label for='answer-id-1800330' id='answer-label-1800330' class=' answer'><span>Option D<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465763[]' id='answer-id-1800331' class='answer   answerof-465763 ' value='1800331'   \/><label for='answer-id-1800331' id='answer-label-1800331' class=' answer'><span>Option E<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-25' style=';'><div id='questionWrap-25'  class='   watupro-question-id-465764'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>25. <\/span>You are managing a cluster of GPU servers for deep learning. You observe that one server consistently exhibits high GPU temperature during training, causing thermal throttling and reduced performance. You\u2019ve already ensured adequate airflow. <br \/>\r<br>Which of the following actions would be MOST effective in addressing this issue?<\/div><input type='hidden' name='question_id[]' id='qID_25' value='465764' \/><input type='hidden' id='answerType465764' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465764[]' id='answer-id-1800332' class='answer   answerof-465764 ' value='1800332'   \/><label for='answer-id-1800332' id='answer-label-1800332' class=' answer'><span>Reduce the ambient temperature of the data center.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465764[]' id='answer-id-1800333' class='answer   answerof-465764 ' value='1800333'   \/><label for='answer-id-1800333' id='answer-label-1800333' class=' answer'><span>Lower the GPU power limit using \u2018nvidia-smi \u2015power-limit*.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465764[]' id='answer-id-1800334' class='answer   answerof-465764 ' value='1800334'   \/><label for='answer-id-1800334' id='answer-label-1800334' class=' answer'><span>Update the NVIDIA drivers to the latest version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465764[]' id='answer-id-1800335' class='answer   answerof-465764 ' value='1800335'   \/><label for='answer-id-1800335' id='answer-label-1800335' class=' answer'><span>Re-seat the GPU in its PCle slot to ensure proper contact and heat dissipation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465764[]' id='answer-id-1800336' class='answer   answerof-465764 ' value='1800336'   \/><label for='answer-id-1800336' id='answer-label-1800336' class=' answer'><span>Increase the fan speed of the GPU cooler using \u2018nvidia-smi --fan\u2019.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-26' style=';'><div id='questionWrap-26'  class='   watupro-question-id-465765'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>26. <\/span>Your deep learning training job that utilizes NCCL (NVIDIA Collective Communications Library) for multi-GPU communication is failing with &quot;NCCL internal error, unhandled system error&quot; after a recent CUDA update. The error occurs during the \u2018all reduce\u2019 operation. <br \/>\r<br>What is the most likely root cause and how would you address it?<\/div><input type='hidden' name='question_id[]' id='qID_26' value='465765' \/><input type='hidden' id='answerType465765' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465765[]' id='answer-id-1800337' class='answer   answerof-465765 ' value='1800337'   \/><label for='answer-id-1800337' id='answer-label-1800337' class=' answer'><span>Incompatible NCCL version with the new CUDA version. Update NCCL to a version compatible with the installed CUDA version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465765[]' id='answer-id-1800338' class='answer   answerof-465765 ' value='1800338'   \/><label for='answer-id-1800338' id='answer-label-1800338' class=' answer'><span>Insufficient shared memory allocated to the CUDA context. Increase the shared memory limit using \u2018cudaDeviceSetLimit(cudaLimitSharedMemory, new_limity.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465765[]' id='answer-id-1800339' class='answer   answerof-465765 ' value='1800339'   \/><label for='answer-id-1800339' id='answer-label-1800339' class=' answer'><span>Firewall rules blocking inter-GPU communication. Configure the firewall to allow communication on the NCCL-defined ports (typically 8000-8010).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465765[]' id='answer-id-1800340' class='answer   answerof-465765 ' value='1800340'   \/><label for='answer-id-1800340' id='answer-label-1800340' class=' answer'><span>Faulty network cables used for inter-node communication (if the training job spans multiple servers). Replace the network cables with certified high-speed cables.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465765[]' id='answer-id-1800341' class='answer   answerof-465765 ' value='1800341'   \/><label for='answer-id-1800341' id='answer-label-1800341' class=' answer'><span>GPU Direct RDMA is not properly configured. Check \u2018dmesg\u2019 for errors and ensure RDMA is enabled.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-27' style=';'><div id='questionWrap-27'  class='   watupro-question-id-465766'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>27. <\/span>You observe high latency and low bandwidth between two GPUs connected via an NVLink switch. You suspect a problem with the NVLink link itself. <br \/>\r<br>Which of the following methods would be the most effective in diagnosing the physical NVLink link health?<\/div><input type='hidden' name='question_id[]' id='qID_27' value='465766' \/><input type='hidden' id='answerType465766' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465766[]' id='answer-id-1800342' class='answer   answerof-465766 ' value='1800342'   \/><label for='answer-id-1800342' id='answer-label-1800342' class=' answer'><span>Using \u2018iperf3\u2019 to measure network throughput between the servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465766[]' id='answer-id-1800343' class='answer   answerof-465766 ' value='1800343'   \/><label for='answer-id-1800343' id='answer-label-1800343' class=' answer'><span>Running a CUDA-aware memory bandwidth test specifically designed for NVLink.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465766[]' id='answer-id-1800344' class='answer   answerof-465766 ' value='1800344'   \/><label for='answer-id-1800344' id='answer-label-1800344' class=' answer'><span>Examining system logs for NVLink-related error messages.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465766[]' id='answer-id-1800345' class='answer   answerof-465766 ' value='1800345'   \/><label for='answer-id-1800345' id='answer-label-1800345' class=' answer'><span>Using \u2018ping\u2019 to check network connectivity between the servers.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465766[]' id='answer-id-1800346' class='answer   answerof-465766 ' value='1800346'   \/><label for='answer-id-1800346' id='answer-label-1800346' class=' answer'><span>Physically inspecting the NVLink cables for damage.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-28' style=';'><div id='questionWrap-28'  class='   watupro-question-id-465767'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>28. <\/span>You have a large dataset stored on a network file system (NFS) and are training a deep learning model on an AMD EPYC server with NVIDIA GPUs. Data loading is very slow. <br \/>\r<br>What steps can you take to improve the data loading performance in this scenario? Select all that apply.<\/div><input type='hidden' name='question_id[]' id='qID_28' value='465767' \/><input type='hidden' id='answerType465767' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465767[]' id='answer-id-1800347' class='answer   answerof-465767 ' value='1800347'   \/><label for='answer-id-1800347' id='answer-label-1800347' class=' answer'><span>Increase the number of NFS client threads on the AMD EPYC server.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465767[]' id='answer-id-1800348' class='answer   answerof-465767 ' value='1800348'   \/><label for='answer-id-1800348' id='answer-label-1800348' class=' answer'><span>Use a local SSD or NVMe drive to cache frequently accessed data.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465767[]' id='answer-id-1800349' class='answer   answerof-465767 ' value='1800349'   \/><label for='answer-id-1800349' id='answer-label-1800349' class=' answer'><span>Mount the NFS share with the \u2018nolock\u2019 option.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465767[]' id='answer-id-1800350' class='answer   answerof-465767 ' value='1800350'   \/><label for='answer-id-1800350' id='answer-label-1800350' class=' answer'><span>Switch to a parallel file system like Lustre or BeeGF<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465767[]' id='answer-id-1800351' class='answer   answerof-465767 ' value='1800351'   \/><label for='answer-id-1800351' id='answer-label-1800351' class=' answer'><span>Reduce the batch size to decrease the amount of data loaded per iteration.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-29' style=';'><div id='questionWrap-29'  class='   watupro-question-id-465768'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>29. <\/span>You are configuring network fabric ports for NVIDIA GPUs in a server. The GPUs are connected to the network via PCIe. <br \/>\r<br>What is the primary factor that determines the maximum achievable bandwidth between the GPUs and the network?<\/div><input type='hidden' name='question_id[]' id='qID_29' value='465768' \/><input type='hidden' id='answerType465768' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465768[]' id='answer-id-1800352' class='answer   answerof-465768 ' value='1800352'   \/><label for='answer-id-1800352' id='answer-label-1800352' class=' answer'><span>The clock speed of the CP<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465768[]' id='answer-id-1800353' class='answer   answerof-465768 ' value='1800353'   \/><label for='answer-id-1800353' id='answer-label-1800353' class=' answer'><span>The amount of system RA<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465768[]' id='answer-id-1800354' class='answer   answerof-465768 ' value='1800354'   \/><label for='answer-id-1800354' id='answer-label-1800354' class=' answer'><span>The PCIe generation and number of lanes connecting the GPUs to the network adapter (e.g., PCIe 4.0 x16).<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465768[]' id='answer-id-1800355' class='answer   answerof-465768 ' value='1800355'   \/><label for='answer-id-1800355' id='answer-label-1800355' class=' answer'><span>The speed of the system\u2019s hard drives or SSDs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465768[]' id='answer-id-1800356' class='answer   answerof-465768 ' value='1800356'   \/><label for='answer-id-1800356' id='answer-label-1800356' class=' answer'><span>The color of the Ethernet cables.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-30' style=';'><div id='questionWrap-30'  class='   watupro-question-id-465769'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>30. <\/span>An InfiniBand fabric is experiencing intermittent packet loss between two high-performance compute nodes. You suspect a faulty cable or connector. <br \/>\r<br>Besides physically inspecting the cables, what software-based tools or techniques can you employ to diagnose potential link errors contributing to this packet loss?<\/div><input type='hidden' name='question_id[]' id='qID_30' value='465769' \/><input type='hidden' id='answerType465769' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465769[]' id='answer-id-1800357' class='answer   answerof-465769 ' value='1800357'   \/><label for='answer-id-1800357' id='answer-label-1800357' class=' answer'><span>Use \u2018ibdiagnet\u2019 to perform a comprehensive fabric analysis, including link integrity checks and error detection.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465769[]' id='answer-id-1800358' class='answer   answerof-465769 ' value='1800358'   \/><label for='answer-id-1800358' id='answer-label-1800358' class=' answer'><span>Monitor the port counters on the InfiniBand switches connected to the compute nodes. Look for excessive CRC errors, symbol errors, or other link-related error counts.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465769[]' id='answer-id-1800359' class='answer   answerof-465769 ' value='1800359'   \/><label for='answer-id-1800359' id='answer-label-1800359' class=' answer'><span>Run \u2018ipeff or \u2018ibperf between the two compute nodes and analyze the reported packet loss rate. Correlate this with the error counters on the switches.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465769[]' id='answer-id-1800360' class='answer   answerof-465769 ' value='1800360'   \/><label for='answer-id-1800360' id='answer-label-1800360' class=' answer'><span>All of the above<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465769[]' id='answer-id-1800361' class='answer   answerof-465769 ' value='1800361'   \/><label for='answer-id-1800361' id='answer-label-1800361' class=' answer'><span>Disable port mirroring.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-31' style=';'><div id='questionWrap-31'  class='   watupro-question-id-465770'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>31. <\/span>You notice that one of the fans in your GPU server is running at a significantly higher RPM than the others, even under minimal load. ipmitool sensor\u2019 output shows a normal temperature for that GPU. <br \/>\r<br>What could be the potential causes?<\/div><input type='hidden' name='question_id[]' id='qID_31' value='465770' \/><input type='hidden' id='answerType465770' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465770[]' id='answer-id-1800362' class='answer   answerof-465770 ' value='1800362'   \/><label for='answer-id-1800362' id='answer-label-1800362' class=' answer'><span>The fan\u2019s PWM control signal is malfunctioning, causing it to run at full speed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465770[]' id='answer-id-1800363' class='answer   answerof-465770 ' value='1800363'   \/><label for='answer-id-1800363' id='answer-label-1800363' class=' answer'><span>The fan bearing is wearing out, causing increased friction and requiring higher RPM to maintain airflow.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465770[]' id='answer-id-1800364' class='answer   answerof-465770 ' value='1800364'   \/><label for='answer-id-1800364' id='answer-label-1800364' class=' answer'><span>The fan is attempting to compensate for restricted airflow due to dust buildup.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465770[]' id='answer-id-1800365' class='answer   answerof-465770 ' value='1800365'   \/><label for='answer-id-1800365' id='answer-label-1800365' class=' answer'><span>The server\u2019s BMC (Baseboard Management Controller) has a faulty temperature sensor reading, causing it to overcompensate.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465770[]' id='answer-id-1800366' class='answer   answerof-465770 ' value='1800366'   \/><label for='answer-id-1800366' id='answer-label-1800366' class=' answer'><span>A network connectivity issue is causing higher CPU utilization, leading to increased system-wide heat.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-32' style=';'><div id='questionWrap-32'  class='   watupro-question-id-465771'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>32. <\/span>You are tasked with ensuring optimal power efficiency for a GPU server running machine learning workloads. You want to dynamically adjust the GPU\u2019s power consumption based on its utilization. <br \/>\r<br>Which of the following methods is the MOST suitable for achieving this, assuming the server\u2019s BIOS and the NVIDIA drivers support it?<\/div><input type='hidden' name='question_id[]' id='qID_32' value='465771' \/><input type='hidden' id='answerType465771' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465771[]' id='answer-id-1800367' class='answer   answerof-465771 ' value='1800367'   \/><label for='answer-id-1800367' id='answer-label-1800367' class=' answer'><span>Manually set the GPU\u2019s power limit using \u2018nvidia-smi -pl and create a script to monitor utilization and adjust the power limit periodically.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465771[]' id='answer-id-1800368' class='answer   answerof-465771 ' value='1800368'   \/><label for='answer-id-1800368' id='answer-label-1800368' class=' answer'><span>Configure the server\u2019s BIOS\/UEFI to use a power-saving profile, which will automatically reduce the GPU\u2019s power consumption when idle.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465771[]' id='answer-id-1800369' class='answer   answerof-465771 ' value='1800369'   \/><label for='answer-id-1800369' id='answer-label-1800369' class=' answer'><span>Enable Dynamic Boost in the NVIDIA Control Panel (if available), which will automatically allocate power between the CPU and GPU based on their current needs.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465771[]' id='answer-id-1800370' class='answer   answerof-465771 ' value='1800370'   \/><label for='answer-id-1800370' id='answer-label-1800370' class=' answer'><span>Use NVIDIA\u2019s Data Center GPU Manager (DCGM) to monitor GPU utilization and dynamically adjust the power limit based on a predefined policy.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465771[]' id='answer-id-1800371' class='answer   answerof-465771 ' value='1800371'   \/><label for='answer-id-1800371' id='answer-label-1800371' class=' answer'><span>Disable ECC (Error Correcting Code) on the GPU to reduce power consumption.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-33' style=';'><div id='questionWrap-33'  class='   watupro-question-id-465772'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>33. <\/span>You are tasked with troubleshooting a performance bottleneck in a multi-node, multi-GPU deep learning training job utilizing Horovod. <br \/>\r<br>The training loss is decreasing, but the overall training time is significantly longer than expected. <br \/>\r<br>Which of the following monitoring approaches would provide the most insight into the cause of the bottleneck?<\/div><input type='hidden' name='question_id[]' id='qID_33' value='465772' \/><input type='hidden' id='answerType465772' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465772[]' id='answer-id-1800372' class='answer   answerof-465772 ' value='1800372'   \/><label for='answer-id-1800372' id='answer-label-1800372' class=' answer'><span>Using \u2018nvidia-smi\u2019 on each node to monitor GPU utilization and memory usage.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465772[]' id='answer-id-1800373' class='answer   answerof-465772 ' value='1800373'   \/><label for='answer-id-1800373' id='answer-label-1800373' class=' answer'><span>Enabling Horovod\u2019s timeline and profiling features to visualize the communication patterns and identify synchronization bottlenecks.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465772[]' id='answer-id-1800374' class='answer   answerof-465772 ' value='1800374'   \/><label for='answer-id-1800374' id='answer-label-1800374' class=' answer'><span>Monitoring network bandwidth utilization on each node using \u2018iftop\u2019 or \u2018iperf3\u2019<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465772[]' id='answer-id-1800375' class='answer   answerof-465772 ' value='1800375'   \/><label for='answer-id-1800375' id='answer-label-1800375' class=' answer'><span>Analyzing the training loss curve to identify potential issues with the model architecture or hyperparameters.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465772[]' id='answer-id-1800376' class='answer   answerof-465772 ' value='1800376'   \/><label for='answer-id-1800376' id='answer-label-1800376' class=' answer'><span>Using Shtop\u2019 to monitor CPIJ utilization on each node.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-34' style=';'><div id='questionWrap-34'  class='   watupro-question-id-465773'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>34. <\/span>You are replacing a faulty NVIDIA Tesla V 100 GPU in a server. After physically installing the new GPU, the system fails to recognize it. You\u2019ve verified the power connections and seating of the card. <br \/>\r<br>Which of the following steps should you take next to troubleshoot the issue?<\/div><input type='hidden' name='question_id[]' id='qID_34' value='465773' \/><input type='hidden' id='answerType465773' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465773[]' id='answer-id-1800377' class='answer   answerof-465773 ' value='1800377'   \/><label for='answer-id-1800377' id='answer-label-1800377' class=' answer'><span>Immediately RMA the new GPU as it is likely defective.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465773[]' id='answer-id-1800378' class='answer   answerof-465773 ' value='1800378'   \/><label for='answer-id-1800378' id='answer-label-1800378' class=' answer'><span>Update the system BIOS and BMC firmware to the latest versions.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465773[]' id='answer-id-1800379' class='answer   answerof-465773 ' value='1800379'   \/><label for='answer-id-1800379' id='answer-label-1800379' class=' answer'><span>Reinstall the operating system to ensure proper driver installation.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465773[]' id='answer-id-1800380' class='answer   answerof-465773 ' value='1800380'   \/><label for='answer-id-1800380' id='answer-label-1800380' class=' answer'><span>Check if the new GPU requires a different driver version than the currently installed one and update if needed.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465773[]' id='answer-id-1800381' class='answer   answerof-465773 ' value='1800381'   \/><label for='answer-id-1800381' id='answer-label-1800381' class=' answer'><span>Disable and re-enable the GPU slot in the system BIO<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-35' style=';'><div id='questionWrap-35'  class='   watupro-question-id-465774'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>35. <\/span>You\u2019re troubleshooting a DGX-I server exhibiting performance degradation during a large-scale distributed training job. \u2018nvidia-sm\u00fc shows all GPUs are detected, but one GPU consistently reports significantly lower utilization than the others. Attempts to reschedule orkloads to that GPU frequently result in CUDA errors. <br \/>\r<br>Which of the following is the MOST likely cause and the BEST initial roubleshooting step?<\/div><input type='hidden' name='question_id[]' id='qID_35' value='465774' \/><input type='hidden' id='answerType465774' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465774[]' id='answer-id-1800382' class='answer   answerof-465774 ' value='1800382'   \/><label for='answer-id-1800382' id='answer-label-1800382' class=' answer'><span>A driver issue affecting only one GPU; reinstall NVIDIA drivers completely.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465774[]' id='answer-id-1800383' class='answer   answerof-465774 ' value='1800383'   \/><label for='answer-id-1800383' id='answer-label-1800383' class=' answer'><span>A software bug in the training script utilizing that specific GPU\u2019s resources inefficiently; debug the training script.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465774[]' id='answer-id-1800384' class='answer   answerof-465774 ' value='1800384'   \/><label for='answer-id-1800384' id='answer-label-1800384' class=' answer'><span>A hardware fault with the GPU, potentially thermal throttling or memory issues; run \u2018nvidia-smi -i -q\u2019 to check temperatures, power limits, and error counts.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465774[]' id='answer-id-1800385' class='answer   answerof-465774 ' value='1800385'   \/><label for='answer-id-1800385' id='answer-label-1800385' class=' answer'><span>Insufficient cooling in the server rack; verify adequate airflow and cooling capacity for the rack.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465774[]' id='answer-id-1800386' class='answer   answerof-465774 ' value='1800386'   \/><label for='answer-id-1800386' id='answer-label-1800386' class=' answer'><span>Power supply unit (PSU) overload, causing reduced power delivery to that GPU; monitor PSU load and check PSU specifications.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-36' style=';'><div id='questionWrap-36'  class='   watupro-question-id-465775'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>36. <\/span>You are tasked with installing a DGX A100 server. After racking and connecting power and network cables, you power it on, but the BMC (Baseboard Management Controller) is not accessible via the network. You have verified the network cable is connected and the switch port is active. <br \/>\r<br>What are the MOST likely causes and initial troubleshooting steps you should take?<\/div><input type='hidden' name='question_id[]' id='qID_36' value='465775' \/><input type='hidden' id='answerType465775' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800387' class='answer   answerof-465775 ' value='1800387'   \/><label for='answer-id-1800387' id='answer-label-1800387' class=' answer'><span>The BMC IP address is not configured or is on a different subnet. Check the BMC\u2019s network configuration using the DGX\u2019s front panel or via serial console. Verify DHCP is enabled and functioning or manually configure a static IP address.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800388' class='answer   answerof-465775 ' value='1800388'   \/><label for='answer-id-1800388' id='answer-label-1800388' class=' answer'><span>The BMC firmware is corrupted and needs to be reflashed using a USB drive. Check the DGX support site for the latest BMC firmware.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800389' class='answer   answerof-465775 ' value='1800389'   \/><label for='answer-id-1800389' id='answer-label-1800389' class=' answer'><span>The BMC is not powered on because the main power supply is faulty. Verify the power supply LEDs are lit and providing power to the system.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800390' class='answer   answerof-465775 ' value='1800390'   \/><label for='answer-id-1800390' id='answer-label-1800390' class=' answer'><span>The network switch port is not configured for the correct VLA<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800391' class='answer   answerof-465775 ' value='1800391'   \/><label for='answer-id-1800391' id='answer-label-1800391' class=' answer'><span>Verify the switch port configuration to ensure it is on the same VLAN as the BM<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465775[]' id='answer-id-1800392' class='answer   answerof-465775 ' value='1800392'   \/><label for='answer-id-1800392' id='answer-label-1800392' class=' answer'><span>The BMC is faulty and needs to be replaced. Contact NVIDIA support for RM<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-37' style=';'><div id='questionWrap-37'  class='   watupro-question-id-465776'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>37. <\/span>A user reports that their GPU-accelerated application is crashing with a CUDA error related to \u2018out of memory\u2019. You have confirmed that the GPU has sufficient physical memory. <br \/>\r<br>What are the likely causes and troubleshooting steps?<\/div><input type='hidden' name='question_id[]' id='qID_37' value='465776' \/><input type='hidden' id='answerType465776' value='checkbox'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465776[]' id='answer-id-1800393' class='answer   answerof-465776 ' value='1800393'   \/><label for='answer-id-1800393' id='answer-label-1800393' class=' answer'><span>The application is leaking GPU memory. Use a memory profiling tool like \u2018cuda-memcheck\u2019 to identify the source of the leak.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465776[]' id='answer-id-1800394' class='answer   answerof-465776 ' value='1800394'   \/><label for='answer-id-1800394' id='answer-label-1800394' class=' answer'><span>The application is requesting a larger block of memory than is available in a single allocation. Try breaking the allocation into smaller chunks or using managed memory.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465776[]' id='answer-id-1800395' class='answer   answerof-465776 ' value='1800395'   \/><label for='answer-id-1800395' id='answer-label-1800395' class=' answer'><span>The CUDA driver version is incompatible with the CUDA runtime version used by the application. Update the CUDA driver to match the runtime version.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465776[]' id='answer-id-1800396' class='answer   answerof-465776 ' value='1800396'   \/><label for='answer-id-1800396' id='answer-label-1800396' class=' answer'><span>The process has exceeded the maximum number of GPU contexts allowed. Reduce the number of concurrent CUDA applications running on the GP<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='checkbox' name='answer-465776[]' id='answer-id-1800397' class='answer   answerof-465776 ' value='1800397'   \/><label for='answer-id-1800397' id='answer-label-1800397' class=' answer'><span>The system\u2019s virtual memory is exhausted. Increase the swap space.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-38' style=';'><div id='questionWrap-38'  class='   watupro-question-id-465777'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>38. <\/span>You are troubleshooting a network performance issue in your NVIDIA Spectrum-X based A1 cluster. You suspect that the Equal-Cost Multi-Path (ECMP) hashing algorithm is not distributing traffic evenly across available paths, leading to congestion on some links. <br \/>\r<br>Which of the following methods would be MOST effective for verifying and addressing this issue?<\/div><input type='hidden' name='question_id[]' id='qID_38' value='465777' \/><input type='hidden' id='answerType465777' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465777[]' id='answer-id-1800398' class='answer   answerof-465777 ' value='1800398'   \/><label for='answer-id-1800398' id='answer-label-1800398' class=' answer'><span>Use \u2018ping\u2019 or \u2018traceroute\u2019 to analyze the paths taken by packets between the affected nodes. If they always take the same path, ECMP is likely not working correctly.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465777[]' id='answer-id-1800399' class='answer   answerof-465777 ' value='1800399'   \/><label for='answer-id-1800399' id='answer-label-1800399' class=' answer'><span>Use switch telemetry tools (e.g., NVIDIA What\u2019s Up Gold, Mellanox NEO, or similar) to monitor link utilization across all available paths between the nodes. Look for significant imbalances in traffic volume.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465777[]' id='answer-id-1800400' class='answer   answerof-465777 ' value='1800400'   \/><label for='answer-id-1800400' id='answer-label-1800400' class=' answer'><span>Restart the switches to force the ECMP hashing algorithm to recalculate paths.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465777[]' id='answer-id-1800401' class='answer   answerof-465777 ' value='1800401'   \/><label for='answer-id-1800401' id='answer-label-1800401' class=' answer'><span>Disable ECMP entirely and rely solely on static routing.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465777[]' id='answer-id-1800402' class='answer   answerof-465777 ' value='1800402'   \/><label for='answer-id-1800402' id='answer-label-1800402' class=' answer'><span>Reduce the TCP window size.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-39' style=';'><div id='questionWrap-39'  class='   watupro-question-id-465778'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>39. <\/span>You are tasked with optimizing storage performance for a deep learning training job on an NVIDIA DGX server. The training data consists of millions of small image files. <br \/>\r<br>Which of the following storage optimization techniques would be MOST effective in reducing I\/O bottlenecks?<\/div><input type='hidden' name='question_id[]' id='qID_39' value='465778' \/><input type='hidden' id='answerType465778' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465778[]' id='answer-id-1800403' class='answer   answerof-465778 ' value='1800403'   \/><label for='answer-id-1800403' id='answer-label-1800403' class=' answer'><span>Implementing RAID 0 across all storage devices.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465778[]' id='answer-id-1800404' class='answer   answerof-465778 ' value='1800404'   \/><label for='answer-id-1800404' id='answer-label-1800404' class=' answer'><span>Using a distributed file system with data striping across multiple storage nodes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465778[]' id='answer-id-1800405' class='answer   answerof-465778 ' value='1800405'   \/><label for='answer-id-1800405' id='answer-label-1800405' class=' answer'><span>Enabling data compression on the storage volume.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465778[]' id='answer-id-1800406' class='answer   answerof-465778 ' value='1800406'   \/><label for='answer-id-1800406' id='answer-label-1800406' class=' answer'><span>Increasing the block size of the file system to the maximum supported value.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465778[]' id='answer-id-1800407' class='answer   answerof-465778 ' value='1800407'   \/><label for='answer-id-1800407' id='answer-label-1800407' class=' answer'><span>Implementing a tiered storage system with NVMe drives for frequently accessed data and HDDs for less frequently accessed data.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div class='watu-question ' id='question-40' style=';'><div id='questionWrap-40'  class='   watupro-question-id-465779'>\n\t\t\t<div class='question-content'><div><span class='watupro_num'>40. <\/span>Consider the following \u2018ibroute\u2019 command used on an InfiniBand host: \u2018ibroute add dest Oxla dev ib0\u2019. <br \/>\r<br>What is the MOST likely purpose of this command?<\/div><input type='hidden' name='question_id[]' id='qID_40' value='465779' \/><input type='hidden' id='answerType465779' value='radio'><!-- end question-content--><\/div><div class='question-choices watupro-choices-columns '><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465779[]' id='answer-id-1800408' class='answer   answerof-465779 ' value='1800408'   \/><label for='answer-id-1800408' id='answer-label-1800408' class=' answer'><span>To add a default route for all traffic destined outside the InfiniBand subnet.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465779[]' id='answer-id-1800409' class='answer   answerof-465779 ' value='1800409'   \/><label for='answer-id-1800409' id='answer-label-1800409' class=' answer'><span>To create a static route for traffic destined to LID Ox1a, using the InfiniBand interface ib0.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465779[]' id='answer-id-1800410' class='answer   answerof-465779 ' value='1800410'   \/><label for='answer-id-1800410' id='answer-label-1800410' class=' answer'><span>To configure the MTU size on the ib0 interface to Ox1a bytes.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465779[]' id='answer-id-1800411' class='answer   answerof-465779 ' value='1800411'   \/><label for='answer-id-1800411' id='answer-label-1800411' class=' answer'><span>To disable routing on the ib0 interface.<\/span><\/label><\/div><div class='watupro-question-choice  ' dir='auto' ><input type='radio' name='answer-465779[]' id='answer-id-1800412' class='answer   answerof-465779 ' value='1800412'   \/><label for='answer-id-1800412' id='answer-label-1800412' class=' answer'><span>To configure a static route for traffic destined to IP address Ox1a, using the InfiniBand interface ib0.<\/span><\/label><\/div><!-- end question-choices--><\/div><!-- end questionWrap--><\/div><\/div><div style='display:none' id='question-41'>\n\t<div class='question-content'>\n\t\t<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/img\/loading.gif\" width=\"16\" height=\"16\" alt=\"Loading...\" title=\"Loading...\" \/>&nbsp;Loading...\t<\/div>\n<\/div>\n\n<br \/>\n\t\n\t\t\t<div class=\"watupro_buttons flex \" id=\"watuPROButtons11887\" >\n\t\t  <div id=\"prev-question\" style=\"display:none;\"><input type=\"button\" value=\"&lt; Previous\" onclick=\"WatuPRO.nextQuestion(event, 'previous');\"\/><\/div>\t\t  \t\t  \t\t   \n\t\t   \t  \t\t<div><input type=\"button\" name=\"action\" class=\"watupro-submit-button\" onclick=\"WatuPRO.submitResult(event)\" id=\"action-button\" value=\"View Results\"  \/>\n\t\t<\/div>\n\t\t<\/div>\n\t\t\n\t<input type=\"hidden\" name=\"quiz_id\" value=\"11887\" id=\"watuPROExamID\"\/>\n\t<input type=\"hidden\" name=\"start_time\" id=\"startTime\" value=\"2026-05-16 11:31:11\" \/>\n\t<input type=\"hidden\" name=\"start_timestamp\" id=\"startTimeStamp\" value=\"1778931071\" \/>\n\t<input type=\"hidden\" name=\"question_ids\" value=\"\" \/>\n\t<input type=\"hidden\" name=\"watupro_questions\" value=\"465740:1800212,1800213,1800214,1800215,1800216 | 465741:1800217,1800218,1800219,1800220,1800221 | 465742:1800222,1800223,1800224,1800225,1800226 | 465743:1800227,1800228,1800229,1800230,1800231 | 465744:1800232,1800233,1800234,1800235,1800236 | 465745:1800237,1800238,1800239,1800240,1800241 | 465746:1800242,1800243,1800244,1800245,1800246 | 465747:1800247,1800248,1800249,1800250,1800251 | 465748:1800252,1800253,1800254,1800255,1800256 | 465749:1800257,1800258,1800259,1800260,1800261 | 465750:1800262,1800263,1800264,1800265,1800266 | 465751:1800267,1800268,1800269,1800270,1800271 | 465752:1800272,1800273,1800274,1800275,1800276 | 465753:1800277,1800278,1800279,1800280,1800281 | 465754:1800282,1800283,1800284,1800285,1800286 | 465755:1800287,1800288,1800289,1800290,1800291 | 465756:1800292,1800293,1800294,1800295,1800296 | 465757:1800297,1800298,1800299,1800300,1800301 | 465758:1800302,1800303,1800304,1800305,1800306 | 465759:1800307,1800308,1800309,1800310,1800311 | 465760:1800312,1800313,1800314,1800315,1800316 | 465761:1800317,1800318,1800319,1800320,1800321 | 465762:1800322,1800323,1800324,1800325,1800326 | 465763:1800327,1800328,1800329,1800330,1800331 | 465764:1800332,1800333,1800334,1800335,1800336 | 465765:1800337,1800338,1800339,1800340,1800341 | 465766:1800342,1800343,1800344,1800345,1800346 | 465767:1800347,1800348,1800349,1800350,1800351 | 465768:1800352,1800353,1800354,1800355,1800356 | 465769:1800357,1800358,1800359,1800360,1800361 | 465770:1800362,1800363,1800364,1800365,1800366 | 465771:1800367,1800368,1800369,1800370,1800371 | 465772:1800372,1800373,1800374,1800375,1800376 | 465773:1800377,1800378,1800379,1800380,1800381 | 465774:1800382,1800383,1800384,1800385,1800386 | 465775:1800387,1800388,1800389,1800390,1800391,1800392 | 465776:1800393,1800394,1800395,1800396,1800397 | 465777:1800398,1800399,1800400,1800401,1800402 | 465778:1800403,1800404,1800405,1800406,1800407 | 465779:1800408,1800409,1800410,1800411,1800412\" \/>\n\t<input type=\"hidden\" name=\"no_ajax\" value=\"0\">\t\t\t<\/form>\n\t<p>&nbsp;<\/p>\n<\/div>\n\n<script type=\"text\/javascript\">\n\/\/jQuery(document).ready(function(){\ndocument.addEventListener(\"DOMContentLoaded\", function(event) { \t\nvar question_ids = \"465740,465741,465742,465743,465744,465745,465746,465747,465748,465749,465750,465751,465752,465753,465754,465755,465756,465757,465758,465759,465760,465761,465762,465763,465764,465765,465766,465767,465768,465769,465770,465771,465772,465773,465774,465775,465776,465777,465778,465779\";\nWatuPROSettings[11887] = {};\nWatuPRO.qArr = question_ids.split(',');\nWatuPRO.exam_id = 11887;\t    \nWatuPRO.post_id = 122484;\nWatuPRO.store_progress = 0;\nWatuPRO.curCatPage = 1;\nWatuPRO.requiredIDs=\"0\".split(\",\");\nWatuPRO.hAppID = \"0.33078200 1778931071\";\nvar url = \"https:\/\/www.dumpsbase.com\/freedumps\/wp-content\/plugins\/watupro\/show_exam.php\";\nWatuPRO.examMode = 1;\nWatuPRO.siteURL=\"https:\/\/www.dumpsbase.com\/freedumps\/wp-admin\/admin-ajax.php\";\nWatuPRO.emailIsNotRequired = 0;\nWatuPROIntel.init(11887);\nWatuPRO.inCategoryPages=1;});    \t \n<\/script>\n","protected":false},"excerpt":{"rendered":"<p>It has been verified that the NCP-AII dumps (V10.03) with practice questions and answers are valid for passing the NVIDIA Certified Professional AI Infrastructure certification exam. And we have shared the NCP-AII free dumps (Part 1, Q1-Q39) of V10.03 online to help you check the quality. From the free demo questions, you can believe that [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18718,18913],"tags":[20996],"class_list":["post-122484","post","type-post","status-publish","format-standard","hentry","category-nvidia","category-nvidia-certified-professional","tag-ncp-aii"],"_links":{"self":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/122484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/comments?post=122484"}],"version-history":[{"count":2,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/122484\/revisions"}],"predecessor-version":[{"id":122486,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/posts\/122484\/revisions\/122486"}],"wp:attachment":[{"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/media?parent=122484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/categories?post=122484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dumpsbase.com\/freedumps\/wp-json\/wp\/v2\/tags?post=122484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}