Azure Data Engineer Exam DP-200 Questions and Answers

How to complete your Microsoft Certified: Azure Data Engineer Associate certification? It requires you answer DP-200 and DP-201, the two exams successfully. Not only offer valid DP-201 exam dumps online, we also updated Azure Data Engineer Exam DP-200 Questions and Answers to ensure that you can pass Microsoft Certified: Azure Data Engineer Associate exams successfully. New DP-200 exam questions come with pdf and software to help you prepare for DP-100 exam well.

Free DP-200 Microsoft Azure Data Engineer Exam Dumps

1. You are a data engineer implementing a lambda architecture on Microsoft Azure. You use an open-source big data solution to collect, process, and maintain dat

A. The analytical data store performs poorly. You must implement a solution that meets the following requirements: Provide data warehousing Reduce ongoing management activities Deliver SQL query responses in less than one second You need to create an HDInsight cluster to meet the requirements.

Which type of cluster should you create?

 
 
 
 

2. DRAG DROP

You develop data engineering solutions for a company. You must migrate data from Microsoft Azure Blob storage to an Azure SQL Data Warehouse for further transformation. You need to implement the solution.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

3. You develop data engineering solutions for a company. The company has on-premises Microsoft SQL Server databases at multiple locations.

The company must integrate data with Microsoft Power BI and Microsoft Azure Logic Apps. The solution must avoid single points of failure during connection and transfer to the cloud. The solution must also minimize latency. You need to secure the transfer of data between on-premises databases and Microsoft Azure.

What should you do?

 
 
 
 

4. You are a data architect. The data engineering team needs to configure a synchronization of data between an on-premises Microsoft SQL Server database to Azure SQL Database.

Ad-hoc and reporting queries are being overutilized the on-premises production instance.

The synchronization process must:

– Perform an initial data synchronization to Azure SQL Database with minimal downtime

– Perform bi-directional data synchronization after initial synchronization

You need to implement this synchronization solution.

Which synchronization method should you use?

 
 
 
 
 

5. An application will use Microsoft Azure Cosmos DB as its data solution. The application will use the Cassandra API to support a column-based database type that uses containers to store items.

You need to provision Azure Cosmos DB.

Which container name and item name should you use? Each correct answer presents part of the solutions. NOTE: Each correct answer selection is worth one point.

 
 
 
 
 

6. A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution contains a dedicated database for each customer organization. Customer organizations have peak usage at different periods during the year. You need to implement the Azure SQL Database elastic pool to minimize cost.

Which option or options should you configure?

 
 
 
 
 

7. HOTSPOT

You are a data engineer. You are designing a Hadoop Distributed File System (HDFS) architecture. You plan to use Microsoft Azure Data Lake as a data storage repository. You must provision the repository with a resilient data schema. You need to ensure the resiliency of the Azure Data Lake Storage.

What should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

8. DRAG DROP

You are developing the data platform for a global retail company. The company operates during normal working hours in each region. The analytical database is used once a week for building sales projections. Each region maintains its own private virtual network.

Building the sales projections is very resource intensive are generates upwards of 20 terabytes (TB) of data. Microsoft Azure SQL Databases must be provisioned.

Database provisioning must maximize performance and minimize cost

The daily sales for each region must be stored in an Azure SQL Database instance

Once a day, the data for all regions must be loaded in an analytical Azure SQL Database instance

You need to provision Azure SQL database instances.

How should you provision the database instances? To answer, drag the appropriate Azure SQL products to the correct databases. Each Azure SQL product may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point.

9. A company manages several on-premises Microsoft SQL Server databases. You need to migrate the databases to Microsoft Azure by using a backup process of Microsoft SQL Server.

Which data technology should you use?

 
 
 
 

10. The data engineering team manages Azure HDInsight clusters. The team spends a large amount of time creating and destroying clusters daily because most of the data pipeline process runs in minutes. You need to implement a solution that deploys multiple HDInsight clusters with minimal effort.

What should you implement?

 
 
 
 

11. You are the data engineer for your company. An application uses a NoSQL database to store data. The database uses the key-value and wide-column NoSQL database type. Developers need to access data in the database using an API. You need to determine which API to use for the database model and type.

Which two APIs should you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 

12. A company is designing a hybrid solution to synchronize data and on-premises Microsoft SQL Server database to Azure SQL Database. You must perform an assessment of databases to determine whether data will move without compatibility issues. You need to perform the assessment.

Which tool should you use?

 
 
 
 
 

13. DRAG DROP

You manage a financial computation data analysis process. Microsoft Azure virtual machines (VMs) run the process in daily jobs, and store the results in virtual hard drives (VHDs.) The VMs product results using data from the previous day and store the results in a snapshot of the VHD. When a new month begins, a process creates a new VHD.

You must implement the following data retention requirements:

– Daily results must be kept for 90 days

– Data for the current year must be available for weekly reports

– Data from the previous 10 years must be stored for auditing purposes

– Data required for an audit must be produced within 10 days of a request.

You need to enforce the data retention requirements while minimizing cost.

How should you configure the lifecycle policy? To answer, drag the appropriate JSON segments to the correct locations. Each JSON segment may be used once, more than once, or not at all. You may need to drag the split bat between panes or scroll to view content. NOTE: Each correct selection is worth one point.

14. A company plans to use Azure SQL Database to support a mission-critical application. The application must be highly available without performance degradation during maintenance windows. You need to implement the solution.

Which three technologies should you implement? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 
 

15. A company plans to use Azure Storage for file storage purposes.

Compliance rules require:

A single storage account to store all operations including reads, writes and deletes

Retention of an on-premises copy of historical operations

You need to configure the storage account.

Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 

16. DRAG DROP

You are developing a solution to visualize multiple terabytes of geospatial data.

The solution has the following requirements:

– Data must be encrypted.

– Data must be accessible by multiple resources on Microsoft Azure.

You need to provision storage for the solution.

Which four actions should you perform in sequence? To answer, move the appropriate action from the list of actions to the answer area and arrange them in the correct order.

17. You are developing a data engineering solution for a company. The solution will store a large set of key-value pair data by using Microsoft Azure Cosmos DB.

The solution has the following requirements:

Data must be partitioned into multiple containers.

Data containers must be configured separately.

Data must be accessible from applications hosted around the world.

The solution must minimize latency.

You need to provision Azure Cosmos DB.

 
 
 
 
 
 
 

18. A company has a SaaS solution that uses Azure SQL Database with elastic pools. The solution will have a dedicated database for each customer organization. Customer organizations have peak usage at different periods during the year.

Which two factors affect your costs when sizing the Azure SQL Database elastic pools? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 

19. HOTSPOT

You are developing a solution using a Lambda architecture on Microsoft Azure.

The data at rest layer must meet the following requirements:

* Data storage:

– Serve as a repository for high volumes of large files in various formats.

– Implement optimized storage for big data analytics workloads.

– Ensure that data can be organized using a hierarchical structure.

* Batch processing:

– Use a managed solution for in-memory computation processing.

– Natively support Scala, Python, and R programming languages.

– Provide the ability to resize and terminate the cluster automatically.

* Analytical data store:

– Support parallel processing.

– Use columnar storage.

– Support SQL-based languages.

You need to identify the correct technologies to build the Lambda architecture.

Which technologies should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

20. DRAG DROP

Your company has on-premises Microsoft SQL Server instance.

The data engineering team plans to implement a process that copies data from the SQL Server instance to Azure Blob storage. The process must orchestrate and manage the data lifecycle.

You need to configure Azure Data Factory to connect to the SQL Server instance.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

21. A company runs Microsoft SQL Server in an on-premises virtual machine (VM).

You must migrate the database to Azure SQL Database. You synchronize users from Active Directory to Azure Active Directory (Azure AD).

You need to configure Azure SQL Database to use an Azure AD user as administrator.

What should you configure?

 
 
 
 

22. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar (22).

You need to implement masking for the Customer_ID field to meet the following requirements:

– The first two prefix characters must be exposed.

– The last four prefix characters must be exposed.

All other characters must be masked.

Solution: You implement data masking and use a credit card function mask.

Does this meet the goal?

 
 

23. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar (22).

You need to implement masking for the Customer_ID field to meet the following requirements:

The first two prefix characters must be exposed.

The last four prefix characters must be exposed.

All other characters must be masked.

Solution: You implement data masking and use an email function mask.

Does this meet the goal?

 
 

24. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar (22).

You need to implement masking for the Customer_ID field to meet the following requirements:

The first two prefix characters must be exposed.

The last four prefix characters must be exposed.

All other characters must be masked.

Solution: You implement data masking and use a random number function mask.

Does this meet the goal?

 
 

25. DRAG DROP

You are responsible for providing access to an Azure Data Lake Storage Gen2 account. Your user account has contributor access to the storage account, and you have the application ID access key. You plan to use PolyBase to load data into Azure SQL data warehouse. You need to configure PolyBase to connect the data warehouse to the storage account.

Which three components should you create in sequence? To answer, move the appropriate components from the list of components to the answer are and arrange them in the correct order.

26. You plan to create a dimension table in Azure Data Warehouse that will be less than 1 GB.

You need to create the table to meet the following requirements:

Provide the fastest query time.

Minimize data movement.

Which type of table should you use?

 
 
 
 

27. You have an Azure SQL data warehouse. Using PolyBase, you create table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse. The external table has three columns. You discover that the Parquet files have a fourth column named ItemID.

Which command should you run to add the ItemID column to the external table?

 
 
 
 

28. DRAG DROP

You have a table named SalesFact in an Azure SQL data warehouse.

SalesFact contains sales data from the past 36 months and has the following characteristics:

– Is partitioned by month

– Contains one billion rows

– Has clustered columnstore indexes

All the beginning of each month, you need to remove data SalesFact that is older than 36 months as quickly as possible.

Which three actions should you perform in sequence in a stored procedure? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

29. You plan to implement an Azure Cosmos DB database that will write 100,000 JSON every 24 hours. The database will be replicated to three regions. Only one region will be writable.

You need to select a consistency level for the database to meet the following requirements:

Guarantee monotonic reads and writes within a session.

Provide the fastest throughput. Provide the lowest latency.

Which consistency level should you select?

 
 
 
 
 

30. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar (22).

You need to implement masking for the Customer_ID field to meet the following requirements:

The first two prefix characters must be exposed.

The last four prefix characters must be exposed.

All other characters must be masked.

Solution: You implement data masking and use a credit card function mask.

Does this meet the goal?

 
 

31. Testlet 2

Background

Proseware, Inc, develops and manages a product named Poll Taker. The product is used for delivering public opinion polling and analysis.

Polling data comes from a variety of sources, including online surveys, house-to-house interviews, and booths at public events.

Polling data

Polling data is stored in one of the two locations:

– An on-premises Microsoft SQL Server 2019 database named PollingData

– Azure Data Lake Gen 2 Data in Data Lake is queried by using PolyBase

Poll metadata

Each poll has associated metadata with information about the poll including the date and number of respondents. The data is stored as JSON.

Phone-based polling Security

– Phone-based poll data must only be uploaded by authorized users from authorized devices

– Contractors must not have access to any polling data other than their own

– Access to polling data must set on a per-active directory user basis

Data migration and loading

– All data migration processes must use Azure Data Factory

– All data migrations must run automatically during non-business hours

– Data migrations must be reliable and retry when needed

Performance

After six months, raw polling data should be moved to a lower-cost storage solution.

Deployments

– All deployments must be performed by using Azure DevOps. Deployments must use templates used in multiple environments

– No credentials or secrets should be used during deployments

Reliability

All services and processes must be resilient to a regional Azure outage.

Monitoring

All Azure services must be monitored by using Azure Monitor. On-premises SQL Server performance must be monitored.

DRAG DROP

You need to ensure that phone-based polling data can be analyzed in the PollingData database.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer are and arrange them in the correct order.

32. DRAG DROP

You need to provision the polling data storage account.

How should you configure the storage account? To answer, drag the appropriate Configuration Value to the correct Setting. Each Configuration Value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point.

33. Testlet 3

Case Study

This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to complete each case.

However, there may be additional case studies and sections on this exam. You must manage your time to ensure that you are able to complete all questions included on this exam in the time provided.

To answer the questions included in a case study, you will need to reference information that is provided in the case study. Case studies might contain exhibits and other resources that provide more information about the scenario that is described in the case study. Each question is independent of the other question on this case study.

At the end of this case study, a review screen will appear. This screen allows you to review your answers and to make changes before you move to the next section of the exam. After you begin a new section, you cannot return to this section.

To start the case study

To display the first question on this case study, click the Next button. Use the buttons in the left pane to explore the content of the case study before you answer the questions. Clicking these buttons displays information such as business requirements, existing environment, and problem statements. If the case study has an All Information tab, note that the information displayed is identical to the information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button to return to the question.

Overview

General Overview

Litware, Inc, is an international car racing and manufacturing company that has 1,000 employees. Most employees are located in Europe. The company supports racing teams that complete in a worldwide racing series.

Physical Locations

Litware has two main locations: a main office in London, England, and a manufacturing plant in Berlin, Germany.

During each race weekend, 100 engineers set up a remote portable office by using a VPN to connect the datacentre in the London office. The portable office is set up and torn down in approximately 20 different countries each year.

Existing environment

Race Central

During race weekends, Litware uses a primary application named Race Central. Each car has several sensors that send real-time telemetry data to the London datacentre. The data is used for real-time tracking of the cars. Race Central also sends batch updates to an application named Mechanical Workflow by using Microsoft SQL Server Integration Services (SSIS).

The telemetry data is sent to a MongoDB database. A custom application then moves the data to databases in SQL Server 2017. The telemetry data in MongoDB has more than 500 attributes. The application changes the attribute names when the data is moved to SQL Server 2017.

The database structure contains both OLAP and OLTP databases.

Mechanical Workflow

Mechanical Workflow is used to track changes and improvements made to the cars during their lifetime. Currently, Mechanical Workflow runs on SQL Server 2017 as an OLAP system. Mechanical Workflow has a named Table1 that is 1 TB. Large aggregations are performed on a single column of Table 1.

Requirements

Planned Changes

Litware is the process of rearchitecting its data estate to be hosted in Azure. The company plans to decommission the London datacentre and move all its applications to an Azure datacentre.

Technical Requirements

Litware identifies the following technical requirements:

Data collection for Race Central must be moved to Azure Cosmos DB and Azure SQL Database. The data must be written to the Azure datacentre closest to each race and must converge in the least amount of time.

The query performance of Race Central must be stable, and the administrative time it takes to perform optimizations must be minimized.

The datacentre for Mechanical Workflow must be moved to Azure SQL data Warehouse.

Transparent data encryption (IDE) must be enabled on all data stores, whenever possible.

An Azure Data Factory pipeline must be used to move data from Cosmos DB to SQL Database for Race Central. If the data load takes longer than 20 minutes, configuration changes must be made to Data Factory.

The telemetry data must migrate toward a solution that is native to Azure.

The telemetry data must be monitored for performance issues. You must adjust the Cosmos DB Request Units per second (RU/s) to maintain a performance SLA while minimizing the cost of the Ru/s.

Data Masking Requirements

During rare weekends, visitors will be able to enter the remote portable offices. Litware is concerned that some proprietary information might be exposed. The company identifies the following data masking requirements for the Race Central data that will be stored in SQL Database:

Only show the last four digits of the values in a column named SuspensionSprings.

Only Show a zero value for the values in a column named ShockOilWeight.

You need to build a solution to collect the telemetry data for Race Control.

What should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

34. On which data store you configure TDE to meet the technical requirements?

 
 
 

35. HOTSPOT

You are building the data store solution for Mechanical Workflow.

How should you configure Table1? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

36. HOTSPOT

Which masking functions should you implement for each column to meet the data masking requirements? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

37. Load the data using the INSERT…SELECT statement.

Does the solution meet the goal?

 
 

38. Load the data using the INSERT…SELECT statement

Does the solution meet the goal?

 
 

39. Load the data using the INSERT…SELECT statement

Does the solution meet the goal?

 
 

40. You develop data engineering solutions for a company.

You must integrate the company’s on-premises Microsoft SQL Server data with Microsoft Azure SQL Database. Data must be transformed incrementally.

You need to implement the data integration solution.

Which tool should you use to configure a pipeline to copy data?

 
 
 
 

41. HOTSPOT

A company runs Microsoft Dynamics CRM with Microsoft SQL Server on-premises. SQL Server Integration Services (SSIS) packages extract data from Dynamics CRM APIs, and load the data into a SQL Server data warehouse.

The datacenter is running out of capacity. Because of the network configuration, you must extract on premises data to the cloud over https. You cannot open any additional ports. The solution must implement the least amount of effort.

You need to create the pipeline system.

Which component should you use? To answer, select the appropriate technology in the dialog box in the answer area. NOTE: Each correct selection is worth one point.

42. DRAG DROP

You develop data engineering solutions for a company.

A project requires analysis of real-time Twitter feeds. Posts that contain specific keywords must be stored and processed on Microsoft Azure and then displayed by using Microsoft Power BI. You need to implement the solution.

Which five actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

43. DRAG DROP

Your company manages on-premises Microsoft SQL Server pipelines by using a custom solution. The data engineering team must implement a process to pull data from SQL Server and migrate it to Azure Blob storage. The process must orchestrate and manage the data lifecycle. You need to configure Azure Data Factory to connect to the on-premises SQL Server database.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

44. HOTSPOT

You are designing a new Lambda architecture on Microsoft Azure.

The real-time processing layer must meet the following requirements:

Ingestion:

– Receive millions of events per second

– Act as a fully managed Platform-as-a-Service (PaaS) solution

– Integrate with Azure Functions

Stream processing:

– Process on a per-job basis

– Provide seamless connectivity with Azure services

– Use a SQL-based query language

Analytical data store:

– Act as a managed service

– Use a document store

– Provide data encryption at rest

You need to identify the correct technologies to build the Lambda architecture using minimal effort.

Which technologies should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

45. You develop data engineering solutions for a company. You need to ingest and visualize real-time Twitter data by using Microsoft Azure.

Which three technologies should you use? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 
 

46. Each day, company plans to store hundreds of files in Azure Blob Storage and Azure Data Lake Storage. The company uses the parquet format.

You must develop a pipeline that meets the following requirements:

Process data every six hours

Offer interactive data analysis capabilities

Offer the ability to process data using solid-state drive (SSD) caching

Use Directed Acyclic Graph(DAG) processing mechanisms

Provide support for REST API calls to monitor processes

Provide native support for Python

Integrate with Microsoft Power BI

You need to select the appropriate data technology to implement the pipeline.

Which data technology should you implement?

 
 
 
 
 

47. HOTSPOT

A company is deploying a service-based data environment. You are developing a solution to process this data.

The solution must meet the following requirements:

Use an Azure HDInsight cluster for data ingestion from a relational database in a different cloud service

Use an Azure Data Lake Storage account to store processed data Allow users to download processed data

You need to recommend technologies for the solution.

Which technologies should you use? To answer, select the appropriate options in the answer area.

48. A company uses Azure SQL Database to store sales transaction data. Field sales employees need an offline copy of the database that includes last year’s sales on their laptops when there is no internet connection available. You need to create the offline export copy.

Which three options can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

 
 
 
 
 

49. Load the data using the CREATE TABLE AS SELECT statement.

Does the solution meet the goal?

 
 

Updated MS-101 Exam Questions - Microsoft 365 Mobility and Security
Azure AI Engineer Exam AI-100 Real Questions

Add a Comment

Your email address will not be published. Required fields are marked *