ATTENTION PLEASE!!! THE 70-775 EXAM UPDATED RECENTLY (Sep/2018) WITH MANY NEW QUESTIONS!!!

And, PassLeader has updated its 70-775 dumps recently, all new questions available now!!!

You can get the newest PassLeader 70-775 exam questions in the
#5 of this topic!!!

--> #5 of this topic

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The new 70-775 dumps (Dec/2017 Updated) now are available, here are part of 70-775 exam questions (FYI):


[Get the download link at the end of this post]

NEW QUESTION 1
You are implementing a batch processing solution by using Azure HDInsight. You have a table that contains sales data. You plan to implement a query that will return the number of orders by zip code. You need to minimize the execution time of the queries and to maximize the compression level of the resulting data. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: B

NEW QUESTION 2
You are configuring the Hive views on an Azure HDInsight cluster that is configured to use Kerberos. You plan to use the YARN logs to troubleshoot a query that runs against Apache Hadoop. You need to view the method, the service, and the authenticated account used to run the query. Which method call should you view in the YARN logs?

A. HQL
B. WebHDFS
C. HDFS C API
D. Ambari RESR API

Answer: D

NEW QUESTION 3
You are building a security tracking solution in Apache Kafka to parse security logs. The security logs record an entry each time a user attempts to access an application. Each log entry contains the IP address used to make the attempt and the country from which the attempt originated. You need to receive notifications when an IP address from outside of the United States is used to access the application.
Solution: Create two new consumers. Create a file import process to send messages. Start the producer.
Does this meet the goal?

A. Yes
B. No

Answer: B

NEW QUESTION 4
You are implementing a batch processing solution by using Azure HDInsight. You need to integrate Apache Sqoop data and to chain complex jobs. The data and the jobs will implement MapReduce. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: F

NEW QUESTION 5
You have an Azure HDInsight cluster. You need to store data in a file format that maximizes compression and increases read performance. Which type of file format should you use?

A. ORC
B. Apache Parquet
C. Apache Avro
D. Apache Sequence

Answer: A

NEW QUESTION 6
You are implementing a batch processing solution by using Azure HDInsight. You have data stored in Azure. You need to ensure that you can access the data by using Azure Active Directory (Azure AD) identities. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: G

NEW QUESTION 7
You need to deploy a NoSQL database to an HDInsight cluster. You will manage the server that host the database by using Remote Desktop. The database must use the key/value pair format in a columnar model. What should you do?

A. Use an Azure PowerShell script to create and configure a premium HDInsight cluster.
Specify Apache Hadoop as the cluster type and use Linux as the operating system.
B. Use the Azure portal to create a standard HDInsight cluster.
Specify Apache Spark as the cluster type and use Linux as the operating system.
C. Use an Azure PowerShell script to create a standard HDInsight cluster.
Specify Apache HBase as the cluster type and use Windows as the operating system.
D. Use an Azure PowerShell script to create a standard HDInsight cluster.
Specify Apache Storm as the cluster type and use Windows as the operating system.
E. Use an Azure PowerShell script to create a premium HDInsight cluster.
Specify Apache HBase as the cluster type and use Linux as the operating system.
F. Use an Azure portal to create a standard HDInsight cluster.
Specify Apache Interactive Hive as the cluster type and use Linux as the operating system.
G. Use an Azure portal to create a standard HDInsight cluster.
Specify Apache HBase as the cluster type and use Linux as the operating system.

Answer: G

NEW QUESTION 8
You have an initial dataset that contains the crime data from major cities. You plan to build training models from the training data. You plan to automate the process of adding more data to the training models and to constantly tune the models by using the additional data, including data that is collected in near real-time. The system will be used to analyze event data gathered from many different sources, such as Internet of Things (IoT) devices, live video surveillance, and traffic activities, and to generate predictions of an increased crime risk at a particular time and place. You have an incoming data stream from Twitter and an incoming data stream from Facebook, which are event-based only, rather than time-based. You also have a time interval stream every 10 seconds. The data is in a key/value pair format. The value field represents a number that defines how many times a hashtag occurs within a Facebook post, or how many times a Tweet that contains a specific hashtag is retweeted. You must use the appropriate data storage, stream analytics techniques, and Azure HDInsight cluster types for the various tasks associated to the processing pipeline. You are designing the real-time portion of the input stream processing. The input will be a continuous stream of data and each record will be processed one at a time. The data will come from an Apache Kafka producer. You need to identify which HDInsight cluster to use for the final processing of the input data. This will be used to generate continuous statistics and real-time analytics. The latency to process each record must be less than one millisecond and tasks must be performed in parallel. Which type of cluster should you identify?

A. Apache Storm
B. Apache Hadoop
C. Apache HBase
D. Apache Spark

Answer: A

NEW QUESTION 9
You have an Apache Hive table that contains one billion rows. You plan to use queries that will filter the data by using the WHERE clause. The values of the columns will be known only while the data loads into a Hive table. You need to decrease the query runtime. What should you configure?

A. static partitioning
B. bucket sampling
C. parallel execution
D. dynamic partitioning

Answer: C

NEW QUESTION 10
You have an Azure HDInsight cluster. You need a build a solution to ingest real-time streaming data into a nonrelational distributed database. What should you use to build the solution?

A. Apache Hive and Apache Kafka
B. Spark and Phoenix
C. Apache Storm and Apache HBase
D. Apache Pig and Apache HCatalog

Answer: C

NEW QUESTION 11
......

Get the newest PassLeader 70-775 VCE dumps here: https://www.passleader.com/70-775.html

OR

Download more NEW PassLeader 70-775 PDF dumps from Google Drive here:

https://drive.google.com/open?id=16Ko6acR3bnZwYnl--najNi9hP740P3Xt

OR

Read the newest PassLeader 70-775 exam questions from this Blog:

http://www.microsoftbraindumps.com/?s=70-775

Good Luck!!!
 
Last edited:

Scot Wills

Member
Member
I pass the 70-775 exam recently and I was very thankful to braindumpspdf. I got 92% marks. I prepare this exam from latest 70-775 practice exam questions. This rehearsal is the best way to evaluate your preparation. I am sure you will pass your exam on the first attempt. Best of luck.
 
The new 2018 version (Sep/2018 Updated) 70-775 dumps now are available, here are part of 70-775 exam questions (FYI):

[Get the VCE dumps and PDF dumps download link at the end of this post]

NEW QUESTION 41
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = STORE 'Sales' USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?

A. Yes
B. No

Answer: B

NEW QUESTION 42
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = LOAD 'Sales' USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?

A. Yes
B. No

Answer: A

NEW QUESTION 43
You are implementing a batch processing solution by using Azure HDInsight. You have a workflow that retrieves data by using a U-SQL query. You need to provide the ability to query and combine data from multiple data sources. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: G

NEW QUESTION 44
You are implementing a batch processing solution by using Azure HDInsight. You have two tables. Each table is larger than 250 TB. Both tables have approximately the same number of rows and columns. You need to match the tables based on a key column. You must minimize the size of the data table that is produced. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: A

NEW QUESTION 45
You deploy Apache Kafka to an Azure HDInsight cluster. You plan to load data into a topic that has a specific schema. You need to load the data while maintaining the existing schema. Which file format should you use to receive the data?

A. JSON
B. Kudu
C. Apache Sequence
D. CSV

Answer: A

NEW QUESTION 46
You have an Apache Interactive Hive cluster in Azure HDInsight. The cluster has 12 processors and 96 GB of RAM. The YARN container size is set to 2 GB and the Tez container size is 3 GB. You configure one Tez container per processor. You are performing map joints between a 2 GB dimension table and a 96 GB fact table. You experience slow performance due to an inadequate utilization of the available resources. You need to ensure that the map joins are used. Which two settings should you configure? (Each correct answer presents part of the solution.Choose two.)

A. SET hive.tez.container.size=98304MB
B. SET hive.auto.convert.join.noconditionaltask.size=2048MB
C. SET yarn.scheduler.minimum-allocation-mb=6144MB
D. SET hive.auto.convert.join.noconditionaltask.size=3072MB
E. SET hive.tez.container.size=6144MB

Answer: AC

NEW QUESTION 47
You have an array of integers in Apache Spark. You need to save the data to an Apache Parquet file. Which methods should you use?

A. take an .toDF
B. makeRDD and sqlContext createDataSet
C. sqlContext load and makeRDD
D. makeRDD and sqlContext createDataFrame

Answer: D

NEW QUESTION 48
You have an Apache Spark cluster in Azure HDInsight. Users report that Spark jobs take longer than expected to complete. You need to reduce the amount of time it takes for the Spark jobs to complete. What should you do?

A. From HDFS, modify the maximum thread setting.
B. From Spark, modify the spark_thrift_cmd_opts parameter.
C. From YARN, modify the container size setting.
D. From Spark, modify the spark.executor.cores parameter.

Answer: D

NEW QUESTION 49
You are configuring an Apache Phoenix operation on top of an Apache HBase server. The operation executes a statement that joins an Apache Hive table and a Phoenix table. You need to ensure that when the table is dropped, the table files are retained, but the table metadata is removed from the Apache HCatalog. Which type of table should you use?

A. internal
B. external
C. temp
D. Azure Table Storage

Answer: B

NEW QUESTION 50
You have an Apache Hive cluster in Azure HDInsight. You plan to ingest on-premises data into Azure Storage. You need to automate the copying of the data to Azure Storage. Which tool should you use?

A. Microsoft Azure Storage Explorer
B. Azure Import/Export Service
C. Azure Backup
D. AzCopy

Answer: D

NEW QUESTION 51
You have an Apache HBase cluster in Azure HDInsight. You plan to use Apache Pig, Apache Hive, and HBase to access the cluster simultaneously and to process data stored in a single platform. You need to deliver consistent operations, security, and data governance. What should you use?

A. Apache Ambari
B. MapReduce
C. Apache Oozie
D. YARN

Answer: D

NEW QUESTION 52
You have several Linux-based and Windows-based Azure HDInsight clusters. The clusters are indifferent Active Directory domains. You need to consolidate system logging for all of the clusters into a single location. The solution must provide near real-time analytics of the log data. What should you use?

A. Apache Ambari
B. YARN
C. Microsoft System Center Operations Manager
D. Microsoft Operations Management Suite (OMS)

Answer: A

NEW QUESTION 53
You have an Apache Spark job. The performance of the job deteriorates over time. You plan to debug the job. You need to gather information that you can use to debug the job. Which tool should you use?

A. YARN
B. Spark History Server
C. HDInsight Cluster Dashboard
D. Jupyter Notebook

Answer: A

NEW QUESTION 54
......

Get the newest PassLeader 70-775 VCE dumps here: https://www.passleader.com/70-775.html

OR

Download more NEW PassLeader 70-775 PDF dumps from Google Drive here:

https://drive.google.com/open?id=16Ko6acR3bnZwYnl--najNi9hP740P3Xt

OR

Read the newest PassLeader 70-775 exam questions from this Blog:

http://www.microsoftbraindumps.com/?s=70-775

Good Luck!!!
 
The new 2018 version (Sep/2018 Updated) 70-775 dumps now are available, here are part of 70-775 exam questions (FYI):

[Get the VCE dumps and PDF dumps download link at the end of this post]

NEW QUESTION 41
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = STORE 'Sales' USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?

A. Yes
B. No

Answer: B

NEW QUESTION 42
You have an Apache Pig table named Sales in Apache HCatalog. You need to make the data in the table accessible from Apache Pig.
Solution: You use the following script:
A = LOAD 'Sales' USING org.apache.hive.hcatalog.pig.HCatLoader();
Does this meet the goal?

A. Yes
B. No

Answer: A

NEW QUESTION 43
You are implementing a batch processing solution by using Azure HDInsight. You have a workflow that retrieves data by using a U-SQL query. You need to provide the ability to query and combine data from multiple data sources. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: G

NEW QUESTION 44
You are implementing a batch processing solution by using Azure HDInsight. You have two tables. Each table is larger than 250 TB. Both tables have approximately the same number of rows and columns. You need to match the tables based on a key column. You must minimize the size of the data table that is produced. What should you do?

A. Use a shuffle join in an Apache Hive query that stores the data in a JSON format.
B. Use a broadcast join in an Apache Hive query that stores the data in an ORC format.
C. Increase the number of spark.executor.cores in an Apache Spark job that stores the data in a text format.
D. Increase the number of spark.executor.instances in an Apache Spark job that stores the data in a text format.
E. Decrease the level of parallelism in an Apache Spark job that stores the data in a text format.
F. Use an action in an Apache Oozie workflow that stores the data in a text format.
G. Use an Azure Data Factory linked service that stores the data in Azure Data Lake.
H. Use an Azure Data Factory linked service that stores the data in an Azure DocumentDB database.

Answer: A

NEW QUESTION 45
You deploy Apache Kafka to an Azure HDInsight cluster. You plan to load data into a topic that has a specific schema. You need to load the data while maintaining the existing schema. Which file format should you use to receive the data?

A. JSON
B. Kudu
C. Apache Sequence
D. CSV

Answer: A

NEW QUESTION 46
You have an Apache Interactive Hive cluster in Azure HDInsight. The cluster has 12 processors and 96 GB of RAM. The YARN container size is set to 2 GB and the Tez container size is 3 GB. You configure one Tez container per processor. You are performing map joints between a 2 GB dimension table and a 96 GB fact table. You experience slow performance due to an inadequate utilization of the available resources. You need to ensure that the map joins are used. Which two settings should you configure? (Each correct answer presents part of the solution.Choose two.)

A. SET hive.tez.container.size=98304MB
B. SET hive.auto.convert.join.noconditionaltask.size=2048MB
C. SET yarn.scheduler.minimum-allocation-mb=6144MB
D. SET hive.auto.convert.join.noconditionaltask.size=3072MB
E. SET hive.tez.container.size=6144MB

Answer: AC

NEW QUESTION 47
You have an array of integers in Apache Spark. You need to save the data to an Apache Parquet file. Which methods should you use?

A. take an .toDF
B. makeRDD and sqlContext createDataSet
C. sqlContext load and makeRDD
D. makeRDD and sqlContext createDataFrame

Answer: D

NEW QUESTION 48
You have an Apache Spark cluster in Azure HDInsight. Users report that Spark jobs take longer than expected to complete. You need to reduce the amount of time it takes for the Spark jobs to complete. What should you do?

A. From HDFS, modify the maximum thread setting.
B. From Spark, modify the spark_thrift_cmd_opts parameter.
C. From YARN, modify the container size setting.
D. From Spark, modify the spark.executor.cores parameter.

Answer: D

NEW QUESTION 49
You are configuring an Apache Phoenix operation on top of an Apache HBase server. The operation executes a statement that joins an Apache Hive table and a Phoenix table. You need to ensure that when the table is dropped, the table files are retained, but the table metadata is removed from the Apache HCatalog. Which type of table should you use?

A. internal
B. external
C. temp
D. Azure Table Storage

Answer: B

NEW QUESTION 50
You have an Apache Hive cluster in Azure HDInsight. You plan to ingest on-premises data into Azure Storage. You need to automate the copying of the data to Azure Storage. Which tool should you use?

A. Microsoft Azure Storage Explorer
B. Azure Import/Export Service
C. Azure Backup
D. AzCopy

Answer: D

NEW QUESTION 51
You have an Apache HBase cluster in Azure HDInsight. You plan to use Apache Pig, Apache Hive, and HBase to access the cluster simultaneously and to process data stored in a single platform. You need to deliver consistent operations, security, and data governance. What should you use?

A. Apache Ambari
B. MapReduce
C. Apache Oozie
D. YARN

Answer: D

NEW QUESTION 52
You have several Linux-based and Windows-based Azure HDInsight clusters. The clusters are indifferent Active Directory domains. You need to consolidate system logging for all of the clusters into a single location. The solution must provide near real-time analytics of the log data. What should you use?

A. Apache Ambari
B. YARN
C. Microsoft System Center Operations Manager
D. Microsoft Operations Management Suite (OMS)

Answer: A

NEW QUESTION 53
You have an Apache Spark job. The performance of the job deteriorates over time. You plan to debug the job. You need to gather information that you can use to debug the job. Which tool should you use?

A. YARN
B. Spark History Server
C. HDInsight Cluster Dashboard
D. Jupyter Notebook

Answer: A

NEW QUESTION 54
......

Get the newest PassLeader 70-775 VCE dumps here: https://www.passleader.com/70-775.html

OR

Download more NEW PassLeader 70-775 PDF dumps from Google Drive here:

https://drive.google.com/open?id=16Ko6acR3bnZwYnl--najNi9hP740P3Xt

OR

Read the newest PassLeader 70-775 exam questions from this Blog:

http://www.microsoftbraindumps.com/?s=70-775

Good Luck!!!
 

Greek Smi

Member
Member
I have heard about a lot of websites which offer dumps and study material but I was not satisfied with their services. BraindumpsPDF.com offers a wide range of benefits like regular updates , beautiful presentation and 24/7 support. I passed my 70-775 exam after purchasing 70-775 dumps from them. This website is superb.
 

Greek Smi

Member
Member
I have heard about a lot of websites which offer dumps and study material but I was not satisfied with their services. BraindumpsPDF.com offers a wide range of benefits like regular updates , beautiful presentation and 24/7 support. I passed my 70-775 exam after purchasing 70-775 dumps from them. This website is superb.
 

John Butler

Member
Member
I have heard about a lot of websites which offer dumps and study material but I was not satisfied with their services. BraindumpsPDF.com offers a wide range of benefits like regular updates , beautiful presentation and 24/7 support. I passed my 70-775 exam after purchasing 70-775 dumps from them. This website is superb.
Thanks Greek Smi for sharing this valid study material.
 

Zaixkingg

Member
Member
I have heard about a lot of websites which offer dumps and study material but I was not satisfied with their services. BraindumpsPDF.com offers a wide range of benefits like regular updates , beautiful presentation and 24/7 support. I passed my 70-775 exam after purchasing 70-775 dumps from them. This website is superb.
Thanks "Greek Smith", this dump is valid.
trained with all these dumps. They are great!


.
 

markalan037

Banned
Banned
I have heard about a lot of websites which offer dumps and study material but I was not satisfied with their services. BraindumpsPDF.com offers a wide range of benefits like regular updates , beautiful presentation and 24/7 support. I passed my 70-775 exam after purchasing 70-775 dumps from them. This website is superb.
Thanks Greek Smi for providing valid study material in your comment.
 
Top