Member since
04-18-2017
2
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6066 | 05-30-2017 08:00 AM |
05-30-2017
08:00 AM
1 Kudo
Hi @Kevin Feasel and @Pedro Faro To give a bit of background on my setup and how I got it working.
1) Create HDP 2.5 VM Sandbox in Azure and make IP static. 2) Create SQL Server 2016 VM in Azure and make IP static. 3) Install putty on SQL Server 2016 vm. 4) Create putty session to Sandbox IP with the following tunnels 10000
10015
10500
11000
19888
2222
4200
50010
50070
50075
8020
8050
8080
8088
8886
8888
9995
9996
5) Create entry in hosts for 127.0.0.1 mapping to sandbox.hortonworks.com 6) Connect to sandbox using putty 7) Open Ambari and get copies of configs core-site, hdfs-site, hive-site, mapred-site and yarn-site from sandbox and move over to
C:\Program Files\Microsoft SQL Server\MSSQL13.SQL_2016\MSSQL\Binn\Polybase\Hadoop\conf\
on SQL Server 2016 vm. 😎 Restart Polybase services. 9) Follow instructions on
https://hortonworks.com/hadoop-tutorial/opening-sandbox-ports-azure/ to add port 50010. 10 ) Use Ambari to add a test.csv file to HDFS under folder /user/maria_dev/data 11) Create external source CREATE EXTERNAL DATA SOURCE [Data-Source-Name] WITH (
TYPE = HADOOP,
LOCATION ='hdfs://sandbox.hortonworks.com:8020',
RESOURCE_MANAGER_LOCATION = 'sandbox.hortonworks.com:8050'
); 12) Create external file format CREATE EXTERNAL FILE FORMAT TextFileFormat WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (FIELD_TERMINATOR =',',
USE_TYPE_DEFAULT = TRUE)); 13) Create external table CREATE EXTERNAL TABLE [dbo].[Test] (
CustomerID INT, CustomerName NCHAR(10), NumberOfOrders INT
) WITH (LOCATION='/user/maria_dev/data/Test.csv', DATA_SOURCE = [Data-Source-Name],
FILE_FORMAT = TextFileFormat
); 14) SELECT * FROM [dbo].[Test] I am logged onto the SQL Server 2016 VM the from step 3 onwards. This is just for testing the functionality.
Hope this helps
... View more
04-18-2017
05:57 PM
Hi @Kevin Feasel I managed to fix the error on HDP 2.5 Could not obtain block: BP-1464254149-172.17.0.2-1477381671113:blk_1073742700_1894 file=/user/maria_dev/data/Test.csv Microsoft.SqlServer.DataWarehouse.Common.ErrorHandling.MppSqlException:
Could not obtain block: BP-1464254149-172.17.0.2-1477381671113:blk_1073742700_1894 file=/user/maria_dev/data/Test.csv at Microsoft.SqlServer.DataWarehouse.DataMovement.Common.ExternalAccess.HdfsBridgeReadAccess.Read(MemoryBuffer buffer, Boolean& isDone) at Microsoft.SqlServer.DataWarehouse.DataMovement.Workers.DataReader.ExternalMoveBufferReader.Read() at Microsoft.SqlServer.DataWarehouse.DataMovement.Workers.ExternalMoveReaderWorker.ReadAndSendData() at Microsoft.SqlServer.DataWarehouse.DataMovement.Workers.ExternalMoveReaderWorker.Execute(Object status) The issue is that the VM is not listening on Port 50010, strange I know as it's fundamental. This link shows you have to add the port https://hortonworks.com/hadoop-tutorial/opening-sandbox-ports-azure/ Any questions i'd be happy to help
... View more