Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar

Big Data Wrangling on HDP with Trifacta - How to Get started

Data Preparation is a constant challenge for any enterprise and the speed, diversity and volumes of data introduced by Big Data simply amplify this problem substantially. Trifacta with HDP helps introduce a new approach to organizing, cleansing, enriching and structuring your data, Data Wrangling, where business users are able to connect and engage with the data to drive out high quality data sets for analytics.

Step 1: Download VM

Trifacta would like to provide Hortonworks partners and SI's with an opportunity to test drive Data Wrangling on HDP. Here is a link to a pre-configured virtual machine containing Trifacta Enterprise and HDP 2.3. Feel free to download and try Wrangling today:

ftp://download.trifacta.com/Hortonworks/Trifacta_3.0_HDP_2.3.2_sandbox.ova

user: hortonworks

pw: wrangler

Step 2: Start VM and access consoles

The Trifacta Enterprise Wrangler on HDP is built on HDP 2.3 and the demo/sandbox instance of Centos. To access the instance:

When using vmware desktop/fusion, the VM is configured to share with your host (NAT) so the IP issued to your VM should be

something like this, so the ambari login is:

http://172.16.238.133:8080

login to ambari (admin/admin) to ensure your HDP services are running

NOTE: If using VirtualBox, port forwarding will allow you to access these services on the same ports, but through localhost:

http://127.0.0.1:8080

to access the vm via ssh:

ssh root@172.16.238.133

pw: Wrangler!123

to start the trifacta service, type:

service trifacta start

to access the trifacta UI,

http://172.16.238.133:3005

user: admin@trifacta.local

pw: admin

Step 3: Try out the demos

The demo instance comes configured with 11 canned trifacta demos, the datasets for these is available for use immediately:

  • CPG_CrossSell
  • IoT_CityBike
  • CPG_InventoryPlanning
  • Pharmacovigilance_DrugSafety
  • ClickStream_WeblogAnalytics
  • SIEM_CyberSecurity
  • DemoContentOverview.pdf
  • SalesDashboard_For_Executives
  • FinServ_TraderFraud
  • TelcoChurn_4MinuteDemo
  • TelcoChurn_Customer360
  • Insurance_CrossSell

Data Wrangling allows a business user to discovery, register, transform, structure and publish high quality analytic data sets in a matter of minutes.

Register on the Trifacta Partner Portal for more information of these demos.

https://trifacta.channeltivity.com/

for more, visit http://www.trifacta.com.

1,994 Views
Comments
avatar
Master Guru

is there an update for 2.4?