1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1841 | 04-03-2024 06:39 AM | |
| 2857 | 01-12-2024 08:19 AM | |
| 1578 | 12-07-2023 01:49 PM | |
| 2340 | 08-02-2023 07:30 AM | |
| 3224 | 03-29-2023 01:22 PM |
06-18-2016
01:39 AM
1 Kudo
Does hdp 2.4 support pig on spark spprk
... View more
Labels:
- Labels:
-
Apache Pig
06-17-2016
03:23 PM
1 Kudo
What are the various ways to integrate Apache Pig, Nifi and Spark? I know I can connect some with Kafka or via files.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Pig
-
Apache Spark
06-17-2016
03:03 PM
also Spark history UI has a lot of cool information
... View more
06-17-2016
02:22 PM
1 Kudo
Define parallel processing. You have many nodes for ingest. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html
Event driven: When this mode is selected, the Processor will be triggered to run by an event, and that event occurs when FlowFiles enter Connections feeding this Processor. This mode is currently considered experimental and is not supported by all Processors. When this mode is selected, the ‘Run schedule’ option is not configurable, as the Processor is not triggered to run periodically but as the result of an event. Additionally, this is the only mode for which the ‘Concurrent tasks’ option can be set to 0. In this case, the number of threads is limited only by the size of the Event-Driven Thread Pool that the administrator has configured.
... View more
06-17-2016
12:42 PM
Lipstick Installation Resources: http://www.graphviz.org/Download_linux_rhel.php https://github.com/Netflix/Lipstick/wiki/Getting-Started Commands sudo yum list available 'graphviz*'
sudo yum -y install 'graphviz*'
./gradlew assemble I always like to rename gradlew, avengers; then ./gradlew run-app Hit your browser to view: http://localhost:9292/ Make sure you add that port/open firewall/etc... 2016-06-17 02:36:44,558 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.4.0root2016-06-17 02:36:402016-06-17 02:36:44HASH_JOIN,FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTimeMedianMapTimeMaxReduceTimeMinReduceTimeAvgReduceTimeMedianReducetimeAliasFeatureOutputs
job_local2036219587_000121n/an/an/an/an/an/an/an/afruit_names_join,fruits,limited,namesHASH_JOIN
job_local406327028_000211n/an/an/an/an/an/an/an/afruit_namesfile:/tmp/temp195796189/tmp-2027262369,
Input(s):
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/1.dat"
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/2.dat"
Output(s):
Successfully stored 1 records in: "file:/tmp/temp195796189/tmp-2027262369"
Counters:
Total records written : 1
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local2036219587_0001->job_local406327028_0002,
job_local406327028_0002
2016-06-17 02:36:44,568 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-06-17 02:36:44,571 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-06-17 02:36:44,582 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-06-17 02:36:44,583 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(orange,ORANGE) It's a very nice looking visualization.
... View more
Labels:
06-17-2016
03:43 AM
Yes ascending is usually a default in most languages. https://pig.apache.org/docs/r0.14.0/basic.html#order-by Usage Note: ORDER BY is NOT stable; if multiple records have the same ORDER BY key, the order in which these records are returned is not defined and is not guarantted to be the same from one run to the next.
... View more
06-17-2016
02:22 AM
You can leave that old Spark. But it's best to install via Ambari and used the supported version on YARN. Shut down the old one or use it for standalone purposes
... View more
06-17-2016
02:21 AM
1 Kudo
Resources: http://www.graphviz.org/Download_linux_rhel.php https://github.com/Netflix/Lipstick/wiki/Getting-Started yum list available 'graphviz*'yum install 'graphviz*'./gradlew assemble I always like to rename gradlew, avengers. then ./gradlew run-app http://localhost:9292/ (Make sure you add that port/open firewall/etc...) That worked! 2016-06-17 02:36:44,558 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.4.0root2016-06-17 02:36:402016-06-17 02:36:44HASH_JOIN,FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTimeMedianMapTimeMaxReduceTimeMinReduceTimeAvgReduceTimeMedianReducetimeAliasFeatureOutputs
job_local2036219587_000121n/an/an/an/an/an/an/an/afruit_names_join,fruits,limited,namesHASH_JOIN
job_local406327028_000211n/an/an/an/an/an/an/an/afruit_namesfile:/tmp/temp195796189/tmp-2027262369,
Input(s):
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/1.dat"
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/2.dat"
Output(s):
Successfully stored 1 records in: "file:/tmp/temp195796189/tmp-2027262369"
Counters:
Total records written : 1
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local2036219587_0001->job_local406327028_0002,
job_local406327028_0002
2016-06-17 02:36:44,568 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-06-17 02:36:44,571 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-06-17 02:36:44,582 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-06-17 02:36:44,583 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(orange,ORANGE)
... View more