Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Link Analysis using Spark Python

avatar
Rising Star

Hi, I need to create some graphs using PySpark to elaborate some link analysis research. I already see this link: http://kukuruku.co/hub/algorithms/social-network-analysis-spark-graphx But this algorithm is implemented in Scala which is very more complex to understand. Anyone have an idea on a white paper or some tutorial that do some link analysis research using PySpark? Thanks!

1 ACCEPTED SOLUTION

avatar

Hi Pedro,

python API for Spark is still missing, however there is a git project with a higher level API on top of Spark GraphX called GraphFrames: (GraphFrames) . The project claims: "GraphX is to RDDs as GraphFrames are to DataFrames."

I haven't worked with it, however a quick test of their samples with Spark 1.6.2 worked:

Use pyspark like this:

pyspark --packages graphframes:graphframes:0.2.0-spark1.6-s_2.10

or use zeppelin and add the dependencies to the interpreter configuration.

Maybe this library has what you need.

View solution in original post

1 REPLY 1

avatar

Hi Pedro,

python API for Spark is still missing, however there is a git project with a higher level API on top of Spark GraphX called GraphFrames: (GraphFrames) . The project claims: "GraphX is to RDDs as GraphFrames are to DataFrames."

I haven't worked with it, however a quick test of their samples with Spark 1.6.2 worked:

Use pyspark like this:

pyspark --packages graphframes:graphframes:0.2.0-spark1.6-s_2.10

or use zeppelin and add the dependencies to the interpreter configuration.

Maybe this library has what you need.