Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Fuzzy Algorithm in Apache Spark

Explorer

Good Morning, everyone.

Before i go with my question, i will start why i need fuzzy algorithm. In my project, i use Arduino to collect some data about temperature, etc. After i collect a lot of data, i want to use Fuzzy Algorithm to predict rainfall in some area.

But, i dont find fuzzy algorithm in MLlib's Apache Spark ? What should i do ? How i can use fuzzy algorithm in HDF, or there is another way ?

Or can i use python library in Apache Spark ? ; Because i heard there are some fuzzy library in Python, maybe i can use it in Apache Spark

Thanks a lot.

1 ACCEPTED SOLUTION

Hello @Rendiyono Wahyu Saputro

Yes, you can import python libraries and use them in Spark, which supports a full Python API via the pyspark shell. For instance, if you wanted to load and use the python scikit-fuzzy library to run fuzzy logic, then you just:

1) Download python library, either using maven update to local repo, or directly via github, and add the library to your Spark classpath

2) Kick off job with pyspark shell (Example: $ pyspark --jars /path/to/scikit-fuzzy.jar )

3) Import python library in your code (Example: "import skfuzzy as fuzz")

4) Use the library

More information about scikit-fuzzy library here:

https://pypi.python.org/pypi/scikit-fuzzy

Hints about dependencies and install:

Scikit-Fuzzy depends on

  • NumPy >= 1.6
  • SciPy >= 0.9
  • NetworkX >= 1.9

and is available on PyPi! The lastest stable release can always be obtained and installed simply by running

$ pip install -U scikit-fuzzy

View solution in original post

1 REPLY 1

Hello @Rendiyono Wahyu Saputro

Yes, you can import python libraries and use them in Spark, which supports a full Python API via the pyspark shell. For instance, if you wanted to load and use the python scikit-fuzzy library to run fuzzy logic, then you just:

1) Download python library, either using maven update to local repo, or directly via github, and add the library to your Spark classpath

2) Kick off job with pyspark shell (Example: $ pyspark --jars /path/to/scikit-fuzzy.jar )

3) Import python library in your code (Example: "import skfuzzy as fuzz")

4) Use the library

More information about scikit-fuzzy library here:

https://pypi.python.org/pypi/scikit-fuzzy

Hints about dependencies and install:

Scikit-Fuzzy depends on

  • NumPy >= 1.6
  • SciPy >= 0.9
  • NetworkX >= 1.9

and is available on PyPi! The lastest stable release can always be obtained and installed simply by running

$ pip install -U scikit-fuzzy