Support Questions
Find answers, ask questions, and share your expertise

How to load Phoenix table and read data into K-Means Python Script ?

I have a phoenix table in which I have combine the CDR (Call Detail Records) and CRM (Customer Relation Management) data.

My table has 18 columns : number, operator_a, operator_b, call_direction, call_type, call_category, number_calls, duration, longitude, latitude, age, day, month, year, zip_code,type_offer, offer, gender.

I am a beginner in machine learning and I am having trouble loading the table into my python script for clustering purpose.

This is what I have got so far, I am sure how to proceed or if I am doing it wrong

from matplotlib import pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage
import numpy as np
import phoenixdb
from sklearn import preprocessing
from sklearn.cluster import KMeans
database_url = 'http://localhost:8765/'
conn = phoenixdb.connect(database_url, autocommit=True)
cursor = conn.cursor()
cursor.execute("SELECT * FROM X LIMIT 45000")
df = pd.DataFrame(cursor.fetchall())
df.columns = [i[0] for i in cursor.description]
label_encoder = preprocessing.LabelEncoder()
df["ANUMBER"] = label_encoder.fit_transform(df["ANUMBER"])
df["AOPERATOR"] = label_encoder.fit_transform(df["AOPERATOR"])
df["BOPERATOR"] = label_encoder.fit_transform(df["BOPERATOR"])
df["DIRECTION"] = label_encoder.fit_transform(df["DIRECTION"])
df["TYPE"] = label_encoder.fit_transform(df["TYPE"])
df["CAT"] = label_encoder.fit_transform(df["CAT"])
df["NBR"] = label_encoder.fit_transform(df["NBR"])
df["DUREE"] = label_encoder.fit_transform(df["DUREE"])
df["LON"] = label_encoder.fit_transform(df["LON"])
df["LAT"] = label_encoder.fit_transform(df["LAT"])
df["AGE"] = label_encoder.fit_transform(df["AGE"])
df["DAY"] = label_encoder.fit_transform(df["DAY"])
df["MONTH"] = label_encoder.fit_transform(df["MONTH"])
df["YEAR"] = label_encoder.fit_transform(df["YEAR"])
df["ZIP"] = label_encoder.fit_transform(df["ZIP"])
df["TOFFER"] = label_encoder.fit_transform(df["TOFFER"])
df["OFFER"] = label_encoder.fit_transform(df["OFFER"])
df["GENDER"] = label_encoder.fit_transform(df["GENDER"])
df =df.dropna()