The K-Nearest Neighbors (KNN) algorithm is a supervised machine learning method employed to tackle classification and regression problems.
K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories.
K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm.
K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification problems.
In this program, we are going to use iris dataset.And this dataset Split into training(70%) and test set(30%).
The iris dataset conatins the following features
- sepal length (cm)
- sepal width (cm)
- petal length (cm)
- petal width (cm)
To install required library files, Open Command Prompt or Terminal and execute the following commands
$ pip install scikit-learn
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import random
data_iris = load_iris()
label_target = data_iris.target_names
print()
print("Sample Data from Iris Dataset")
print("*"*30)
for i in range(10):
rn = random.randint(0,120)
print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])
X = data_iris.data
y = data_iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3,random_state=1)
print("The Training dataset length: ",len(X_train))
print("The Testing dataset length: ",len(X_test))
try:
nn = int(input("Enter number of neighbors :"))
knn = KNeighborsClassifier(nn)
knn.fit(X_train, y_train)
# to display the score
print("The Score is :",knn.score(X_test, y_test))
# To get test data from the user
test_data = input("Enter Test Data :").split(",")
for i in range(len(test_data)):
test_data[i] = float(test_data[i])
print()
v = knn.predict([test_data])
print("Predicted output is :",label_target[v])
except:
print("Please supply valid input......")
Sample Run:
--------------
$ python3 Knn.py
Sample Data from Iris Dataset
******************************
[6.1 2.8 4. 1.3] ===> versicolor
[4.6 3.1 1.5 0.2] ===> setosa
[5.8 2.6 4. 1.2] ===> versicolor
[6.9 3.2 5.7 2.3] ===> virginica
[6.7 3.1 4.7 1.5] ===> versicolor
[4.7 3.2 1.3 0.2] ===> setosa
[5. 3. 1.6 0.2] ===> setosa
[6.4 2.9 4.3 1.3] ===> versicolor
[5.5 2.4 3.7 1. ] ===> versicolor
[5.8 2.6 4. 1.2] ===> versicolor
The Training dataset length: 105
The Testing dataset length: 45
Enter number of neighbors :9
The Score is : 0.9777777777777777
Enter Test Data :5.3,2.2,3.2,1.5
Predicted output is : ['versicolor']
1) Write a python program to compute
• Central Tendency Measures: Mean, Median,Mode
• Measure of Dispersion: Variance, Standard Deviation View Solution
2) Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy View Solution
3) Study of Python Libraries for ML application such as Pandas and Matplotlib View Solution
4) Write a Python program to implement Simple Linear Regression View Solution
5) Implementation of Multiple Linear Regression for House Price Prediction using sklearn View Solution
6) Implementation of Decision tree using sklearn and its parameter tuning View Solution
7) Implementation of KNN using sklearn View Solution
8) Implementation of Logistic Regression using sklearn View Solution
9) Implementation of K-Means Clustering View Solution
10) Performance analysis of Classification Algorithms on a specific dataset (Mini Project) View Solution