Logistic regression is one of the most popular Machine Learning algorithms, which comes under the Supervised Learning technique. It is used for predicting the categorical dependent variable using a given set of independent variables. Logistic regression predicts the output of a categorical dependent variable. Therefore the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
In this implementation, we use "Study Hours" as the independent variable and "Exam Results (0 or 1)" as the dependent variable.
pd.read_csv()
reads the CSV file into a Pandas DataFrame."Study Hours"
and "Exam Result"
.train_test_split()
is used to split the data into training (80%) and testing (20%) sets.random_state=42
ensures the split is reproducible.LogisticRegression()
initializes the logistic regression model.model.fit(X_train, y_train)
trains the model on the training data.model.predict(X_test)
uses the trained model to predict the exam results for the test set.accuracy_score()
.To install required libraries, Open Command Prompt or Terminal and execute the following commands
$ pip install scikit-learn
$ pip install numpy
$ pip install pandas
$ pip install matplotlib
Study Hours,Exam Result
1,0
2,0
3,0
4,0
5,1
6,1
7,1
8,1
9,1
10,1
To Download above CSV file : Click Here
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
# Step 1: Load the dataset
data = pd.read_csv("study_hours.csv") # Ensure this file exists in the same directory
X = data[['Study Hours']].values
y = data['Exam Result'].values
# Step 2: Train-test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42)
# Step 3: Train logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Step 4: Predict and evaluate
y_pred = model.predict(X_test)
# Step 5: Print results
print(f"Accuracy: {accuracy_score(y_test, y_pred):.1f}")
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
# Step 6: Plotting decision boundary
X_range = np.linspace(X.min() - 1, X.max() + 1, 300).reshape(-1, 1)
y_prob = model.predict_proba(X_range)[:, 1]
plt.figure(figsize=(8, 6))
plt.scatter(X_train, y_train, color='blue', label='Training Data')
plt.scatter(X_test, y_test, color='green', marker='x', s=100, label='Testing Data')
plt.plot(X_range, y_prob, color='red', linewidth=2, label='Decision Boundary')
plt.xlabel("Study Hours")
plt.ylabel("Exam Result")
plt.title("Logistic Regression - Study Hours vs Exam Result")
plt.legend()
plt.grid(True)
plt.show()
Sample Run:
--------------
$ python3 Logistic_Regression.py
Accuracy: 1.0
Confusion Matrix:
[[1 0]
[0 1]]
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 1
1 1.00 1.00 1.00 1
accuracy 1.00 2
macro avg 1.00 1.00 1.00 2
weighted avg 1.00 1.00 1.00 2
1) Write a python program to compute
• Central Tendency Measures: Mean, Median,Mode
• Measure of Dispersion: Variance, Standard Deviation View Solution
2) Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy View Solution
3) Study of Python Libraries for ML application such as Pandas and Matplotlib View Solution
4) Write a Python program to implement Simple Linear Regression View Solution
5) Implementation of Multiple Linear Regression for House Price Prediction using sklearn View Solution
6) Implementation of Decision tree using sklearn and its parameter tuning View Solution
7) Implementation of KNN using sklearn View Solution
8) Implementation of Logistic Regression using sklearn View Solution
9) Implementation of K-Means Clustering View Solution
10) Performance analysis of Classification Algorithms on a specific dataset (Mini Project) View Solution