Praveen Kumar K C: ML - Find S Algorithm

Introduction:

The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S algorithm finds the most specific hypothesis that fits all the positive examples. We have to note here that the algorithm considers only those positive training example. The find-S algorithm starts with the most specific hypothesis and generalizes this hypothesis each time it fails to classify an observed positive training data. Hence, the Find-S algorithm moves from the most specific hypothesis to the most general hypothesis.

Important Representation :

? indicates that any value is acceptable for the attribute.
specify a single required value ( e.g., Cold ) for the attribute.
ϕindicates that no value is acceptable.
The most general hypothesis is represented by: {?, ?, ?, ?, ?, ?}
The most specific hypothesis is represented by: {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}

Steps Involved In Find-S :

Start with the most specific hypothesis.
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
Take the next example and if it is negative, then no changes occur to the hypothesis.
If the example is positive and we find that our initial hypothesis is too specific then we update our current hypothesis to a general condition.
Keep repeating the above steps till all the training examples are complete.
After we have completed all the training examples we will have the final hypothesis when can use to classify the new examples.

Code Snippet:

# coding: utf-8

# # Find-S Algorithm:

# ## Algorithm:

# 1. Initialize h to the most specific hypothesis in H

# 2. For each positive training instance x

# i. For each attribute constraint a i in h :

# a. If the constraint a i in h is satisfied by x Then do nothing

# b. Else replace a i in h by the next more general constraint that is satisfied by x

# 3. Output hypothesis h

# In[1]:

import csv

# ### Read File:

# Load the csv file and asign each row to a data frame

# Also print the row to see the dataset (optional)

# In[ ]:

a=[]

with open('finds.csv') as csfile:

reader = csv.reader(csfile)

for row in reader:

a.append(row)

print(row)

num_attributes=len(a[0])-1

# 1. The most general hypothesis is represented by:

# ```['?', '?', '?', '?', '?', '?']```

# 2. The most specific hypothesis is represented by:

# ```['0', '0', '0', '0', '0', '0']```

# In[ ]:

print("The most general hypothesis:",["?"]*num_attributes)

print("The most specific hypothesis:",["0"]*num_attributes)

# ### Algorithm Implementation:

# Implementation of the above algorithm by updating the hypothesis at each iteration and output the final hypothesis.

# In[ ]:

hypothesis=a[0][:-1]

print("\n Find S: Finding a maximally specific hypothesis")

for i in range (len(a)):

if a[i][num_attributes] == "Yes":

for j in range(num_attributes):

if a[i][j]!=hypothesis[j]:

hypothesis[j]='?'

print("The taining example no:",i+1," the hyposthesis is:",hypothesis)

print("\n The maximally specific hypohthesis for training set is")

print(hypothesis)

Output:

['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same', 'Yes']
['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same', 'Yes']
['Rainy', 'Cold', 'High', 'Strong', 'Warm', 'Change', 'No']
['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change', 'Yes']
The most general hypothesis: ['?', '?', '?', '?', '?', '?']
The most specific hypothesis: ['0', '0', '0', '0', '0', '0']

 Find S: Finding a maximally specific hypothesis
The taining example no: 1  the hyposthesis is: ['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same']
The taining example no: 2  the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
The taining example no: 3  the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same']
The taining example no: 4  the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', '?', '?']

 The maximally specific hypohthesis for training set is
['Sunny', 'Warm', '?', 'Strong', '?', '?']

Input data Set:

Praveen Kumar K C

Tuesday, November 7, 2023

ML - Find S Algorithm

No comments:

Post a Comment

AI & M L Lab - 18CSL76

Search This Blog