Introduction:
The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S algorithm finds the most specific hypothesis that fits all the positive examples. We have to note here that the algorithm considers only those positive training example. The find-S algorithm starts with the most specific hypothesis and generalizes this hypothesis each time it fails to classify an observed positive training data. Hence, the Find-S algorithm moves from the most specific hypothesis to the most general hypothesis.
Important Representation :
- ? indicates that any value is acceptable for the attribute.
- specify a single required value ( e.g., Cold ) for the attribute.
- ϕindicates that no value is acceptable.
- The most general hypothesis is represented by: {?, ?, ?, ?, ?, ?}
- The most specific hypothesis is represented by: {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
Steps Involved In Find-S :
- Start with the most specific hypothesis.
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ} - Take the next example and if it is negative, then no changes occur to the hypothesis.
- If the example is positive and we find that our initial hypothesis is too specific then we update our current hypothesis to a general condition.
- Keep repeating the above steps till all the training examples are complete.
- After we have completed all the training examples we will have the final hypothesis when can use to classify the new examples.
Code Snippet:
# coding: utf-8
# # Find-S Algorithm:
# ## Algorithm:
# 1. Initialize h to the most specific hypothesis in H
# 2. For each positive training instance x
# i. For each attribute constraint a i in h :
# a. If the constraint a i in h is satisfied by x Then do nothing
# b. Else replace a i in h by the next more general constraint that is satisfied by x
# 3. Output hypothesis h
#
# In[1]:
import csv
# ### Read File:
# Load the csv file and asign each row to a data frame
# Also print the row to see the dataset (optional)
# In[ ]:
a=[]
with open('finds.csv') as csfile:
reader = csv.reader(csfile)
for row in reader:
a.append(row)
print(row)
num_attributes=len(a[0])-1
# 1. The most general hypothesis is represented by:
# ```['?', '?', '?', '?', '?', '?']```
# 2. The most specific hypothesis is represented by:
# ```['0', '0', '0', '0', '0', '0']```
# In[ ]:
print("The most general hypothesis:",["?"]*num_attributes)
print("The most specific hypothesis:",["0"]*num_attributes)
# ### Algorithm Implementation:
# Implementation of the above algorithm by updating the hypothesis at each iteration and output the final hypothesis.
# In[ ]:
hypothesis=a[0][:-1]
print("\n Find S: Finding a maximally specific hypothesis")
for i in range (len(a)):
if a[i][num_attributes] == "Yes":
for j in range(num_attributes):
if a[i][j]!=hypothesis[j]:
hypothesis[j]='?'
print("The taining example no:",i+1," the hyposthesis is:",hypothesis)
print("\n The maximally specific hypohthesis for training set is")
print(hypothesis)
Output:
['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same', 'Yes'] ['Sunny', 'Warm', 'High', 'Strong', 'Warm', 'Same', 'Yes'] ['Rainy', 'Cold', 'High', 'Strong', 'Warm', 'Change', 'No'] ['Sunny', 'Warm', 'High', 'Strong', 'Cool', 'Change', 'Yes'] The most general hypothesis: ['?', '?', '?', '?', '?', '?'] The most specific hypothesis: ['0', '0', '0', '0', '0', '0'] Find S: Finding a maximally specific hypothesis The taining example no: 1 the hyposthesis is: ['Sunny', 'Warm', 'Normal', 'Strong', 'Warm', 'Same'] The taining example no: 2 the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same'] The taining example no: 3 the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', 'Warm', 'Same'] The taining example no: 4 the hyposthesis is: ['Sunny', 'Warm', '?', 'Strong', '?', '?'] The maximally specific hypohthesis for training set is ['Sunny', 'Warm', '?', 'Strong', '?', '?']
Input data Set:

No comments:
Post a Comment