I’ve been playing around with text Files for some time now and I will like to share with you what I’ve learnt.
Today, I wrote a python script to determine the reading grade of a text file using Flesch index.Flesch index was proposed by Dr. Rudolf Flesch in 1949.Flesch Index scores usually range from 0 to 100.
|Flesch index||Text file reading Grade|
Enough with the facts, let us go into the coding aspect.
WHAT DOES MY PYTHON SCRIPT DO?
My program prompts the user to enter an existing text file in the current directory of my program. It then reads the text file and manipulates it to output the number of words and the level of the reading grade of the text file using Flesch Index formula. It will count the number of words, sentence and syllabus .
The Flesch-Kincaid Grade Level Formula is used to compute the
Equivalent Grade Level G:
G = 0.39 _ (words / sentences) + 11.8 _ (syllables / words) – 15.59
So before I show you some program codes in python, let us know what these words mean.
|word||Any sequence of characters that are not whitespaced.|
|sentence||Any sequence of words that have puntaition mark( a question mark,colon or semicolon,exclamation mark,and a fullstop)|
|Syllable||Any word of three characters or less; or any vowel
(a, e, i, o, u) or pair of consecutive vowels, except
for a final -es, -ed, or -e that is not -le.
VERSION OF PYTHON AM USING.
Algorithm of Flesch Index.
- Get user’s text file
- count words
- count sentence
- count syllables
- compute Flesch Index
Now to business!!!…
FLESCH INDEX PYTHON SCRIPT
UserFileName=raw_input(“Enter file name:”)
if (UserFileName in listOfdir) and (UserFileName.endswith(‘.txt’)):
sentence=text.count(“.”) + text.count(‘!’) + \
text.count(“;”) + text.count(“:”) + \
for word in text.split():
for vowel in [‘a’,’e’,’i’,’o’,’u’]:
syllable += word.count(vowel)
for ending in [‘es’,’ed’,’e’]:
syllable -= 1
syllable += 1
if G >= 0 and G <=30: print ‘The Readability level is College’ elif G >= 50 and G <=60: print ‘The Readability level is High School’ elif G >= 90 and G <=100:
print ‘The Readability level is fourth grade’
print ‘This text has %d words’ %(words)
elif UserFileName not in listOfdir:
print “This text file does not exist in current directory”
print “This is not a text file.”
If you look closely at the above python index script,you will notice some additions that were made which were not in the algorithm. I added them to prevent ERROR in our program. Also, the program was not properly indented.
Explanation of the FLESCH INDEX PYTHON SCRIPT.
The python os module provides methods that help you perform file processing operations such as viewing the list of files in the current directory.
The getcwd() displays the current working directory of where our python program is located.Our text file has to be saved in the same directory of our python program to avoid ERROR in our program.
dire is the location of our current working directory.
The listdir(dire) displays a list of files in the current working directory of the object dire.
Our next block of code is:
The if statement avoids ERROR in case the input file is not a text file or it does not exist in our current working directory.
Our next block of code executes the algorithm of our FLESCH INDEX PYTHON SCRIPT.
We are almost through, but we have to write an error message to complete our program. The first elif statement tells the user that the text file does not exist in the current directory while the second elif statement tells the user that the input file is not a text file.
Finally, we are through!
And I believe you can now write your FLESCH INDEX PYTHON SCRIPT TO DETERMINE THE READABILITY OF A TEXT FILE.
If you are a python developer, view this project on my Github page
If you need further explanation, please feel free to comment or likewise share this post.