Project for CSE – Simple Speech interpreter AI

 

Project for CSE – Simple Speech interpreter AI

 

The Artificial Intelligence and Automation is taking over all the sectors at much rapid pace than we as human beings expected it to. Automation of daily activities and complex tasks result in saving a lot of time. Voice is based on the most needed task for a student which is collecting important data. This task may sound easy but, it is more hectic when surfing all the websites on internet with a particular search. It takes a lot of our time to find the best content or the best website which gives the needed data. Voice is a python automation script which uses many online python libraries.

Voice works in the same way as amazon’s Alexa or googles Ok google or in a cooler way like tiny version of Jarvis from Iron man’s suit. Voice is an inception or a foundation of AI.

 

Let’s start building the project.

First of all, install Python in your system. You can download python from the link provided here.

After installing python, now you have to download and install pip which is a package installer for python. This will help you install all the libraries required for the project. Usually, pip is automatically installed when downloading python. If you want more info on pip installation you can find it here.

 

Create a folder named Voice or any project name you want to give. This project mainly contains two python files, “play_me.py” and “extension.py”, create these files in the root folder. You can refer to the following project hierarchy for your reference.



Once the files are created, the next thing is to install libraries for the project.

Note: - Details and use for importing these libraries are provided at the end of blog.

 

First, run these commands separately on terminal to install the libraries in your project.

 

Now, let’s start with coding.

 

import os

import speech_recognition as sr

 

Type above code in play_me.py file to import the libraries.

 

 

file = open("output.txt","w")

 

r = sr.Recognizer()

 

with sr.Microphone() as source:

    # read the audio data from the default microphone

    audio_data = r.record(source, duration=5)

    print("Recognizing...")

    # convert speech to text

    text = r.recognize_google(audio_data)

    print(text)

    file.writelines(text)

 

file.close()

os.system("python extension.py")

 

 

            Above code will create a file named output.txt which will contain the text of audio narrated by the user. After starting the microphone user is supposed to ask a question, it could be anything which is available on internet or more specifically Wikipedia.

For recording the question, the user is provided a time of 5 seconds. After getting the audio, recognizer will convert the audio into text and the code will write the recorded code into the previously created text file. Now the last line will run the next python file.

 

 

try:

    from googlesearch import search

except ImportError:

    print("No module")

 

import urllib

import xarray as xr

from gtts import gTTS

from playsound import playsound

from bs4 import BeautifulSoup as soup

import os

 

file = open("output.txt","r")

 

myText = file.read()

 

language = 'en'

 

for link in search(myText + " wikipedia", tld = "co.in", num = 5, stop = 5, pause = 2):

    print(link)

 

    if "wikipedia" in link:

        data = urllib.request.urlopen(link)

       

        urlData = data.read()

 

        data.close()

       

        page = soup(urlData,"html.parser")

 

        temp = page.find('div',{ "class" : "mw-body"})

 

        printData = temp.p.text

 

        print(printData)

        if(len(printData)>1):

            print(len(printData))

            obj = gTTS(text=printData, lang = language, slow=False)

 

            obj.save('welcome.mp3')

 

            playsound("welcome.mp3",True)

 

            os.remove('welcome.mp3')

 

 

The above script is used to get the recorded text which was previously stored. Then the text is searched through the web inside specific websites for example, based on the above code a condition is allocated to search specifically in Wikipedia websites. Thus, the results will be parsed and combined content from Wikipedia websites.

One can add more checks to increase the versatility of the search. But it may slow down the whole process as it will need more time to process the huge number of websites.

After searching through the text and finding required text (which is configurable), the script will save the searched text into an mp3 audio file. This is achieved using gtts library, which converts the text to audio in any language (with configurable speed).

The saved audio is then played using playground method which plays mp3 audio files. The audio will narrate searched data for about 5 times with different content searched through google search’s search method. The number of search and narrations is also controllable and one can play with the search method to do so.

 

For running the whole project, you have to only run the play_me.py script which will automatically call the extension script.

Save both the files and run the below command to start the project.

 

python play_me.py

 

 

Comments

Popular posts from this blog

Best Remote Developer Job Websites in 2026

The Difference between var and let in typescript