Skip to content

Commit 3b4f29a

Browse files
authored
Update README.md
1 parent 7e961dc commit 3b4f29a

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,26 +17,26 @@ This is a stripped down version of the [**Open Speech Recording**](https://githu
1717
python -m flask run
1818
```
1919

20-
3. Then open the link provided in the terminal in a web browser to run the application. Make sure to **run the application in a private or incognito window** which avoids any cacheing issues. Also we've found that the app works best when **using Chrome**. Once the app opens you'll need give access to your microphone, and then you can click ```Record```. Once you finish recording all of the words a popup will appear and ask if you'd like to download the data. Simply click ```OK``` and the files will be downloaded into the folder from which you are running the flask app (which should be the open-speech-recording folder). You can then close the browser window and kill the app with ```cntrl+c```.
20+
3. Then open the link provided in the terminal in a web browser to run the application. Make sure to **run the application in a private or incognito window** which avoids any cacheing issues. Also we've found that the app works best when **using Chrome**. Once the app opens you'll need give access to your microphone, and then you can click ```Record```. Once you finish recording all of the words a popup will appear and ask if you'd like to download the data. Simply click ```OK``` and the files will be downloaded into the folder from which you are running the flask app (which should be the open-speech-recording folder).
2121

22-
Note: if you want to change the words or counts recorded make sure to kill and re-start the app and open the link in a new incognito window!
22+
Note: if you want to change the words or counts make sure to kill and re-start the app and open the link in a new incognito window to avoid any cacheing issues at the server or browser level!
2323

2424
### You can use the scripts to manipulate the data as follows:
2525

26-
1. You now have a large collection of ```.ogg``` files in your directory. We now need to convert them to ```.wav``` files using ```ffmpeg```:
26+
1. You can convert the ```.ogg``` files to ```.wav``` files using ```ffmpeg```:
2727
```
2828
sudo apt-get install ffmpeg
2929
mkdir wavs
3030
find *.ogg -print0 | xargs -0 basename -s .ogg | xargs -I {} ffmpeg -i {}.ogg -ar 16000 wavs/{}.wav
3131
```
3232

33-
2. Then we need to trim them using Pete's tool. Note: background noise does not need to be trimmed and does not need to be recorded by the tool — you simply need to convert them to ```.wav``` files. In fact, they are expected to be longer. See the [original dataset](https://storage.googleapis.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz) for examples.
33+
2. You can then trim the ```.wav``` files with Pete's tool.
3434
```
3535
mkdir trimmed_wavs
3636
make -C extract_loudest_section/
3737
/tmp/extract_loudest_section/gen/bin/extract_loudest_section 'wavs/*.wav' trimmed_wavs/
3838
```
39-
3. Finally we need to create the directory structure as the training script expects that the data is organized into folders where there is one folder for each of the ```wanted_words``` filled with ```.wav``` files of the wanted words (as 1 second clips) and an additional folder called ```_background_noise_``` containing longer files of background noise. We will then compress this into a zip file so it can be easily uploaded to Colab.
39+
3. Finally we can create the directory structure expected by the Tensorflow training script by running another of Pete's scripts adn then compress it into a zip file so it can be easily uploaded to Colab.
4040
```
4141
python organize_wavs.py
4242
cd output_wavs

0 commit comments

Comments
 (0)