A tool written in Python to generate Frequency Distribution Tables in .xlsx file format.
Made to make my life in Statistics class a little bit easier and just have an excuse to actually learn Python more.
In a terminal of your choice, run:
pip install -r requirements.txt
git clone https://github.com/cobbdzon/fdt.py.git
Always remember to configure the dataset in src\config.json
before outputting a file.
Simply run the batch file src\run.bat
and type in the parameters.
After inputting the parameters the program will automatically open the directory and have the .xlsx file selected for you. You can edit the code to remove the auto-open if you don't like having a new window pop up every time you generate a file.
The output name should always be a valid file name and the number of classes be a valid number at all times or else the program will not output anything.
The old and original way
py "C:\...\fdt.py\src\init.py" [OUTPUT_NAME] [NUMBER_OF_CLASSES]
Replace the ...
to the directory of the fdt.py repository in your machine.
If either [OUTPUT_NAME]
or [NUMBER_OF_CLASSES]
are specified, their default values in config.json
will be used instead.
You can find the outputted .xlsx files in the out folder inside src.
An example of a proper config.json
from src\templates
is shown below
{
"outputName": "FDT",
"numberOfClasses": 6,
"data": [
69, 97, 76, 60, 35, 83, 63,
67, 40, 85, 75, 49, 58, 55,
59, 73, 43, 93, 38, 78, 71,
55, 51, 70, 89, 61, 65, 65,
72, 65, 75, 32, 64, 60, 75,
89, 75, 65, 85, 87, 45, 75
]
}
Config | Type | Description |
---|---|---|
outputName | string | The default file name of the .xlsx file that will be outputted by the script. |
numberOfClasses | number | The default number of classes in the Frequency Distribution Table. |
data | array<number> | The data sample for the Frequency Distribution Table. |
The formulas used in the code and just for the author's reference.
Notations from statistics class.
Variable | Name |
---|---|
Class Width | |
Class Interval | |
Class Interval | |
Class Mark | |
Class Boundary | |
Frequency | |
Total Frequency | |
Relative Frequency | |
Cumulative Frequency | |
Less Than Cumulative Frequency | |
Greater Than Cumulative Frequency |
This is my own notation for certain unannotated values.
Variable | Name |
---|---|
Dataset Lowest Value | |
Dataset Highest Value | |
Desired Classes | |
Class Interval Lower Limit | |
Class Intervl Upper Limit | |
Class Boundary Lower Limit | |
Class Boundary Upper Limit |