Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 865 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 190.2 KiB |
Average record size in memory | 225.2 B |
Variable types
CAT | 3 |
---|---|
NUM | 3 |
Reproduction
Analysis started | 2020-02-13 23:57:46.007107 |
---|---|
Analysis finished | 2020-02-13 23:57:49.053188 |
Version | pandas-profiling v2.5.0 |
Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
Download configuration | config.yaml |
Distinct count | 865 |
---|---|
Unique (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 6.9 KiB |
wild_watermelon | 1 |
---|---|
ube | 1 |
fuchsia | 1 |
camouflage_green | 1 |
violet_ryb | 1 |
Other values (860) |
Value | Count | Frequency (%) | |
wild_watermelon | 1 | 0.1% | |
ube | 1 | 0.1% | |
fuchsia | 1 | 0.1% | |
camouflage_green | 1 | 0.1% | |
violet_ryb | 1 | 0.1% | |
fluorescent_yellow | 1 | 0.1% | |
air_force_blue_usaf | 1 | 0.1% | |
medium_aquamarine | 1 | 0.1% | |
pink_pearl | 1 | 0.1% | |
electric_cyan | 1 | 0.1% | |
Other values (855) | 855 | 98.8% |
Length
Max length | 39 |
---|---|
Mean length | 11.37572254 |
Min length | 3 |
Value | Count | Frequency (%) | |
Lowercase_Letter | 26 | 83.9% | |
Decimal_Number | 4 | 12.9% | |
Connector_Punctuation | 1 | 3.2% |
Value | Count | Frequency (%) | |
Latin | 26 | 83.9% | |
Common | 5 | 16.1% |
Value | Count | Frequency (%) | |
ASCII | 31 | 100.0% |
Distinct count | 865 |
---|---|
Unique (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 6.9 KiB |
Light Yellow | 1 |
---|---|
Light Red Ochre | 1 |
Pastel Pink | 1 |
Resolution Blue | 1 |
Office Green | 1 |
Other values (860) |
Value | Count | Frequency (%) | |
Light Yellow | 1 | 0.1% | |
Light Red Ochre | 1 | 0.1% | |
Pastel Pink | 1 | 0.1% | |
Resolution Blue | 1 | 0.1% | |
Office Green | 1 | 0.1% | |
Electric Lavender | 1 | 0.1% | |
Dark Slate Blue | 1 | 0.1% | |
Copper Rose | 1 | 0.1% | |
School Bus Yellow | 1 | 0.1% | |
Flavescent | 1 | 0.1% | |
Other values (855) | 855 | 98.8% |
Length
Max length | 41 |
---|---|
Mean length | 11.59190751 |
Min length | 3 |
Value | Count | Frequency (%) | |
Lowercase_Letter | 29 | 42.0% | |
Uppercase_Letter | 26 | 37.7% | |
Other_Punctuation | 5 | 7.2% | |
Decimal_Number | 4 | 5.8% | |
Open_Punctuation | 1 | 1.4% | |
Close_Punctuation | 1 | 1.4% | |
Final_Punctuation | 1 | 1.4% | |
Space_Separator | 1 | 1.4% | |
Dash_Punctuation | 1 | 1.4% |
Value | Count | Frequency (%) | |
Latin | 55 | 79.7% | |
Common | 14 | 20.3% |
Value | Count | Frequency (%) | |
ASCII | 65 | 98.5% | |
Punctuation | 1 | 1.5% |
Distinct count | 765 |
---|---|
Unique (%) | 88.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 6.9 KiB |
#c19a6b | 5 |
---|---|
#fada5e | 4 |
#967117 | 4 |
#fad6a5 | 3 |
#d2691e | 3 |
Other values (760) |
Value | Count | Frequency (%) | |
#c19a6b | 5 | 0.6% | |
#fada5e | 4 | 0.5% | |
#967117 | 4 | 0.5% | |
#fad6a5 | 3 | 0.3% | |
#d2691e | 3 | 0.3% | |
#a52a2a | 3 | 0.3% | |
#008000 | 3 | 0.3% | |
#0ff | 3 | 0.3% | |
#808080 | 3 | 0.3% | |
#0f0 | 3 | 0.3% | |
Other values (755) | 831 | 96.1% |
Length
Max length | 7 |
---|---|
Mean length | 6.798843931 |
Min length | 4 |
Value | Count | Frequency (%) | |
Decimal_Number | 10 | 58.8% | |
Lowercase_Letter | 6 | 35.3% | |
Other_Punctuation | 1 | 5.9% |
Value | Count | Frequency (%) | |
Common | 11 | 64.7% | |
Latin | 6 | 35.3% |
Value | Count | Frequency (%) | |
ASCII | 17 | 100.0% |
Distinct count | 221 |
---|---|
Unique (%) | 25.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 158.59884393063584 |
---|---|
Minimum | 0 |
Maximum | 255 |
Zeros | 81 |
Zeros (%) | 9.4% |
Memory size | 6.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 101 |
median | 178 |
Q3 | 236 |
95-th percentile | 255 |
Maximum | 255 |
Range | 255 |
Interquartile range (IQR) | 135 |
Descriptive statistics
Standard deviation | 85.33843164 |
---|---|
Coefficient of variation (CV) | 0.5380772617 |
Kurtosis | -0.9264508707 |
Mean | 158.5988439 |
Median Absolute Deviation (MAD) | 72.69125464 |
Skewness | -0.5936792074 |
Sum | 137188 |
Variance | 7282.647915 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 100.5 203.5 205.5 214.5 249.5 254.5 255. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
255 | 110 | 12.7% | |
0 | 81 | 9.4% | |
250 | 15 | 1.7% | |
204 | 13 | 1.5% | |
128 | 11 | 1.3% | |
150 | 11 | 1.3% | |
227 | 10 | 1.2% | |
153 | 10 | 1.2% | |
244 | 10 | 1.2% | |
240 | 9 | 1.0% | |
Other values (211) | 585 | 67.6% |
Value | Count | Frequency (%) | |
0 | 81 | 9.4% | |
1 | 4 | 0.5% | |
2 | 1 | 0.1% | |
3 | 2 | 0.2% | |
5 | 1 | 0.1% |
Value | Count | Frequency (%) | |
255 | 110 | 12.7% | |
254 | 7 | 0.8% | |
253 | 8 | 0.9% | |
252 | 6 | 0.7% | |
251 | 9 | 1.0% |
Distinct count | 234 |
---|---|
Unique (%) | 27.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 124.68323699421966 |
---|---|
Minimum | 0 |
Maximum | 255 |
Zeros | 58 |
Zeros (%) | 6.7% |
Memory size | 6.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 64 |
median | 123 |
Q3 | 190 |
95-th percentile | 250 |
Maximum | 255 |
Range | 255 |
Interquartile range (IQR) | 126 |
Descriptive statistics
Standard deviation | 76.27022506 |
---|---|
Coefficient of variation (CV) | 0.6117119422 |
Kurtosis | -1.097846721 |
Mean | 124.683237 |
Median Absolute Deviation (MAD) | 64.8274944 |
Skewness | 0.0522334723 |
Sum | 107851 |
Variance | 5817.14723 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 27.5 126.5 132.5 254.5 255. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
0 | 58 | 6.7% | |
255 | 35 | 4.0% | |
128 | 13 | 1.5% | |
105 | 12 | 1.4% | |
51 | 11 | 1.3% | |
204 | 11 | 1.3% | |
66 | 9 | 1.0% | |
102 | 9 | 1.0% | |
218 | 9 | 1.0% | |
160 | 9 | 1.0% | |
Other values (224) | 689 | 79.7% |
Value | Count | Frequency (%) | |
0 | 58 | 6.7% | |
1 | 2 | 0.2% | |
2 | 2 | 0.2% | |
3 | 2 | 0.2% | |
6 | 2 | 0.2% |
Value | Count | Frequency (%) | |
255 | 35 | 4.0% | |
254 | 3 | 0.3% | |
253 | 2 | 0.2% | |
252 | 2 | 0.2% | |
251 | 1 | 0.1% |
Distinct count | 230 |
---|---|
Unique (%) | 26.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 119.0878612716763 |
---|---|
Minimum | 0 |
Maximum | 255 |
Zeros | 80 |
Zeros (%) | 9.2% |
Memory size | 6.9 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 53 |
median | 119 |
Q3 | 186 |
95-th percentile | 253.6 |
Maximum | 255 |
Range | 255 |
Interquartile range (IQR) | 133 |
Descriptive statistics
Standard deviation | 78.34386249 |
---|---|
Coefficient of variation (CV) | 0.6578660634 |
Kurtosis | -1.13796004 |
Mean | 119.0878613 |
Median Absolute Deviation (MAD) | 67.11706773 |
Skewness | 0.1072876893 |
Sum | 103011 |
Variance | 6137.76079 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 1. 29.5 106.5 107.5 126.5 128.5 240.5 254.5 255. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
0 | 80 | 9.2% | |
255 | 41 | 4.7% | |
107 | 15 | 1.7% | |
128 | 14 | 1.6% | |
204 | 10 | 1.2% | |
120 | 9 | 1.0% | |
94 | 9 | 1.0% | |
51 | 8 | 0.9% | |
33 | 8 | 0.9% | |
59 | 8 | 0.9% | |
Other values (220) | 663 | 76.6% |
Value | Count | Frequency (%) | |
0 | 80 | 9.2% | |
2 | 3 | 0.3% | |
3 | 1 | 0.1% | |
5 | 2 | 0.2% | |
7 | 2 | 0.2% |
Value | Count | Frequency (%) | |
255 | 41 | 4.7% | |
254 | 3 | 0.3% | |
252 | 1 | 0.1% | |
251 | 1 | 0.1% | |
250 | 7 | 0.8% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
Code | Name | Hex | R | G | B | |
---|---|---|---|---|---|---|
0 | air_force_blue_raf | Air Force Blue (Raf) | #5d8aa8 | 93 | 138 | 168 |
1 | air_force_blue_usaf | Air Force Blue (Usaf) | #00308f | 0 | 48 | 143 |
2 | air_superiority_blue | Air Superiority Blue | #72a0c1 | 114 | 160 | 193 |
3 | alabama_crimson | Alabama Crimson | #a32638 | 163 | 38 | 56 |
4 | alice_blue | Alice Blue | #f0f8ff | 240 | 248 | 255 |
5 | alizarin_crimson | Alizarin Crimson | #e32636 | 227 | 38 | 54 |
6 | alloy_orange | Alloy Orange | #c46210 | 196 | 98 | 16 |
7 | almond | Almond | #efdecd | 239 | 222 | 205 |
8 | amaranth | Amaranth | #e52b50 | 229 | 43 | 80 |
9 | amber | Amber | #ffbf00 | 255 | 191 | 0 |
Last rows
Code | Name | Hex | R | G | B | |
---|---|---|---|---|---|---|
855 | yale_blue | Yale Blue | #0f4d92 | 15 | 77 | 146 |
856 | yellow | Yellow | #ff0 | 255 | 255 | 0 |
857 | yellow_green | Yellow-Green | #9acd32 | 154 | 205 | 50 |
858 | yellow_munsell | Yellow (Munsell) | #efcc00 | 239 | 204 | 0 |
859 | yellow_ncs | Yellow (Ncs) | #ffd300 | 255 | 211 | 0 |
860 | yellow_orange | Yellow Orange | #ffae42 | 255 | 174 | 66 |
861 | yellow_process | Yellow (Process) | #ffef00 | 255 | 239 | 0 |
862 | yellow_ryb | Yellow (Ryb) | #fefe33 | 254 | 254 | 51 |
863 | zaffre | Zaffre | #0014a8 | 0 | 20 | 168 |
864 | zinnwaldite_brown | Zinnwaldite Brown | #2c1608 | 44 | 22 | 8 |