Dataset statistics
Number of variables | 14 |
---|---|
Number of observations | 229308 |
Missing cells | 35613 |
Missing cells (%) | 1.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 61.7 MiB |
Average record size in memory | 282.3 B |
Variable types
NUM | 9 |
---|---|
CAT | 3 |
BOOL | 1 |
DATE | 1 |
Reproduction
Analysis started | 2020-02-13 23:59:05.001814 |
---|---|
Analysis finished | 2020-02-14 00:00:47.945248 |
Version | pandas-profiling v2.5.0 |
Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
Download configuration | config.yaml |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1766 distinct values | High cardinality |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High Correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High Correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High Correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High Correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High Correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High Correlation |
GEMIDDELDE_VERKOOPPRIJS has 35613 (15.5%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.2206314) | Skewed |
DATUM_BESTAND only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
PEILDATUM only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
Distinct count | 1 |
---|---|
Unique (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.7 MiB |
1 |
---|
Value | Count | Frequency (%) | |
1 | 229308 | 100.0% |
Distinct count | 1 |
---|---|
Unique (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.7 MiB |
2019-12-11 |
---|
Value | Count | Frequency (%) | |
2019-12-11 | 229308 | 100.0% |
Length
Max length | 10 |
---|---|
Mean length | 10 |
Min length | 10 |
Value | Count | Frequency (%) | |
Decimal_Number | 4 | 80.0% | |
Dash_Punctuation | 1 | 20.0% |
Value | Count | Frequency (%) | |
Common | 5 | 100.0% |
Value | Count | Frequency (%) | |
ASCII | 5 | 100.0% |
Distinct count | 1 |
---|---|
Unique (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.7 MiB |
2019-12-01 |
---|
Value | Count | Frequency (%) | |
2019-12-01 | 229308 | 100.0% |
Length
Max length | 10 |
---|---|
Mean length | 10 |
Min length | 10 |
Value | Count | Frequency (%) | |
Decimal_Number | 4 | 80.0% | |
Dash_Punctuation | 1 | 20.0% |
Value | Count | Frequency (%) | |
Common | 5 | 100.0% |
Value | Count | Frequency (%) | |
ASCII | 5 | 100.0% |
JAAR
Date
Distinct count | 8 |
---|---|
Unique (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.7 MiB |
Minimum | 2012-01-01 00:00:00 |
---|---|
Maximum | 2019-01-01 00:00:00 |
Histogram
BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)
Distinct count | 27 |
---|---|
Unique (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 420.101309156244 |
---|---|
Minimum | 301 |
Maximum | 8418 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 301 |
---|---|
5-th percentile | 302 |
Q1 | 305 |
median | 313 |
Q3 | 322 |
95-th percentile | 361 |
Maximum | 8418 |
Range | 8117 |
Interquartile range (IQR) | 17 |
Descriptive statistics
Standard deviation | 913.4366995 |
---|---|
Coefficient of variation (CV) | 2.174324811 |
Kurtosis | 72.51679024 |
Mean | 420.1013092 |
Median Absolute Deviation (MAD) | 207.4526335 |
Skewness | 8.625272146 |
Sum | 96332591 |
Variance | 834366.6039 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 301. 301.5 302.5 303.5 304.5 ... 375.5 389.5 1145. 5159. 8418. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
305 | 32575 | 14.2% | |
313 | 29685 | 12.9% | |
303 | 26374 | 11.5% | |
330 | 18482 | 8.1% | |
316 | 15621 | 6.8% | |
308 | 11407 | 5.0% | |
324 | 9638 | 4.2% | |
306 | 9434 | 4.1% | |
301 | 9302 | 4.1% | |
304 | 7465 | 3.3% | |
Other values (17) | 59325 | 25.9% |
Value | Count | Frequency (%) | |
301 | 9302 | 4.1% | |
302 | 4940 | 2.2% | |
303 | 26374 | 11.5% | |
304 | 7465 | 3.3% | |
305 | 32575 | 14.2% |
Value | Count | Frequency (%) | |
8418 | 2946 | 1.3% | |
1900 | 151 | 0.1% | |
390 | 566 | 0.2% | |
389 | 2511 | 1.1% | |
362 | 3730 | 1.6% |
Distinct count | 1766 |
---|---|
Unique (%) | 0.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.7 MiB |
101 | 953 |
---|---|
402 | 944 |
301 | 912 |
403 | 909 |
203 | 860 |
Other values (1761) |
Value | Count | Frequency (%) | |
101 | 953 | 0.4% | |
402 | 944 | 0.4% | |
301 | 912 | 0.4% | |
403 | 909 | 0.4% | |
203 | 860 | 0.4% | |
201 | 853 | 0.4% | |
401 | 769 | 0.3% | |
404 | 754 | 0.3% | |
409 | 747 | 0.3% | |
802 | 743 | 0.3% | |
Other values (1756) | 220864 | 96.3% |
Length
Max length | 4 |
---|---|
Mean length | 3.34916357 |
Min length | 2 |
Value | Count | Frequency (%) | |
Uppercase_Letter | 15 | 60.0% | |
Decimal_Number | 10 | 40.0% |
Value | Count | Frequency (%) | |
Latin | 15 | 60.0% | |
Common | 10 | 40.0% |
Value | Count | Frequency (%) | |
ASCII | 25 | 100.0% |
ZORGPRODUCT_CD
Real number (ℝ≥0)
Distinct count | 5877 |
---|---|
Unique (%) | 2.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 439993873.8149825 |
---|---|
Minimum | 10501002 |
Maximum | 998418081 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 10501002 |
---|---|
5-th percentile | 28999036 |
Q1 | 99799031 |
median | 149599019 |
Q3 | 990004006 |
95-th percentile | 990416016 |
Maximum | 998418081 |
Range | 987917079 |
Interquartile range (IQR) | 890204975 |
Descriptive statistics
Standard deviation | 428909436.9 |
---|---|
Coefficient of variation (CV) | 0.9748077471 |
Kurtosis | -1.73352344 |
Mean | 439993873.8 |
Median Absolute Deviation (MAD) | 413702639.5 |
Skewness | 0.4715791357 |
Sum | 1.008941152e+14 |
Variance | 1.83963305e+17 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.05010020e+07 1.05010105e+07 1.11010025e+07 1.11010105e+07 1.13010025e+07 ... 9.98418074e+08 9.98418078e+08 9.98418080e+08 9.98418080e+08 9.98418081e+08], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
990004009 | 1680 | 0.7% | |
990004007 | 1653 | 0.7% | |
990003004 | 1640 | 0.7% | |
990004006 | 1330 | 0.6% | |
990356076 | 1156 | 0.5% | |
990356073 | 1064 | 0.5% | |
990003007 | 1054 | 0.5% | |
131999228 | 1008 | 0.4% | |
131999164 | 996 | 0.4% | |
199299013 | 956 | 0.4% | |
Other values (5867) | 216771 | 94.5% |
Value | Count | Frequency (%) | |
10501002 | 5 | < 0.1% | |
10501003 | 8 | < 0.1% | |
10501004 | 8 | < 0.1% | |
10501005 | 8 | < 0.1% | |
10501007 | 3 | < 0.1% |
Value | Count | Frequency (%) | |
998418081 | 104 | < 0.1% | |
998418080 | 95 | < 0.1% | |
998418079 | 24 | < 0.1% | |
998418077 | 5 | < 0.1% | |
998418076 | 5 | < 0.1% |
Distinct count | 8305 |
---|---|
Unique (%) | 3.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 494.5914839429937 |
---|---|
Minimum | 1 |
Maximum | 152464 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 13 |
Q3 | 100 |
95-th percentile | 1670 |
Maximum | 152464 |
Range | 152463 |
Interquartile range (IQR) | 97 |
Descriptive statistics
Standard deviation | 3052.939543 |
---|---|
Coefficient of variation (CV) | 6.172648826 |
Kurtosis | 384.6693955 |
Mean | 494.5914839 |
Median Absolute Deviation (MAD) | 788.5709634 |
Skewness | 16.41870919 |
Sum | 113413784 |
Variance | 9320439.851 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 2.500000e+00 3.500000e+00 4.500000e+00 ... 4.793800e+04 6.489850e+04 8.755750e+04 1.089555e+05 1.524640e+05], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
1 | 38110 | 16.6% | |
2 | 18585 | 8.1% | |
3 | 12109 | 5.3% | |
4 | 8976 | 3.9% | |
5 | 6994 | 3.1% | |
6 | 5847 | 2.5% | |
7 | 4864 | 2.1% | |
8 | 4093 | 1.8% | |
9 | 3817 | 1.7% | |
10 | 3360 | 1.5% | |
Other values (8295) | 122553 | 53.4% |
Value | Count | Frequency (%) | |
1 | 38110 | 16.6% | |
2 | 18585 | 8.1% | |
3 | 12109 | 5.3% | |
4 | 8976 | 3.9% | |
5 | 6994 | 3.1% |
Value | Count | Frequency (%) | |
152464 | 1 | < 0.1% | |
144806 | 1 | < 0.1% | |
144494 | 1 | < 0.1% | |
108968 | 1 | < 0.1% | |
108943 | 1 | < 0.1% |
Distinct count | 8816 |
---|---|
Unique (%) | 3.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 572.9722425733075 |
---|---|
Minimum | 1 |
Maximum | 239637 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 14 |
Q3 | 109 |
95-th percentile | 1876.65 |
Maximum | 239637 |
Range | 239636 |
Interquartile range (IQR) | 106 |
Descriptive statistics
Standard deviation | 3826.365529 |
---|---|
Coefficient of variation (CV) | 6.678099295 |
Kurtosis | 731.276005 |
Mean | 572.9722426 |
Median Absolute Deviation (MAD) | 919.3577293 |
Skewness | 21.2206314 |
Sum | 131387119 |
Variance | 14641073.16 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 2.500000e+00 3.500000e+00 4.500000e+00 ... 5.004450e+04 6.940800e+04 1.038255e+05 1.511055e+05 2.396370e+05], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
1 | 36785 | 16.0% | |
2 | 18263 | 8.0% | |
3 | 12019 | 5.2% | |
4 | 8834 | 3.9% | |
5 | 6925 | 3.0% | |
6 | 5852 | 2.6% | |
7 | 4848 | 2.1% | |
8 | 4069 | 1.8% | |
9 | 3715 | 1.6% | |
10 | 3392 | 1.5% | |
Other values (8806) | 124606 | 54.3% |
Value | Count | Frequency (%) | |
1 | 36785 | 16.0% | |
2 | 18263 | 8.0% | |
3 | 12019 | 5.2% | |
4 | 8834 | 3.9% | |
5 | 6925 | 3.0% |
Value | Count | Frequency (%) | |
239637 | 1 | < 0.1% | |
231932 | 1 | < 0.1% | |
229642 | 1 | < 0.1% | |
226570 | 1 | < 0.1% | |
218436 | 1 | < 0.1% |
Distinct count | 7177 |
---|---|
Unique (%) | 3.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7503.579853297748 |
---|---|
Minimum | 1 |
Maximum | 208422 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 43 |
Q1 | 403 |
median | 1668 |
Q3 | 6243 |
95-th percentile | 36249 |
Maximum | 208422 |
Range | 208421 |
Interquartile range (IQR) | 5840 |
Descriptive statistics
Standard deviation | 17334.73061 |
---|---|
Coefficient of variation (CV) | 2.310194727 |
Kurtosis | 32.28435376 |
Mean | 7503.579853 |
Median Absolute Deviation (MAD) | 9071.070806 |
Skewness | 4.971721621 |
Sum | 1720630889 |
Variance | 300492885.4 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 1.550000e+01 1.650000e+01 2.150000e+01 ... 1.589495e+05 1.624000e+05 1.991720e+05 2.017170e+05 2.084220e+05], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
19 | 374 | 0.2% | |
21 | 365 | 0.2% | |
9 | 359 | 0.2% | |
17 | 355 | 0.2% | |
37 | 351 | 0.2% | |
25 | 349 | 0.2% | |
20 | 347 | 0.2% | |
33 | 341 | 0.1% | |
32 | 326 | 0.1% | |
6 | 320 | 0.1% | |
Other values (7167) | 225821 | 98.5% |
Value | Count | Frequency (%) | |
1 | 216 | 0.1% | |
2 | 256 | 0.1% | |
3 | 266 | 0.1% | |
4 | 270 | 0.1% | |
5 | 257 | 0.1% |
Value | Count | Frequency (%) | |
208422 | 19 | < 0.1% | |
202446 | 17 | < 0.1% | |
200988 | 25 | < 0.1% | |
200165 | 16 | < 0.1% | |
198179 | 20 | < 0.1% |
Distinct count | 7929 |
---|---|
Unique (%) | 3.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 10399.268333420552 |
---|---|
Minimum | 1 |
Maximum | 336702 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 52 |
Q1 | 519 |
median | 2268 |
Q3 | 8557 |
95-th percentile | 49217 |
Maximum | 336702 |
Range | 336701 |
Interquartile range (IQR) | 8038 |
Descriptive statistics
Standard deviation | 24826.12898 |
---|---|
Coefficient of variation (CV) | 2.387295739 |
Kurtosis | 36.9830588 |
Mean | 10399.26833 |
Median Absolute Deviation (MAD) | 12651.40944 |
Skewness | 5.276229893 |
Sum | 2384635423 |
Variance | 616336680.1 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 3.550000e+01 3.750000e+01 4.750000e+01 ... 2.412975e+05 2.438805e+05 2.467420e+05 2.634200e+05 3.367020e+05], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
20 | 289 | 0.1% | |
19 | 279 | 0.1% | |
13 | 273 | 0.1% | |
25 | 272 | 0.1% | |
22 | 269 | 0.1% | |
46 | 263 | 0.1% | |
18 | 263 | 0.1% | |
24 | 260 | 0.1% | |
38 | 260 | 0.1% | |
57 | 256 | 0.1% | |
Other values (7919) | 226624 | 98.8% |
Value | Count | Frequency (%) | |
1 | 179 | 0.1% | |
2 | 191 | 0.1% | |
3 | 241 | 0.1% | |
4 | 228 | 0.1% | |
5 | 237 | 0.1% |
Value | Count | Frequency (%) | |
336702 | 19 | < 0.1% | |
323151 | 20 | < 0.1% | |
320162 | 25 | < 0.1% | |
293720 | 17 | < 0.1% | |
288363 | 16 | < 0.1% |
Distinct count | 215 |
---|---|
Unique (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 656595.7786165333 |
---|---|
Minimum | 444 |
Maximum | 1489537 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 444 |
---|---|
5-th percentile | 43687 |
Q1 | 275922 |
median | 742178 |
Q3 | 995565 |
95-th percentile | 1334705 |
Maximum | 1489537 |
Range | 1489093 |
Interquartile range (IQR) | 719643 |
Descriptive statistics
Standard deviation | 415846.3639 |
---|---|
Coefficient of variation (CV) | 0.6333369441 |
Kurtosis | -1.082885006 |
Mean | 656595.7786 |
Median Absolute Deviation (MAD) | 365133.4779 |
Skewness | 0.1145470942 |
Sum | 1.505626648e+11 |
Variance | 1.729281984e+11 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[4.4400000e+02 4.6915000e+03 7.0550000e+03 8.2715000e+03 9.3430000e+03 ... 1.3330250e+06 1.3783225e+06 1.4362880e+06 1.4700865e+06 1.4895370e+06], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
881023 | 5102 | 2.2% | |
873277 | 4354 | 1.9% | |
843735 | 4348 | 1.9% | |
886468 | 4325 | 1.9% | |
817645 | 4171 | 1.8% | |
1077748 | 3884 | 1.7% | |
1064083 | 3850 | 1.7% | |
1040410 | 3811 | 1.7% | |
1014947 | 3799 | 1.7% | |
980751 | 3757 | 1.6% | |
Other values (205) | 187907 | 81.9% |
Value | Count | Frequency (%) | |
444 | 41 | < 0.1% | |
1742 | 102 | < 0.1% | |
2572 | 172 | 0.1% | |
6811 | 380 | 0.2% | |
7299 | 71 | < 0.1% |
Value | Count | Frequency (%) | |
1489537 | 2976 | 1.3% | |
1450636 | 3054 | 1.3% | |
1421940 | 3564 | 1.6% | |
1334705 | 3545 | 1.5% | |
1331345 | 3549 | 1.5% |
Distinct count | 215 |
---|---|
Unique (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1022648.7517749054 |
---|---|
Minimum | 465 |
Maximum | 2537889 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 465 |
---|---|
5-th percentile | 47358 |
Q1 | 370187 |
median | 970136 |
Q3 | 1681610 |
95-th percentile | 2379520 |
Maximum | 2537889 |
Range | 2537424 |
Interquartile range (IQR) | 1311423 |
Descriptive statistics
Standard deviation | 706770.0206 |
---|---|
Coefficient of variation (CV) | 0.6911170814 |
Kurtosis | -0.9382607257 |
Mean | 1022648.752 |
Median Absolute Deviation (MAD) | 598361.3463 |
Skewness | 0.3477123344 |
Sum | 2.3450154e+11 |
Variance | 4.99523862e+11 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[4.6500000e+02 5.1505000e+03 7.4795000e+03 9.7315000e+03 1.1798000e+04 ... 2.0229920e+06 2.1266420e+06 2.4345950e+06 2.5137795e+06 2.5378890e+06], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
1211622 | 5102 | 2.2% | |
1279773 | 4354 | 1.9% | |
1215825 | 4348 | 1.9% | |
1300234 | 4325 | 1.9% | |
1185851 | 4171 | 1.8% | |
2537889 | 3884 | 1.7% | |
2489670 | 3850 | 1.7% | |
2067726 | 3811 | 1.7% | |
2379520 | 3799 | 1.7% | |
2185558 | 3757 | 1.6% | |
Other values (205) | 187907 | 81.9% |
Value | Count | Frequency (%) | |
465 | 41 | < 0.1% | |
1991 | 102 | < 0.1% | |
2911 | 172 | 0.1% | |
7390 | 380 | 0.2% | |
7569 | 71 | < 0.1% |
Value | Count | Frequency (%) | |
2537889 | 3884 | 1.7% | |
2489670 | 3850 | 1.7% | |
2379520 | 3799 | 1.7% | |
2185558 | 3757 | 1.6% | |
2067726 | 3811 | 1.7% |
Distinct count | 2985 |
---|---|
Unique (%) | 1.5% |
Missing | 35613 |
Missing (%) | 15.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3445.532409200031 |
---|---|
Minimum | 70.0 |
Maximum | 287220.0 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 1.7 MiB |
Quantile statistics
Minimum | 70 |
---|---|
5-th percentile | 140 |
Q1 | 455 |
median | 1210 |
Q3 | 3935 |
95-th percentile | 12970 |
Maximum | 287220 |
Range | 287150 |
Interquartile range (IQR) | 3480 |
Descriptive statistics
Standard deviation | 6594.982718 |
---|---|
Coefficient of variation (CV) | 1.914067823 |
Kurtosis | 189.2625833 |
Mean | 3445.532409 |
Median Absolute Deviation (MAD) | 3535.567694 |
Skewness | 8.39162592 |
Sum | 667382400 |
Variance | 43493797.05 |
Histogram with fixed size bins (bins=10)
Value | Count | Frequency (%) | |
160 | 1745 | 0.8% | |
105 | 1689 | 0.7% | |
180 | 1476 | 0.6% | |
110 | 1415 | 0.6% | |
300 | 1207 | 0.5% | |
140 | 1169 | 0.5% | |
295 | 1003 | 0.4% | |
145 | 996 | 0.4% | |
165 | 969 | 0.4% | |
500 | 950 | 0.4% | |
Other values (2975) | 181076 | 79.0% | |
(Missing) | 35613 | 15.5% |
Value | Count | Frequency (%) | |
70 | 226 | 0.1% | |
75 | 75 | < 0.1% | |
80 | 358 | 0.2% | |
85 | 845 | 0.4% | |
90 | 416 | 0.2% |
Value | Count | Frequency (%) | |
287220 | 8 | < 0.1% | |
147535 | 3 | < 0.1% | |
143015 | 4 | < 0.1% | |
122155 | 4 | < 0.1% | |
116910 | 3 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802057 | 69 | 69 | 286 | 315 | 74662 | 107802 | NaN |
1 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802050 | 1 | 1 | 286 | 315 | 74662 | 107802 | NaN |
2 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802034 | 1 | 1 | 286 | 315 | 74662 | 107802 | NaN |
3 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802015 | 87 | 89 | 286 | 315 | 74662 | 107802 | NaN |
4 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802006 | 90 | 94 | 286 | 315 | 74662 | 107802 | 22265.0 |
5 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802039 | 42 | 45 | 286 | 315 | 74662 | 107802 | NaN |
6 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802038 | 14 | 14 | 286 | 315 | 74662 | 107802 | NaN |
7 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802043 | 1 | 1 | 286 | 315 | 74662 | 107802 | 21105.0 |
8 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2705 | 972802053 | 1 | 1 | 286 | 315 | 74662 | 107802 | NaN |
9 | 1.0 | 2019-12-11 | 2019-12-01 | 2012-01-01 | 308 | 2555 | 131999204 | 9 | 9 | 3208 | 3305 | 74662 | 107802 | 2785.0 |
Last rows
VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
229298 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027160 | 2866 | 4544 | 6606 | 13642 | 183097 | 329711 | 3165.0 |
229299 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027186 | 1 | 1 | 6606 | 13642 | 183097 | 329711 | 3350.0 |
229300 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027154 | 2 | 2 | 6606 | 13642 | 183097 | 329711 | 28980.0 |
229301 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027153 | 2 | 2 | 6606 | 13642 | 183097 | 329711 | 51280.0 |
229302 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027159 | 557 | 820 | 6606 | 13642 | 183097 | 329711 | 9515.0 |
229303 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027199 | 1281 | 1498 | 6606 | 13642 | 183097 | 329711 | 850.0 |
229304 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027195 | 1 | 1 | 6606 | 13642 | 183097 | 329711 | 3085.0 |
229305 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027155 | 2 | 2 | 6606 | 13642 | 183097 | 329711 | 15750.0 |
229306 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027198 | 4541 | 6728 | 6606 | 13642 | 183097 | 329711 | 220.0 |
229307 | 1.0 | 2019-12-11 | 2019-12-01 | 2018-01-01 | 327 | 0312 | 990027158 | 35 | 44 | 6606 | 13642 | 183097 | 329711 | 20920.0 |