Overview

Dataset info

Number of variables14
Number of observations220056
Missing cells41688 (1.4%)
Duplicate rows0 (0.0%)
Total size in memory23.5 MiB
Average record size in memory112.0 B

Variables types

Numeric5
Categorical2
Boolean0
Date1
URL0
Text (Unique)0
Rejected6
Unsupported0

Warnings

AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG (ρ = 0.9538974473) Rejected
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC (ρ = 0.937317187) Rejected
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD (ρ = 0.9295724875) Rejected
DATUM_BESTAND has constant value "2019-07-10" Rejected
GEMIDDELDE_VERKOOPPRIJS has 39256 (17.8%) missing values Missing
PEILDATUM has constant value "2019-07-01" Rejected
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1772 distinct values Warning
VERSIE has constant value "1.0" Rejected
ZORGPRODUCT_CD has a high cardinality: 5872 distinct values Warning
ZORGPRODUCT_CD has 2432 (1.1%) missing values Missing

Variables

AANTAL_PAT_PER_DIAG
Numeric

Distinct count6901
Unique (%)3.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean7389.779034
Minimum1
Maximum205513
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum1
5-th percentile32
Q1354
Median1570
Q36061
95-th percentile35984
Maximum205513
Range205512
Interquartile range5707

Descriptive statistics

Standard deviation17269.46408
Coef of variation2.336939169
Kurtosis31.30153635
Mean7389.779034
MAD9038.062863
Skewness4.920357863
Sum1626165215
Variance298234389.4
Memory size1.7 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 2.350000e+01 3.450000e+01 3.550000e+01 ... 1.383325e+05 1.520900e+05 1.572990e+05 1.738185e+05 2.055130e+05], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
6 420 0.2%
 
32 398 0.2%
 
4 391 0.2%
 
12 388 0.2%
 
21 386 0.2%
 
19 386 0.2%
 
8 385 0.2%
 
23 385 0.2%
 
5 379 0.2%
 
17 378 0.2%
 
Other values (6891) 216160 98.2%
 

Minimum 5 values

ValueCountFrequency (%) 
1 340 0.2%
 
2 374 0.2%
 
3 355 0.2%
 
4 391 0.2%
 
5 379 0.2%
 

Maximum 5 values

ValueCountFrequency (%) 
205513 19 < 0.1%
 
200182 17 < 0.1%
 
199981 16 < 0.1%
 
197742 20 < 0.1%
 
189114 19 < 0.1%
 

AANTAL_PAT_PER_SPC
Numeric

Distinct count217
Unique (%)0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean642319.0376
Minimum83
Maximum1489568
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum83
5-th percentile30777
Q1242846
Median713937
Q3977237
95-th percentile1328494
Maximum1489568
Range1489485
Interquartile range734391

Descriptive statistics

Standard deviation426698.7108
Coef of variation0.6643096122
Kurtosis-1.108280086
Mean642319.0376
MAD373675.5045
Skewness0.06739837272
Sum1.413461581e+11
Variance1.820717898e+11
Memory size1.7 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[8.300000e+01 9.250000e+01 1.170000e+02 2.470000e+02 4.725000e+02 ... 1.302159e+06 1.318038e+06 1.436341e+06 1.470131e+06 1.489568e+06], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
881250 5107 2.3%
 
870026 4401 2.0%
 
871428 4372 2.0%
 
841375 4367 2.0%
 
1061055 3974 1.8%
 
1058528 3972 1.8%
 
693207 3970 1.8%
 
977237 3872 1.8%
 
1040393 3858 1.8%
 
995598 3724 1.7%
 
Other values (207) 178439 81.1%
 

Minimum 5 values

ValueCountFrequency (%) 
83 13 < 0.1%
 
102 6 < 0.1%
 
132 3 < 0.1%
 
362 57 < 0.1%
 
583 102 < 0.1%
 

Maximum 5 values

ValueCountFrequency (%) 
1489568 2981 1.4%
 
1450694 3057 1.4%
 
1421988 3588 1.6%
 
1328494 3616 1.6%
 
1307582 3590 1.6%
 

AANTAL_PAT_PER_ZPD
Numeric

Distinct count8019
Unique (%)3.6%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean484.0502236
Minimum1
Maximum150256
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum1
5-th percentile1
Q12
Median12
Q392
95-th percentile1592
Maximum150256
Range150255
Interquartile range90

Descriptive statistics

Standard deviation3048.750513
Coef of variation6.298417736
Kurtosis369.6471736
Mean484.0502236
MAD776.5827494
Skewness16.23883553
Sum106518156
Variance9294879.691
Memory size1.7 MiB
Histogram
Histogram with fixed size bins (bins=50)
Histogram
Histogram with variable size bins (bins=[1.000000e+00 1.500000e+00 2.500000e+00 3.500000e+00 4.500000e+00 ... 5.373450e+04 6.827850e+04 8.744250e+04 1.084875e+05 1.502560e+05], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 38238 17.4%
 
2 18353 8.3%
 
3 11855 5.4%
 
4 8822 4.0%
 
5 6850 3.1%
 
6 5699 2.6%
 
7 4726 2.1%
 
8 3958 1.8%
 
9 3623 1.6%
 
10 3190 1.4%
 
Other values (8009) 114742 52.1%
 

Minimum 5 values

ValueCountFrequency (%) 
1 38238 17.4%
 
2 18353 8.3%
 
3 11855 5.4%
 
4 8822 4.0%
 
5 6850 3.1%
 

Maximum 5 values

ValueCountFrequency (%) 
150256 1 < 0.1%
 
144234 1 < 0.1%
 
122173 1 < 0.1%
 
108889 1 < 0.1%
 
108086 1 < 0.1%
 

AANTAL_SUBTRAJECT_PER_DIAG
Highly correlated

This variable is highly correlated with AANTAL_PAT_PER_DIAG and should be ignored for analysis

Correlation0.9538974473

AANTAL_SUBTRAJECT_PER_SPC
Highly correlated

This variable is highly correlated with AANTAL_PAT_PER_SPC and should be ignored for analysis

Correlation0.937317187

AANTAL_SUBTRAJECT_PER_ZPD
Highly correlated

This variable is highly correlated with AANTAL_PAT_PER_ZPD and should be ignored for analysis

Correlation0.9295724875

BEHANDELEND_SPECIALISME_CD
Numeric

Distinct count28
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Mean421.6114898
Minimum100
Maximum8418
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum100
5-th percentile302
Q1305
Median313
Q3322
95-th percentile361
Maximum8418
Range8318
Interquartile range17

Descriptive statistics

Standard deviation919.7531312
Coef of variation2.181518183
Kurtosis71.44324618
Mean421.6114898
MAD210.3102137
Skewness8.562981878
Sum92778138
Variance845945.8223
Memory size1.7 MiB
Histogram
Histogram with fixed size bins (bins=28)
Histogram
Histogram with variable size bins (bins=[ 100. 200.5 301.5 302.5 303.5 ... 375.5 389.5 1145. 5159. 8418. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
305 30805 14.0%
 
313 28755 13.1%
 
303 25331 11.5%
 
330 17841 8.1%
 
316 15076 6.9%
 
308 10523 4.8%
 
324 9070 4.1%
 
301 9007 4.1%
 
306 8937 4.1%
 
304 7145 3.2%
 
Other values (18) 57566 26.2%
 

Minimum 5 values

ValueCountFrequency (%) 
100 9 < 0.1%
 
301 9007 4.1%
 
302 4799 2.2%
 
303 25331 11.5%
 
304 7145 3.2%
 

Maximum 5 values

ValueCountFrequency (%) 
8418 2867 1.3%
 
1900 145 0.1%
 
390 534 0.2%
 
389 2439 1.1%
 
362 3700 1.7%
 

DATUM_BESTAND
Constant

This variable is constant and should be ignored for analysis

Constant value2019-07-10

GEMIDDELDE_VERKOOPPRIJS
Numeric

Distinct count2872
Unique (%)1.3%
Missing (%)17.8%
Missing (n)39256
Infinite (%)0.0%
Infinite (n)0
Mean3403.963993
Minimum70
Maximum287220
Zeros (%)0.0%
Mini histogram

Quantile statistics

Minimum70
5-th percentile135
Q1445
Median1185
Q33860
95-th percentile12880.25
Maximum287220
Range287150
Interquartile range3415

Descriptive statistics

Standard deviation6577.433188
Coef of variation1.932286358
Kurtosis197.0488928
Mean3403.963993
MAD3505.52706
Skewness8.509069144
Sum615436690
Variance43262627.34
Memory size1.7 MiB
Histogram
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
160 1701 0.8%
 
105 1634 0.7%
 
180 1542 0.7%
 
110 1194 0.5%
 
300 1171 0.5%
 
140 1130 0.5%
 
295 959 0.4%
 
235 945 0.4%
 
115 944 0.4%
 
145 944 0.4%
 
Other values (2861) 168636 76.6%
 
(Missing) 39256 17.8%
 

Minimum 5 values

ValueCountFrequency (%) 
70 226 0.1%
 
75 75 < 0.1%
 
80 428 0.2%
 
85 714 0.3%
 
90 448 0.2%
 

Maximum 5 values

ValueCountFrequency (%) 
287220 8 < 0.1%
 
147535 3 < 0.1%
 
122155 4 < 0.1%
 
116910 3 < 0.1%
 
108570 7 < 0.1%
 

JAAR
Date

Distinct count8
Unique (%)< 0.1%
Missing (%)0.0%
Missing (n)0
Infinite (%)0.0%
Infinite (n)0
Minimum2012-01-01 00:00:00
Maximum2019-01-01 00:00:00
Mini histogram
Histogram
Histogram of 'JAAR' (bins=N)

PEILDATUM
Constant

This variable is constant and should be ignored for analysis

Constant value2019-07-01

TYPERENDE_DIAGNOSE_CD
Categorical

Distinct count1772
Unique (%)0.8%
Missing (%)0.0%
Missing (n)0
101
 
929
402
 
925
403
 
883
Other values (1769)
217319
ValueCountFrequency (%) 
101 929 0.4%
 
402 925 0.4%
 
403 883 0.4%
 
301 883 0.4%
 
203 841 0.4%
 
201 834 0.4%
 
401 765 0.3%
 
404 741 0.3%
 
409 727 0.3%
 
802 707 0.3%
 
Other values (1762) 211821 96.3%
 
Max length4
Mean length3.344621369
Min length1
Contains charsTrue
Contains digitsTrue
Contains spacesFalse
Contains non-wordsFalse

VERSIE
Constant

This variable is constant and should be ignored for analysis

Constant value1.0

ZORGPRODUCT_CD
Categorical

Distinct count5872
Unique (%)2.7%
Missing (%)1.1%
Missing (n)2432
990004009
 
1656
990004007
 
1598
990003004
 
1555
Other values (5868)
212815
(Missing)
 
2432
ValueCountFrequency (%) 
990004009 1656 0.8%
 
990004007 1598 0.7%
 
990003004 1555 0.7%
 
990004006 1236 0.6%
 
990356076 1059 0.5%
 
990003007 990 0.4%
 
131999228 986 0.4%
 
131999164 973 0.4%
 
990356073 970 0.4%
 
199299013 915 0.4%
 
Other values (5861) 205686 93.5%
 
(Missing) 2432 1.1%
 
Max length9
Mean length8.933689606
Min length3
Contains charsTrue
Contains digitsTrue
Contains spacesTrue
Contains non-wordsTrue

Correlations

Missing values

Sample

First rows

AANTAL_PAT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_DIAGAANTAL_SUBTRAJECT_PER_SPCAANTAL_SUBTRAJECT_PER_ZPDBEHANDELEND_SPECIALISME_CDDATUM_BESTANDGEMIDDELDE_VERKOOPPRIJSJAARPEILDATUMTYPERENDE_DIAGNOSE_CDVERSIEZORGPRODUCT_CD
0245618733047837983047425883272019-07-10405.02013-01-012019-07-0104151.0990027133
12456187330183798304742183272019-07-1024960.02013-01-012019-07-0104151.0990027166
22456187330133798304742133272019-07-1033390.02013-01-012019-07-0104151.0990027163
324561873302379830474223272019-07-102410.02013-01-012019-07-0104151.0990027160
424561873306379830474263272019-07-10NaN2013-01-012019-07-0104151.0990027161
524561873301379830474213272019-07-102225.02013-01-012019-07-0104151.0990027142
624561873302379830474223272019-07-10NaN2013-01-012019-07-0104151.0990027165
7245618733048337983047425333272019-07-102650.02013-01-012019-07-0104151.0990027168
82456187330113798304742113272019-07-1056195.02013-01-012019-07-0104151.0990027162
924561873301379830474213272019-07-1051765.02013-01-012019-07-0104151.0990027153

Last rows

AANTAL_PAT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_DIAGAANTAL_SUBTRAJECT_PER_SPCAANTAL_SUBTRAJECT_PER_ZPDBEHANDELEND_SPECIALISME_CDDATUM_BESTANDGEMIDDELDE_VERKOOPPRIJSJAARPEILDATUMTYPERENDE_DIAGNOSE_CDVERSIEZORGPRODUCT_CD
2200461163540809202543156123162019-07-103040.02018-01-012019-07-0161191.0990116004
22004711635408035202543156383162019-07-101080.02018-01-012019-07-0161191.0990116011
220048116354080120254315613162019-07-10NaN2018-01-012019-07-0161191.0990116007
220049116354080520254315653162019-07-10NaN2018-01-012019-07-0161191.0990116048
220050116354080120254315613162019-07-10NaN2018-01-012019-07-0161191.0990116055
22005111635408048202543156543162019-07-10385.02018-01-012019-07-0161191.0990116027
22005211635408016202543156163162019-07-10310.02018-01-012019-07-0161191.0990116018
220053116354080520254315663162019-07-1014365.02018-01-012019-07-0161191.0990116008
220054116354080120254315613162019-07-10NaN2018-01-012019-07-0161191.0990116054
2200551163540809202543156103162019-07-10NaN2018-01-012019-07-0161191.0990116049