Dataset info
Number of variables | 12 |
---|---|
Number of observations | 891 |
Missing cells | 866 (8.1%) |
Duplicate rows | 0 (0.0%) |
Total size in memory | 83.6 KiB |
Average record size in memory | 96.1 B |
Variables types
Numeric | 5 |
---|---|
Categorical | 5 |
Boolean | 1 |
Date | 0 |
URL | 0 |
Text (Unique) | 1 |
Rejected | 0 |
Unsupported | 0 |
Warnings
Age has 177 (19.9%) missing values | Missing |
Cabin has a high cardinality: 148 distinct values | Warning |
Cabin has 687 (77.1%) missing values | Missing |
Fare has 15 (1.7%) zeros | Zeros |
Parch has 678 (76.1%) zeros | Zeros |
SibSp has 608 (68.2%) zeros | Zeros |
Ticket has a high cardinality: 681 distinct values | Warning |
Age
Numeric
Distinct count | 89 |
---|---|
Unique (%) | 10.0% |
Missing (%) | 19.9% |
Missing (n) | 177 |
Infinite (%) | 0.0% |
Infinite (n) | 0 |
Mean | 29.699 |
---|---|
Minimum | 0.42 |
Maximum | 80 |
Zeros (%) | 0.0% |
Quantile statistics
Minimum | 0.42 |
---|---|
5-th percentile | 4 |
Q1 | 20.125 |
Median | 28 |
Q3 | 38 |
95-th percentile | 56 |
Maximum | 80 |
Range | 79.58 |
Interquartile range | 17.875 |
Descriptive statistics
Standard deviation | 14.526 |
---|---|
Coef of variation | 0.48912 |
Kurtosis | 0.17827 |
Mean | 29.699 |
MAD | 11.323 |
Skewness | 0.38911 |
Sum | 21205 |
Variance | 211.02 |
Memory size | 7.0 KiB |
Histogram with fixed size bins (bins=10)
Value | Count | Frequency (%) | |
24 | 30 | 3.4% | |
22 | 27 | 3.0% | |
18 | 26 | 2.9% | |
28 | 25 | 2.8% | |
19 | 25 | 2.8% | |
30 | 25 | 2.8% | |
21 | 24 | 2.7% | |
25 | 23 | 2.6% | |
36 | 22 | 2.5% | |
29 | 20 | 2.2% | |
Other values (78) | 467 | 52.4% | |
(Missing) | 177 | 19.9% |
Minimum 5 values
Value | Count | Frequency (%) | |
0.42 | 1 | 0.1% | |
0.67 | 1 | 0.1% | |
0.75 | 2 | 0.2% | |
0.83 | 2 | 0.2% | |
0.92 | 1 | 0.1% |
Maximum 5 values
Value | Count | Frequency (%) | |
80 | 1 | 0.1% | |
74 | 1 | 0.1% | |
71 | 2 | 0.2% | |
70.5 | 1 | 0.1% | |
70 | 2 | 0.2% |
Cabin
Categorical
Distinct count | 148 |
---|---|
Unique (%) | 16.6% |
Missing (%) | 77.1% |
Missing (n) | 687 |
B96 B98 | 4 |
---|---|
G6 | 4 |
C23 C25 C27 | 4 |
Other values (144) | |
(Missing) |
Value | Count | Frequency (%) | |
B96 B98 | 4 | 0.4% | |
G6 | 4 | 0.4% | |
C23 C25 C27 | 4 | 0.4% | |
C22 C26 | 3 | 0.3% | |
F33 | 3 | 0.3% | |
E101 | 3 | 0.3% | |
D | 3 | 0.3% | |
F2 | 3 | 0.3% | |
E44 | 2 | 0.2% | |
F G73 | 2 | 0.2% | |
Other values (137) | 173 | 19.4% | |
(Missing) | 687 | 77.1% |
Max length | 15 |
---|---|
Mean length | 3.1347 |
Min length | 1 |
Contains chars | True |
Contains digits | True |
Contains spaces | True |
Contains non-words | True |
Embarked
Categorical
Distinct count | 4 |
---|---|
Unique (%) | 0.4% |
Missing (%) | 0.2% |
Missing (n) | 2 |
S | |
---|---|
C | |
Q | 77 |
(Missing) | 2 |
Value | Count | Frequency (%) | |
S | 644 | 72.3% | |
C | 168 | 18.9% | |
Q | 77 | 8.6% | |
(Missing) | 2 | 0.2% |
Max length | 3 |
---|---|
Mean length | 1.0045 |
Min length | 1 |
Contains chars | True |
Contains digits | False |
Contains spaces | False |
Contains non-words | False |
Fare
Numeric
Distinct count | 248 |
---|---|
Unique (%) | 27.8% |
Missing (%) | 0.0% |
Missing (n) | 0 |
Infinite (%) | 0.0% |
Infinite (n) | 0 |
Mean | 32.204 |
---|---|
Minimum | 0 |
Maximum | 512.33 |
Zeros (%) | 1.7% |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 7.225 |
Q1 | 7.9104 |
Median | 14.454 |
Q3 | 31 |
95-th percentile | 112.08 |
Maximum | 512.33 |
Range | 512.33 |
Interquartile range | 23.09 |
Descriptive statistics
Standard deviation | 49.693 |
---|---|
Coef of variation | 1.5431 |
Kurtosis | 33.398 |
Mean | 32.204 |
MAD | 28.164 |
Skewness | 4.7873 |
Sum | 28694 |
Variance | 2469.4 |
Memory size | 7.0 KiB |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 2.00625 6.3375 7.0479 7.0521 ... 57.4896 92.2896 159.1646 262.6875 512.3292 ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
8.05 | 43 | 4.8% | |
13 | 42 | 4.7% | |
7.8958 | 38 | 4.3% | |
7.75 | 34 | 3.8% | |
26 | 31 | 3.5% | |
10.5 | 24 | 2.7% | |
7.925 | 18 | 2.0% | |
7.775 | 16 | 1.8% | |
26.55 | 15 | 1.7% | |
0 | 15 | 1.7% | |
Other values (238) | 615 | 69.0% |
Minimum 5 values
Value | Count | Frequency (%) | |
0 | 15 | 1.7% | |
4.0125 | 1 | 0.1% | |
5 | 1 | 0.1% | |
6.2375 | 1 | 0.1% | |
6.4375 | 1 | 0.1% |
Maximum 5 values
Value | Count | Frequency (%) | |
512.33 | 3 | 0.3% | |
263 | 4 | 0.4% | |
262.38 | 2 | 0.2% | |
247.52 | 2 | 0.2% | |
227.53 | 4 | 0.4% |
Name
Categorical, Unique
First 5 values |
---|
Abbing, Mr. Anthony |
Abbott, Mr. Rossmore Edward |
Abbott, Mrs. Stanton (Rosa Hunt) |
Abelson, Mr. Samuel |
Abelson, Mrs. Samuel (Hannah Wizosky) |
Last 5 values |
---|
de Mulder, Mr. Theodore |
de Pelsmaeker, Mr. Alfons |
del Carlo, Mr. Sebastiano |
van Billiard, Mr. Austin Blyler |
van Melkebeke, Mr. Philemon |
First 5 values
Value | Count | Frequency (%) | |
Abbing, Mr. Anthony | 1 | 0.1% | |
Abbott, Mr. Rossmore Edward | 1 | 0.1% | |
Abbott, Mrs. Stanton (Rosa Hunt) | 1 | 0.1% | |
Abelson, Mr. Samuel | 1 | 0.1% | |
Abelson, Mrs. Samuel (Hannah Wizosky) | 1 | 0.1% |
Last 5 values
Value | Count | Frequency (%) | |
van Melkebeke, Mr. Philemon | 1 | 0.1% | |
van Billiard, Mr. Austin Blyler | 1 | 0.1% | |
del Carlo, Mr. Sebastiano | 1 | 0.1% | |
de Pelsmaeker, Mr. Alfons | 1 | 0.1% | |
de Mulder, Mr. Theodore | 1 | 0.1% |
Parch
Numeric
Distinct count | 7 |
---|---|
Unique (%) | 0.8% |
Missing (%) | 0.0% |
Missing (n) | 0 |
Infinite (%) | 0.0% |
Infinite (n) | 0 |
Mean | 0.38159 |
---|---|
Minimum | 0 |
Maximum | 6 |
Zeros (%) | 76.1% |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
Median | 0 |
Q3 | 0 |
95-th percentile | 2 |
Maximum | 6 |
Range | 6 |
Interquartile range | 0 |
Descriptive statistics
Standard deviation | 0.80606 |
---|---|
Coef of variation | 2.1123 |
Kurtosis | 9.7781 |
Mean | 0.38159 |
MAD | 0.58074 |
Skewness | 2.7491 |
Sum | 340 |
Variance | 0.64973 |
Memory size | 7.0 KiB |
Histogram with fixed size bins (bins=7)
Histogram with variable size bins (bins=[0. 0.5 1.5 2.5 6. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
0 | 678 | 76.1% | |
1 | 118 | 13.2% | |
2 | 80 | 9.0% | |
5 | 5 | 0.6% | |
3 | 5 | 0.6% | |
4 | 4 | 0.4% | |
6 | 1 | 0.1% |
Minimum 5 values
Value | Count | Frequency (%) | |
0 | 678 | 76.1% | |
1 | 118 | 13.2% | |
2 | 80 | 9.0% | |
3 | 5 | 0.6% | |
4 | 4 | 0.4% |
Maximum 5 values
Value | Count | Frequency (%) | |
6 | 1 | 0.1% | |
5 | 5 | 0.6% | |
4 | 4 | 0.4% | |
3 | 5 | 0.6% | |
2 | 80 | 9.0% |
PassengerId
Numeric
Distinct count | 891 |
---|---|
Unique (%) | 100.0% |
Missing (%) | 0.0% |
Missing (n) | 0 |
Infinite (%) | 0.0% |
Infinite (n) | 0 |
Mean | 446 |
---|---|
Minimum | 1 |
Maximum | 891 |
Zeros (%) | 0.0% |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 45.5 |
Q1 | 223.5 |
Median | 446 |
Q3 | 668.5 |
95-th percentile | 846.5 |
Maximum | 891 |
Range | 890 |
Interquartile range | 445 |
Descriptive statistics
Standard deviation | 257.35 |
---|---|
Coef of variation | 0.57703 |
Kurtosis | -1.2 |
Mean | 446 |
MAD | 222.75 |
Skewness | 0 |
Sum | 3.9739e+05 |
Variance | 66231 |
Memory size | 7.0 KiB |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 891.], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
891 | 1 | 0.1% | |
293 | 1 | 0.1% | |
304 | 1 | 0.1% | |
303 | 1 | 0.1% | |
302 | 1 | 0.1% | |
301 | 1 | 0.1% | |
300 | 1 | 0.1% | |
299 | 1 | 0.1% | |
298 | 1 | 0.1% | |
297 | 1 | 0.1% | |
Other values (881) | 881 | 98.9% |
Minimum 5 values
Value | Count | Frequency (%) | |
1 | 1 | 0.1% | |
2 | 1 | 0.1% | |
3 | 1 | 0.1% | |
4 | 1 | 0.1% | |
5 | 1 | 0.1% |
Maximum 5 values
Value | Count | Frequency (%) | |
891 | 1 | 0.1% | |
890 | 1 | 0.1% | |
889 | 1 | 0.1% | |
888 | 1 | 0.1% | |
887 | 1 | 0.1% |
Pclass
Categorical
Distinct count | 3 |
---|---|
Unique (%) | 0.3% |
Missing (%) | 0.0% |
Missing (n) | 0 |
3 | |
---|---|
1 | |
2 |
Value | Count | Frequency (%) | |
3 | 491 | 55.1% | |
1 | 216 | 24.2% | |
2 | 184 | 20.7% |
Max length | 1 |
---|---|
Mean length | 1 |
Min length | 1 |
Contains chars | False |
Contains digits | True |
Contains spaces | False |
Contains non-words | False |
Sex
Categorical
Distinct count | 2 |
---|---|
Unique (%) | 0.2% |
Missing (%) | 0.0% |
Missing (n) | 0 |
male | |
---|---|
female |
Value | Count | Frequency (%) | |
male | 577 | 64.8% | |
female | 314 | 35.2% |
Max length | 6 |
---|---|
Mean length | 4.7048 |
Min length | 4 |
Contains chars | True |
Contains digits | False |
Contains spaces | False |
Contains non-words | False |
SibSp
Numeric
Distinct count | 7 |
---|---|
Unique (%) | 0.8% |
Missing (%) | 0.0% |
Missing (n) | 0 |
Infinite (%) | 0.0% |
Infinite (n) | 0 |
Mean | 0.52301 |
---|---|
Minimum | 0 |
Maximum | 8 |
Zeros (%) | 68.2% |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
Median | 0 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 8 |
Range | 8 |
Interquartile range | 1 |
Descriptive statistics
Standard deviation | 1.1027 |
---|---|
Coef of variation | 2.1085 |
Kurtosis | 17.88 |
Mean | 0.52301 |
MAD | 0.71378 |
Skewness | 3.6954 |
Sum | 466 |
Variance | 1.216 |
Memory size | 7.0 KiB |
Histogram with fixed size bins (bins=7)
Histogram with variable size bins (bins=[0. 0.5 1.5 4.5 8. ], "bayesian blocks" binning strategy used)
Value | Count | Frequency (%) | |
0 | 608 | 68.2% | |
1 | 209 | 23.5% | |
2 | 28 | 3.1% | |
4 | 18 | 2.0% | |
3 | 16 | 1.8% | |
8 | 7 | 0.8% | |
5 | 5 | 0.6% |
Minimum 5 values
Value | Count | Frequency (%) | |
0 | 608 | 68.2% | |
1 | 209 | 23.5% | |
2 | 28 | 3.1% | |
3 | 16 | 1.8% | |
4 | 18 | 2.0% |
Maximum 5 values
Value | Count | Frequency (%) | |
8 | 7 | 0.8% | |
5 | 5 | 0.6% | |
4 | 18 | 2.0% | |
3 | 16 | 1.8% | |
2 | 28 | 3.1% |
Survived
Boolean
Distinct count | 2 |
---|---|
Unique (%) | 0.2% |
Missing (%) | 0.0% |
Missing (n) | 0 |
0 | |
---|---|
1 |
Value | Count | Frequency (%) | |
0 | 549 | 61.6% | |
1 | 342 | 38.4% |
Ticket
Categorical
Distinct count | 681 |
---|---|
Unique (%) | 76.4% |
Missing (%) | 0.0% |
Missing (n) | 0 |
CA. 2343 | 7 |
---|---|
1601 | 7 |
347082 | 7 |
Other values (678) |
Value | Count | Frequency (%) | |
CA. 2343 | 7 | 0.8% | |
1601 | 7 | 0.8% | |
347082 | 7 | 0.8% | |
347088 | 6 | 0.7% | |
CA 2144 | 6 | 0.7% | |
3101295 | 6 | 0.7% | |
382652 | 5 | 0.6% | |
S.O.C. 14879 | 5 | 0.6% | |
LINE | 4 | 0.4% | |
17421 | 4 | 0.4% | |
Other values (671) | 834 | 93.6% |
Max length | 18 |
---|---|
Mean length | 6.7508 |
Min length | 3 |
Contains chars | True |
Contains digits | True |
Contains spaces | True |
Contains non-words | True |
First rows
Age | Cabin | Embarked | Fare | Name | Parch | PassengerId | Pclass | Sex | SibSp | Survived | Ticket | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 22.0 | NaN | S | 7.2500 | Braund, Mr. Owen Harris | 0 | 1 | 3 | male | 1 | 0 | A/5 21171 |
1 | 38.0 | C85 | C | 71.2833 | Cumings, Mrs. John Bradley (Florence Briggs Th... | 0 | 2 | 1 | female | 1 | 1 | PC 17599 |
2 | 26.0 | NaN | S | 7.9250 | Heikkinen, Miss. Laina | 0 | 3 | 3 | female | 0 | 1 | STON/O2. 3101282 |
3 | 35.0 | C123 | S | 53.1000 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | 0 | 4 | 1 | female | 1 | 1 | 113803 |
4 | 35.0 | NaN | S | 8.0500 | Allen, Mr. William Henry | 0 | 5 | 3 | male | 0 | 0 | 373450 |
5 | NaN | NaN | Q | 8.4583 | Moran, Mr. James | 0 | 6 | 3 | male | 0 | 0 | 330877 |
6 | 54.0 | E46 | S | 51.8625 | McCarthy, Mr. Timothy J | 0 | 7 | 1 | male | 0 | 0 | 17463 |
7 | 2.0 | NaN | S | 21.0750 | Palsson, Master. Gosta Leonard | 1 | 8 | 3 | male | 3 | 0 | 349909 |
8 | 27.0 | NaN | S | 11.1333 | Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) | 2 | 9 | 3 | female | 0 | 1 | 347742 |
9 | 14.0 | NaN | C | 30.0708 | Nasser, Mrs. Nicholas (Adele Achem) | 0 | 10 | 2 | female | 1 | 1 | 237736 |
Last rows
Age | Cabin | Embarked | Fare | Name | Parch | PassengerId | Pclass | Sex | SibSp | Survived | Ticket | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
881 | 33.0 | NaN | S | 7.8958 | Markun, Mr. Johann | 0 | 882 | 3 | male | 0 | 0 | 349257 |
882 | 22.0 | NaN | S | 10.5167 | Dahlberg, Miss. Gerda Ulrika | 0 | 883 | 3 | female | 0 | 0 | 7552 |
883 | 28.0 | NaN | S | 10.5000 | Banfield, Mr. Frederick James | 0 | 884 | 2 | male | 0 | 0 | C.A./SOTON 34068 |
884 | 25.0 | NaN | S | 7.0500 | Sutehall, Mr. Henry Jr | 0 | 885 | 3 | male | 0 | 0 | SOTON/OQ 392076 |
885 | 39.0 | NaN | Q | 29.1250 | Rice, Mrs. William (Margaret Norton) | 5 | 886 | 3 | female | 0 | 0 | 382652 |
886 | 27.0 | NaN | S | 13.0000 | Montvila, Rev. Juozas | 0 | 887 | 2 | male | 0 | 0 | 211536 |
887 | 19.0 | B42 | S | 30.0000 | Graham, Miss. Margaret Edith | 0 | 888 | 1 | female | 0 | 1 | 112053 |
888 | NaN | NaN | S | 23.4500 | Johnston, Miss. Catherine Helen "Carrie" | 2 | 889 | 3 | female | 1 | 0 | W./C. 6607 |
889 | 26.0 | C148 | C | 30.0000 | Behr, Mr. Karl Howell | 0 | 890 | 1 | male | 0 | 1 | 111369 |
890 | 32.0 | NaN | Q | 7.7500 | Dooley, Mr. Patrick | 0 | 891 | 3 | male | 0 | 0 | 370376 |