Three year summary Lake Geneva

Notebook 4

Purpose: Present an analysis method for survey results from beach litter inventories on Lake Geneva.

Background: This is in the context of the global movement to reduce plastic debris in the maritime environment. Riverine inputs are major contributors of plastic debris (and all types of refuse) to the oceans. This is an analysis of the data collected on the shores of Lake Geneva over a three year period. The maritime protocol was modified in very specific ways to adjust for the local geography and population density.

Research question: Is this a representative sample ?

from notebook one we can assume the following:
  1. The data was collected at different locations
  2. The data was collected by different groups of people
  3. For each year there is one group that collected 50% or more of the samples
  4. The average total number of objects found is greater than the median for all groups
from notebook two :
  1. The cumalitve average of pcs/m is greater than the median
  2. For each year the average of pcs/m is greater than the median
  3. The median value year over year is 4.24 < x < 6.84
  4. There may be a negative correlation between number of samples and standard deviation
  5. That different groups report similar results
from notebook three:
  1. That the seven most frequently indentified objects are relatively constant

If sampling all the trash is not possibile, what if we sample as much as possible and see what that looks like? The following questions could be answered:

  1. What does the distribution of survey results look like?
  2. Do different groups of people produce different survey results?
  3. How different are the survey results from one location to another?
  4. What are the most abundant objects?
  5. How different are the survey results year over year?

From Notebook 1 IMPORTANT!

In notebook one the conclusion was that the data has different geographic centers and those centers reflect different land use patterns.

The data is now local so it is allways best to run notebook one first

This notebook and all subsequent use the directory establsihed in notebook one (see above)

Directory already in place

Read in the json data

Make pcs/m, make grouping levels

Results: Total pieces of trash per meter

Cumulative results

Reported as the total number of objects found, the cumulative and year over year results are given in the table below:

Statistic Year one Year two Year three All
Samples 83 41 24 148
mean 8.77 9.99 9.01 9.16
median 4.83 6.84 4.24 5.52
Std dev 9.92 8.52 16.14 10.75
25\%ile 3.113 4.41 2.36 3.18
75\%ile 10.405 12.35 7.79 11.24
Minimum 0.68 0.57 0.11 0.11
Maximum 50.075 39.54 77.05 77.05
MCBP samples 80 22 5 107
SLR samples 0 18 15 33
EPFL samples 2 2 2 6
PC samples 0 0 2 2

Year two has the highest average, median and the greatest innerquartile range. The lowest and the highest daily values were reported in year three. In each year there is one group that collected at least 50% of all the samples.

For each year the mean is greater than the median, suggesting a right skewed distribution. The mean and median are clossest at year two.

histoGramAllData
logHistoGramAllData
YoYdistLog
YoYdistNormed