Assessing the accuracy of the MarineAware platform - Riskaware

Assessing the accuracy of the MarineAware platform

Oil spills at sea can have profound impacts on the environment, public health, the economy, and communities.

Although oil spills can occur naturally, the most catastrophic come from human activity – such as the 1989 Exxon Valdez incident, one of the largest oil tanker spills in history. Here, more than 11 million gallons of crude oil were spilt across 1,300 miles of shorelines of Alaska and Canada.

The results were devastating, decimating local wildlife and killing more than a quarter of a million seabirds alone. To the fishing industry, the Valdez incident caused more than $300 million in economic harm, with the tourist trade also taking a significant hit. Clean-up costs were estimated to be more than $2.5 billion, and legal settlements and penalties more than doubled this figure.

Through oil spill modelling tools, resources can be more effectively deployed that reduce the potential environmental and economic damage alongside a range of other significant benefits.

Recently, we approached the STFC to validate and assess our MarineAware platform. In doing so, we hoped to gain an understanding of how accurate our oil spill modelling capabilities are, and how they can be effectively used by response teams worldwide.


“Through the Innovate UK A4I programme, we reached out to the Hartree Centre to help us validate our model in a way that could produce quantifiable results. We wanted to objectively assess the model’s ability to forecast different historic incidents to answer the question, ‘how accurate is the model?’ in a meaningful way.” – Tim Culmer, MarineAware Product Manager.

About MarineAware

MarineAware – our oil spill modelling platform – is a fresh look at oil spill preparedness and response. At its core is our advanced oil spill model, capable of predicting the trajectory and fate of an oil spill, as well as identifying its possible source.

Key to the platform is its flexibility. The modelling capability can be used to produce bespoke reports for planning or can be combined with capabilities like automatic satellite detection for monitoring and scheduling to aid preparedness for potential scenarios.

Organisations across the world rely on MarineAware’s modelling capabilities to provide them with vital modelling support. This includes a high-profile study conducted on behalf of the UK Government’s Foreign, Commonwealth & Development Office (FCDO) to assess the risks posed by the FSO Safer, a deteriorating fuel storage vessel in the Red Sea.

Learn more: Discover the MarineAware platform here.

Validating the MarineAware Platform

Validation of a model involves comparing its simulations of real scenarios to the actual outcomes. A range of historic oil spill incidents were used as part of the validation carried out by STFC, including an incident in 2018 which happened in Corsica, featured in this article.

Corsica: setting the scene

On the 7th of October 2018, the CSL Virginia and the Ulysse collided in the Mediterranean Sea, just north of the French Island of Corsica. As a result, more than 1000 tonnes of oil were released into the ocean over a 3-mile spill, with oil continuing to leak from a damaged hull during this time.

Due to the large areas oil spills can cover, satellite imagery can be a valuable resource for understanding the true extent of a spill. For the incident in Corsica, several high-quality images were captured of the spill by the European Space Agency’s Sentinel-1 satellites.

Thanks to the amount of information captured by the satellites, we were able to use this scenario to accurately validate simulations we ran of this incident. This was made possible thanks to the capabilities, facilities and expertise of the Hartree Centre – a part of the STFC – who were able to perform an unbiased and thorough validation of our modelling accuracy.

Understanding the validation metrics

To assess how well the MarineAware model was able to accurately simulate the outcome of the Corsica incident and other oil spills, the Hartree Centre used three metrics. These were based on established and proven approaches developed in other domains, and adapted for oil spill model validation at the Hartree Centre:

A 2-D Measure of Effectiveness (MOE): A 2-D MOE measures performance based on the extent to which the location of the predicted oil spill overlaps with the actual extent of the oil at a given time. Both false negatives (areas where oil is found but not predicted) and false positives (areas where oil is predicted but not found) are considered.

A Centroid Skill Score: A centroid skill score measures the proximity of a predicted region to the observed region, within a pre-set tolerance threshold. A centroid skill score of 1 would mean the centroids of the observed oil and predicted oil spill would perfectly align.

An Area Skill Score: This measures the size of the predicted oil spill region relative to the observed region. A score of 1 would indicate that the predicted and known spill areas are the same, whereas a score of 0 would indicate a large margin of error.

For more information of the metrics developed by the Hartree Centre, you can download their paper published in the IEEE Journal of Oceanic Engineering.

Our two model modes explained

We can run the MarineAware model in different modes depending on the requirements of a situation. Most commonly our model is run in deterministic mode for short-range predictions, with a stochastic mode used to take into account the uncertainty involved in longer-term predictions.

These two modes produce different outputs and by developing different metrics, the Hartree Centre were able to select the metrics most suitable for assessing each one:

Deterministic (Short-range)

When an oil spill is initially detected, responders need to act quickly. In these instances, the MarineAware model is typically used in a deterministic mode. Running a deterministic simulation involves running a single simulation based on the exact metocean forecasts over that period. The simulation can give an idea of the different concentrations of oil likely to be seen, and because it involves just a single run it can supply results to responders quickly.

Stochastic (Long-range)

Over longer time periods, uncertainty in the metocean forecasts and the modelling itself increases. To take this uncertainty into account, we run the MarineAware model in a stochastic mode. This involves performing many (sometimes hundreds) of simulations and varying inputs, such as the metocean conditions based on the uncertainty associated with them. The outputs from these different simulations are then combined to identify the parts of the sea and land most likely to be affected given all the possible conditions.

For example, in a stochastic run of 100 simulations where oil reached a specific location in fifty of them, we could infer that there was a fifty per cent chance of oil reaching that location.

Investigating the results

Deterministic

Comparison of the MarineAware model’s deterministic simulation with the actual extent of the oil 1 day 24 minutes after the initial release. The model’s ability to simulate the trajectory and spread of the oil is shown by the centroid and area skill scores, which are plotted on the y and x axes of the chart, respectively.

Comparison of the MarineAware model’s deterministic simulation with the actual extent of the oil 1 day 12 hours 19 minutes after the initial release. The model’s ability to simulate the trajectory and spread of the oil is shown by the centroid and area skill scores, which are plotted on the y and x axes of the chart, respectively.

For the deterministic outputs, the Hartree Centre combined the centroid and area skills scores to produce an assessment of the model’s ability to predict the trajectory and spread of the oil. The figure above shows the MarineAware model’s predictions (in purple) overlayed on the outline of the real oil spill as seen from the satellite imagery (in grey) at 1 day 24 minutes and 1 day 12 hours 19 minutes after the spill started.

For deterministic modelling, the skill score metrics are preferred over the 2-D MOE metric because they are more tolerant to inaccuracies in the model. For example, if the model had been only a few degrees off in the angle of its trajectory there would be hardly any overlap, which would produce as low a 2-D MOE score as if the model had predicted it would go in completely the wrong direction.

From the assessment carried out by the Hartree Centre, we can see that the model has high centroid skill scores at both times because it has predicted the trajectory of the oil very well. The area skill score is lower at the first time but is greatly improved at the second. One explanation for this is that the release rate of oil used in the model simulation was constant, where as it may have varied in the real scenario. When there are uncertainties like this, it can be useful to use stochastic modelling.

Stochastic

Comparison of the MarineAware model’s stochastic simulation with the actual extent of the oil 2 days 12 hours and 11 minutes after the initial release. The different coloured areas show the model’s calculated probability for the oil reaching different locations. The 2-D MOE score for each probability is shown on the chart with its ability to reduce false negative shown on the x-axis, and its ability to reduce false positive on the y-axis.

Because we are considering all possible outcomes, instead of the most likely, the nature of the stochastic modelling means that there is a greater spread in the output and therefore a much higher chance of overlap with the actual oil spill. Therefore, in these instances, the Hartree Centre have recommended the use of the 2-D MOE metric.

The figure above shows the MarineAware model’s predictions (in purple to yellow) overlayed on the outline of the real oil spill as seen from the satellite imagery (in yellow and brown) at 2 days 12 hours and 11 minutes after the spill started. For this longer-range simulation, we have chosen to compare the stochastic results.

The brighter colours of the stochastic output show the areas where the model predicts there is the most certainty that there will be oil, with the darker colours showing the lower probabilities. As expected, there is the most certainty of oil nearer the source, and the further the oil travels and the longer it is in the water – the more that uncertainty increases.

The 2-D MOE metric assesses the model’s ability to reduce false negative and false positive, with the scores plotted on the x and y axes of the chart, respectively. The less of the actual oil outside the simulated area, the lower the amount of false negative and therefore the higher the x-axis score. On the y-axis, a higher score will be awarded for less false positive, in other words when the simulation has less areas showing the possibility of oil where was no actual oil.

Again, the MarineAware model performed well, with the lowest probability output (>1%) covering virtually all of the observed oil spill area, meaning hardly any false negatives, as we would expect for this type of output.

MarineAware’s validation: a summary

“Through the Innovate UK A4I programme, we reached out to the Hartree Centre to help us validate our model in a way that could produce quantifiable results. We wanted to objectively assess the model’s ability to forecast different historic incidents to answer the question, “how accurate is the model?” in a meaningful way.”

The validation work completed on the MarineAware model by the STFC is vital in ensuring that its predictions can be relied upon by its users. The Corsica incident is just one of the many historical incidents used in the validation. Using these real-world cases helps us ensure that the model is accurate when it matters. The STFC concluded that the MarineAware model can provide important actionable intelligence in the event of an actual oil spill, or for preparedness planning of a potential incident.

The metrics developed by the STFC have formed a vital part of our development process for MarineAware. As well as identifying the strengths of the model, it also helps identify areas where we can enhance it further. We continue to test our model with the most recent oil spill incidents as they occur. This is important as the nature of spills is changing with the introduction of new low-sulphur fuels for vessels.

Our work with the STFC has been extremely positive highlighting the advantages of collaborating with other organisations, with different areas of expertise, to strengthen the capabilities of potentially life-saving tools.

Learn more about the MarineAware platform by downloading our brochure.

Get In Touch

Are you looking for more information about Riskaware, our products or services?

Get in contact with us by filling out the form or call the office on +44 (0) 117 929 1058 and a member of our team would be happy to help.