Sean C. Crosby

Sampling for Maximum Dissimilarity

Creating a subsample of data that covers your N-dimensional space

Sampling Sampling data can be done in many ways depending on what is desired. Most often a random uniform sampling is used to collect a small subset for preliminary analysis. This can reduce computation and provide rapid insight. There are, however, other reasons to sample data, such as create a representative sample set that covers the data range. A particular example arises in earth sciences when the goal is to model weather or ocean conditions across the range of possible forcing.

Posted by "Sean C. Crosby" Sunday, October 8, 2023

Making wind forecast corrections with ML

Using multi-layer perceptrons to create non-linear forecast corrections

The idea Recent skill improvements in meteorological and ocean forecast have been by incorporating observations into model predictions. Numerical model development, by means of improved physical representations, is in many cases suffering from diminishing returns. While ensemble approaches (many) models are useful and provide uncertainty estimates, the assimilation of ever increasing data into models is needed. The folks over at Google’s Deep Mind more recently showed that deep neural networks could be used to accurately predict short-term (up to 90-minutes) precipitation events using prior radar observations (Nature article).

Posted by "Sean C. Crosby" Sunday, January 2, 2022

Can we visualize decision space for different classifiers?

Visualizing decision space from simple to complex classifiers

Support Vector Machines (SVM) are seemingly derived from a intuitive concept, drawing a decision boundary with the widest margin (aka gutter, street, etc.). While this only really applies to the linear problem in which the data are indeed separable, I find it particularly helpful in visualizing the decision space of the model. Unlike a random forest or multi-layer neural network, it is easy to picture model space. While not a novel idea in the slightest, this provoked me to consider decision space for several classification algorithms to hopefully gain insight into other techniques, and with this insight select appropriate methods for future questions.

Posted by "Sean C. Crosby" Tuesday, December 28, 2021

Dominant wind field patterns from complex empirical orthogonal functions

Using complex EOFs (aka PCA) to find common wind patterns

Today I was wondering if I can make some sense of a rather large 60-year hindcast of winds in Washington around the Salish Sea. Could I see what the typical wind patterns in the region are? How can I manage this with a rather hefty 1+GB set of spatial wind predictions? This is a good opportunity to explore empirical orthogonal functions (EOFs), also known as principle component analysis (PCA).

Posted by "Sean C. Crosby" Tuesday, November 23, 2021

Ocean wave transformation with ray-path tracing

Given ocean bathymetry we can calculate wave speed and then rapidly estimate ocean wave transformation

Ray tracing Ray-tracing is a common tool in many fields from Magnetic resonance imaging (MRI) to ocean acoustics. In fact ocean acoustics are frequently employed to study whale and porpoise populations as well as detect boats or submarines. The concept is relatively simple, a wave traveling through a medium will bend or refract as its speed in the medium changes. And this is true for ocean waves, the ones you commonly see arriving at the shoreline.

Posted by "Sean C. Crosby" Sunday, October 17, 2021

FEATURED TAGS

data-science oceanography

ABOUT ME

Father, Partner, Cyclist, Runner, Wing-foiler, Data Scientist