baroclinic_instability, on 19 February 2012 - 01:03 PM, said:
Totally curious, you mentioned representativeness, but I was also wondering if issues arise regarding uneven distribution of data (i.e., a lone dropsonde into a Pacific cyclone, etc.)? What types of schemes are in place now to deal with extreme data distribution differences within a various domain?
I don't really think this is as much of an issue, per say, but does yield an analysis/IC that is of variable quality for different parts of the globe.
How the observational information is used to correct the guess (short term model forecast) is highly dependent on the background error covariance information. This is one of the reasons why ensemble methods have become so popular within the DA community.....as they allow you to sample/estimate through Monte Carlo type methods where you need to spread observational informational out over larger spatial distances or with larger amplitude (in addition to inherit model-driven multivariate aspects of the correlations). In the case of 4DVAR, this can be
somewhat accounted for implicitly through the linearized model that is used as part of the update. There are ways to account for this within 3DVAR, but it's more difficult.
Ensemble methods are imperfect, however, due to the inherit issues related to sampling a huge space (the background error covariance for a modern day NWP models is huge, ~10^7 or greater....state space squared.....and we're trying to sample this with only ~50-100 ensemble members). This is why various places (UKMet, NCEP, CMC, others) are investing in hybrid type technologies where you combine ensemble-based information with the variational (with a full rank, static background error estimate) framework.
As an example, there have been projects that utilized ensemble-based methods to construct historical reanalysis....from a limited subset of surface pressure data
only (and I'm talking full 3D analyses). That's about as extreme as it gets.