To predict the behaviour of a dynamical system using a mathematical model, an accurate estimate of the current state of the system is needed in order to initialize the model. Complete information on the current state is, however, seldom available. The aim of optimal state estimation, known in the geophysical sciences as ‘data assimilation’, is to determine a best estimate of the current state using measured observations of the real system over time, together with the model equations. The problem is commonly formulated in variational terms as a very large nonlinear least-squares optimization problem. The lack of complete data, coupled with errors in the observations and in the model, leads to a highly ill-conditioned inverse problem that is difficult to solve.

To understand the nature of the inverse problem, we examine how different components of the assimilation system influence the conditioning of the optimization problem. First we consider the case where the dynamical equations are assumed to model the real system exactly. We show, against intuition, that with increasingly dense and precise observations, the problem becomes harder to solve accurately. We then extend these results to a 'weak-constraint' form of the problem, where the model equations are assumed not to be exact, but to contain random errors. Two different, but mathematically equivalent, forms of the problem are derived. We investigate the conditioning of these two forms and find, surprisingly, that these have quite different behaviour.