wiki

The orbit determination and identification problems

The orbit determination problem

Is it possible to reconstruct the orbit of a celestial body from its observations? Answering this question is the goal of orbit determination. As shown in [Milani and Gronchi, 2010], one method to solve this problem goes back to Carl Friedrich Gauss, and consists in the determination of a preliminary orbit and to the successive application of the least squares method. To compute a solution, we need at least three optical observations of the celestial body.

To obtain the preliminary orbit, the geometry of the observations and the two-body dynamics are used. Let $\boldsymbol{r}$ and $\boldsymbol{\rho}$ be respectively the heliocentric and topocentric position of the observed body. Let $\boldsymbol{q}$ be the heliocentric position of the observer. Because of the cosine rule, we have

\[r^2=\rho^2+q^2+2\rho q\cos{\epsilon},\]

being $\epsilon$ the angle between $\boldsymbol{\rho}$ and $\boldsymbol{q}$. Moreover, because of the two-body dynamics it is possible to write

\[\mathcal{C} \frac{\rho}{q}=\gamma - \frac{q^3}{r^3},\]

where $\mathcal{C} $ and $\gamma$ are real parameters depending on the observations. They are obtained by exploiting the coplanarity condition of the heliocentric position vectors $\boldsymbol{r}_1$, $\boldsymbol{r}_2$ and $\boldsymbol{r}_3$ associated to three observations and by expanding $\boldsymbol{r}_1-\boldsymbol{r}_2$ and $\boldsymbol{r}_3-\boldsymbol{r}_2$ in powers of the time variable $t$ around the central time $t_2$. By combining the two equations, an eight degree polynomial is obtained:

\[P(\rho)=\mathcal{C}^2r^8-q^2(\mathcal{C}^2+2\mathcal{C}\gamma\cos{\epsilon}+\gamma^2)r^6+2q^5(\mathcal{C}\cos{\epsilon}+\gamma)r^3-q^8\]

By solving it and by applying the Gibbs formula, it is possible to obtain the position vector $\boldsymbol{r}$ and the velocity $\dot{\boldsymbol{r}}$, defining a preliminary orbit. Te latter is taken as the initial guess to solve iteratively the orbit determination problem. We describe the procedure below. Let us consider the Cauchy problem:

\[\frac{d\boldsymbol{y}}{dt}=f(\boldsymbol{y},t,\boldsymbol{\mu}), \boldsymbol{y}(t_0)= \boldsymbol{y}_0\]

where $\boldsymbol{y} \in \mathbb{R}^p$ is the orbital state vector. The components of the vector $\boldsymbol{\mu} \in \mathbb{R}^{p’}$ are called dynamical parameters. Let $\boldsymbol{\Phi}_{t_0}^t (\boldsymbol{y}_0,\boldsymbol{\mu})$ be the integral flow, that is the general solution of the differential equation. Two functions can be introduced: the observation function $R(\boldsymbol{y},t,\boldsymbol{\nu})$ and the prediction function $\tilde{r}(t)=R(\boldsymbol{\Phi}_{t_0}^t (\boldsymbol{y}_0,\boldsymbol{\mu}),t,\boldsymbol{\nu})$. The components of the vector $\boldsymbol{\nu} \in \mathbb{R}^{p’’}$ are called kinetic parameters. At each epoch, the prediction and the observation functions define selected orbital parameters, for example the right ascension and the declination of the body. In particular, the prediction function can be exploited to predict the orbital elements at the observations epochs to compare them with the measured data $r_i$. In this way, it is possible to compute the so-called residuals for each of the m observations:

$\boldsymbol{\xi}=(\xi_1,\ldots,\xi_m)$, where $\,\xi_i=r_i-\tilde{r}(t_i)$, $\,i=1,\ldots,m$

Even if $\boldsymbol{y}_0$, $\boldsymbol{\mu}$ and $\boldsymbol{\nu}$ were known with a perfect accuracy and all the computations were perfectly accurate, the residuals would not be null: they are random variables. As a consequence of the least-squares principle, it is possible to approximate the celestial body orbit by finding the set of parameters $\boldsymbol{x} \in \mathbb{R}^N$, $N\leq p+p’+p’’$, subset of $(\boldsymbol{y}_0,\boldsymbol{\mu},\boldsymbol{\nu})$, minimizing the following function, known as target function:

\[Q(\boldsymbol{x})=Q(\boldsymbol{\xi}(\boldsymbol{x}))=\frac{1}{m}\boldsymbol{\xi}^T\boldsymbol{\xi}.\]

To solve this minimization problem, Newton’s method could be applied. Indeed, to search for the minimum points of $Q(\boldsymbol{x})$, we can search for the zeros of the following equation:

\[\frac{\partial Q}{\partial t}=\frac{2}{m}\boldsymbol{\xi}^TB=0\]

with $B=\cfrac{\partial\boldsymbol{\xi}}{\partial \boldsymbol{x}}$.

It is evident that applying Newton’s method implies to compute the Hessian matrix:

\[\frac{\partial^2 Q}{\partial t^2}=\frac{2}{m}(B^TB+\boldsymbol{\xi}^TH),\]

with $H=\cfrac{\partial^2 \boldsymbol{\xi}}{\partial \boldsymbol{x}^2}$. Indeed the Netwon iteration step is

\[\boldsymbol{x}_{k+1}=\boldsymbol{x}_k-(B^TB+\boldsymbol{\xi}^TH)^{-1}B^T\boldsymbol{\xi}.\]

The computational cost is important. This is the reason why, if the residuals can be assumed small, the differential corrections method as an alternative. The iteration becomes

\[\boldsymbol{x}_{k+1}=\boldsymbol{x}_k-(B^TB)^{-1}B^T\boldsymbol{\xi}.\]

The matrix $C=B^TB$ is called normal matrix and it is the inverse of the covariance matrix $\Gamma=C^{-1}$: the value of $\Gamma$ computed at the solution $\boldsymbol{x}^{\ast}$ is a measure of the uncertainty of the solution. Indeed, it has to be underlined that the solution is just a nominal solution, not the real one. Also, a set of parameters $\boldsymbol{x}$ imposing a value of $Q(\boldsymbol{x})$ immediately above its minimum $Q^{\ast}$ is acceptable. We can define a confidence region as follows:

\[Z(\sigma)=\{ \boldsymbol{x} \in \mathbb{R}^N | Q(\boldsymbol{x})\leq Q^{\ast}+\frac{\sigma^2}{m} \}\]

with $\sigma >0$. In particular, it is possible to define a penalty $\Delta Q(\boldsymbol{x})$, such that in the neighborhood of $\boldsymbol{x}^{\ast}$, we have

\[Q(\boldsymbol{x})=Q^{\ast}+\Delta Q(\boldsymbol{x}).\]

By a Taylor expansion, neglecting the third order terms,we obtain

\[\Delta Q = \frac{1}{m} (\boldsymbol{x}-\boldsymbol{x}^{\ast})^TC(\boldsymbol{x}-\boldsymbol{x}^{\ast}).\]

In this approzimation, the confidence region is an ellipsoid

\[Z(\sigma)=\{ \boldsymbol{x} \in \mathbb{R}^N | (\boldsymbol{x}-\boldsymbol{x}^{\ast})^TC(\boldsymbol{x}-\boldsymbol{x}^{\ast}) \leq \sigma^2 \}\]

An alternative to the Gauss method is the Laplace method: the main difference is the way the parameters $\mathcal{C}$ and $\mathcal{\gamma}$ of the dynamical equation are determined. Both these methods have an important limitation: if the sets of observations are too close in time, neither Newton’s method nor the differential corrections converge to a solution. Indeed, it is not possible to get information about the geodetic curvature of the orbits: thus, the orbit itself cannot be computed. It is the case of the so-called Too Short Arcs (TSAs).

A TSA can be described by means of an attributable $A=(\alpha(\bar{t}),\delta(\bar{t}),\dot \alpha(\bar{t}),\dot \delta(\bar{t}))$, where $\alpha(\bar{t})$ and $\delta(\bar{t})$ are the right ascension and the declination corresponding to the mean epoch $\bar{t}=\frac{1}{m}\sum_{i=1}^m t_i$; they are determined by a linear or quadratic fit. The inability to compute a least-squares orbit can be seen as the impossibility to determine $\ddot \alpha$ and $\ddot \delta$ and, thus, to get the information about the topocentric distance $\rho$ and the topocentric velocity $\dot \rho$.

For more details about the Gauss and Laplance methods, refer to [Milani and Gronchi, 2010].

The identification problem

The identification problem consists in determining which detections correspond to the same celestial body. According to the observation database different kind of identification problems arise. They are presented below.

Orbit identification

In the case of two sets of data allow to determine two least-squares orbits we have the classical orbit identification problem. The two orbits are represented by the vectors $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$ and have normal matrices $C_1$ and $C_2$ and covariance matrices $\Gamma_1$ and $\Gamma_2$, at the same epoch. If the two sets of observations belong to the same orbit, at the considered epoch the real orbit will not be characterized neither by $\boldsymbol{x}^{\ast}=\boldsymbol{x}_1$ nor by $\boldsymbol{x}^{\ast}=\boldsymbol{x}_2$. To determine $\boldsymbol{x}^{\ast}$, supposing that $\boldsymbol{x}_1$ and $\boldsymbol{x}_2$ are near, it is possible to exploit a joint target function, $Q(\boldsymbol{x})$, given by the linear combination of the target functions associated to the two original sets of data:

\[m Q(\boldsymbol{x}) = m_1 Q_1(\boldsymbol{x}) + m_2 Q_2(\boldsymbol{x}) = m Q^* + m\Delta Q(\boldsymbol{x}),\]

where

\[\begin{aligned} &m Q^* = m_1 Q_1(\boldsymbol{x}_1) + m_2 Q(\boldsymbol{x}_2),\\[1.75ex] &\begin{aligned} m\Delta Q(\boldsymbol{x}) &= m_1 \Delta Q_1(\boldsymbol{x}) + m_2 \Delta Q_2(\boldsymbol{x})\\ &\simeq \boldsymbol{x}^T (C_1+C_2)\boldsymbol{x}-2\boldsymbol{x}^T(C_1\boldsymbol{x}_1+C_2\boldsymbol{x}_2)+\boldsymbol{x}_1^TC_1\boldsymbol{x}_1+\boldsymbol{x}_2^TC_2\boldsymbol{x}_2. \end{aligned} \end{aligned}\]

Thus, introducing the following notations

\[\begin{gather} C_0 = C_1+C_2,\\[1.5ex] \boldsymbol{x}_0 = C_0^{-1}(C_1\boldsymbol{x}_1+C_2\boldsymbol{x}_2),\\[1.5ex] K = \boldsymbol{x}_1^TC_1\boldsymbol{x}_1+\boldsymbol{x}_2^TC_2\boldsymbol{x}_2-\boldsymbol{x}_0^TC_0\boldsymbol{x}_0, \end{gather}\]

we can write

\[\Delta Q(\boldsymbol{x}) = \frac{1}{m}(\boldsymbol{x}-\boldsymbol{x}_0)^TC_0(\boldsymbol{x}-\boldsymbol{x}_0)+\frac{K}{m}.\]

The minimum point of $\Delta Q(\boldsymbol{x})$ is $\boldsymbol{x}_0$ and its minimum value is the identification penalty $\cfrac{K}{m}$. It can be shown that

$\frac{K}{m}=\frac{1}{m}\Delta \boldsymbol{x}^TC \Delta \boldsymbol{x},$ where $\begin{gather} C = C_2-C_2C_0^{-1}C_2=C_1-C_1C_0^{-1}C_1,\\[1.5ex] \Delta \boldsymbol{x} = \boldsymbol{x}_1-\boldsymbol{x}_2. \end{gather}$

Thus, we have

\[Q(\boldsymbol{x}) \simeq Q^{\ast} + \frac{1}{m}\Delta \boldsymbol{x}^T C \Delta \boldsymbol{x} + \frac{1}{m}(\boldsymbol{x}-\boldsymbol{x}_0)^TC_0(\boldsymbol{x}-\boldsymbol{x}_0).\]

In conclusion, the nominal solution is $\boldsymbol{x}_0 = C_0^{-1}(C_1\boldsymbol{x}_1+C_2\boldsymbol{x}_2)$ and it is characterized by the covariance matrix $\Gamma_0=C_0^{-1}$, which assesses its uncertainty.

Attribution

As already introduced, it could be not possible to determine a least-squares orbit for a given set of collected data. It’s the case of a TSA, to which we can associate just an attributable. If the attributable is compared to a least-squares orbit, the identification problem is known as attribution.

Also, for an attributable $\mathcal{A}$, it is possible to compute a normal matrix and a covariance matrix. To this aim, the attributable can be seen as a map $\boldsymbol{G}$:

\[\mathcal{A} = \boldsymbol{y}(\bar{t}) = (\alpha(\bar{t}),\delta(\bar{t}),\dot \alpha(\bar{t}), \dot \delta(\bar{t})) = \boldsymbol{G}(\boldsymbol{x}).\]

Given the initial condition $\boldsymbol{x}$ at a time $t_0$ with covariance $\Gamma_x$, a prediction function can be defined as:

\[\boldsymbol{F}(\boldsymbol{x},t_0,t)=\boldsymbol{G}\circ \boldsymbol{\Phi}_{t_0}.\]

Then, the covariance and the normal matrices can be determined as follows:

\[\begin{gather} \Gamma_A = \left[\frac{\partial \boldsymbol{F}}{\partial \boldsymbol{x}}\right] \Gamma_x \left[\frac{\partial \boldsymbol{F}}{\partial \boldsymbol{x}}\right]^T,\\[1.5ex] C_A = \Gamma_A^{-1}. \end{gather}\]

Let $\mathcal{A}_p$ be the attributable, predicted at the time $\bar{t}$ and computed from the least-squares orbit. It is characterized by a covariance matrix $\Gamma_p$ and a normal matrix $C_p$. The formulae for a linear attribution are the same as in the orbit identification case:

\[\begin{gather} C_0 = C_A+ C_p,\\[1.5ex] \Gamma_0 = C_0^{-1},\\[1.5ex] x_0 = \Gamma_0[C_A \mathcal{A} + C_p \mathcal{A}_p],\\[1.5ex] K_4 = (\mathcal{A}_p-\mathcal{A})[C_A-C_A\Gamma_0C_A](\mathcal{A}_p-\mathcal{A}). \end{gather}\]

The attribution penalty $\cfrac{K_4}{m}$ is used to filter out the pairs orbit-attributable which cannot belong to the same object.

Linkage

The linkage problem consists in trying to join two TSAs to determine an orbit. Is can be easily understood that this is the most difficult identification problem since it is not possible to compare quantities of the same nature: it is not possible to compute an orbit from each single TSA; the two available TSAs correspond to different times, so the two observations cannot be directly compared. As a consequence, an orbit produced by means of the linkage always needs a confirmation by attributing additional data.

The state of the art concerning the linkage problem is treated in the following sections.