Generation of multistate release–recapture models using a graphic user interface (GUI)

Algorithm development

The purpose of Program Branch is to translate a schematic of a multistate release–recapture study into a maximum likelihood model [27: 38–81] in order to extract survival and movement probabilities of tagged animals. Multiple steps are required to produce an algorithm that is user-friendly, flexible, and capable of generating meaningful results. In general, multistate models are written as a function of detection/capture probabilities and transition parameters that describe the joint probability of an animal making a migration choice and surviving. Afterward, these transition probabilities must be recombined in meaningful ways to estimate survival and route selection. Standard errors need to be calculated for both directly estimated and derived parameter values.

Program Branch begins with a graphical user interface (GUI) that allows point, click, and drag operations using a series of icons placed on a canvas. The GUI allows users to construct schematics of the multistate release–recapture model by indicating paths of movement and detection locations. The user constructs the schematic sequentially, one step at a time, and can erase steps made in error. Subsequently, the software interprets the icons and their adjacency relationships into a meaningful format for machine processing. Once the diagram has been interpreted, each possible path a tagged animal can take through the study is completely enumerated and stored, as well as the model parameters associated with survival, movement, and detection. Using the enumerated pathways, the software codes a product-multinomial likelihood equation using the parameters and spatial and temporal relationships specified in the diagram. The program also identifies which model parameters are estimable as specified, and which parameters are not separable and must be combined. This distinction is performed both when the model is first created by the user, and again after the user has input the data, because sparse data may require additional model re-parameterization. Numerical optimization is then used to calculate the maximum likelihood estimates for the model parameters and associated variance–covariance matrix. Finally, after estimation of the user-specified model parameters, the program has the capability of summarizing survival or movement processes across the entire system being modeled.

The diagram drawn of the multistate model is translated into information used to code the maximum likelihood model based on digraph theory [28]. A digraph or directed graph consists of a set of vertices and a set of arcs. In release–recapture studies, the vertices are survival, captures, and redirection events (“forks” or “branches”) depicted in the study schematic. Arcs are the sequential transitions from one vertex to another. An adjacency matrix is a useful way of conveying the model structure.

In the adjacency matrix, a value of 1 denotes two adjacent vertices or events in the direction of the movement (Fig. 1). The order of the rows and columns is unimportant. In multistate models, the names of the vertices are the types of model parameters that represent survival (S), branching ((gamma)), and detection (P) events. The pathways through the multistate model can then be traced through the set of arcs in the matrix (Fig. 1). The illustrated branching model consists of a release of tagged animals (R), 9 survival parameters (S), 2 branching parameters ((gamma_{1}), (gamma_{2})), and 7 detection probabilities (Fig. 1). The corresponding 19 × 19 adjacency matrix includes information on which events are sequential and the three various pathways through the study. Note the column R in the matrix does not have an entry of 1. This is because R is the release parameter, and there are no arcs leading into the release point, only leading out.

https://static-content.springer.com/image/art%3A10.1186%2Fs40317-016-0115-6/MediaObjects/40317_2016_115_Fig1_HTML.gif
Fig. 1

Digraph representation of a multistate model. Digraph representation of a multistate model with associated table that depicts the adjacency matrix representation of the digraph. R denotes release location of tagged animals; S survival; ? choice of movement direction; and P detection probability

One approach is to construct a multinomial likelihood based on all possible migration pathways and recapture opportunities. This process creates a detection history for each fish, which can be written as an n-digit binary number, where n is the number of recapture opportunities in the model, and each digit d
i
(i = 1, …, n) equals 1 if an animal is recaptured at detection array i and 0 otherwise. A multinomial likelihood equation is then created based on all possible histories.

Alternatively, to develop an algorithm to write a likelihood equation from the adjacency matrix, it is simpler to break the detection histories into single steps. The product of these conditional likelihoods is equivalent to the unconditional multinomial model based on full capture histories.

Writing a conditional product-multinomial likelihood equation can be accomplished by following a simple set of rules. The adjacency matrix provides a list of parameters along each possible path through the model. The probability of detection at site j given previous detection at site (i) (where (j i) and (i) = 0 is the release point) is the sum of the probabilities for all possible paths from site (i) to site (j). Each of these sub-paths is essentially atomic; that is, the probability of reaching (j) from (i) via each sub-path can be expressed simply as the product of the parameters along the sub-path, with (left( {1 – P} right)) substituted for each detection parameter P to indicate non-detection at intermediate sites, and (left( {1 – gamma } right)) substituted for (gamma) to indicate selection of the alternate branch at a fork. The total probability of detection at (j), given detection at (i), is the sum of these sub-path products. The conditional multinomial likelihood function is then based on all possible next sites, given detection at site (i). The joint likelihood is the product over all sites (i).

In release–recapture studies in general and multistate models in particular, as depicted in Fig. 1 and its associated adjacency matrix, not all parameters are estimable. In the last reach of any branch, the probability of detection (P) is not separable from the probability of survival (S). In reaches with a fork, the probability of survival (S) is not separable from the probability of an animal selecting a direction of movement ((gamma)) (or lack of movement). The computational algorithm must therefore re-parameterize the multistate model parameters based on estimability before values can be calculated. The types of parameters estimated directly by Program Branch are the joint probability of route selection and survival (“transition” probability, (phi = gamma S)), the probability of detection (p), and the joint probability of route selection, survival through the final reach, and detection at the final site ((lambda = phi P)). It is also possible that additional model re-parameterizations may be necessary based on features of the data. Sparse data may require elimination or the combining of parameters for the model to be estimable.