A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks

In computational systems biology, genome-scale metabolic network reconstructions are used to build in silico models of cellular metabolism [1]. To analyze these models, a large variety of constraint-based methods has been developed over the years [2].

Typically, the metabolic network is assumed to be in steady-state, i.e., the production and consumption of the internal metabolites has to be balanced. This leads to a flux space of the form (C = { vin mathbb {R}^{text {Rxn}} mid Sv = 0, ; l leq v leq u}). Here (Sin mathbb {R}^{text {Met} times text {Rxn} }) denotes the stoichiometric matrix, given a set of (internal) metabolites Met and a set of reactions Rxn. The vectors v?C are called (feasible) flux vectors and can be interpreted as steady-state flux distributions of the metabolic network. The vectors (l,u in mathbb {R}^{text {Rxn}}_{pm infty }) define lower and upper bounds on the fluxes, where (mathbb {R}_{pm infty } := mathbb {R}cup {pm infty }). By Irrev?Rxn we denote the set of irreversible reactions, which can carry flux in only one direction, i.e., v
i
?0, for all i?Irrev. For simplicity, we assume l
i
?0, for all i?Irrev. Reactions in Rev=Rxn?Irrev are called reversible.

Some constraint-based analysis methods can be applied to genome-scale network reconstructions with several thousands of reactions. Others are limited to small or medium-sized models, like the computation of elementary flux modes [3] or minimal cut sets [4]. In such situations, a natural question is whether it is possible to reduce the given large network to a meaningful smaller one of practical size.

In 2015, Erdrich et al. [5] introduced a method called NetworkReducer, which reduces large metabolic networks to smaller subnetworks, while preserving relevant biological properties of interest. The algorithm in [5] is divided into two parts: network pruning and network compressing. In the compressing step, reactions belonging to the same enzyme subset [6] are lumped together. In the pruning step removable and non-removable reactions are identified such that the reduced network consisting of the non-removable reactions fulfills four requirements, which can be specified by the user:

  1. 1.

    Set of protected metabolites
    P
    Met: all metabolites in P
    Met must be retained in the reduced network.

  2. 2.

    Set of protected reactions
    P
    Rxns: all reactions in P
    Rxns must be retained in the reduced network.

  3. 3.

    Set (mathcal {F}) of protected functionalities (or phenotypes) for the reduced network. We assume that any protected functionality (f in mathcal {F}) can be described by a corresponding system of linear inequalities: D
    f
    v?d
    f
    .

  4. 4.

    Minimum degrees of freedom: dof?dof
    min. Here, the degrees of freedom dof correspond to the dimension of the null space of the stoichiometric matrix S, i.e., dof=|Rxn|?rank(S).

The overall goal of NetworkReducer is to find a smaller subnetwork such that all requirements (1) – (4) can be satisfied by a suitable flux vector. An example is given in Fig. 1.

https://static-content.springer.com/image/art%3A10.1186%2Fs12859-016-1412-z/MediaObjects/12859_2016_1412_Fig1_HTML.gif
Fig. 1

Solid arcs correspond to active reactions, dotted arcs to inactive reactions. In a, the flux vector satisfies the functionality of carrying flux through the biomass reaction while having oxygen uptake. In b, the functionality is carrying flux through the biomass reaction while there is no oxygen uptake. Combining the two flux vectors leads to the network in c, which contains 7 active reactions. A minimum subnetwork enabling both functionalities with only 6 reactions is given in (d). The corresponding binary variables for 1d would have the following values: a
1=1,a
2=1,a
3=1,a
4=1,a
5=1,a
6=0,a
7=0,a
8=1, where a
i
corresponds to reaction r
i

The method of Erdrich et al. [5] searches for a suitable subnetwork by iterating over the reactions. In every iteration, the flux rate through one particular reaction is set to zero and a linear program (LP) is solved to check if the remaining reactions still form a feasible subnetwork. Feasibility means that there exists non-zero flux vectors satisfying the steady-state condition and the other requirements. To identify the reaction to be eliminated a flux variability analysis (FVA) [7] is done and a reaction with smallest overall flux range is selected. Thus in every iteration, an LP is solved and an FVA is performed. Each FVA involves solving up to 2n LPs, where n is the number of reactions.

An important aspect of the method in [5] is that it does not necessarily compute a minimum subnetwork (with respect to the number of active reactions), see Fig. 2 for an example. The method that we develop here will always find a feasible subnetwork with a minimum number of active reactions. A subnetwork satisfying the requirements (1) – (3) can be obtained by solving only one mixed-integer linear program (MILP). If this subnetwork does not fulfill the dof-requirement (4), we exclude this subnetwork and compute a new subnetwork by solving the MILP again. This method turns out to be much faster than the algorithm introduced in [5]. More importantly, we are guaranteed to obtain a minimum subnetwork regarding the number of active reactions, which is not the case for NetworkReducer. However, due to the minimality condition, our method cannot preserve flux variability in the same way as NetworkReducer does.

https://static-content.springer.com/image/art%3A10.1186%2Fs12859-016-1412-z/MediaObjects/12859_2016_1412_Fig2_HTML.gif
Fig. 2

If in the first step of the pruning procedure the flux through reaction 1 is set to zero, reaction 1 is removable and reactions 2 and 3 are non-removable. If in the first step reaction 2 or 3 is set to zero, both of them would be removable and reaction 1 would be non-removable. The resulting subnetwork is then smaller than the first one

A second related work is the FASTCORE algorithm of N. Vlassis et al. [8]. This method is also based on solving several LPs but without performing an FVA in between. Thus it is a very fast approach. However, the resulting subnetworks are not minimum and only protected reactions can be specified, but no protected metabolites, functionalities, or degrees of freedom.

An early approach for network reduction was introduced by Burgard et al. [9] already in 2001 and later improved in 2014 by Jonnalagadda and Srinivasan [10]. This method also allows computing minimum subnetworks using an MILP approach. However, only one functionality can be formulated and not several ones like in NetworkReducer. Other related work can be found in [1118].

Altogether, our method can be seen as a network reduction algorithm that merges features from NetworkReducer and the method in [9], such that we can specify biological requirements like in [5] and compute all minimum subnetworks using an MILP, similar to [9].

The organization of this paper is as follows. In the Methods section we develop the underlying MILP methods. We start with the basic algorithm and then describe several improvements. In the Results and discussion section we compare our MILP approach with the existing methods NetworkReducer and FASTCORE. Furthermore, we apply it to a collection of genome-scale network reconstructions and discuss the results. The last section is Conclusion.

A software tool implementing the algorithms described in this paper is available at https://sourceforge.net/projects/minimalnetwork/.