Identification of functional metabolic biomarkers from lung cancer patient serum using PEP technology

Lung cancer is one of the most common malignancies and the leading cause in cancer-related fatality. In the U.S. alone, more than 210,000 new lung cancer cases are diagnosed with more than 170,000 deaths resulting from this disease each year. Lung cancer is the 5th leading cause of death worldwide, there were 1.5 million lung cancer deaths in 2010, an increase of 48 % in the past 20 years [32]. Usually symptoms of lung cancer do not appear until the disease is already in an advanced, non-curable stage. Most patients present with advanced disease and 5-year survival rates are poor, ranging from less than 10 % in China to 13–16 % in Europe and the US [32]. Even when symptoms of lung cancer do appear, many people may mistake them for other problems, such as an infection or long-term effects from smoking, delaying the disease diagnosis. Current practice on the diagnosis of common cancers relies heavily on imaging technologies such as CT scans for lung cancer, mammograms for breast cancer and pelvic ultrasounds for ovarian cancer. While advances in imaging technology have allowed more sensitive detection of small lesions, these advances have also led to an increase in false positive findings and invasive procedures to make a definitive diagnosis. For example in lung cancer; the CT scanners have led to the detection of large numbers of small pulmonary nodules. The frequency of detecting noncalcified nodules on a single CT varies from 5 to 60 % in a lung cancer screening population [12]. Given the high probability of false positive findings associated with CT screening, there is a substantial need for additional noninvasive modalities to discriminate between benign and malignant nodules. There are similar challenges in imaging based screening for other malignancies and a subsequent need for complementary diagnostic tests.

Blood based biomarkers have potential in cancer screening and their role could extend further from general population risk assessment to treatment response evaluation and recurrence monitoring. The rich content of diverse cellular and molecular elements in blood, which provide information about the health status of an individual, make it an ideal compartment to develop noninvasive diagnostics for cancer [42]. However, despite a large literature collection related to biomarkers for common cancers, blood based diagnostic tests that inform about the presence of cancer at an early stage and predict treatment response have been difficult to develop [24, 47]. Protein markers currently in clinical use, which include CA125 (cancer antigen 125) for ovarian cancer, CA199 (carbohydrate antigen 199) for pancreatic cancer, CEA (carcino embryonic antigen) for colon cancer and PSA (prostate specific antigen) for prostate cancer, have limitations with respect to their use for screening owing to low sensitivity and specificity in early stages and inability to distinguish aggressive from indolent tumors [11]. Other common cancers, notably breast and lung cancer, lack established biomarkers with demonstrated clinical utility in a screening setting. Thus, there is a need for biomarkers with the required sensitivity and specificity for the detection of frequently occurring cancer types [24].

Over the past decade, system biology especially proteomics has been used for the discovery of potential biomarkers from human fluids including serum [1, 15, 20, 22, 40, 42, 46]. So far, most efforts in proteomics seek to identify and sequence annotate the proteome by mass spectrometry analyses of peptides derived through proteolytic processing of the parent proteome [25, 32, 33, 35]. In such manner, thousands of proteins have been identified from human serum (www.serumproteome.org). However no validated protein biomarker currently exists for use in routine clinical practice for lung cancer early detection, prognosis and the prediction of treatment response. Proteomic profiling could potentially provide such markers.

One of the challenges from mass spectrometry-based proteomics is to overcome the analytical bias towards the most abundant serum proteins, and the complexities of mining the data to a manageable number of biomarker proteins that can be analyzed in more depth. Currently in proteomics research, little attention has been paid to systematic functional annotation, yet functional annotation is crucial as many proteins have variants and each functional variant forms may contribute to its own unique functional activity. It is generally recognized that sequence annotation alone cannot capture this vital information, so new strategies are necessary. So reconciling protein identifications to actual enzyme activities or functions has been subject to limitations in proteome separation and assay technologies. To overcome these inefficiencies in functional annotation, a top-down approach, starting with function, and ending with sequence and structural annotation and functional validation was developed. The PEP technology uses a modified Two-dimensional Gel Electrophoresis to separate the proteome, without substantially compromising function [19]. The isolated proteins are then electro-eluted from the PEP plate and further refolded, and enzyme activities are measured systematically from hundreds to thousands of fractions depending on the complexity of the proteome. This method thus provides a new functional dimension to explore the human serum proteome.

Human serum contains thousands of proteins with abundances spanning eight logs [36]. In the past decade thousands of proteins have been identified from the human serum mainly by mass spectrometry technology and many of them are enzymes supporting catalytic function within cells (www.serumproteome.org). However, for the majority of proteomic applications, antibody-based detection, 2-D gel electrophoresis followed by protein staining, and quantitative label and label-free LC-MS, are used to measure protein abundances and carry out the comparison. Differential gel electrophoresis (DIGE) is a common method to pair disease sample and control with two different fluorescence dye staining to study relative protein up or down regulation, and this approach has been applied to the identification of potential drug targets or biomarkers [1, 42]. However, the functional differences between the disease and normal proteomes can’t be analyzed with approaches solely reliant on protein abundances, as conformational variations produce functions, not strictly proportional to the abundances of the gene derivative polypeptide products. As a result, there is a distinct advantage in characterizing the functions of a proteome.

Many functional assays are very sensitive especially those based on fluorescence detection, being able to measure enzyme activities at picogram level [14, 23, 41, 43, 45]. This is significantly more sensitive than the most sensitive detection methods from electrophoretic gels with silver staining or fluorescence staining, or the general mass spectrometry methods. Secondly, it is known that many proteins have post-translational modifications (PTMs) and splice variants [2], these different forms of the same gene product may have different functions/enzyme activities and play very different roles in biology. However the important information on the impact of PTM and protein splicing is lost in the antibody, LC-MS or gel-based analysis because these platforms cannot directly measure the functional features of the proteome. New methods to monitor and compare functional proteomes are therefore desirable.

It is hypothesized that the levels and distributions of certain enzyme functions in serum could produce proteomic features and collective profiles which reflect physiological changes of an individual and can serve as possible biomarkers or diagnostic parameters. To achieve such features, a modified 2-D gel electrophoresis (2-DE) system can be used; 2-DE being a powerful tool to separate proteomes based on two orthogonal parameters, isoelectric point (pI) and molecular weight respectively. Thousands of protein spots can be detected from a large format 2-D gel. However, the typical 2-D gel electrophoresis includes the use of reducing reagent to disrupt the protein disulfide bonds and the use of high concentration SDS to denature and negatively charge the proteins for the second dimension separation. In the current PEP technology, reducing reagent is not used, keeping the disulfide bonds intact. Furthermore, a much reduced SDS concentration (0.1 %) was used to charge the proteins before the second dimension, this is a 20-fold reduction over the typical SDS treatment. This modified condition allows the proteins to get negatively charged but the condition is not strong enough to destroy the protein tertiary structure. This, in combination with protein refolding and protein protectants in the PEP system, allows the efficient recovery of enzyme activities from the serum proteome after 2-D gel electrophoresis and protein elution [19]. Our initial studies using colon and breast cancer patient serum and normal serum provided strong evidence for a clear disease signature when the enzyme activities were compared (data not shown). Since most of the functional proteins or enzymes exist at relatively low level in the human serum and there is a limited loading capacity on the 2-DE gel, it is important to enrich the low abundance proteins before 2-DE and PEP analysis. AlbuVoid™ (Biotech Support Group, Monmouth Junction NJ) has been shown to effectively enrich low abundance serum proteins while depleting the Albumin. It was used previously for pre-treatment of human serum in the PEP technology, whereby more functional features were observed with AlbuVoid™ than without (data not shown). Consequently, we adopted AlbuVoid™ in our workflows in this investigation.

The selection of the activity to be monitored is an important choice in any functional proteomic investigation. Previous studies have demonstrated that general redox enzymes could be detected from beef liver extract or mouse cochleae tissue, both NADH and NADPH-dependent oxidases can be detected from a large number of fractions after 2-D gel separation and PEP elution [19]. However, the relative enzyme activities were low because instead of specific substrates, general enzyme substrates containing all the 20 amino acids and several sugars were used at relatively low level; the purpose of this approach was to maximize the detection of enzyme species using NADH or NADPH as cofactor. In the current study, a more specific assay for hexokinase activity was measured, with only two substrates (glucose and ATP) introduced at optimized conditions [13]. Hexokinase activity was selected for several reasons. Foremost is that the products produced from Hexokinase activity are the first within the glycolytic pathway, a pathway often implicated in cancer development [13, 21, 37]. Another reason is that a large number of functional proteins within and cross-regulating with the glycolytic pathway could potentially be monitored by a broad spectrum assay as the one employed, which already contains low level of endogenous Hexokinase activity. The introduction of exogenous protein(s) from the PEP samples could potentially supersede any rate-limiting protein function and enhance the hexokinase activity. As such, this assay may also detect the effect of proteins from other pathways that cross-interact with the glycolytic pathway.