Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
???displayArticle.abstract???
Vertebrate development from an egg to a complex multi-cell organism is accompanied by multiple phases of genome-scale changes in the repertoire of proteins and their post-translational modifications. While much has been learned at the RNA level, we know less about changes at the protein level. In this paper, we present a deep analysis of changes of ∼15,000 proteins and ∼11,500 phospho-sites at 11 developmental time points in Xenopus laevis embryos ranging from the stage VI oocyte to the juvenile tadpole. We find that the most dramatic changes to the proteome occur during the transition to functional organ systems, which occurs as the embryo becomes a tadpole. At that time, the absolute amount of non-yolk protein increases two-fold, and there is a shift in the balance of expression from proteins regulating gene expression to receptors, ligands, and proteins involved in cell-cell and cell-environment interactions. Between the early and late tadpole, the median increase for membrane and secreted proteins is substantially higher than that of nuclear proteins. To begin to appreciate changes at the post-translational level, we have measured quantitative phospho-proteomic data across the same developmental stages. In contrast to the significant protein changes that are concentrated at the end of the time series, the most significant phosphorylation changes are concentrated in the very early stages of development. A clear exception are phosphorylations of proteins involved in gene expression: these increase just after fertilization, with patterns that are highly correlated with the underlying protein changes. To facilitate the interpretation of this unique phospho-proteome data set, we created a pipeline for identifying homologous human phosphorylations from the measured Xenopus phospho-proteome. Collectively, our data reveal multiple coordinated transitions in protein and phosphorylation profiles, reflecting distinct developmental strategies and providing an extensive resource to further explore developmental biology at the proteomic and phospho-proteomic levels.
Fig. 1. Temporal Profiling of the Xenopus laevis Proteome and Phospho-proteome Across Development. A. The eleven Xenopus laevis developmental stages collected for proteomic and phospho-proteomic measurement spaced according to Nieuwkoop and Faber times for temperatures between 22 and 24 °C and mapped to broad developmental classifications. (The Stage 41 illustration is actually Stage 40 for convenience. See Methods for Illustration information.) B. (i) 14,892 proteins were quantified; 61.4 % measured in all three replicates. (ii) 11,422 phospho-forms were quantified; 14.3 % were measured in all three replicates. C. Representative Examples: (i) SF3B1 protein and (ii) SF3B1 phospho-form T-429 are measured in all three replicates. SF3B1 replicate Pearson correlation coefficients (0.85 for protein and 0.87 for phospho-form) align with the median values for all measurements (Median Pearson of 0.90 for 14.9k proteins and 0.87 for 11.4k phospho-forms respectively; Fig. 2C).
Fig. 2. Distinct Protein Expression Clusters Correspond to Major Developmental Transitions. A. Twelve k-means clusters of relative protein trends were merged into eight summary trends ordered by the time that the first dynamic event in the trend occurs. The median trend of the cluster is the colored line; the ten to ninety percentile region is shown with shading. The pie chart shows the fractions of total proteins in each cluster. B. Over-representation analysis of biological process GO gene sets shows that, as development progresses, dynamic clusters are associated with proliferation, patterning and differentiation, organogenesis, and tissue function. For each GO protein list shown in black, the p-value for the over-representation of those proteins in the gene set relative to all proteins is shown with a circle, whose area is proportional to the -log10 p-value (hypergeometric) and has the appropriate cluster color. All GO protein lists are overrepresented in at least one cluster with FDR≤5 %. (See Spreadsheet S4 for the entire set of over-represented sets). C. (i) Median trend relative protein levels consistent with the transitions in the mitotic cell cycle across development. Securin/PTTG1 is highest in the metaphase arrested egg and decreases following fertilization. CDK4 levels increase following Stage 9 as the embryo enters the period where entry into S phase is under regulation. (See Fig. S4 for individual replicate trends). (ii,iii) Expression of proteins connected to tissue differentiation and organogenesis: for oligodendrocytes (ii) and gut (iii) (See Fig. S4 for individual replicate trends). The plots illustrate the temporal progression through these phases, as supported by the GO overrepresentation analysis. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 3. Gene Expression Regulators Show Pre-Stage 30 Dynamics While Cell Communication Proteins Exhibit Post-Stage 30 Changes. A. Introducing the fraction fold change metric (FFC). The measured time period is split at Stage 30 into “initial development” (blue) and “later development” (red). In order to determine how much of the maximum fold change across the entirety of development occurs during “initial development,” we divide the maximum fold change during initial development by the maximum fold change across development. We illustrate the metric with the two examples from the data – HNRNPU and CD55 – that have very different fraction fold changes (For further visualization see Fig. S5.). B. (i) Violin plots of the median trend FFCs for the six different classes and all proteins (center black line = median; top and bottom black lines = max and min, respectively). The medians of transcription factors (TFs), E3 ubiquitin ligases (E3s), and ligands differ significantly from all proteins for the median trend and all three replicates (p < 0.001; Kruskal Wallis test, Bonferroni critical value for multiple comparisons with post-hoc Dunn's test). Violin plots for the individual replicates are shown in Fig. S6. (ii) Examples of transcription factor relative trends with FFCs equal to one. The data shown is the median trend of the replicates for which the protein is measured. The trend lines are colored according to the clusters in Fig. 2A. Individual replicate trends are shown in Fig. S7. C. (i) Gene Set Enrichment Analysis (GSEA) results for the median trend FFC metric using Biological Process GO sets shows that processes related to gene expression and cell fate differentiation are enriched for FFCs closer to 1 compared to the background of all measured proteins. In contrast, processes related to cell-environment signaling and the immune response are depleted for FFCs closer to 1. (ii) Median protein trends for some well-known members of the (left) “adaptation of signaling pathway” and (right) “interleukin-12 production” gene sets. Individual replicates are show in Fig. S8. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 4. Two-Fold Proteome Mass Increase from Late Tailbud to Tadpole is Predominantly Driven by Tissue-Specific Proteins. A. Amount of non-yolk protein per egg/embryo (average of ten) at ten developmental stages. The mean of three technical replicates is shown with 95 % confidence intervals. B. The total amount of protein, in picomoles, at the measured developmental stages. (See Methods and Fig. S10 for more details about estimation of moles per protein per timepoint). The dots are the mean of three replicates and the error bars are the standard deviation. Red lines show the developmental stages of embryo drawings. C. Median trend (bars) absolute change of proteins by annotated localization for Stage 30–41 (lighter shade) and Stages 41–48 (darker shade). Open circles are the values for each replicate for each subset. Statistical significance was determined using the Kruskal Wallis test with Bonferroni correction for 20 comparisons (α = 0.0025). D. Absolute protein amount median trends for three different extracellular matrix proteins. Individual replicates are shown in Supplemental Fig. 11A. E. Median trend (bars) absolute change of proteins with the localization classifications additionally stratified by tissue specificity (TS) (not tissue specific: no black lines, tissue specific: black lines; left and right, respectively) for Stages 30–41 (lighter shade) and Stages 41–48 (darker shade). Open circles are the values for each replicate for each subset. Asterisks show the comparisons where the median trend and all three replicates are statistically significantly different, where statistical significance for a comparison is assessed with the Kruskal Wallis test, Bonferroni correction for eighty comparisons (α = 0.000625). F. The absolute amount of protein across development for three tissue specific membrane proteins and tissue specific secreted proteins. Individual replicates are shown in Supplemental Fig. 11B,C. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 5. Mitotic cell cycle and gene expression related phosphorylation dominate the phospho-proteome. A. Phospho-form dynamics are investigated together with their underlying protein dynamics and without normalization to the protein trend in order to distinguish (i) phospho-forms that are similar to their protein trend but have significant changes in magnitude from (ii) those that are also similar but do not change very much during development. (iii) Other phospho-forms have very different trends than the protein they are measured on.
Fig. 5. Mitotic cell cycle and gene expression related phosphorylation dominate the phospho-proteome. B. Over 50 % of phospho-forms are relatively flat and the majority of dynamic phospho-forms show most of their increase before the tadpole stages. (i) The protein and phospho-form trends were clustered together as a vector of 22 timepoints. Twelve original (Supplemental Fig. 14) k-means clusters were merged into eight summary clusters. Cluster medians (dark colored lines) and 10 to 90 percentile regions (lighter colored area) are plotted separately for proteins and phospho-forms to aid interpretation. (ii) The fraction of the total phospho-forms that are in each cluster is represented by a pie chart where the colors are the same as the cluster medians. C. Cluster median Pearson correlation between phospho-form and protein (PP Corr) and phospho-form maximum fold change across development (Phos FC) are shown for all clusters with error bars representing the 25–75 % interquartile range the same co-cluster colors. The maximum allowable phospho-form max. fold change across development is 10. GO Biological Process protein sets that are overrepresented for a given cluster with FDR≤5 % are shown in the same cluster color. See Spreadsheet S5 for the complete set of overrepresented sets. D. Cluster B phospho-forms are statistically significantly overrepresented for multiple acidic motifs, and cluster D phospho-forms are statistically significantly overrepresented for multiple proline directed motifs. See Supplemental Table 16 for the complete lists of statistically significantly overrepresented motifs. E. Examples of phosphorylations from three different co-clusters that have a homologous human phosphorylation. The median trend phospho-form (dashed) and protein (solid) lines are shown where the color matches the phospho-form's co-cluster. Individual replicate data is shown in Supplemental Fig. 17. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 6. Acidic Motif Phosphorylations Correlate Strongly with Protein Trends, Unlike Threonine Phosphorylations. A. Definitions of the motif classifications. B. Fraction of phospho-forms by (left) motif classification and (right) phospho-acceptor residue. 50 % of phospho-forms have proline-directed motifs. The vast majority of phospho-forms have serine phospho-acceptor residues. C. Phospho-forms with acidic motifs are more correlated with their associated protein trend than the set of all phospho-forms, and threonine phospho-acceptor phospho-forms are the least correlated (∗∗ = p < 0.001; Kruskal Wallis test, Bonferroni critical value for multiple comparisons including comparisons of the median values and the three replicates). See Supplemental Fig. 18 for the same visualization for the three replicates separately. D. Examples of phosphorylations (dashed lines) and associated protein (solid lines) with high PP Corr values (acidic motifs, left) and lower PP Corr values (threonine phosphorylations, center and right). The threonine phosphorylations examples increase in phosphorylation level after the midblastula transition (just before Dev. Stage 9). Individual replicate data is shown in Supplemental Fig. 20.