In addition to oxygen and carbon dioxide, our blood also transports many chemicals. Some of these molecules provide useful indicators of our health. In fact, measuring such biomarkers is a common feature of clinical blood tests. Other molecules present, such as hormones and drugs, directly affect health through processes such as regulating metabolism and immune response. Bar et al. clarified the factors that influence the chemical brewing formula of human blood in the book Nature.
The origin of most blood-borne molecules and the reasons for their concentration changes between individuals are still unclear. The list of possible modifiers is long: for any given molecule, diet, medication, medical condition and medical history, genetic variation and gut microbes may all work. In addition, these factors can interact, just like trimethylamine oxide. This molecule promotes atherosclerosis in narrowed arteries and is produced by the metabolism of certain dietary compounds rich in red meat by microorganisms and their hosts. For such molecules that directly affect health, understanding their metabolic regulation may help generate new clinical treatments.
Bar etc. Describe their efforts to solve the problem of which factors control the molecules present in the blood. This work requires not only the measurement of many variables that may be involved, but also the use of analytical methods that can capture complexity (such as the interaction between variables), while still ensuring that effective predictions can be made for individuals outside the study population.
The author first described in detail the characteristics of blood samples from 491 healthy individuals. They quantified the molecules in the serum, which are the liquid components in the blood that remain after removing the proteins needed for coagulation. Study participants provided detailed health information and answered questionnaires about diet and lifestyle. They also provided stool samples for DNA sequencing to determine the genetic characteristics of the gut microbes (also known as the microbiome) present.
As the author admits, this is a small research group based on the standards of genome-wide association studies, aiming to find links between genes and diseases. Bar etc. It is also the first method to link serum molecules with genetic variation or the microbiome3,4. However, the author’s analysis of this group of individuals is unique in the number of types of data collected systematically to investigate serum components.
Next, Bar waited. Use machine learning methods to connect factors such as human genetics and microbiome information with molecules in the blood. Through many analyses, omitting different subsets of data, the authors found that diet, microbiome, and clinical variables (such as prescription drug use and blood pressure) have the greatest correlation with serum molecules. Although the authors found some genetic associations and confirmed the previously reported links between 46 genes and metabolites, they concluded that the associated effects of genetic factors are smaller than those of diet, clinical variables, and the microbiome. These different data types are not completely comparable, but the authors’ estimates of genetic effects are consistent with previous research results, providing support for their conclusions that diet and microbiome have a greater impact on serum composition than genetic factors and are more common .
Diet can affect the composition of the microbiome, so diet and microbiome can predict certain molecular data with similar accuracy to predictions. But Bar and colleagues show that these data types also provide non-overlapping information. For example, dietary information uniquely predicts specific metabolites related to the consumption of citrus fruits, while the presence of a microorganism belonging to the Streptomyces family strongly predicts the presence of indoxyl sulfate-bacterial decomposition of the amino acid tryptophan The product, previously related to it. Kidney and vasculature diseases 5.
To make predictions about the concentration of molecules present in blood samples, Bar et al. A machine learning method called gradient-boosted decision tree is used, which can capture complex interactions. The decision tree learns a simple “first come first” rule to make predictions (Figure 1). This method stratifies a single decision tree, and gradually improves the decision tree by training a new model specifically aimed at reducing the prediction error of the old decision tree.
Figure 1 A method to predict the molecular composition of blood. Bar et al. 1 obtained human blood samples and identified many molecules present. The author also collected information on a range of factors that may affect the molecules found, such as diet and gut microbes. Bar and colleagues used a computational method called a gradient-enhanced decision tree to predict the molecular composition of an individual’s blood. a. In this hypothetical example, the data points show the concentration of an individual X molecule (in arbitrary units (au) and the relative abundance of a gut bacteria Y). b. The model uses “if-then” classification to predict (black horizontal line) the relationship between bacterial abundance and X concentration. The prediction in this case is that if the bacterial abundance is greater than 0.1, then the concentration of X is 2, and if the abundance is less than 0.1, then the concentration of X. It is 0.1. The dotted line represents the prediction error. c. Then refine the model by considering other factors, such as whether the person eats red meat (red) (does not eat blue meat). d. After another “if-then” classification that includes this dietary factor, the model generates accurate predictions with lower errors (red and blue horizontal lines), linking the predicted X concentration to dietary and bacterial factors.
Bar and colleagues use a method called feature attribution analysis to explain these models. This creates specific assumptions about how individual factors (such as microbes, food, and genetic variation) affect specific predictions (here, blood molecular composition). More complex models may be prone to “overfitting”, that is, making incorrect predictions based on noise or irrelevant details. Therefore, the authors conservatively fitted and evaluated their model, but more importantly, they confirmed many predicted links between microorganisms and metabolites in two large independent research groups. Finally, Bar waited. Tested a set of predictions in a smaller study, identified the molecules (cytosine and betaine) associated with the consumption of whole wheat bread, and then showed that individuals randomly assigned to eat the bread had the expected changes in these metabolites .
This research is comprehensive, but there is still plenty of room for future exploration. The author uses the fully validated and standardized Metabolon platform to measure serum metabolites, but this metabolomics analysis method cannot cover all blood-derived compounds. Therefore, certain types of molecules (such as lipids) may be undersampled compared to other types. This may explain why most of the authors only detected a correlation between metabolites and one of the two most abundant lineages of gut bacteria [6,7]. Metabolomics can detect molecules with unknown identities other than molecular weights. In fact, the authors have reported multiple associations with such unknown metabolites. Although these may point to previously unknown aspects of biology (for example, interestingly, this association is related to the age of the participants), without metabolite identification, only limited conclusions can be drawn.
The author’s microbiome data provides DNA information of all the genomes present in the stool extract. However, Bar et al. Decompose these data to the level of abundant bacterial species, excluding non-bacteria such as yeast or protozoa. Limiting the analysis to the species level also obscures the fact that strains of the same bacterial species may differ in genetic content. For example, the metabolism of digoxin drugs in the body by the bacterium Escherichia coli requires genes that are only present in certain E. coli. Finally, the author was unable to associate serum metabolites with the specific bacterial enzymes responsible for their production, which would help to link the related links with underlying molecular mechanisms.
These restrictions should not detract from the most useful aspects of this article. By providing the complete data set to the research community, Bar and his colleagues can help realize the development of future calculation methods, it is possible to solve some of these limitations, and even provide methods to answer new questions. For scientists interested in the mechanisms by which diet, microbiome, and genetics affect our biochemistry and physiology, their data can be a rich and valuable resource.
Koppel, N., Bisanz, JE, Pandelia, M.-E., Turnbaugh, PJ & Balskus, EP eLife 7, e33953 (2018).
Post time: Dec-28-2020