Knowledge Graph Databases and the Cypher Query Language June 30, 2024 Knowledge Graph Databases and the Cypher Query Language Name* 1. When we submit the example query on the Enrichr-KG website, which are the two genes that are linked to the three terms listed below? (GO, MGI Mammalian Phenotype, and KEGG Pathways are the selected libraries) Tyrosine metabolism Abnormal hepatocyte morphology MP:0000607 Aromatic amino acid family catabolic process (GO:0009074) FAH, GSTZ1 GYS2, GBE1 GSTZ1, PGM2 FAH, GBE1 2. You performed a single gene search for SDF2L1 using Enrichr-KG term and gene search and limited the output size to 10. Which node is not a part of the output network? MIR SKI II Cyclosporin-A VEGFA-VEGFR2 Signaling Pathway WP3888 3. When you search for the shortest paths between APOE (gene) and GFAP (gene) using Enrichr-KG term and gene search, there is one node from the GO Biological Process 2021 library. Which one is it? regulation of cellular component organization (GO:0051128) acetyl-CoA biosynthetic process (GO:0006085) calcium ion homeostasis (GO:0055074) regulation of wound healing (GO:0061041) 4.To look for birth defects related to the drug terconazole, a researcher performed a single-term search using Reprotox-KG. Based on the search results, which birth defect is related to terconazole? Hint: The relation is coming from DrugEnrichr. Trisomy 21 Spina bifida Hypoplastic left heart Dandy-Walker malformation 5. You want to study the potential relationship between the drug “ketotifen" and the birth defect “Trisomy 18” with Reprotox-KG. Using only the Drugshot library for finding connections between these entities, which candidate small molecule appears as a possible connection? Stridor Sugar Tea Caffeine 6. The table below is the result of node serialization. What is a correct statement about it? The ID should be the same as the original data such as the ontology ID Visualization of the table will produce an error because it includes special characters It’s ok to have only two columns - IDs and labels The table should include source and target columns 7. Which is not true about Neo4j? You can use the Cypher query language to search it The WHERE statement can’t be used because it is not a relational database You can search both a label and a property at the same time It is not a relational database 8. Which statement best explains why researchers use Neo4j for biological data? It helps to see how different entity types are interconnected It is less prone to viral infections We can use SQL to query the networks it is sufficiently stores It is great for storing large tables of patients and their lab tests 9. What is an advantage of graph databases? It stores information in tables We can use SQL to query the graph data It is a better data lake It captures the natural relationships between entities 10. Select the incorrect description of the elements in this example Cypher query. (A) specifies the node type (B) assigns property to the node type (C) assigns a relation between two nodes (D) assigns a positive regulation effect of the gene on the process Submit