Home CF Programs CFDE Centers Partnerships Outreach Publications Webinars

Common Fund Data Ecosystem Centers

The NIH Common Fund (CF) programs have produced transformative datasets, databases, methods, bioinformatics tools and workflows that are significantly advancing biomedical research in the United States and worldwide. Currently, CF programs are mostly isolated. However, integrating data from across CF programs has the potential for synergistic discoveries. In addition, since CF programs have a time limit of 10 years, sustainability of the widely used CF digital resources after the programs expire is critical. To address these challenges, the NIH established the Common Fund Data Ecosystem (CFDE) program which has been recently approved to continue to its second new phase. For the second phase of the CFDE five centers were established.

Map of CFDE Centers

CWIC

The CFDE Cloud Workspace is designed to provide researchers with an accessible and collaborative environment for data analysis. It allows users to import and integrate data with Common Fund datasets while utilizing a wide range of analysis tools, workflows, and pipelines. The Cloud Workspace Implementation Center (CWIC) streamlines deployment by leveraging key partnerships: TACC’s high-performance computing resources, Galaxy’s open-source interface for data analysis, and CloudBank’s tools for simplified cloud access and billing. <br/><br/>This workspace supports both novice and expert users by offering training, outreach, and cost-management tools to optimize resource usage. Users will have access to a variety of tools developed by CFDE, Galaxy, and other partners, with the flexibility to incorporate custom or third-party tools. By providing free storage and compute resources, the Cloud Workspace lowers barriers to entry, enabling researchers to work with large datasets and complex analyses.<br/><br/>By fostering data sharing, collaboration, and ease of access, CWIC accelerates biomedical research and supports a broader scientific community in tackling high-priority challenges. This initiative represents a significant step toward expanding access to advanced computational resources and empowering researchers to drive innovation in the field.

ICC

The CONNECT Integration and Coordination Center (ICC) is dedicated to advancing biomedical research within the Common Fund Data Ecosystem (CFDE) by enhancing efficiency, transparency, and innovation. Led by Prof. Jake Chen at UAB, alongside Profs. Casey Greene, Sean Davis, Peipei Ping, and Wei Wang, the ICC integrates three key cores—Administrative, Evaluation, and Sustainability—to drive its mission forward.<br/><br/>The Administrative Core, led by Prof. Chen, ensures seamless coordination and project management across CFDE entities. Utilizing Agile methodologies and collaboration tools like U-BRITE, it optimizes communication and accelerates scientific progress. The Evaluation Core, led by Profs. Greene and Davis, focuses on continuous quality improvement, developing evaluation metrics and feedback mechanisms to assess and enhance CFDE initiatives. Meanwhile, the Sustainability Core, led by Prof. Ping with support from Prof. Wang, ensures the long-term accessibility and reusability of CF program data through strategic data management practices and repository planning.<br/><br/>By integrating these efforts, the CONNECT ICC fosters collaboration, accelerates biomedical discoveries, and strengthens CFDE’s long-term impact. Through innovative methodologies and expert leadership, it is positioned to drive transformative advancements in biomedical research.

TC

In coordination with other CFDE Centers and funded programs and projects, the Training Center (TC) acts as a central hub to provide a comprehensive approach to support current and potential CFDE users on their learning journey. It aims to expand the CFDE data userbase and enhance the confidence and complexity of dataset usage through community building and engagement activities.<br/><br/><br/> The TC will provide training in basic and advanced computational and data analytic skills for data science learners and users to engage meaningfully with CFDE data and tools in research and increase awareness and attract new users from the bioinformatics, data science, and research communities through a variety of initiatives internal and external to the CFDE, including attendance at conferences and other activities and opportunities.

DRC

The CFDE Data Resource Center (DRC) was tasked with developing two web-based portals: an **Information Portal** to serve information about the CFDE and a **Data Portal** to host harmonized metadata and processed data contributed by participating CF Data Coordination Centers (DCCs) and other sources. By combining the data and information portals, the **CFDE Workbench** is a comprehensive resource where users can collect both information and data from CFDE and CF resources, as well as query disease, gene, drug, and other biological entities across standardized data formats from each CF DCC. The CFDE Workbench consolidates efforts toward making CF programs funded resources harmonized, FAIR, and AI-ready.<br/><br/>To achieve these goals, the DRC team works collaboratively with the other CFDE newly established centers, the participating CFDE DCCs, the CFDE NIH team, and relevant external entities and potential consumers of these three software products. These interactions will be achieved via face-to-face meetings, virtual working groups meeting, one-on-one meetings, Slack, GitHub, project management software, and e-mail exchange. Via these interactions, we will establish standards, workstreams, feedback and mini projects towards accomplishing the goal of developing a lively and productive Common Fund Data Ecosystem.<br/><br/>The **Data Portal** of the CFDE Workbench catalogs several types of uniformly processed data and metadata filesand other digital objects from each participating DCC. The **Information Portal** provides relevant information about each DCC and on overarching consortium activities that include training and outreach events, brief descriptions of CFDE partnership projects, and detailed community-established protocols.

KC

Making NIH Common Fund (CF) datasets FAIR is just the first step in unlocking their potential in the era of big data. Scientific progress depends on accessible knowledge, yet non-computational researchers often struggle with interpreting knowledge graphs (KGs) due to their logic-based reasoning, which can overlook scientific context and uncertainty, leading to invalid inferences.<br/><br/>Our CFDE Knowledge Center (KC) will focus on presenting scientifically valid knowledge from CF projects in a KG format aligned with CFDE and external curation efforts. To ensure accuracy, we will emphasize careful knowledge extraction—ensuring each KG edge is based on primary experimental findings or expert analysis—and thoughtful knowledge presentation, using tailored visualizations instead of general graph traversal.<br/><br/>Leveraging our experience from four large-scale NIH-funded projects, we will develop a user-friendly portal that enhances data accessibility and scientific validity, empowering a diverse range of researchers to engage with CF-generated knowledge.