Difference between revisions of "EarthCube"

From SCECpedia
Jump to navigationJump to search
(Created page with 'Updated Combined Capability List – December 16, 2011 Note: Reference numbers for the capabilities refer to the original list of 99 from the Charrette and from the virtual sess…')
 
 
Line 1: Line 1:
Updated Combined Capability List – December 16, 2011
+
'''EarthCube Updated Combined Capability List – December 16, 2011'''
  
 
Note: Reference numbers for the capabilities refer to the original list of 99 from the Charrette and from the virtual sessions, both available at http://earthcube.ning.com/page/capabilities
 
Note: Reference numbers for the capabilities refer to the original list of 99 from the Charrette and from the virtual sessions, both available at http://earthcube.ning.com/page/capabilities
  
  
A. Dataset and Workflow Discovery
+
'''A. Dataset and Workflow Discovery'''
A1: Registration of datasets or workflows (with input parameters) into shared collections or global catalogs
+
*A1: Registration of datasets or workflows (with input parameters) into shared collections or global catalogs
A2: Upload & publish workflows and data sets  
+
*A2: Upload & publish workflows and data sets  
A3: Search across multiple granularity levels & disciplines for workflows and datasets (using semantic metadata and provenance information when appropriate)  
+
*A3: Search across multiple granularity levels & disciplines for workflows and datasets (using semantic metadata and provenance information when appropriate)  
A4: Data subscription services  
+
*A4: Data subscription services  
 
Summarized from 6, 63, 67, 76, 80, 89, 93, 98, V3, V27, V28
 
Summarized from 6, 63, 67, 76, 80, 89, 93, 98, V3, V27, V28
  
B.  Metadata for Workflow and Data Sets
+
'''B.  Metadata for Workflow and Data Sets'''
B1: Automated provenance tracking and tracking of data updates (versioning)  
+
*B1: Automated provenance tracking and tracking of data updates (versioning)  
B2: Commenting, annotating, rating, and categorization of workflows, data sets, or models (both automatic and manual)  
+
*B2: Commenting, annotating, rating, and categorization of workflows, data sets, or models (both automatic and manual)  
B3: Computational environment provenance stored including code, configurations, input and metadata, with a goal toward reproducibility  
+
*B3: Computational environment provenance stored including code, configurations, input and metadata, with a goal toward reproducibility  
 
Summarized from 11, 14, 21, 22, 27, 43, 47, 68, 69, 70, 73, 74, 82, 86, 88, 95, 97, V1, V6, V9, V26
 
Summarized from 11, 14, 21, 22, 27, 43, 47, 68, 69, 70, 73, 74, 82, 86, 88, 95, 97, V1, V6, V9, V26
  
C. Data Security and Trust
+
'''C. Data Security and Trust'''
C1: Provide single sign-on environment for shared collections spanning administrative domains
+
*C1: Provide single sign-on environment for shared collections spanning administrative domains
C2: Flexible access & sharing controls (licensing) of data, models & workflows  
+
*C2: Flexible access & sharing controls (licensing) of data, models & workflows  
C3: Issue tracker for problems with data or workflows  
+
*C3: Issue tracker for problems with data or workflows  
C4: Protect individual property rights  
+
*C4: Protect individual property rights  
C5: Data trust: ability for users to track data that needs further explanation or is suspect, fault tolerances, automated tools for validation  
+
*C5: Data trust: ability for users to track data that needs further explanation or is suspect, fault tolerances, automated tools for validation  
 
Summarized from 16, 18, 57, 64, 75, 76, 84, 92, 96, V7, V20
 
Summarized from 16, 18, 57, 64, 75, 76, 84, 92, 96, V7, V20
 
Closely related to items in F. Data Management within Workflows and M. Policy Enforcement Processes
 
Closely related to items in F. Data Management within Workflows and M. Policy Enforcement Processes
  
 
+
'''D. Data Access Services'''
D. Data Access Services
+
*D1: Reusable/shared/standard software interfaces for disparate data types
D1: Reusable/shared/standard software interfaces for disparate data types
+
*D2: Brokering interfaces to manage access to data using disparate storage and all protocols
D2: Brokering interfaces to manage access to data using disparate storage and all protocols
+
*D3: Deliver data in user-requested format and translation between standards
D3: Deliver data in user-requested format and translation between standards
+
*D4: Real-time access to data and facilities even in low bandwidth settings
D4: Real-time access to data and facilities even in low bandwidth settings
+
*D5: Networking and linkage of existing data sets  
D5: Networking and linkage of existing data sets  
+
*D6: Data curation:  Long term preservation, integrity, authenticity & chain of custody
D6: Data curation:  Long term preservation, integrity, authenticity & chain of custody
 
 
From capabilities list:  1, 2, 3, 5, 7, 13, 17, 62, 63, 65, 66, 72, V1, V2, V20, V30, V31
 
From capabilities list:  1, 2, 3, 5, 7, 13, 17, 62, 63, 65, 66, 72, V1, V2, V20, V30, V31
 
Closely related to items in I. Numerical methods and software engineering
 
Closely related to items in I. Numerical methods and software engineering
  
E. Workflow Execution Management
+
'''E. Workflow Execution Management'''
E1: Manage workflow execution in distributed environment – conditional execution, integration with server-side and compute side, interactive
+
*E1: Manage workflow execution in distributed environment – conditional execution, integration with server-side and compute side, interactive
E2: Combine components from multiple existing workflow systems or models  
+
*E2: Combine components from multiple existing workflow systems or models  
 
Summarized from 29, 33, 47, 77, 78, 79, 83, 85, 90, 94, V10, V24, V30
 
Summarized from 29, 33, 47, 77, 78, 79, 83, 85, 90, 94, V10, V24, V30
  
F. Data Management within Workflows
+
'''F. Data Management within Workflows'''
F1: Caching of intermediate workflow results  
+
*F1: Caching of intermediate workflow results  
F2: Automatically propagate uncertainty through workflows  
+
*F2: Automatically propagate uncertainty through workflows  
F3: Move workflows to data when more sensible
+
*F3: Move workflows to data when more sensible
F4: Automation of QC/QA procedures where feasible  
+
*F4: Automation of QC/QA procedures where feasible  
 
Summarized from 81, 87, 91, V25
 
Summarized from 81, 87, 91, V25
 
See also C. Data Security and Trust
 
See also C. Data Security and Trust
  
G. Modeling Standards and Frameworks
+
'''G. Modeling Standards and Frameworks'''
G1: Joint 4D framework for interdisciplinary models, data & information  
+
*G1: Joint 4D framework for interdisciplinary models, data & information  
G2: Community-based/policed repositories, standards and governance structures for EC compliant tools, and applications that are promulgated  
+
*G2: Community-based/policed repositories, standards and governance structures for EC compliant tools, and applications that are promulgated  
G3: A means to discover, publish and reuse computational models within the EarthCube framework  
+
*G3: A means to discover, publish and reuse computational models within the EarthCube framework  
G4: The ability to compose and integrate models or to extend models into scientific workflows
+
*G4: The ability to compose and integrate models or to extend models into scientific workflows
 
Summarized from capabilities: 25, 26, 28, 34, 35, 37, 42
 
Summarized from capabilities: 25, 26, 28, 34, 35, 37, 42
  
H. Modeling Capabilities within Cloud, Grid, HPC, and Science Portals
+
'''H. Modeling Capabilities within Cloud, Grid, HPC, and Science Portals '''
H1: Web service creation and publishing
+
*H1: Web service creation and publishing
H2: Intelligent data query, retrieval, download and interaction  
+
*H2: Intelligent data query, retrieval, download and interaction  
H3: Interacting gridding/regridding, visualization and other manipulation and analysis tools  
+
*H3: Interacting gridding/regridding, visualization and other manipulation and analysis tools  
H4: Real-time simulation capabilities and flexible access to high performance computing
+
*H4: Real-time simulation capabilities and flexible access to high performance computing
 
Summarized from capabilities: 30, 31, 36, 39, 40, 41, 45, 46
 
Summarized from capabilities: 30, 31, 36, 39, 40, 41, 45, 46
  
I. Numerical Methods and Software Engineering
+
'''I. Numerical Methods and Software Engineering'''
I1: Experimental facilities and data for model validation  
+
*I1: Experimental facilities and data for model validation  
I2: Automated software building and validation to ensure stable software releases (e.g. the NMI build-test lab)   
+
*I2: Automated software building and validation to ensure stable software releases (e.g. the NMI build-test lab)   
I3: Standard APIs and model description standards that enable the creation of better and more reliable tools
+
*I3: Standard APIs and model description standards that enable the creation of better and more reliable tools
I4: Fault tolerance and reliability built into archival and other systems  
+
*I4: Fault tolerance and reliability built into archival and other systems  
 
Summarized from capabilities: 23, 38, V13, V14, V15, V16
 
Summarized from capabilities: 23, 38, V13, V14, V15, V16
 
Closely related to items in L. Best Practices and D. Continuity, sustainability, and evolution
 
Closely related to items in L. Best Practices and D. Continuity, sustainability, and evolution
  
J. Tools to Probe, Validate, Verify, and Visualize Data
+
'''J. Tools to Probe, Validate, Verify, and Visualize Data'''
J1: 4D integration of high resolution topography scans & geodetic data  
+
*J1: 4D integration of high resolution topography scans & geodetic data  
J2: Integration of geologic data in deep time
+
*J2: Integration of geologic data in deep time
J3: Fusion tools that support integration, assimilation & regridding  
+
*J3: Fusion tools that support integration, assimilation & regridding  
J4: Data mining tools and techniques  
+
*J4: Data mining tools and techniques  
 
Summarized from capabilities: 4, 8, 9, 10, 12, 48, 71, 72
 
Summarized from capabilities: 4, 8, 9, 10, 12, 48, 71, 72
  
K. Broad Participation: Enable Collaboration and Participation from International, Industry, Academic, NGO and other Domain Partners  
+
'''K. Broad Participation: Enable Collaboration and Participation from International, Industry, Academic, NGO and other Domain Partners'''
K1: Linkage with NEON, LTER, state geological surveys, and other communities  
+
*K1: Linkage with NEON, LTER, state geological surveys, and other communities  
K2: Low barrier to participation and mechanisms to ensure individual/small voices are heard  
+
*K2: Low barrier to participation and mechanisms to ensure individual/small voices are heard  
K3: Outreach to encourage new collaborations  
+
*K3: Outreach to encourage new collaborations  
 
Summarized from capabilities: 32, 52, 54, 59, 60, 61, V18, V19, V29, V31
 
Summarized from capabilities: 32, 52, 54, 59, 60, 61, V18, V19, V29, V31
  
L. Best Practices & Governance Models for the Development of Definitions & Standards  
+
'''L. Best Practices & Governance Models for the Development of Definitions & Standards'''
L1: Community identification & commission of programs of work that are deemed important  
+
*L1: Community identification & commission of programs of work that are deemed important  
L2: Standards for gridding in models/datasets
+
*L2: Standards for gridding in models/datasets
L3: Standards and best practices for formal data publication and citation
+
*L3: Standards and best practices for formal data publication and citation
 
Summarized from capabilities: 55, 58, 99, V4, V5, V8
 
Summarized from capabilities: 55, 58, 99, V4, V5, V8
 
Closely related to items in I. Numerical methods and software engineering
 
Closely related to items in I. Numerical methods and software engineering
  
M. Policy Enforcement Processes
+
'''M. Policy Enforcement Processes'''
M1: Archival policies for integrity, authenticity, versioning, provenance
+
*M1: Archival policies for integrity, authenticity, versioning, provenance
M2: Quality assurance policies  
+
*M2: Quality assurance policies  
M3: Role of publishing houses of scientific literature and engagement to drive compliance with community agreed-upon standards  
+
*M3: Role of publishing houses of scientific literature and engagement to drive compliance with community agreed-upon standards  
M4: Role of funding agencies in driving compliance with community agreed-upon standards
+
*M4: Role of funding agencies in driving compliance with community agreed-upon standards
M5: Ensure community consensus  
+
*M5: Ensure community consensus  
M6: Consensus driven decision making  
+
*M6: Consensus driven decision making  
 
Summarized from 5, 15, 19, 20, 22, 43, 50, 53, 56, V22
 
Summarized from 5, 15, 19, 20, 22, 43, 50, 53, 56, V22
 
Closely related to items in C. Data Security and Trust
 
Closely related to items in C. Data Security and Trust
  
N. Continuity, Sustainability, & Evolution
+
'''N. Continuity, Sustainability, & Evolution'''
N1: Social networking sites to support knowledge sharing between disparate teams of computer science/geo scientists
+
*N1: Social networking sites to support knowledge sharing between disparate teams of computer science/geo scientists
N2: Global catalogs /directories for data, software, models, workflows, etc.
+
*N2: Global catalogs /directories for data, software, models, workflows, etc.
N3: Linkage with NEON, LTER, state geological surveys, and other communities
+
*N3: Linkage with NEON, LTER, state geological surveys, and other communities
N4: Fault tolerance and reliability built into archival and other systems
+
*N4: Fault tolerance and reliability built into archival and other systems
N5: Usage tracking for tools and datasets  
+
*N5: Usage tracking for tools and datasets  
N6: User support services  
+
*N6: User support services  
N7: Reward systems for all project roles and contributions
+
*N7: Reward systems for all project roles and contributions
N8: Long-term financial sustainability planning
+
*N8: Long-term financial sustainability planning
 
Summarized from capabilities: 6, 23, 44, 49, 50, 51, 54, V11, V12, V17, V21, V23
 
Summarized from capabilities: 6, 23, 44, 49, 50, 51, 54, V11, V12, V17, V21, V23

Latest revision as of 00:48, 22 December 2011

EarthCube Updated Combined Capability List – December 16, 2011

Note: Reference numbers for the capabilities refer to the original list of 99 from the Charrette and from the virtual sessions, both available at http://earthcube.ning.com/page/capabilities


A. Dataset and Workflow Discovery

  • A1: Registration of datasets or workflows (with input parameters) into shared collections or global catalogs
  • A2: Upload & publish workflows and data sets
  • A3: Search across multiple granularity levels & disciplines for workflows and datasets (using semantic metadata and provenance information when appropriate)
  • A4: Data subscription services

Summarized from 6, 63, 67, 76, 80, 89, 93, 98, V3, V27, V28

B. Metadata for Workflow and Data Sets

  • B1: Automated provenance tracking and tracking of data updates (versioning)
  • B2: Commenting, annotating, rating, and categorization of workflows, data sets, or models (both automatic and manual)
  • B3: Computational environment provenance stored including code, configurations, input and metadata, with a goal toward reproducibility

Summarized from 11, 14, 21, 22, 27, 43, 47, 68, 69, 70, 73, 74, 82, 86, 88, 95, 97, V1, V6, V9, V26

C. Data Security and Trust

  • C1: Provide single sign-on environment for shared collections spanning administrative domains
  • C2: Flexible access & sharing controls (licensing) of data, models & workflows
  • C3: Issue tracker for problems with data or workflows
  • C4: Protect individual property rights
  • C5: Data trust: ability for users to track data that needs further explanation or is suspect, fault tolerances, automated tools for validation

Summarized from 16, 18, 57, 64, 75, 76, 84, 92, 96, V7, V20 Closely related to items in F. Data Management within Workflows and M. Policy Enforcement Processes

D. Data Access Services

  • D1: Reusable/shared/standard software interfaces for disparate data types
  • D2: Brokering interfaces to manage access to data using disparate storage and all protocols
  • D3: Deliver data in user-requested format and translation between standards
  • D4: Real-time access to data and facilities even in low bandwidth settings
  • D5: Networking and linkage of existing data sets
  • D6: Data curation: Long term preservation, integrity, authenticity & chain of custody

From capabilities list: 1, 2, 3, 5, 7, 13, 17, 62, 63, 65, 66, 72, V1, V2, V20, V30, V31 Closely related to items in I. Numerical methods and software engineering

E. Workflow Execution Management

  • E1: Manage workflow execution in distributed environment – conditional execution, integration with server-side and compute side, interactive
  • E2: Combine components from multiple existing workflow systems or models

Summarized from 29, 33, 47, 77, 78, 79, 83, 85, 90, 94, V10, V24, V30

F. Data Management within Workflows

  • F1: Caching of intermediate workflow results
  • F2: Automatically propagate uncertainty through workflows
  • F3: Move workflows to data when more sensible
  • F4: Automation of QC/QA procedures where feasible

Summarized from 81, 87, 91, V25 See also C. Data Security and Trust

G. Modeling Standards and Frameworks

  • G1: Joint 4D framework for interdisciplinary models, data & information
  • G2: Community-based/policed repositories, standards and governance structures for EC compliant tools, and applications that are promulgated
  • G3: A means to discover, publish and reuse computational models within the EarthCube framework
  • G4: The ability to compose and integrate models or to extend models into scientific workflows

Summarized from capabilities: 25, 26, 28, 34, 35, 37, 42

H. Modeling Capabilities within Cloud, Grid, HPC, and Science Portals

  • H1: Web service creation and publishing
  • H2: Intelligent data query, retrieval, download and interaction
  • H3: Interacting gridding/regridding, visualization and other manipulation and analysis tools
  • H4: Real-time simulation capabilities and flexible access to high performance computing

Summarized from capabilities: 30, 31, 36, 39, 40, 41, 45, 46

I. Numerical Methods and Software Engineering

  • I1: Experimental facilities and data for model validation
  • I2: Automated software building and validation to ensure stable software releases (e.g. the NMI build-test lab)
  • I3: Standard APIs and model description standards that enable the creation of better and more reliable tools
  • I4: Fault tolerance and reliability built into archival and other systems

Summarized from capabilities: 23, 38, V13, V14, V15, V16 Closely related to items in L. Best Practices and D. Continuity, sustainability, and evolution

J. Tools to Probe, Validate, Verify, and Visualize Data

  • J1: 4D integration of high resolution topography scans & geodetic data
  • J2: Integration of geologic data in deep time
  • J3: Fusion tools that support integration, assimilation & regridding
  • J4: Data mining tools and techniques

Summarized from capabilities: 4, 8, 9, 10, 12, 48, 71, 72

K. Broad Participation: Enable Collaboration and Participation from International, Industry, Academic, NGO and other Domain Partners

  • K1: Linkage with NEON, LTER, state geological surveys, and other communities
  • K2: Low barrier to participation and mechanisms to ensure individual/small voices are heard
  • K3: Outreach to encourage new collaborations

Summarized from capabilities: 32, 52, 54, 59, 60, 61, V18, V19, V29, V31

L. Best Practices & Governance Models for the Development of Definitions & Standards

  • L1: Community identification & commission of programs of work that are deemed important
  • L2: Standards for gridding in models/datasets
  • L3: Standards and best practices for formal data publication and citation

Summarized from capabilities: 55, 58, 99, V4, V5, V8 Closely related to items in I. Numerical methods and software engineering

M. Policy Enforcement Processes

  • M1: Archival policies for integrity, authenticity, versioning, provenance
  • M2: Quality assurance policies
  • M3: Role of publishing houses of scientific literature and engagement to drive compliance with community agreed-upon standards
  • M4: Role of funding agencies in driving compliance with community agreed-upon standards
  • M5: Ensure community consensus
  • M6: Consensus driven decision making

Summarized from 5, 15, 19, 20, 22, 43, 50, 53, 56, V22 Closely related to items in C. Data Security and Trust

N. Continuity, Sustainability, & Evolution

  • N1: Social networking sites to support knowledge sharing between disparate teams of computer science/geo scientists
  • N2: Global catalogs /directories for data, software, models, workflows, etc.
  • N3: Linkage with NEON, LTER, state geological surveys, and other communities
  • N4: Fault tolerance and reliability built into archival and other systems
  • N5: Usage tracking for tools and datasets
  • N6: User support services
  • N7: Reward systems for all project roles and contributions
  • N8: Long-term financial sustainability planning

Summarized from capabilities: 6, 23, 44, 49, 50, 51, 54, V11, V12, V17, V21, V23