Arabidopsis bioinformatics resources: The current state, challenges, and priorities for the future
Effective research, education, and outreach efforts by the Arabidopsis thalianacommunity, as well as other scientific communities that depend on Arabidopsis resources, depend vitally on easily available and publicly‐shared resources. These resources include reference genome sequence data and an ever‐increasing number of diverse data sets and data types. TAIR (The Arabidopsis Information Resource) and Araport (originally named the Arabidopsis Information Portal) are community informatics resources that provide tools, data, and applications to the more than 30,000 researchers worldwide that use in their work either Arabidopsis as a primary system of study or data derived from Arabidopsis. Four years after Araport's establishment, the IAIC held another workshop to evaluate the current status of Arabidopsis Informatics and chart a course for future research and development. The workshop focused on several challenges, including the need for reliable and current annotation, community‐defined common standards for data and metadata, and accessible and user‐friendly repositories/tools/methods for data integration and visualization. Solutions envisioned included (a) a centralized annotation authority to coalesce annotation from new groups, establish a consistent naming scheme, distribute this format regularly and frequently, and encourage and enforce its adoption. (b) Standards for data and metadata formats, which are essential, but challenging when comparing across diverse genotypes and in areas with less‐established standards (e.g., phenomics, metabolomics). Community‐established guidelines need to be developed. (c) A searchable, central repository for analysis and visualization tools. Improved versioning and user access would make tools more accessible. Workshop participants proposed a “one‐stop shop” website, an Arabidopsis “Super‐Portal” to link tools, data resources, programmatic standards, and best practice descriptions for each data type. This must have community buy‐in and participation in its establishment and development to encourage adoption.