This includes the discovery of vaccine targets, assessment of variability, and in-depth analysis of immune epitope. strains. FluKB is representation of a new generation of databases that integrates data, analytical tools, and analytical workflows that enable comprehensive analysis and automatic generation of analysis reports. == 1. Introduction == An estimated 250,000500,000 people die from seasonal influenza infection each year. The economic impact of influenza is immense due to the large number of lost working hours, hospitalizations, further medical complications, and treatment costs. Although vaccines against influenza exist, the rapid mutation of influenza virus calls for constant surveillance and annual vaccine reformulation [1]. RO-1138452 A huge body of sequence data, annotations, and knowledge is available in the literature, online resources, and biological databases such as GenBank [2], UniProt [3], Protein Data Bank [4], EpiFlu Database [5], OpenFlu Database [6], Influenza Research Database (IRD) [7], and the Immune Epitope Database (IEDB) [8]. However, the underlying mechanisms of host/pathogen interaction are still not completely understood. The lack of a universal or broadly neutralizing influenza vaccine can be attributed to, among other factors, combinatorial complexity of the host immune system and the highly variable nature of viral antigens leading to immune escape of the RO-1138452 emerging influenza variants [9,10]. One approach, in an attempt to overcome challenges of immune escape, is to raise a T-cell response against class I or class II epitopes conserved among viral strains [11,12]. Public databases represent valuable resource for the study and development of broadly protective T-cell vaccines, but our ability to analyze these data falls behind the pace of data accumulation. Numerous computational analysis tools that RO-1138452 are useful for vaccine target discovery are available. They include keyword and text search tools, sequence comparison tools such as the BLAST algorithm [13] or multiple sequence alignment tools such as MAFFT [14], MUSCLE [15], and the Clustal [16], 3D structure visualization tools [17,18], HLA binding prediction algorithms [1921], and conservation analysis tools [22,23], among others. The application of these tools in discrete steps can yield valuable information; however the extraction of higher-level knowledge requires integrating data from multiple databases and employing various analytical tools to answer specific questions. For example, when a new infectious influenza strain emerges (such as H9N7 avian flu [24] or a new seasonal flu) it is desirable to rapidly investigate its similarities and dissimilarities with known sequences, its epidemic or pandemic potential in humans, how different it is from the past vaccine strains, and its T- and B-cell epitopes from RO-1138452 previously circulating strains and estimate its immune escape potential. Additionally, for new pandemic strains (such as 2009 swine flu [25]) it is desirable to establish origin and identify strains that are useful vaccine candidates. Well-defined workflows enable rapid extraction of such knowledge and automated generation FAC of reports that contain such information, for which knowledge-based systems have previously been utilized [26,27]. The need for integration and advanced analysis of available data is rapidly increasing. The integration of multistep analysis of multidimensional data for vaccine analysis and discovery requires the automation of analytical workflows [28]. FluKB is a knowledge-based system that integrates multiple types of influenza data and analytical tools into such workflows to support vaccine target discovery. The datasets in FluKB consist of curated, enriched, and standardized protein sequence data, immunological data from multiple data sources, and a set of modular analysis tools. The analysis tools infrastructure comprises a library of individual tools along with standard (applicable to multiple pathogens) and specific influenza vaccine target discovery workflows. Furthermore, we developed a standardized nomenclature to enable and speed up data mining using automated workflows. FluKB has a user-friendly web-based interface to RO-1138452 access the data, tools, predefined workflows, and workflow reports. The overall architecture of FluKB is shown inFigure 1. == Figure 1. == Overview of the architecture of FluKB. (a) Users can access FluKB through an interactive user interface where they can select specific data and tools or deploy a predefined workflow. (b) Top to bottom: data are collected, cleaned, and enriched. Higher-level knowledge extraction is enabled by.