Tutorials to Help you Get Started
In order to enable users of the KBase to get started with the system
we have developed a series of tutorials ranging from the basic
software installation how-to to detailed analysis of exemplars for
You may browse the tutorials from the menu on the right-hand side of
the page. They are arranged roughly in order of complexity. However,
here we present a short outline of what is available, and you might do well
to first scan it quickly.
- IRIS: a Browser-based Framework for Interacting with KBASE
The KBase IRIS interface allows one to run all the of the Kbase
command line scripts and some useful Unix tools from a web based
interface without downloading any software. The interface keeps track
of who you are and stores your results and history on our KBase IRIS
server. The beta version of the IRIS interface is at the Interactive KBase.
- Command Line Scripts
- Accessing CS Data
- Some Basic Exercises Using the DOE Kbase (ERDB work)
- The DOE Kbase includes an integrated database of genomic data called the Central Store (CS). Extracting data from the CS can be achieved using a defined API. It can also be accessed via a set of command-line scripts that allow a fairly simple and consistent mechanism for access, once a few general principles are understood.
- Getting What You Need from the CS Using Command-Line Scripts
- The Central Store (CS) is the Kbase integration of the data needed to support the creation and validation of metabolic and regulatory models. It will certainly be used for many other purposes, as well; but, its creation is being driven by the needs of the modelling community.
- Extending the CS-Commands with Operators
- The CS-API allows a fairly rich set of command to access data from the CS. We believe that, if we supplement this API with just a few routines and then construct the corresponding commands, we will produce an extremely effective technology for serious work.
- Processing Identical Genomes
- The KBase is intended as an environment in which identical genomes, annotated by distinct groups, can reside and be compared. We anticipate that there will soon be thousands of "essentially identical" genomes, due to the rapid increase in sequencing capacity. We will cover "very similar" genomesin other tutorials. This short document will just make comments relating to the use of identical genomes.
- Translation to the Abstract Function Vocabulary
- Getting Started
- Getting Started with the KBase (Installing from CPAN)
- In order to use the KBase command line scripts as we discuss in these tutorials, you will need to install the KBase client software on your computer.
- KBase Development in the SEED Environment
- The KBase development environment is still under construction. To accelerate progress, we have created a framework within the SEED environment that allows us to develop and test software, as well as creating and maintaining tutorials and documentation. At this stage, we have employed this environment to support worrk on the ID server, the CS and the documentation. We intend to move to a native KBase development environment, once it is well-defined. Until then, this document is intended to help KBase developers with access to the SEED environment to get started.
- Extracting Data from the CS Using the CS-API: Some Typical Examples
- We will cover the basic tools for accessing data in the CS via the CS-API (as opposed to the more common use of the command-line tools. We will give small test programs along with displayed output in hopes that users can easily generate what they need by making minor modifications.
- Looking for Features with Identical Sequence that have Different Annotations
- In this short example, we write code that reads in a file in which each line is supposed to contain the ID of a feature of type 'peg' (protein-encoding gene). For each of these input pegs we gather all features that have exactly the same translation (i.e., protein sequence), gathe their annotations, and check to see if they are all identical. If not, the set of features and inconsistent annotations gets displayed.
- Comparing the Functional Roles in Two Genomes
- As the KBase begins to be used to support development and maintenance of metabolic models, it becomes important that we be able to rapidly compare the functional roles that are implemented by protein-encoding genes in a pair of genomes. In the most common case, we will be looking for errors in annotations between very close genomes. In that case, most of the discrepancies will reflect errors in gene calling and annotations. In the cases of more distant genomes, it becomes possible to infer metabolic differences from the discrepancies.
- The Issue of Retrieving Functional Coupling Data
- As we proceed to improve annotations of gene function, precision of metabolic models, and estimates of regulatory architecture, we will need access to the clues that support estimates of "functional coupling". In this short example, we allow the user to designate a genome within the set stored in the CS (presumably a prokaryotic genome), and we retrieve data relating to co-occurrence (in close proximity on multiple genomes) and co-expression (when we have data available ).
- Mapping Functions Between Sources of Annotations: Take 2
- In another tutorial we outlined an approach to establishing exchangable annotations based on exemplars. The approach we illustrate here involves a set of perl programs that invoke KBase command-line tools (rather than using the KBase API directly). In our view this is a convenient paradigm for problems for which performance is not the big issue.