Basic Philosopy

Underlying Premises

The way in which the programs which make up the eXintegrator system handle and display microarray data are based on a small number of premises regarding what is useful, and that which perhaps is somewhat less useful.

  1. The value of looking at the expression of hundreds or thousands of probe sets at a time is generally overestimated. As such these programs do their best to let the user search and select relevant probe sets either by expression patterns or by associated annotation.
  2. Biological relevance is more importance than statistical correctness. Most microarray experiments are essentially fishing expeditions where the researchers look for genes which show some differential expression across some samples. Good statistics allows for the researcher to limit the number of alternatives to study further, but it is very rare that the statistics are considered proof of the fact. Since the stats themeselves generally do not end up as part of the final proof they don't actually matter that much as long as there are quick ways to screen interesting from not so interesting genes, for which the limited expression pattern of the typical experiment is only part of hence....
  3. Context is king. These programs aim to provide the user with context by integrating the display of probe set annotation with probe set expression data as well as providing an 'expression' context by incorporating data from an extended data set.
  4. Raw data is best. It should always be easy to view the raw data which from which expression values are calculated.
  5. The more people view the data the more useful it is. Hence this is a client server system which does its best to make casual analyses of the data easy, painless and as fast as possible. I've always wanted to make people look at the data with fewer preconceptions about what they are looking for. As such I have not made a big effort to provide statistics which ask very well defined questions, but rather which provide means of dealing with more fuzzy approximations (which doesn't mean there's any fuzzy logic built in.. don't even really know what that means).
  6. Simplicity is a wonderful thing. Ok, in the beginning it was all kind of simple, but things change...
This is not to denigrate other means of doing things, merely to give some idea as to why the programs behave in the manner in which they do.

Basic Operating Mode

The programs allows the user to select and or order probe sets either by statistical methods or by database lookups. The selection of sets of probe sets only results in an index of these probe sets being loaded by the client application. The expression data for the selected probe sets are only loaded from the server when requested by the client. When this happens, the server not only sends the expression data for that probe set, but also the set of annotation that is available for that probe set. This allows the user to balance the clarity of the expression data with how biologically interesting or useful the gene represented is. The programs could be said to actively promote decisions on the basis of 'gut-feeling' rather than statistical correctness.
Almost all statistical queries work on the currently selected probe sets rather than on the the whole data set available in the database. This makes it easy to select a set of probe set on the basis of some annotation, and then to order or select a subset of this by some statistical means. In addition it is possible to combine past selections using boolean logic which can then serve as the source of the next analysis.

To use the programs the users have to be registered with the database, and must log in using their password and username before gaining access to the main window of the client application.