IgDiscover itself does not (yet) come with all imaginable analysis facilities built into it.
However, it creates many files (mostly with tables) that can be used for custom analysis.
For example, all
.tsv files (in particular
can be opened and inspected in a spreadsheet application such as LibreOffice. From there,
you can do basic tasks such as sorting from the menu of that application.
Often, these facilities are not enough, however, and some basic understanding of the command-line is helpful. Clearly, this is not as convenient as working in a graphical user interface (GUI), but we do not currently have the resources to provide one for IgDiscover. To alleviate this somewhat, we provide here instructions for a few things that you may want to do with the IgDiscover result files.
Extract all sequences that match any database gene exactly¶
candidates.tsv file tells you for each discovered sequence how often an exact match
of that sequence was found in your input reads. A high number of exact matches is a good
indication that the candidate is actually a new gene or allele. In order to find the original
reads that correspond to those matches, you can run this command in the analysis directory,
iteration-01 with the directory in which the
filtered.tsv.gz file is
igdiscover run iteration-01/exact.tsv
This command will extract all rows from
iteration-01/filtered.tsv.gz for which the
V_errors column is zero.
Extra configuration settings¶
Some configuration settings are not documented in the default
since they rarely need to be changed.
# Leave empty or choose a species name supported by IgBLAST: # human, mouse, rabbit, rat, rhesus_monkey # This setting is not used anywhere except that it is passed # to IgBLAST using the -organism option. Since we provide IgBLAST # with our own gene databases, it seems this has no effect. species:
# Which program to use for computing multiple alignments. This is used for # computing consens sequences. # Choose 'mafft', 'clustalo', 'muscle' or 'muscle-fast'. # 'muscle-fast' runs muscle with parameters "-maxiters 1 -diags". # #multialign_program: muscle-fast