Freqgen¶

Freqgen is a tool to generate coding DNA sequences with specified amino acid usage frequencies or sequence, GC content, codon usage bias, and/or $k$-mer usage bias. To accomplish this, Freqgen uses genetic algorithms to efficiently search the solution space of possible DNA sequences to find ones that most closely match the desired parameters.

Features¶

Supports both CLI and Python module usage
Thoroughly documented with examples
Leverages NumPy for C-optimized number crunching
Can simultaneously match multiple DNA statistics

Installation¶

Simply run:

$ pip install freqgen

Or, to get the latest (but not necessarily stable) development version:

$ pip install git+https://github.com/Lab41/freqgen.git

Five-second CLI tutorial¶

The basic flow of Freqgen can be summarized in three steps:

Generate a new amino acid sequence based on the amino acid usage profile of reference sequences. If you already have a specific amino acid sequence in mind (i.e. for synthetic biology uses), skip this step:
```
$ freqgen aa reference_sequences.fna -o new_sequence.faa -l LENGTH
```
Create a YAML file containing $k$-mer frequencies for the amino acid sequence’s DNA to have:
```
$ freqgen featurize reference_sequences.fna -k INT -o reference_freqs.yaml
```

Generate the DNA sequence coding for the amino acid sequence:

$ freqgen -t reference_freqs.yaml -s new_sequence.faa -v -o optimized.fna

Visualize the results of the optimization (optional):

$ freqgen visualize --target reference_freqs.yaml --optimized optimized.fna

Citation¶

To be determined.

Freqgen

Navigation

Related Topics

Freqgen¶

Features¶

Installation¶

Five-second CLI tutorial¶

Citation¶

Table of Contents¶

Indices and tables¶