ngs_tools.fastq
Submodules
Package Contents
Functions
|
Convert a Fastq to unmapped BAM. |
|
Convert FASTQs to an unmapped BAM according to an arbitrary function. |
|
Convert FASTQs to an unmapped BAM according to the provided |
- ngs_tools.fastq.fastq_to_bam(fastq_path: str, bam_path: str, name: Optional[str] = None, n_threads: int = 1, show_progress: bool = False) str
Convert a Fastq to unmapped BAM.
- Parameters:
fastq_path – Path to the input FASTQ
bam_path – Path to the output BAM
name – Name for this set of reads. Defaults to None. If not provided, a random string is generated by calling
shortuuid.uuid()
. This value is added as the read group (RG tag) for all the reads in the BAM.n_threads – Number of threads to use. Defaults to 1.
show_progress – Whether to display a progress bar. Defaults to False.
- Returns:
Path to BAM
- ngs_tools.fastq.fastqs_to_bam(fastq_paths: List[str], parse_func: Callable[[Tuple[Read.Read, Ellipsis], pysam.AlignmentHeader], pysam.AlignedSegment], bam_path: str, name: Optional[str] = None, n_threads: int = 1, show_progress: bool = False) str
Convert FASTQs to an unmapped BAM according to an arbitrary function.
- Parameters:
fastq_paths – List of FASTQ paths.
parse_func – Function that accepts a tuple of
ngs_tools.fastq.Read
objects (one from each FASTQ) and apysam.AlignmentHeader
object as the second argument, and returns a newpysam.AlignedSegment
object to write into the BAM. Note that the second argument must be used for the header argument when initializing the newpysam.AlignedSegment
. Whenever this function returns None, the read will not be written to the BAM.name – Name for this set of reads. Defaults to None. If not provided, a random string is generated by calling
shortuuid.uuid()
. This value is added as the read group (RG tag) for all the reads in the BAM.bam_path – Path to the output BAM
n_threads – Number of threads to use. Defaults to 1.
show_progress – Whether to display a progress bar. Defaults to False.
- Returns:
Path to BAM
- ngs_tools.fastq.fastqs_to_bam_with_chemistry(fastq_paths: List[str], chemistry: ngs_tools.chemistry.Chemistry, tag_map: Dict[str, Tuple[str, str]], bam_path: str, name: Optional[str] = None, sequence_key: str = 'cdna', n_threads: int = 1, show_progress: bool = False) str
Convert FASTQs to an unmapped BAM according to the provided
ngs_tools.chemistry.Chemistry
instance.Note that any split features (i.e. split barcode where barcode is in multiple positions) are concatenated.
- Parameters:
fastq_paths – List of FASTQ paths. The order must match that of the chemistry.
chemistry –
ngs_tools.chemistry.Chemistry
instance to use to parse the reads.tag_map – Mapping of parser names to their corresponding BAM tags. The keys are the parser names, and the values must be a tuple of
(sequence BAM tag, quality BAM tag)
, where the former is the tag that will be used for the nucleotide sequence, and the latter is the tag that will be used for the quality scores.bam_path – Path to the output BAM
name – Name for this set of reads. Defaults to None. If not provided, a random string is generated by calling
shortuuid.uuid()
. This value is added as the read group (RG tag) for all the reads in the BAM.sequence_key – Parser key to use as the actual alignment sequence. Defaults to cdna.
n_threads – Number of threads to use. Defaults to 1.
show_progress – Whether to display a progress bar. Defaults to False.
- Returns:
Path to BAM
- Raises:
FastqError – If the number of FASTQs provided does not meet the number required for the specified chemistry, if the tag map provides keys that do not exist for the chemistry, or if the tag map contains multiple BAM tags.