Sometimes it is essential to know the length distribution of your sequences. It may be your newly assembled scaffolds or it might be a genome, that you wish to know the size of chromosomes, or it could just be any multi fasta sequence file. A simple way to do it is using
For example save this script as
#!/usr/bin/python from Bio import SeqIO import sys cmdargs = str(sys.argv) for seq_record in SeqIO.parse(str(sys.argv), "fasta"): output_line = '%s\t%i' % \ (seq_record.id, len(seq_record)) print(output_line)
chmod +x seq_length.py seq_length.py inpput_file.fasta
This will print length for all the sequences in that file.