Conversation
There was a problem hiding this comment.
Thought this file would be alright to add to Git since it is tiny. But I can also just have the user curl it themselves.
|
Preview at https://biojulia.github.io/BioTutorials/28 |
| All of the nucleotides in all of the reads have a quality score of `$`, which corresponds to a probabilty of error of 0.50119. | ||
| More information about how to convert ASCII values to quality scores [here](https://people.duke.edu/~ccc14/duke-hts-2018/bioinformatics/quality_scores.html). | ||
| This would be quite poor if we were looking at Illumia data. | ||
| However, because of how PacBio chemistry works, |
There was a problem hiding this comment.
Want to confirm this information/explanation with you
There was a problem hiding this comment.
I've never used PacBio data before!
There was a problem hiding this comment.
Yeah, I think this is right, though maybe just grab an illumina dataset instead (or something from FormatSpecimens) so as not to need this particular bit - no need to over complicate things
| The SRR (sample run accession number) is the unique identifier within SRA | ||
| and corresponds to the specific sequencing run. | ||
|
|
||
| In a later tutorial, we will discuss how to download this file in Julia using the SRR. |
There was a problem hiding this comment.
Is there any example code that can be shared on how to do this? Or I can show how this package can be used in a one line on the terminal here.
There was a problem hiding this comment.
https://github.com/BioJulia/BioServices.jl is the cannonical way.
But also, another useful addition to the cookbook would be showing how to call shell commands from julia
|
|
||
| This cookbook will provide a series of "recipes" that will help get started quickly with BioJulia so you can doing some bioinformatics! | ||
|
|
||
| We have tutorials for reading in files, performing alignments, and using tools such as BLAST, |
There was a problem hiding this comment.
Though another option would be to bring in FormatSpecimens.jl... maybe not for the very first one.
| The SRR (sample run accession number) is the unique identifier within SRA | ||
| and corresponds to the specific sequencing run. | ||
|
|
||
| In a later tutorial, we will discuss how to download this file in Julia using the SRR. |
There was a problem hiding this comment.
https://github.com/BioJulia/BioServices.jl is the cannonical way.
But also, another useful addition to the cookbook would be showing how to call shell commands from julia
| ``` | ||
| curl -L --retry 5 --retry-delay 2 \ | ||
| "https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR12147540" \ | ||
| | gzip -c > SRR12147540.fastq.gz |
There was a problem hiding this comment.
Re: command line - this can be
run(pipeline(
`curl -L --retry 5 --retry-delay 2 "https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR12147540"`,
`gzip -c`,
"SRR12147540.fastq.gz"
)
)
or
run(pipeline(
`curl -L --retry 5 --retry-delay 2 "https://trace.ncbi.nlm.nih.gov/Traces/sra-reads-be/fastq?acc=SRR12147540"`;
stdout=pipeline(`gzip -c`; stdout="SRR12147540.fastq.gz")
)
)
| All of the nucleotides in all of the reads have a quality score of `$`, which corresponds to a probabilty of error of 0.50119. | ||
| More information about how to convert ASCII values to quality scores [here](https://people.duke.edu/~ccc14/duke-hts-2018/bioinformatics/quality_scores.html). | ||
| This would be quite poor if we were looking at Illumia data. | ||
| However, because of how PacBio chemistry works, |
There was a problem hiding this comment.
Yeah, I think this is right, though maybe just grab an illumina dataset instead (or something from FormatSpecimens) so as not to need this particular bit - no need to over complicate things
First cookbook tutorial that explains how to read in different file bioinformatics file types