<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Computing | Weecology Wiki</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/</link><atom:link href="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/index.xml" rel="self" type="application/rss+xml"/><description>Computing</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><image><url>https://deploy-preview-83--weecology-wiki.netlify.app/media/icon_hu2654a0fcc87c65a864822ac27b001d3b_700_512x512_fill_lanczos_center_3.png</url><title>Computing</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/</link></image><item><title/><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/open-drone-map/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/open-drone-map/</guid><description>&lt;h1 id="running-open-drone-map-on-hipergator">Running Open Drone Map on HiPerGator&lt;/h1>
&lt;p>Following instructions in &lt;a href="containers">containers&lt;/a>:&lt;/p>
&lt;h2 id="pull-the-container">Pull the container&lt;/h2>
&lt;p>Here, we&amp;rsquo;ve created a folder (on blue) to store the container image:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">module load aptainer
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">srun -t 6:00:00 apptainer pull /blue/ewhite/your_user_id/odm/odm.sif docker://opendronemap/odm
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="setup-directories">Setup directories&lt;/h2>
&lt;p>Setup directories. ODM needs the working folder code to exist, and a subfolder called images:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>your_user_id@login12 odm&lt;span class="o">]&lt;/span>$ tree
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">├── code
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ ├── images
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │ ├── DSC00001.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">│ │ └── DSC00002.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">└── run.slurm
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We&amp;rsquo;ll create &lt;code>run.slurm&lt;/code> next:&lt;/p>
&lt;h2 id="setup-slurm">Setup SLURM&lt;/h2>
&lt;p>We need to bind a working directory where the outputs + images will go. You&amp;rsquo;d think you could just bind to &lt;code>/code&lt;/code>, which is what ODM would like, but that doesn&amp;rsquo;t work because the EntryPoint of the container is &lt;code>/code/run.py&lt;/code>. So we need to set our working directory somewhere else. This is analogous to &lt;code>docker -v&lt;/code>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>&lt;span class="c1">#SBATCH --job-name=odm-node&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --nodes=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --partition=hpg-turin&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --cpus-per-task=16&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem=64GB&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --time=12:00:00&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --gpus=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --output=./slurm_logs/%A.out&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --error=./slurm_logs/%A.err&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">printenv &lt;span class="p">|&lt;/span> grep -i slurm &lt;span class="p">|&lt;/span> sort
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">srun apptainer run --bind &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">PWD&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">:/project&amp;#34;&lt;/span> /blue/ewhite/your_user_id/odm/odm.sif --project-path /project --max-concurrency &lt;span class="m">16&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Save this as &lt;code>run.slurm&lt;/code> and make sure you&amp;rsquo;ve added any other SBATCH arguments like the correct user/group IDs.&lt;/p>
&lt;h2 id="copy-data">Copy data&lt;/h2>
&lt;p>Copy some drone images and go for a walk while rsync runs:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">rsync -avz --progress /data/everglades/Flight_1/DCIM/ hpg:/home/your_user_id/code/odm/code/images
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>For larger jobs, you should use some scratch storage space on blue. You can delete the images from the working directory afterwards.&lt;/p>
&lt;h2 id="run">Run&lt;/h2>
&lt;p>Launch and check the job was submitted:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">sbatch run.slurm
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">squeuemine
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="check-logs">Check logs&lt;/h2>
&lt;p>Then tail the log while it runs to confirm it&amp;rsquo;s doing something sensible:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>your_user_id@login12 odm&lt;span class="o">]&lt;/span>$ tail -f ./slurm_logs/20526676_4294967294.out
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:50:58,774 DEBUG: Found &lt;span class="m">10000&lt;/span> points in 4.388810873031616s
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:50:59,008 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00077.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:50:59,219 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00098.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:04,036 DEBUG: Found &lt;span class="m">10000&lt;/span> points in 4.76168966293335s
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:04,185 DEBUG: Found &lt;span class="m">9406&lt;/span> points in 5.121237516403198s
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:04,482 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00108.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:04,600 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00003.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:08,902 DEBUG: Found &lt;span class="m">10000&lt;/span> points in 4.365730285644531s
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:09,338 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00092.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:09,627 DEBUG: Found &lt;span class="m">9079&lt;/span> points in 4.97150993347168s
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">2025-12-08 21:51:10,010 INFO: Extracting ROOT_DSPSIFT features &lt;span class="k">for&lt;/span> image DSC00141.JPG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">....
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> No more stages to run
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMNNNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMdo:..---../sNMMMMMMMMMMMMMMMMMMMMMMMMMMNs/..---..:odMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMy-.odNMMMMMNy/&lt;span class="sb">`&lt;/span>/mMMMMMMMMMMMMMMMMMMMMMMm/&lt;span class="sb">`&lt;/span>/hNMMMMMNdo.-yMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMN/&lt;span class="sb">`&lt;/span>sMMMMMMMMMNNMm/&lt;span class="sb">`&lt;/span>yMMMMMMMMMMMMMMMMMMMMy&lt;span class="sb">`&lt;/span>/mMNNMMMMMMMMNs&lt;span class="sb">`&lt;/span>/MMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MM/ hMMMMMMMMNs.+MMM/ dMMMMMMMMMMMMMMMMMMh +MMM+.sNMMMMMMMMh +MM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MN /MMMMMMNo/./mMMMMN :MMMMMMMMMMMMMMMMMM: NMMMMm/./oNMMMMMM: NM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Mm +MMMMMN+ &lt;span class="sb">`&lt;/span>/MMMMMMM&lt;span class="sb">`&lt;/span>-MMMMMMMMMMMMMMMMMM-&lt;span class="sb">`&lt;/span>MMMMMMM:&lt;span class="sb">`&lt;/span> oNMMMMM+ mM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MM..NMMNs./mNMMMMMMMy sMMMMMMMMMMMMMMMMMMo hMMMMMMMNm/.sNMMN&lt;span class="sb">`&lt;/span>-MM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMd&lt;span class="sb">`&lt;/span>:mMNomMMMMMMMMMy&lt;span class="sb">`&lt;/span>:MMMMMMMNmmmmNMMMMMMN:&lt;span class="sb">`&lt;/span>hMMMMMMMMMdoNMm-&lt;span class="sb">`&lt;/span>dMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMm:.omMMMMMMMMNh/ sdmmho/.&lt;span class="sb">`&lt;/span>..&lt;span class="sb">`&lt;/span>-&lt;span class="sb">``&lt;/span>-/sddh+ /hNMMMMMMMMdo.:mMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMd+--/osss+:-:/&lt;span class="sb">`&lt;/span> &lt;span class="sb">```&lt;/span>:- .ym+ hmo&lt;span class="sb">``&lt;/span>:-&lt;span class="sb">`&lt;/span> &lt;span class="sb">`&lt;/span>+:-:ossso/-:+dMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMNmhysosydmNMo /ds&lt;span class="sb">`&lt;/span>/NMM+ hMMd..dh. sMNmdysosyhmNMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMMMs .:-:&lt;span class="sb">``&lt;/span>hmmN+ yNmds -:.:&lt;span class="sb">`&lt;/span>-NMMMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMMN.-mNm- //:::. -:://: +mMd&lt;span class="sb">`&lt;/span>-NMMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMM+ dMMN -MMNNN+ yNNNMN :MMMs sMMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMM&lt;span class="sb">`&lt;/span>.mmmy /mmmmm/ smmmmm&lt;span class="sb">``&lt;/span>mmmh :MMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMM&lt;span class="sb">``&lt;/span>:::- ./////. -:::::&lt;span class="sb">`&lt;/span> :::: -MMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMM:&lt;span class="sb">`&lt;/span>mNNd /NNNNN+ hNNNNN .NNNy +MMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMMd&lt;span class="sb">`&lt;/span>/MMM.&lt;span class="sb">`&lt;/span>ys+//. -/+oso +MMN.&lt;span class="sb">`&lt;/span>mMMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMMMMMy /o:- &lt;span class="sb">`&lt;/span>oyhd/ shys+ &lt;span class="sb">`&lt;/span>-:s-&lt;span class="sb">`&lt;/span>hMMMMMMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMNmdhhhdmNMMM&lt;span class="sb">`&lt;/span> +d+ sMMM+ hMMN:&lt;span class="sb">`&lt;/span>hh- sMMNmdhhhdmNMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMms:::/++//::+ho .+- /dM+ hNh- +/&lt;span class="sb">`&lt;/span> -h+:://++/::/smMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMN+./hmMMMMMMNds- ./oso:.&lt;span class="sb">``&lt;/span>:. :-&lt;span class="sb">``&lt;/span>.:os+- -sdNMMMMMMmy:.oNMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMm-.hMNhNMMMMMMMMNo&lt;span class="sb">`&lt;/span>/MMMMMNdhyyyyhhdNMMMM+&lt;span class="sb">`&lt;/span>oNMMMMMMMMNhNMh.-mMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MM:&lt;span class="sb">`&lt;/span>mMMN/-sNNMMMMMMMo yMMMMMMMMMMMMMMMMMMy sMMMMMMMNNs-/NMMm&lt;span class="sb">`&lt;/span>:MM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Mm /MMMMMd/.-oMMMMMMN :MMMMMMMMMMMMMMMMMM-&lt;span class="sb">`&lt;/span>MMMMMMMo-./dMMMMM/ NM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> Mm /MMMMMMm:-&lt;span class="sb">`&lt;/span>sNMMMMN :MMMMMMMMMMMMMMMMMM-&lt;span class="sb">`&lt;/span>MMMMMNs&lt;span class="sb">`&lt;/span>-/NMMMMMM/ NM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MM:&lt;span class="sb">`&lt;/span>mMMMMMMMMd/-sMMMo yMMMMMMMMMMMMMMMMMMy sMMMs-/dMMMMMMMMd&lt;span class="sb">`&lt;/span>:MM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMm-.hMMMMMMMMMdhMNo&lt;span class="sb">`&lt;/span>+MMMMMMMMMMMMMMMMMMMM+&lt;span class="sb">`&lt;/span>oNMhdMMMMMMMMMh.-mMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMNo./hmNMMMMMNms--yMMMMMMMMMMMMMMMMMMMMMMy--smNMMMMMNmy/.oNMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMms:-:/+++/:-+hMMMMMMMMMMMMMMMMMMMMMMMMMNh+-:/+++/:-:smMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMNdhhyhdmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMmdhyhhmNMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMNNNNNMMMMMMNNNNNNMMMMMMMMNNMMMMMMMNNMMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMh/-...-+dMMMm......:+hMMMMs../MMMMMo..sMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMM/ /yhy- sMMm -hhy/ :NMM+ oMMMy /MMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMy /MMMMN&lt;span class="sb">`&lt;/span> NMm /MMMMo +MM: .&lt;span class="sb">`&lt;/span> yMd&lt;span class="sb">```&lt;/span> :MMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMM+ sMMMMM: hMm /MMMMd -MM- /s &lt;span class="sb">`&lt;/span>h.&lt;span class="sb">`&lt;/span>d- -MMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMs +MMMMM. mMm /MMMMy /MM. +M/ yM: &lt;span class="sb">`&lt;/span>MMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMN- smNm/ +MMm :NNdo&lt;span class="sb">`&lt;/span> .mMM&lt;span class="sb">`&lt;/span> oMM+/yMM/ MMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMNo- &lt;span class="sb">`&lt;/span>:yMMMm &lt;span class="sb">`&lt;/span>:sNMMM&lt;span class="sb">`&lt;/span> sMMMMMMM+ NMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> MMMMMMMMMMMMMMMNmmNMMMMMMMNmmmmNMMMMMMMNNMMMMMMMMMNNMMMMMMMMMMMM
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">[&lt;/span>INFO&lt;span class="o">]&lt;/span> ODM app finished - Mon Dec &lt;span class="m">08&lt;/span> 22:13:26 &lt;span class="m">2025&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">.100 - &lt;span class="k">done&lt;/span>.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">(&lt;/span>ASCII art is fun&lt;span class="o">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The output will be in &lt;code>odm_orthophoto/odm_orthophoto.tif&lt;/code>&lt;/p>
&lt;h2 id="useful-flags-and-other-run-options">Useful flags and other run options&lt;/h2>
&lt;ul>
&lt;li>Store high-precision geotags from PPK in &lt;code>geo.txt&lt;/code> in the project folder and ODM will pick them up automatically (see &lt;a href="https://docs.opendronemap.org/geo/" target="_blank" rel="noopener">here&lt;/a>)&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="s2">&amp;#34;&amp;#34;&amp;#34;Convert WISPR PPK coords to OpenDroneMap&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nn">argparse&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">import&lt;/span> &lt;span class="nn">csv&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="kn">from&lt;/span> &lt;span class="nn">pathlib&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">Path&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">parser&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">argparse&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ArgumentParser&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">description&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;Convert WISPR CSV to ODM geo.txt&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">parser&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">add_argument&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;root_dir&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">help&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;Root directory to search for exif_image_list.csv&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">parser&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">add_argument&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;output_txt&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">help&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;Output geo.txt file for ODM&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">args&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">parser&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">parse_args&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">csv_files&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nb">list&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Path&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">args&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">root_dir&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">rglob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;exif_image_list.csv&amp;#34;&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">csv_files&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">==&lt;/span> &lt;span class="mi">0&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">raise&lt;/span> &lt;span class="ne">SystemExit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Error: No exif_image_list.csv found&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">csv_files&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">&amp;gt;&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">raise&lt;/span> &lt;span class="ne">SystemExit&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;Error: Found multiple exif_image_list.csv files:&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span> &lt;span class="o">+&lt;/span> &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">join&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">str&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">f&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">f&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">csv_files&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">with&lt;/span> &lt;span class="nb">open&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">csv_files&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">])&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">infile&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nb">open&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">args&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">output_txt&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;w&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="k">as&lt;/span> &lt;span class="n">outfile&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">reader&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">csv&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">DictReader&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">infile&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">outfile&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">write&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;EPSG:4326&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">for&lt;/span> &lt;span class="n">row&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">reader&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># ODM format: filename lon lat alt yaw pitch roll [h_acc v_acc]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">outfile&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">write&lt;/span>&lt;span class="p">(&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;image_name&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;longitude&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;latitude&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;altitude&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;gimbal_yaw&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;gimbal_pitch&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;gimbal_roll&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;x_accuracy&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">row&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;z_accuracy&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="se">\n&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="sa">f&lt;/span>&lt;span class="s2">&amp;#34;Created &lt;/span>&lt;span class="si">{&lt;/span>&lt;span class="n">args&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">output_txt&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>Set &lt;code>--max-concurrency&lt;/code> to the number of CPUs assigned to the job&lt;/li>
&lt;li>&lt;code>--optimize-disk-space&lt;/code> to clean up intermediate files at the cost of needing to re-run the job if it crashes (no resume)&lt;/li>
&lt;li>&lt;code>--orthophoto-resolution 2&lt;/code> ortho resolution in centimeters. The lower this parameter, the larger the output ortho will be and the slower the processing. The default is &lt;code>5&lt;/code> which runs fairly quickly and is a good sanity check.&lt;/li>
&lt;li>Use &lt;a href="https://docs.opendronemap.org/gcp/" target="_blank" rel="noopener">GCPs&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="array-processing">Array processing.&lt;/h2>
&lt;p>Here is an example job file that can run on an array of surveys. The list of input folders is provided in &lt;code>folder_list.txt&lt;/code> (newline delineated).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>&lt;span class="c1">#SBATCH --job-name=odm-array&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --nodes=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --partition=hpg-turin&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --cpus-per-task=16&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem=64GB&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --time=12:00:00&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --gpus=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --array=1-N%4&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --output=./slurm_logs/%A_%a.out&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --error=./slurm_logs/%A_%a.err&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Configuration&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">INPUT_FILE&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;folder_list.txt&amp;#34;&lt;/span> &lt;span class="c1"># Text file with one absolute path per line&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">WORKING_DIR&lt;/span>&lt;span class="o">=&lt;/span> &lt;span class="c1"># Where you want to create the output folders&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">ODM_SIF&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="c1"># Path to the SIF file&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Get the source folder for this array task&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">SOURCE_FOLDER&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$(&lt;/span>sed -n &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">SLURM_ARRAY_TASK_ID&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">p&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="nv">$INPUT_FILE&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="k">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Get the basename&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">BASENAME&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$(&lt;/span>basename &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="nv">$SOURCE_FOLDER&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="k">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Create target directory structure&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">WORKING_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">BASENAME&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Print environment for debugging&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">printenv &lt;span class="p">|&lt;/span> grep -i slurm &lt;span class="p">|&lt;/span> sort
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">mkdir -p &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/code/images&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Copy JPG files&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Copying images from &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">SOURCE_FOLDER&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2"> to &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/code/images&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">rsync -av --include&lt;span class="o">=&lt;/span>&lt;span class="s1">&amp;#39;*.JPG&amp;#39;&lt;/span> --include&lt;span class="o">=&lt;/span>&lt;span class="s1">&amp;#39;*.jpg&amp;#39;&lt;/span> --exclude&lt;span class="o">=&lt;/span>&lt;span class="s1">&amp;#39;*&amp;#39;&lt;/span> &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">SOURCE_FOLDER&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/code/images/&amp;#34;&lt;/span> &lt;span class="o">||&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span>&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Warning: No JPG files found in &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">SOURCE_FOLDER&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Perform PPK geotagging&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">python3 wispr_to_odm_ppk.py &lt;span class="si">${&lt;/span>&lt;span class="nv">SOURCE_FOLDER&lt;/span>&lt;span class="si">}&lt;/span> &lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>/geo.txt &lt;span class="o">||&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span>&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Failed to find a PPK coordinate file. Processing will use EXIF GPS data only.&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">module load cuda
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Run ODM with the target directory as project path&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Running ODM on &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">srun apptainer run --nv --bind &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">:/project&amp;#34;&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="nv">$ODM_SIF&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --project-path /project &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --max-concurrency &lt;span class="m">16&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --orthophoto-resolution &lt;span class="m">2&lt;/span> &lt;span class="se">\
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="se">&lt;/span> --optimize-disk-space
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Clean up the images in the target folder&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Removing image folder &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/code/images&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">rm -rf &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">/code/images&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Setting permissions on target foler&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">bash group-permissions-update.sh &lt;span class="si">${&lt;/span>&lt;span class="nv">TARGET_DIR&lt;/span>&lt;span class="si">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;Completed processing &lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">BASENAME&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>The param &lt;code>#SBATCH --array=1-N%4&lt;/code> starts an array job with a maximum of 4 concurrent tasks.&lt;/li>
&lt;li>Note &lt;code>module load cuda&lt;/code> to pass GPU to job&lt;/li>
&lt;/ul>
&lt;p>Submission script:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">INPUT_FILE&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;folder_list.txt&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">N&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$(&lt;/span>wc -l &amp;lt; &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="nv">$INPUT_FILE&lt;/span>&lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="k">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sbatch --array&lt;span class="o">=&lt;/span>1-&lt;span class="nv">$N&lt;/span> job_array.slurm
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>Accessing the T-drive</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/t-drive/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/t-drive/</guid><description>&lt;ul>
&lt;li>If off campus, turn on the virtual private network (VPN) before connecting using the instructions here: &lt;a href="https://vpn.ufl.edu/" target="_blank" rel="noopener">https://vpn.ufl.edu/&lt;/a>&lt;/li>
&lt;li>Follow the instructions for: 1) &lt;a href="https://wec.ifas.ufl.edu/resources/it--computer-support/mapping-your-t-and-u-drives/" target="_blank" rel="noopener">Windows&lt;/a> or 2) &lt;a href="https://wec.ifas.ufl.edu/resources/it--computer-support/mapping-your-t-and-u-drives-mac/" target="_blank" rel="noopener">macOS&lt;/a>&lt;/li>
&lt;li>On Linux use the same basic approach described in the Windows/macOS instructions, but enter &lt;code>UFAD&lt;/code> for the &lt;code>WORKGROUP&lt;/code>&lt;/li>
&lt;li>Lab materials are in the &lt;code>lab-white-ernest&lt;/code> folder&lt;/li>
&lt;li>Our public file serving folder is &lt;code>Weecology&lt;/code>&lt;/li>
&lt;/ul>
&lt;h2 id="linux">Linux&lt;/h2>
&lt;p>Assuming you&amp;rsquo;re running Ubuntu. Install the cifs-utils package, create a folder called /media/T for a mount point, then run the mount command, replacing &lt;code>&amp;lt;your-gatorlink&amp;gt;&lt;/code> with your UF gatorlink (the part of your UF email address before the &lt;code>@&lt;/code> and &lt;code>&amp;lt;your-local-username&amp;gt;&lt;/code> with your username on the Linux machine you are performing the mount from; you can check this with &lt;code>whoami&lt;/code>). If you haven&amp;rsquo;t run &lt;code>sudo&lt;/code> recently you will be prompted for your local password. You will (then) be prompted for UF your password.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl"># 1) Install CIFS support
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sudo apt-get install -y cifs-utils
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 2) Create the mount point (if it doesn&amp;#39;t exist)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sudo mkdir -p /media/T
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 3) Unmount if already mounted (lazy to avoid &amp;#34;busy&amp;#34; errors)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sudo umount -l /media/T 2&amp;gt;/dev/null || true
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 4) Mount with your user/group mapping and permissions. Customize permissions to your need
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># - file_mode=0660 -&amp;gt; files: rw-rw----
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># - dir_mode=0770 -&amp;gt; folders: rwxrwx---
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># - Replace &amp;lt;your-gatorlink&amp;gt; and &amp;lt;preferred-local-group&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sudo mount -t cifs LINK_FROM_INSTRUCTIONS_ABOVE_WITHOUT_SMB_PART /media/T \
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> -o username=&amp;lt;your-gatorlink&amp;gt;,uid=$(id -u),gid=$(getent group &amp;lt;preferred-local-group&amp;gt; | cut -d: -f3),file_mode=XXXX,dir_mode=XXXX
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>,gid=&amp;lt;preferred-local-group&amp;gt;&lt;/code> can be left out if no group is needed, but on shared systems (like Serenity) we typically want to include a group so everyone can access the share.
The full command, including the &lt;code>LINK_FROM_INSTRUCTIONS_ABOVE_WITHOUT_SMB_PART&lt;/code>, is available for weecology folks as a pinned post on the Serenity and HiPerGator Slack channels.&lt;/p></description></item><item><title>Lab Style Guide for Code</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-code-style-guide/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-code-style-guide/</guid><description>&lt;h2 id="guiding-principles">Guiding Principles&lt;/h2>
&lt;p>This document provides a guide to code structure and formatting across languages used within the Weecology projects. Links to language-specific guides are provided below.&lt;/p>
&lt;p>Generally, this guide follows the principles outlined &lt;a href="http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745#s2" target="_blank" rel="noopener">here&lt;/a>. In particular:&lt;/p>
&lt;ol>
&lt;li>Write and style your code for human readers&lt;/li>
&lt;li>Minimize the number of facts a reader is expected to hold at one time&lt;/li>
&lt;li>Use consistent, distinct, and meaningful names&lt;/li>
&lt;li>Employ consistent style and formatting&lt;/li>
&lt;/ol>
&lt;h2 id="structure">Structure&lt;/h2>
&lt;h3 id="modularize">Modularize&lt;/h3>
&lt;ul>
&lt;li>Break code into chunks corresponding to contained tasks&lt;/li>
&lt;li>Whenever possible write code into functions, even if the function isn&amp;rsquo;t called repeatedly&lt;/li>
&lt;/ul>
&lt;h3 id="loops">Loops&lt;/h3>
&lt;ul>
&lt;li>Loop should be used with repeated tasks unless you actually need the indices&lt;/li>
&lt;li>If the language allows, use vectorized functions in place of loops to speed computation and reduce code volume&lt;/li>
&lt;/ul>
&lt;h2 id="style">Style&lt;/h2>
&lt;h3 id="naming">Naming&lt;/h3>
&lt;ul>
&lt;li>Be concise and meaningful, but conveying meaning is more important than brevity
&lt;ul>
&lt;li>Document abbreviations if they are not common or immediately intuitive&lt;/li>
&lt;li>Functions are verbs, variables are nouns&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Use snake_case for variables and functions (e.g., &lt;code>portal_data&lt;/code>)
&lt;ul>
&lt;li>Exceptions:
&lt;ul>
&lt;li>established prefixes (e.g., &lt;code>n&lt;/code> in &lt;code>nobs&lt;/code> to indicate the number of observations)&lt;/li>
&lt;li>established suffixes (e.g., &lt;code>i&lt;/code> in &lt;code>obsi&lt;/code> to indicate the specific observation in a for loop)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Use UpperCamelCase for class names for object oriented programming (primarily in Python)&lt;/li>
&lt;li>Do not use &lt;code>.&lt;/code> in names (&lt;a href="http://adv-r.had.co.nz/Style.html" target="_blank" rel="noopener">particularly in R&lt;/a>)&lt;/li>
&lt;li>Do not use single-letter names
&lt;ul>
&lt;li>Exceptions:
&lt;ul>
&lt;li>representing a term in an equation (e.g., &lt;code>y&lt;/code> in &lt;code>y = m * x + b&lt;/code>)&lt;/li>
&lt;li>using an established name in a language (e.g., &lt;code>n&lt;/code> references the number of draws from a random variable in R)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Constants, and only constants, should be in all caps&lt;/li>
&lt;/ul>
&lt;h3 id="white-space">White space&lt;/h3>
&lt;ul>
&lt;li>spaces after commas&lt;/li>
&lt;li>spaces around operators (unless inside the argument definitions in Python)&lt;/li>
&lt;li>no spaces around parentheses&lt;/li>
&lt;/ul>
&lt;h3 id="line-length">Line length&lt;/h3>
&lt;ul>
&lt;li>Lines &amp;lt;= 80 characters
&lt;ul>
&lt;li>But a few extra characters can be better than confusing contortions to make length&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Parentheses/brackets/braces with breaks after commas are typically better than line break characters (but not always)&lt;/li>
&lt;/ul>
&lt;h3 id="indentation">Indentation&lt;/h3>
&lt;ul>
&lt;li>Always indent to indicate that code is inside a function, loop, etc. (Python makes you do this. Thanks Python!)&lt;/li>
&lt;li>Use spaces, not tabs (but it&amp;rsquo;s fine for you IDE to turn the tab key into spaces)&lt;/li>
&lt;li>Follow language convention for number of spaces
&lt;ul>
&lt;li>R: 2&lt;/li>
&lt;li>Python: 4&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>When breaking lines with parentheses (mostly in function calls/definitions) align with the leading character after the opening parenthesis
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">(stuff, things,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> more_things)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;/ul>
&lt;h3 id="references">References&lt;/h3>
&lt;ul>
&lt;li>Magic numbers (numeric references to elements, columns, rows, etc.) should be avoided.&lt;/li>
&lt;li>References should be made by name or-in the case of loops-position.&lt;/li>
&lt;/ul>
&lt;h2 id="documentation">Documentation&lt;/h2>
&lt;h3 id="in-line-commenting">In-line commenting&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001745#s8" target="_blank" rel="noopener">Document what the code does and how to use it, not how it does it&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="external-documentation">External documentation&lt;/h3>
&lt;ul>
&lt;li>Use standard documentation comment styles
&lt;ul>
&lt;li>R: &lt;a href="http://r-pkgs.had.co.nz/man.html" target="_blank" rel="noopener">roxygen&lt;/a>&lt;/li>
&lt;li>Python: &lt;a href="https://www.python.org/dev/peps/pep-0257/" target="_blank" rel="noopener">docstrings&lt;/a>&lt;/li>
&lt;li>These can create formatted documentation, but they are useful visual indicators even if you don&amp;rsquo;t do this&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="language-specific-style-guides">Language specific style guides&lt;/h2>
&lt;ul>
&lt;li>Follow official language style guides (within reason). This helps make your code broadly readable and makes external contributions more likely.
&lt;ul>
&lt;li>Python: &lt;a href="https://www.python.org/dev/peps/pep-0008/" target="_blank" rel="noopener">Official Python Style Guide (PEP8)&lt;/a>&lt;/li>
&lt;li>Julia: &lt;a href="https://docs.julialang.org/en/v1/manual/style-guide/" target="_blank" rel="noopener">Official Julia Style Guide&lt;/a>&lt;/li>
&lt;li>R: &lt;a href="http://adv-r.had.co.nz/Style.html" target="_blank" rel="noopener">Hadley Wickham&amp;rsquo;s style guide&lt;/a>. This isn&amp;rsquo;t official, or broadly agreed on, but it serves as the base (or at least justification) for a lot of what we do&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Computer Setup - Mac</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/computer-setup-mac/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/computer-setup-mac/</guid><description>&lt;h2 id="terminal">Terminal&lt;/h2>
&lt;p>This is an application on Macs. It is opened by double-clicking on the icon, which should bring up a small window with a black background. The Terminal is used to move around in the file directory, rearrange files, and use git.&lt;/p>
&lt;h2 id="python">Python&lt;/h2>
&lt;p>&lt;em>IDE&lt;/em>&lt;/p>
&lt;p>&lt;em>Packages&lt;/em>&lt;/p>
&lt;p>&lt;em>Projects&lt;/em>&lt;/p>
&lt;p>&lt;em>Links to get started coding in Python&lt;/em>&lt;/p>
&lt;p>&lt;em>Using git/GitHub&lt;/em>&lt;/p>
&lt;h2 id="r">R&lt;/h2>
&lt;p>&lt;em>Installing R&lt;/em>&lt;/p>
&lt;p>&lt;em>Installing RStudio&lt;/em>&lt;/p>
&lt;p>&lt;em>Packages&lt;/em>&lt;/p>
&lt;p>&lt;em>Projects&lt;/em>&lt;/p>
&lt;p>&lt;em>Using git/GitHub&lt;/em>&lt;/p>
&lt;h2 id="dependencies">Dependencies&lt;/h2>
&lt;h2 id="gitgithub">Git/GitHub&lt;/h2>
&lt;p>Resources: &lt;a href="http://swcarpentry.github.io/git-novice/" target="_blank" rel="noopener">Software Carpentry&lt;/a>, &lt;a href="https://happygitwithr.com/" target="_blank" rel="noopener">Happy Git and GiHub for the userR&lt;/a>, &lt;a href="http://rogerdudler.github.io/git-guide/" target="_blank" rel="noopener">Roger Dudler cheatsheet&lt;/a>&lt;/p>
&lt;h3 id="installing-git">Installing git&lt;/h3>
&lt;ol>
&lt;li>
&lt;p>Open up the Terminal, type in &amp;ldquo;git&amp;rdquo; and press enter.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>This should cause a pop-up window to appear. It will have several options; click on &amp;ldquo;Install&amp;rdquo; (not &amp;ldquo;Get Xcode&amp;rdquo;, see &amp;ldquo;Installing Xcode&amp;rdquo; for that).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Click &amp;ldquo;Agree&amp;rdquo;.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>When the install is finished, click &amp;ldquo;Done&amp;rdquo;.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>To make sure this worked, type in &amp;ldquo;git&amp;rdquo; in the Terminal and press enter. Some information will come up, including a list of common commands.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="configuring-git">Configuring Git&lt;/h3>
&lt;p>There&amp;rsquo;s some basic info on Git setup from &lt;a href="https://swcarpentry.github.io/git-novice/#installing-git" target="_blank" rel="noopener">software carpentry&lt;/a>. If you are also setting up a GitHub account, be sure to use the same email address, so that when you use Git on your computer and &lt;em>push&lt;/em> the changes to GitHub, it identifies you correctly.&lt;/p>
&lt;p>On a mac, browsing folders in finder also tends to generate &lt;code>.DS_Store&lt;/code> files. You generally don&amp;rsquo;t want to include those in your repositories, so here are &lt;a href="https://www.jeffgeerling.com/blogs/jeff-geerling/stop-letting-dsstore-slow-you" target="_blank" rel="noopener">some instructions&lt;/a> to ignore such files globally.&lt;/p>
&lt;h3 id="installing-xcode">Installing Xcode&lt;/h3>
&lt;h3 id="github">GitHub&lt;/h3>
&lt;p>Create a GitHub account by going to &lt;a href="https://github.com/" target="_blank" rel="noopener">https://github.com/&lt;/a>&lt;/p>
&lt;p>This is a service that allows you to store your code, project management materials, etc., online. It allows for other people to look at your work (if the repo is public). It also saves all the versions of your code as you change it, which is referred to as version control. This eliminates the need for creating multiple copies of code as you change it (e.g., folder with files called &amp;ldquo;data_analysis&amp;rdquo;, &amp;ldquo;data_analysis_3&amp;rdquo;, &amp;ldquo;data_analysis_45&amp;rdquo;, &amp;ldquo;data_analysis_final&amp;rdquo;, and &amp;ldquo;data_analysis_final_really&amp;rdquo;) and you can use the online GitHub interface to easily look back at previous versions of your code and see what was changed.&lt;/p>
&lt;h3 id="repositories">Repositories&lt;/h3>
&lt;p>A repository is where you put all of the materials related to single project. One repository per project.&lt;/p>
&lt;p>Creating a GitHub repository:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Open up a browser, go to the GitHub website, and sign into your GitHub account. Navigate to your profile page.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Click on the &amp;ldquo;Repositories&amp;rdquo; tab in the top middle of the page.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In the upper right hand corner, click on the green &amp;ldquo;New&amp;rdquo; button.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In this page, name the repo, preferably something short and succinct that uniquely describes the project. Also, there should be no spaces in repository names. If the name consists of multiple words, separate them by an underscore, dash, or camel case, e.g., mammal_community_dynamics, mammal-community-dynamics, MammalCommunityDynamics.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Select if the repo will be public or private. Keep in mind that you have a limited number of private repos for free, so most of your repos will likely be public. Having your research publicly available also makes for better science.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Check &amp;ldquo;Initialize this repository with a README&amp;rdquo;.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Leave both &amp;ldquo;Add .gitignore&amp;rdquo; and &amp;ldquo;Add a license&amp;rdquo; as &amp;ldquo;None&amp;rdquo;.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Click green &amp;ldquo;Create repository&amp;rdquo; button. You&amp;rsquo;ve created your new repository, congrats!&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>Cloning GitHub repository:&lt;/p>
&lt;p>The repository that was created above is the remote repository. This can be accessed from any computer using the browser. You also need to create a copy of the remote repository called the local repository. This copy of the repo can only be accessed from the computer it is created on. You will do work (e.g., changing code) on the local repo.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>In the browser, navigate to the main page of the repository you want to clone.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>In the right-hand column near the bottom, there is a bar containing a URL. There are two possible options for this URL, either HTTPS or SSH. You can switch between these two by clicking on the relevant blue hyperlinked acronym below the URL.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Click on the HTTPS blue hyperlinked word. Either HTTPS or SSH can be used, but it is easier to start with HTTPS. The difference between these is how they link the local repo to the remote, but that difference is not important right now.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Copy the HTTPS URL.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Open up the Terminal and navigate to the location on your computer where you want the local repo to be located. You can navigate around using the commands &amp;ldquo;ls&amp;rdquo; (this displays all the folders and files in the current directory) and &amp;ldquo;cd&amp;rdquo;. This latter command changes the directory, so you will type in the path for the directory you want to go to. For example, if I want to put the repo in the folder Projects, which is within the folder Documents, I would type the following into the Terminal: &amp;ldquo;cd Documents/Projects&amp;rdquo;. Then hit enter.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Once you&amp;rsquo;re in the directory where you want the repo to be located, type in the command &amp;ldquo;git clone&amp;rdquo;, a space, and then paste in the HTTPS URL.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hit enter. This should create a new folder in the chosen directory that has the same name as the remote repo. This is your local repository.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>Adding files and commits to local repository:&lt;/p>
&lt;p>An important aspect of git/GitHub is version control. As you change scripts, you can use git to save all the different versions of scripts and then use GitHub later on to easily look at each of these versions. Then, if you mess up your code or don&amp;rsquo;t like the direction it&amp;rsquo;s heading in, you can access and use a previous version of your script very easily.&lt;/p>
&lt;p>The way that you save these versions using git is by doing something called a commit. Each commit represents a different version of a script. Because you choose when to do a commit, you get to choose how different all of the versions of the script are. You should definitely make a commit for every major change in the code that you make, but you can never commit too often. When in doubt, commit.&lt;/p>
&lt;p>Another important, and confusing, step in this process is adding the script. Before you can commit the newest version of a script, you have to add that script to the stage. This means that if you&amp;rsquo;ve changed several of the scripts within one local repository, you can add all of these scripts to the stage and commit them together, or you can add one script to the stage and commit it at a time.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Create a new script (in Python, R, or whatever language) and save the script in the folder for the local repository you&amp;rsquo;ve just created. Similar to the names of repositories, there should be no spaces in script names. See the fourth bullet point in the &amp;ldquo;Creating a GitHub Repository&amp;rdquo; section for naming conventions.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If you are not already there, open up the Terminal and navigate to the local repository folder using the &amp;ldquo;cd&amp;rdquo; command.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>To add the script, type in &amp;ldquo;git add &amp;quot; and then the name of the script, and then hit enter. (There should be a space between the add command and the script name). The script is now on the stage. Optional: You can repeat this multiple times in a row with different script names if you&amp;rsquo;re adding multiple scripts to the stage at the same time.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>To make sure that the script has been added, type in &amp;ldquo;git status&amp;rdquo; and hit Enter. This will bring up information about the repo that you are currently looking at. If you&amp;rsquo;ve correctly added the script, under the &amp;ldquo;Changes to be committed:&amp;rdquo; header, there should be an indented bit of text that will be formatted as &amp;ldquo;modified: &amp;quot; and the name of the script you&amp;rsquo;ve added.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Now you want to commit this version of the script. In the Terminal, type in &amp;ldquo;git commit -m &amp;ldquo;&lt;em>message&lt;/em>&amp;rdquo; and hit enter. The &lt;em>message&lt;/em> is where you will insert a succinct, informative description of what changed between the last version and this newest version of the script. Writing good commit messages is a bit of an art, but there is some information [here] (&lt;a href="http://chris.beams.io/posts/git-commit/" target="_blank" rel="noopener">http://chris.beams.io/posts/git-commit/&lt;/a>) on good commit messages.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>You can use git status again to check that the commit worked. Type in &amp;ldquo;git status&amp;rdquo; and hit Enter. Now the entire &amp;ldquo;Changes to be committed:&amp;rdquo; section should be gone, because there should no longer be any changes that haven&amp;rsquo;t been committed in this repo.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>You can look at this commit, and all previous commits, by typing in &amp;ldquo;git log&amp;rdquo; and hitting Enter. This will bring up a list of the commits, with the most recent commit at the top. The information about each commit includes the author, date, and message of the commit. At the top of each commit, there is a long string of letters and numbers. This is the hash, or unique identifier, for each commit.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>You will do this add and commit workflow (steps 2-7) each time you make a substantial change to this script, or when you want to add another script.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>Pushing and pulling:&lt;/p>
&lt;p>TODO: add summary here (about getting changes up to GitHub repository)&lt;/p></description></item><item><title>Containers</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/containers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/containers/</guid><description>&lt;p>Containers are a tool for running code in self-contained environments.&lt;/p>
&lt;h2 id="debugging-r-devel-using-docker">Debugging r-devel using Docker&lt;/h2>
&lt;p>First &lt;a href="https://www.docker.com/get-started" target="_blank" rel="noopener">install docker&lt;/a> for your operating system.&lt;/p>
&lt;p>Then get an r-devel container:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">sudo docker pull rocker/drd
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The run docker interactively while mounting your working directory:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">sudo docker run -v WORKING/DIRECTORY:/mnt/WORKDIR -it rocker/drd:latest
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Replace &lt;code>WORKING/DIRECTORY&lt;/code> with the path to the directory whose files you want to access and &lt;code>WORKDIR&lt;/code> with whatever you want it to be named inside the &lt;code>/mnt/&lt;/code> directory in the container.&lt;/p>
&lt;p>This will open an interactive R console running r-devel.&lt;/p>
&lt;h2 id="using-containers-in-vs-code">Using containers in VS Code&lt;/h2>
&lt;p>If you use VS Code as your IDE you can develop inside a container.
To setup this up follow &lt;a href="https://code.visualstudio.com/docs/devcontainers/containers" target="_blank" rel="noopener">the official instructions&lt;/a>, which can be summarized as:&lt;/p>
&lt;ol>
&lt;li>Install docker for your OS with non-sudo access (on Linux add your user to the &lt;code>docker&lt;/code> group and logout)&lt;/li>
&lt;li>Install the Dev Containers extension&lt;/li>
&lt;li>Create a &lt;code>.devcontainers/devcontainer.json&lt;/code> file to your project directory indicating which container to use, e.g,.&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-json" data-lang="json">&lt;span class="line">&lt;span class="cl">&lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="nt">&amp;#34;name&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;r-devel&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="nt">&amp;#34;image&amp;#34;&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;rocker/drd:latest&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This can be a little tricky to setup (don&amp;rsquo;t be afraid to ask for help), but when you open the project in VS Code you&amp;rsquo;ll be automatically working in the designated container.&lt;/p>
&lt;h2 id="using-containers-on-hipergator">Using containers on HiPerGator&lt;/h2>
&lt;p>While many developers are familiar with Docker, HiPerGator uses a slightly different container system called &lt;a href="https://docs.rc.ufl.edu/software/apps/apptainer/" target="_blank" rel="noopener">apptainer&lt;/a>. Apptainer can run Docker containers by first converting them to a Singularity image (.sif), which is similar to an ISO if you&amp;rsquo;ve ever burned a disk or created a bootable USB drive. There is some convenience here that the image of the VM is an obvious portable file on disk, while with Docker it&amp;rsquo;s not always obvious where things are stored. Interacting with this image file is very similar to Docker. You can read more information about Apptainer &lt;a href="https://apptainer.org/docs/user/latest/quick_start.html#interacting-with-images" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;h2 id="example-open-drone-map">Example: Open Drone Map&lt;/h2>
&lt;p>In this example we&amp;rsquo;ll use Open Drone Map (&lt;a href="https://github.com/OpenDroneMap/ODM" target="_blank" rel="noopener">ODM&lt;/a>). ODM can be run via a docker container to avoid a fairly complex installation process.&lt;/p>
&lt;p>First, we need to &lt;a href="https://apptainer.org/docs/user/latest/docker_and_oci.html" target="_blank" rel="noopener">pull an image&lt;/a>. On the cluster, we need to load the &lt;code>apptainer&lt;/code> module first. Then we have to &lt;code>pull&lt;/code> the image. This may take a while, so use &lt;code>srun&lt;/code> or &lt;code>sbatch&lt;/code> with a job file. The first argument is the (optional) path to the created image, the second argument is the docker repo ID:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="n">module&lt;/span> &lt;span class="nb">load&lt;/span> &lt;span class="n">aptainer&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">srun&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">t&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">00&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">00&lt;/span> &lt;span class="n">apptainer&lt;/span> &lt;span class="n">pull&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">path&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">to&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">odm&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sif&lt;/span> &lt;span class="n">docker&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="o">//&lt;/span>&lt;span class="n">opendronemap&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">odm&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="launchihng-the-container">Launchihng the container&lt;/h3>
&lt;p>To run a container, call &lt;code>srun apptainer run&lt;/code> which should run the &lt;a href="https://docs.docker.com/reference/dockerfile/#entrypoint" target="_blank" rel="noopener">entrypoint&lt;/a> in the container. Like Docker, you can also launch a shell or execute any program that&amp;rsquo;s installed in the container (&lt;code>docker exec&lt;/code> -&amp;gt; &lt;code>apptainer exec&lt;/code>).&lt;/p>
&lt;h3 id="mounting-local-directories">Mounting local directories&lt;/h3>
&lt;p>The most common argument you&amp;rsquo;ll want is &lt;code>--bind&lt;/code> which is similar to &lt;code>-v&lt;/code> in Docker and lets you mount a local filesystem in the container. In our example, ODM will look for input files in the &lt;code>/project&lt;/code> folder. If we create folder called &lt;code>data/working_dir&lt;/code>, we can then &lt;em>bind&lt;/em> this directory to &lt;code>/project&lt;/code> like &lt;code>--bind &amp;quot;data/working_dir:/project&amp;quot;&lt;/code> (&lt;code>--bind &amp;quot;host_path:container_path&amp;quot;&lt;/code>)&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">... &lt;span class="c1"># Other SBATCH options&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --job-name=odm-node&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --nodes=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --partition=hpg-turin&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --cpus-per-task=16&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem=64GB&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --time=12:00:00&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --gpus=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">module load apptainer
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">srun apptainer run --bind &lt;span class="s2">&amp;#34;&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">PROJECT_FOLDER&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="s2">:/project&amp;#34;&lt;/span> odm.sif --project-path /project --max-concurrency &lt;span class="m">16&lt;/span> --fast-orthophoto
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Don&amp;rsquo;t forget to add any other &lt;code>SBATCH&lt;/code> flags you need.&lt;/p>
&lt;h3 id="nvidia-inside-containers">NVIDIA inside containers&lt;/h3>
&lt;p>Use the &lt;code>--nv&lt;/code> flag to &lt;a href="https://apptainer.org/docs/user/latest/gpu.html" target="_blank" rel="noopener">pass NVIDIA GPUs&lt;/a> to the container. This requires CUDA to be installed on the host system (i.e. &lt;code>module load cuda&lt;/code> first).&lt;/p></description></item><item><title>Create an orthomosaic in Agisoft</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/create-orthomosaic-agisoft/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/create-orthomosaic-agisoft/</guid><description>&lt;h1 id="orthomosiac-guide">Orthomosiac Guide&lt;/h1>
&lt;p>Written by Ben Weinstein, September 13, 2022&lt;/p>
&lt;p>The goal of this wiki is to document the steps to create a Orthomosaic from a set of UAV raw images.
This relates to the everglades project with both Inspire Quadcopter and Wingtra Drones, but can serve as a general guide for the steps to create a georeferenced image.&lt;/p>
&lt;h1 id="agisoft">Agisoft&lt;/h1>
&lt;p>This tutorial uses Agisoft Metashape Pro. Future work needs to determine whether the standard version will suffice.
We are following the manual here: &lt;a href="https://www.agisoft.com/pdf/metashape-pro_1_7_en.pdf" target="_blank" rel="noopener">https://www.agisoft.com/pdf/metashape-pro_1_7_en.pdf&lt;/a>
This tutorial is quick and dirty in the sense that we are choosing low quality outputs to speed up the workflow.&lt;/p>
&lt;h2 id="load-images">Load Images&lt;/h2>
&lt;p>Raw images look like:
&lt;img width="1702" alt="Screen Shot 2022-09-13 at 11 19 35 AM" src="https://user-images.githubusercontent.com/1208492/189980044-8d168d05-5207-4880-9c4c-0773f91b6d70.png">&lt;/p>
&lt;p>Workflow -&amp;gt; Add Folder
&lt;img width="1630" alt="Screen Shot 2022-09-13 at 11 21 12 AM" src="https://user-images.githubusercontent.com/1208492/189980448-df8fabaa-e583-4b49-ac24-aa560fb6c8a8.png">&lt;/p>
&lt;h2 id="align-photos">Align Photos&lt;/h2>
&lt;p>Align photos places the images from a single physical camera into a common reference system. It creates a sparse point cloud of overlap to generally tell where the images are in space.&lt;/p>
&lt;img width="1765" alt="Screen Shot 2022-09-13 at 11 27 44 AM" src="https://user-images.githubusercontent.com/1208492/189981585-9d22162a-3090-46f6-8312-1aa294a1d4cc.png">
&lt;p>To view the coordinate system and alignment, view the &amp;lsquo;reference&amp;rsquo; tab in the bottom left.
&lt;img width="1486" alt="Screen Shot 2022-09-13 at 11 29 45 AM" src="https://user-images.githubusercontent.com/1208492/189982143-23d88091-a64b-4eed-a4df-1cd94040a3f9.png">&lt;/p>
&lt;h3 id="set-coordinate-reference-system">Set Coordinate Reference System&lt;/h3>
&lt;p>We are confident that the Inspire knows its geospatial accuracy to about 10m, and the pitch degree to about 1m.
&lt;img width="1476" alt="Screen Shot 2022-09-13 at 11 38 42 AM" src="https://user-images.githubusercontent.com/1208492/189983775-5c3d4efa-d6a2-4c92-b3fc-4492810a44f7.png">&lt;/p>
&lt;h2 id="set-marker-points">Set Marker Points&lt;/h2>
&lt;p>If we don&amp;rsquo;t have ground control points we can skip this step.&lt;/p>
&lt;h2 id="build-dense-cloud">Build Dense Cloud&lt;/h2>
&lt;p>Workflow -&amp;gt; Build Dense Cloud&lt;/p>
&lt;p>The dense point cloud is required to build an elevation model to convert the 2D images into a 3d surface.&lt;/p>
&lt;h2 id="build-digital-elevation-model">Build Digital Elevation Model&lt;/h2>
&lt;p>The elevation model projects the images into 3D space and is saved as a seperate .tif file.&lt;/p>
&lt;p>Workflow -&amp;gt; Build DEM&lt;/p>
&lt;h2 id="build-orthomosaic">Build Orthomosaic&lt;/h2>
&lt;p>Use the digital elevation model and the dense point cloud to create a single stitched model of the entire colony.&lt;/p>
&lt;img width="1651" alt="image" src="https://user-images.githubusercontent.com/1208492/189990313-2fd7e259-e314-46b6-b811-e14a8e7795b2.png"></description></item><item><title>File Compression Notes</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/file-compression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/file-compression/</guid><description>&lt;h2 id="efficient-large-volume-decompression">Efficient large volume (de)compression&lt;/h2>
&lt;p>When archiving large volumes of data using parallel and highly efficient algorithms can be useful.
We most commonly do this when archiving old projects on the HPC.&lt;/p>
&lt;p>On Linux (and our HPC) one of the easy ways to do this is with tar with zstd compression.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">tar --use-compress-program&lt;span class="o">=&lt;/span>zstd -cvf my_archive.tar.zst /path/to/archive
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If you need to pass arguments to zstd they can be included in quotes, &lt;code>'zstd -v'&lt;/code>.&lt;/p>
&lt;p>To uncompress these archives:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">tar --use-compress-program&lt;span class="o">=&lt;/span>unzstd -xvf my_archive.tar.zst
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="ignoring-failed-reads-using-tar">Ignoring failed reads using tar&lt;/h2>
&lt;p>When archiving files with tar the archive will fail if any file cannot be read by the account doing the archiving.
This is a common occurrence we archiving on the HPC and the files are often (but not always) hidden files that don&amp;rsquo;t need to be archived (but definitely check to make sure).
You can ignored these failed reads using the &lt;code>--ignore-failed-read&lt;/code> flag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">tar --ignore-failed-read -cvf my_archive.tar.zst /path/to/archive
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="fixing-a-corrupted-zip-file">Fixing a corrupted zip file&lt;/h2>
&lt;h3 id="using-zip">Using zip&lt;/h3>
&lt;p>If you try to open a zip file and it won&amp;rsquo;t unzip you can often fix it by rezipping the file (&lt;a href="https://superuser.com/questions/23290/terminal-tool-linux-for-repair-corrupted-zip-files" target="_blank" rel="noopener">source&lt;/a>).&lt;/p>
&lt;p>First, try:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">zip -F corrupted.zip --out fixed.zip
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If that doesn&amp;rsquo;t work try:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">zip -FF corrupted.zip --out fixed.zip
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If you receive an error message like:&lt;/p>
&lt;blockquote>
&lt;p>zip error: Entry too big to split, read, or write (Poor compression resulted in unexpectedly large entry - try -fz)&lt;/p>
&lt;/blockquote>
&lt;p>then:&lt;/p>
&lt;ol>
&lt;li>Make sure you have at least version 3.0 of &lt;code>zip&lt;/code>&lt;/li>
&lt;li>Try adding &lt;code>-fz&lt;/code> to the command&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">zip -FF -fz corrupted.zip --out fixed.zip
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="using-p7zip">Using p7zip&lt;/h3>
&lt;p>If none of this works try &lt;a href="https://7-zip.org/" target="_blank" rel="noopener">p7zip&lt;/a>, which can be installed using conda.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">conda create -n p7zip &lt;span class="nv">python&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">3&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">conda activate p7zip
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">conda install -c bioconda p7zip
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This version is pretty out of date, but much less out of date than the one in the HiperGator module system.
The one in the HiperGator module is too old to solve the problems we&amp;rsquo;ve seen.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">7za x corrupted.zip
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note that this will decompress into the current working directory not into &lt;code>./corrupted/&lt;/code>&lt;/p>
&lt;h2 id="increasing-compression">Increasing compression&lt;/h2>
&lt;p>There is a tradeoff between how long it takes to compress something and how much smaller gets.
When using &lt;code>zip&lt;/code> this is controlled by a numeric argument ranging from 1 (faster) to 9 (smaller).
So, if you&amp;rsquo;re archiving large objects try using &lt;code>zip -9&lt;/code>.&lt;/p>
&lt;p>If you have a bunch of already zipped files you can recompress them using the following bash loop:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">&lt;span class="k">for&lt;/span> f in *.zip
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">do&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> mkdir &lt;span class="si">${&lt;/span>&lt;span class="nv">f&lt;/span>&lt;span class="p">%.*&lt;/span>&lt;span class="si">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> unzip -d &lt;span class="si">${&lt;/span>&lt;span class="nv">f&lt;/span>&lt;span class="p">%.zip&lt;/span>&lt;span class="si">}&lt;/span> &lt;span class="nv">$f&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> rm &lt;span class="nv">$f&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> rm &lt;span class="si">${&lt;/span>&lt;span class="nv">f&lt;/span>&lt;span class="p">%.*&lt;/span>&lt;span class="si">}&lt;/span>/&lt;span class="si">${&lt;/span>&lt;span class="nv">f&lt;/span>&lt;span class="p">%.*&lt;/span>&lt;span class="si">}&lt;/span>.csv
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> zip -r -9 &lt;span class="nv">$f&lt;/span> &lt;span class="si">${&lt;/span>&lt;span class="nv">f&lt;/span>&lt;span class="p">%.zip&lt;/span>&lt;span class="si">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">done&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>Geospatial Computing from the Command Line</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/geospatial-computing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/geospatial-computing/</guid><description>&lt;h2 id="installation">Installation&lt;/h2>
&lt;p>The commands on this page use &lt;code>gdal&lt;/code> and &lt;code>jq&lt;/code>.&lt;/p>
&lt;h3 id="conda">Conda&lt;/h3>
&lt;p>You can install these packages on any operating system using conda/mamba.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">mamba create -n my-gdal-env &lt;span class="nv">python&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">3&lt;/span> gdal jq
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Then activate the environment every time you want to work with gdal.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">mamba activate my-gdal-env
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="linux">Linux&lt;/h3>
&lt;p>On Ubuntu these can be installed using:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">sudo apt install gdal-bin jq
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="get-information-about-a-raster">Get information about a raster&lt;/h2>
&lt;p>&lt;code>gdalinfo&lt;/code> provides information about raster files.&lt;/p>
&lt;p>&lt;code>gdalinfo myraster.tif&lt;/code> will produce a basic readable output to the screen.&lt;/p>
&lt;p>This output can also be written to JSON&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdalinfo -json myraster.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Writing to JSON makes it easy to use individual pieces, e.g., to look up the dimensions of the raster (you&amp;rsquo;ll need to install &lt;code>jq&lt;/code> to do this).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdalinfo -json myraster.tif &lt;span class="p">|&lt;/span> jq -r .size
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>or just the width of the raster&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdalinfo -json myraster.tif &lt;span class="p">|&lt;/span> jq -r .size&lt;span class="o">[&lt;/span>0&lt;span class="o">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="splitting-rasters-using-gdal">Splitting rasters using gdal&lt;/h2>
&lt;p>One way to split a raster into pieces is to use the &lt;code>gdal_retile.py&lt;/code> Python script bundled with &lt;code>gdal&lt;/code>.&lt;/p>
&lt;p>The following command will split &lt;code>myraster.tif&lt;/code> into 1500x1500 pixel rasters stored in &lt;code>outputdir&lt;/code>.
The first number is the width (in pixels) and the second is the height (in pixels) of each chunk.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdal_retile.py -ps &lt;span class="m">1500&lt;/span> &lt;span class="m">1500&lt;/span> -targetDir outputdir myraster.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Files will be labeled with &lt;code>_row_col&lt;/code> and so if the original image was 4500x1500 then the above command would produce three output files:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">myraster_1_1.tif
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">myraster_2_1.tif
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">myraster_3_1.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Representing the top of the original raster (&lt;code>_1_1&lt;/code>), the middle of the original raster (&lt;code>_2_1&lt;/code>), and the bottom of the original raster (&lt;code>_3_1&lt;/code>).&lt;/p>
&lt;h3 id="split-raster-into-horizontal-strips">Split raster into horizontal strips&lt;/h3>
&lt;p>Our most common usage is to split large rasters into horizontal strips with manageable file sizes (&amp;lt; 3 GB). This can be automated by changing &lt;code>myraster.tif&lt;/code> to the location of your raster and &lt;code>outputdir&lt;/code> to the directory you want the split raster pieces stored in and running the code below:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="nv">RASTER&lt;/span>&lt;span class="o">=&lt;/span>myraster.tif
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">OUTPUTDIR&lt;/span>&lt;span class="o">=&lt;/span>outputdir
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">WIDTH&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$(&lt;/span>gdalinfo -json &lt;span class="nv">$RASTER&lt;/span> &lt;span class="p">|&lt;/span> jq -r .size&lt;span class="o">[&lt;/span>0&lt;span class="o">]&lt;/span>&lt;span class="k">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">WIDTHPAD&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$((&lt;/span>WIDTH &lt;span class="o">+&lt;/span> &lt;span class="m">10&lt;/span>&lt;span class="k">))&lt;/span> &lt;span class="c1"># Padding prevents periodic inclusion of single pixel strip &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nv">HEIGHT&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="k">$(&lt;/span>expr &lt;span class="m">1000000000&lt;/span> / &lt;span class="nv">$WIDTH&lt;/span>&lt;span class="k">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">gdal_retile.py -ps &lt;span class="nv">$WIDTHPAD&lt;/span> &lt;span class="nv">$HEIGHT&lt;/span> -targetDir &lt;span class="nv">$OUTPUTDIR&lt;/span> &lt;span class="nv">$RASTER&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="combiningmerging-rasters-using-gdal">Combining/merging rasters using gdal&lt;/h2>
&lt;h3 id="merging-rasters">Merging rasters&lt;/h3>
&lt;p>One way to combine rasters is to use the &lt;code>gdal_merge.py&lt;/code> Python script bundled with &lt;code>gdal&lt;/code>.
We use LZW compression to reduce file sizes while ensuring that the resulting GeoTIFF can be used in all geospatial computing systems.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdal_merge.py -o output_file.tif input_file_1.tif input_file_2.tif input_file_3.tif
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">gdal_translate -co &lt;span class="nv">COMPRESS&lt;/span>&lt;span class="o">=&lt;/span>LZW -co &lt;span class="nv">PREDICTOR&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">2&lt;/span> -co &lt;span class="nv">BIGTIFF&lt;/span>&lt;span class="o">=&lt;/span>YES output_file.tif compressed_file.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We use this two command approach instead of including compression in the merge command because gdal_merge.py doesn&amp;rsquo;t currently support BigTIFF creation correctly and many of our combined files are greater than the 4GB maximum for regular TIFFs.
&lt;code>PREDICTOR=2&lt;/code> produces more efficient compression in the presence of spatial autocorrelation, which we often have.&lt;/p>
&lt;h3 id="virtually-combining-rasters">Virtually combining rasters&lt;/h3>
&lt;p>Instead of actually merging the rasters you can create a virtual raster in a vrt file.
This file includes metadata on the positions of all of the rasters, which can be loaded into a GIS and viewed like a single raster.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdalbuildvrt virtual_combined_raster.vrt *.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="removing-alpha-channels">Removing alpha channels&lt;/h2>
&lt;p>All of our code works with 3 band RGB rasters.
Occasionally we accidentally produce a raster that contains a 4th alpha channel.
This can be removed using GDAL.&lt;/p>
&lt;p>First check to make sure the bands you want are the first three bands (they pretty much always are):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdalinfo four_band_ortho.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This should show something like the following info about channels:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">Band &lt;span class="m">1&lt;/span> &lt;span class="nv">Block&lt;/span>&lt;span class="o">=&lt;/span>1501x1 &lt;span class="nv">Type&lt;/span>&lt;span class="o">=&lt;/span>Byte, &lt;span class="nv">ColorInterp&lt;/span>&lt;span class="o">=&lt;/span>Red
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Band &lt;span class="m">2&lt;/span> &lt;span class="nv">Block&lt;/span>&lt;span class="o">=&lt;/span>1501x1 &lt;span class="nv">Type&lt;/span>&lt;span class="o">=&lt;/span>Byte, &lt;span class="nv">ColorInterp&lt;/span>&lt;span class="o">=&lt;/span>Green
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Band &lt;span class="m">3&lt;/span> &lt;span class="nv">Block&lt;/span>&lt;span class="o">=&lt;/span>1501x1 &lt;span class="nv">Type&lt;/span>&lt;span class="o">=&lt;/span>Byte, &lt;span class="nv">ColorInterp&lt;/span>&lt;span class="o">=&lt;/span>Blue
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Band &lt;span class="m">4&lt;/span> &lt;span class="nv">Block&lt;/span>&lt;span class="o">=&lt;/span>1501x1 &lt;span class="nv">Type&lt;/span>&lt;span class="o">=&lt;/span>Byte, &lt;span class="nv">ColorInterp&lt;/span>&lt;span class="o">=&lt;/span>Alpha
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The use &lt;code>gdal_translate.py&lt;/code> to just keep the first three bands:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">gdal_translate -b &lt;span class="m">1&lt;/span> -b &lt;span class="m">2&lt;/span> -b &lt;span class="m">3&lt;/span> four_band_ortho.tif three_band_ortho.tif
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;a href="https://support.skycatch.com/hc/en-us/articles/219178537-Removing-the-Alpha-Channel-from-Your-Orthotiff" target="_blank" rel="noopener">Original source&lt;/a>&lt;/p></description></item><item><title>Collaborating Using Git &amp; GitHub</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/git-collaboration/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/git-collaboration/</guid><description>&lt;h2 id="intro">Intro&lt;/h2>
&lt;p>This is intended to be a default set of procedures for Weecologists to collaborate together using a Git/GitHub repository. For projects that are primarily being worked on by one person, this is probably unnecessary, but you may want to follow this anyway, so as to ingrain the workflow practices.&lt;/p>
&lt;h2 id="setup">Setup&lt;/h2>
&lt;p>&lt;em>If you haven&amp;rsquo;t done so already, please check out the onboarding &lt;a href="https://deploy-preview-83--weecology-wiki.netlify.app/docs/getting-started/new-member-onboarding/#git-and-github">section with links to Git and Github resources&lt;/a>.&lt;/em>&lt;/p>
&lt;p>In this guide, we presume that there is a single repo on GitHub and multiple users, who work on clones of that repo (on their local machines), and interface through GitHub.&lt;/p>
&lt;h2 id="branching">Branching&lt;/h2>
&lt;p>One way of thinking about git branches is that each branch represents a &amp;ldquo;lineage&amp;rdquo; of commits in a repo. By default, git repos have a &lt;code>master&lt;/code> branch, and adding commits to a new repo will create iterative versions of the project, all considered to be part of the &lt;code>master&lt;/code> branch.&lt;/p>
&lt;p>You can see the branches in your project using &lt;code>git branch&lt;/code> from the command line while in the folder with a git repo. This will list the branches in the repo:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl"> biomass-function
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> hao-data-vignette
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* master
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here, &lt;code>master&lt;/code> is marked with an asterisk (and possibly a different color) to indicate that it is the &amp;ldquo;active&amp;rdquo; branch. What this means is that new commits added to the repo will be derived from the end of the master branch and included as part of that branch.&lt;/p>
&lt;h3 id="making-new-branches">Making New Branches&lt;/h3>
&lt;p>We can create new branches by specifying a new branch name when using the &lt;code>git branch&lt;/code> command. This allows us to start a new &amp;ldquo;lineage&amp;rdquo; of commits from the current state of the repo.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git branch hao-test-branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When we look at the branches, we now see:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl"> biomass-function
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> hao-data-vignette
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> hao-test-branch
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* master
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Notice that the active branch is still &amp;ldquo;master&amp;rdquo;.&lt;/p>
&lt;h3 id="switching-branches">Switching Branches&lt;/h3>
&lt;p>To change the active branch, we use the &lt;code>git checkout&lt;/code> command:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git checkout hao-test-branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Switched to branch &amp;#39;hao-test-branch&amp;#39;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This is what it looks like when we run &lt;code>git branch&lt;/code> afterword:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl"> biomass-function
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> hao-data-vignette
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* hao-test-branch
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> master
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="pushing-to-github">Pushing to GitHub&lt;/h3>
&lt;p>After we have created a branch on our local clone of the repo, and made some commits, we might want to push those commits to GitHub. The first time we do so, however, we encounter an error:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git push
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">fatal: The current branch hao-test-branch has no upstream branch.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">To push the current branch and set the remote as upstream, use
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> git push --set-upstream origin hao-test-branch
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The reason for this error is that the repo on GitHub does not have the branch &lt;code>hao-test-branch&lt;/code>, and commits have to be assigned to a branch. The suggested command does several things at once:&lt;/p>
&lt;ol>
&lt;li>create a branch called &lt;code>hao-test-branch&lt;/code> on the GitHub repo (which has the remote name &lt;code>origin&lt;/code>)&lt;/li>
&lt;li>establish a link between the local branch called &lt;code>hao-test-branch&lt;/code> and the GitHub branch called &lt;code>hao-test-branch&lt;/code>&lt;/li>
&lt;li>push the local commits on &lt;code>hao-test-branch&lt;/code> to GitHub.&lt;/li>
&lt;/ol>
&lt;h3 id="pulling-from-github">Pulling from GitHub&lt;/h3>
&lt;p>Suppose someone starts making an update and has pushed it to GitHub and wants your help before merging it into the master branch. How do you download that new branch?&lt;/p>
&lt;p>First, make sure we get all the information from the GitHub repo. This assumes that the GitHub repo is named as the &amp;ldquo;origin&amp;rdquo; remote (which is the default).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git fetch origin
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We can then view the possible branches using&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git branch -r
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">biomass&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">function&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">fix&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">test&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">hao&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">data&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">vignette&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">hao&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="k">export&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">obs&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="k">func&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">hao&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">loadData&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">update&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">hao&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">reorder&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">args&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">remove&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">incomplete&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">censuses&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">master&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">namespace_issue&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">namespaceissues&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">origin&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">standardize_column_names&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We want to create a local branch to mirror the &amp;ldquo;fix-test&amp;rdquo; branch:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">~/projects/portalr &amp;gt; git checkout -b fix-test origin/fix-test
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Branch fix-test set up to track remote branch fix-test from origin.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Switched to a new branch &amp;#39;fix-test&amp;#39;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This has done several things: it retrieved the branch from GitHub to our local machine, set up tracking, and changed the current active branch. Now, if we make new commits to the local copy of the branch, we are able to push directly to that corresponding branch on GitHub.&lt;/p>
&lt;h2 id="pull-requests">Pull Requests&lt;/h2>
&lt;p>The preference is to use GitHub to merge the updates on a new branch back into &lt;code>master&lt;/code>. We can do this by going to the &amp;ldquo;Pull requests&amp;rdquo; tab on the GitHub repo page and creating a &amp;ldquo;New pull request&amp;rdquo;.
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/github_PR_tab.png" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Suppose we want to merge from &lt;code>hao-test-branch&lt;/code> into &lt;code>master&lt;/code>. Then we select &lt;code>master&lt;/code> as the &amp;ldquo;base: &amp;quot; branch, and &lt;code>hao-test-branch&lt;/code> as the &amp;ldquo;compare: &amp;quot; branch. We can then write some comments for our new pull request before clicking on &amp;ldquo;Create new pull request&amp;rdquo;.&lt;/p>
&lt;p>&lt;em>If the pull request fixes an issue, you can include keywords to &lt;a href="https://help.github.com/articles/closing-issues-using-keywords/" target="_blank" rel="noopener">automagically close&lt;/a> the issue when the pull request is merged.&lt;/em>&lt;/p>
&lt;h3 id="updating-pull-requests">Updating Pull Requests&lt;/h3>
&lt;p>At this point, other people can comment on the pull request itself in GitHub, if discussion regarding the changes needs to occur.&lt;/p>
&lt;p>Additionally, assuming that the pull request has not yet been merged, further commits &lt;em>to that branch on GitHub&lt;/em> are automatically included with the pull request. Thus, if you later find a bug, you can make further changes and not have to submit a new pull request.&lt;/p>
&lt;h3 id="merging-pull-requests">Merging Pull Requests&lt;/h3>
&lt;p>In general, check with one of the repo maintainers about merging pull requests. This ensures that the &lt;code>master&lt;/code> branch doesn&amp;rsquo;t break (too often) and that everyone is informed about changes.&lt;/p>
&lt;h2 id="summary-example">Summary Example&lt;/h2>
&lt;p>Objective: I want to fix issue #1 in the &lt;a href="https://github.com/weecology/portalr" target="_blank" rel="noopener">https://github.com/weecology/portalr&lt;/a> repo.&lt;/p>
&lt;ol>
&lt;li>Download the repo from GitHub and onto my local machine. [&lt;code>git clone&lt;/code>]&lt;/li>
&lt;li>In my local machine, create a new branch (e.g. &lt;code>hao-add-biomass-function&lt;/code> &amp;lt;- prefacing the branch name with your name helps prevent branch name collisions. [&lt;code>git branch&lt;/code>]&lt;/li>
&lt;li>Switch to the new branch. [&lt;code>git checkout&lt;/code>]&lt;/li>
&lt;li>Make the updates on my local machine. [&lt;code>git commit&lt;/code>]&lt;/li>
&lt;li>Push the updates to GitHub. [&lt;code>git push&lt;/code>]&lt;/li>
&lt;li>Create the pull request on GitHub. [GitHub web interface]&lt;/li>
&lt;li>Merge the pull request on GitHub. [GitHub web interface]&lt;/li>
&lt;li>On my local machine, switch back to the master branch. [&lt;code>git branch&lt;/code>]&lt;/li>
&lt;li>Get the updates to the master branch [&lt;code>git pull&lt;/code>]
(optionally) Delete the branch on GitHub. [GitHub web interface, &amp;ldquo;Code&amp;rdquo; tab, &amp;ldquo;## branches&amp;rdquo;]
(optionally) Delete the branch on my local machine. [&lt;code>git branch -d hao-add-biomass-function&lt;/code>]&lt;/li>
&lt;/ol></description></item><item><title>Git Tips</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/git-tips/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/git-tips/</guid><description>&lt;h2 id="deleting-all-merged-branches-locally">Deleting all merged branches (locally)&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">git switch main
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">git branch --merged &lt;span class="p">|&lt;/span> egrep -v &lt;span class="s2">&amp;#34;(^\*|master|main|dev)&amp;#34;&lt;/span> &lt;span class="p">|&lt;/span> xargs git branch -d
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Source: &lt;a href="https://stackoverflow.com/a/6127884/133513" target="_blank" rel="noopener">https://stackoverflow.com/a/6127884/133513&lt;/a>&lt;/p>
&lt;h2 id="pretty-git-log">Pretty git Log&lt;/h2>
&lt;p>(Hao) I have this in my ~/.gitconfig file to enable the git lg alias on command-line.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">[alias]
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">lg1 = log --graph --abbrev-commit --decorate --format=format:&amp;#39;%C(bold blue)%h%C(reset) - %C(bold green)(%ar)%C(reset) %C(white)%&amp;lt;(40,trunc)%s%C(reset) %C(reverse white)- %an%C(reset)%C(bold yellow)%d%C(reset)&amp;#39; --all
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">lg2 = log --graph --abbrev-commit --decorate --format=format:&amp;#39;%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n&amp;#39;&amp;#39; %C(white)%s%C(reset) %C(dim white)- %an%C(reset)&amp;#39; --all
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">lg = !&amp;#34;git lg1&amp;#34;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>GitHub Actions</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/github-actions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/github-actions/</guid><description>&lt;h2 id="generic-tutorial">Generic Tutorial&lt;/h2>
&lt;p>This is an example tutorial on how to set up a GitHub Action workflow.
In case the project has a Travis file, most of work would be copying and pasting the command into the different stages of the online template that is provided by GitHub.&lt;/p>
&lt;p>&lt;a href="https://www.youtube.com/watch?v=F3wZTDmHCFA" target="_blank" rel="noopener">https://www.youtube.com/watch?v=F3wZTDmHCFA&lt;/a>&lt;/p></description></item><item><title>Globus for file transfer</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/globus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/globus/</guid><description>&lt;p>&lt;a href="https://app.globus.org" target="_blank" rel="noopener">Globus&lt;/a> is a useful file manager for transferring files between your computer and the HiPerGator, or from place to place on the HiPerGator.&lt;/p>
&lt;p>&lt;a href="https://help.rc.ufl.edu/doc/Globus" target="_blank" rel="noopener">Here is the HiPerGator guide to Globus.&lt;/a> It may be more up to date.&lt;/p>
&lt;p>Briefly, to use Globus, you need to create an account with your UF login. You may need to get authorization to be added to the UF research computing access. Once you have access, you can use the online Globus interface to set up file transfers from different locations (called &amp;ldquo;endpoints&amp;rdquo;) on the HiPerGator.&lt;/p>
&lt;p>To transfer files to and from your computer, you need to install the Globus Connect Personal account on your computer and set it up as an &amp;ldquo;endpoint&amp;rdquo;. Then you can also set up transfers to and from your computer.&lt;/p></description></item><item><title>HiPerGator Intro Guide</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-intro-guide/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-intro-guide/</guid><description>&lt;h1 id="so-you-want-to-run-your-r-or-python-script-on-the-hipergator">So you want to run your R or python script on the HiperGator&lt;/h1>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This guide gives a high level overview of how one goes about running R or python scripts on a high performance cluster (HPC). There are no coding examples here, and instead is designed to give someone a frame of reference for how to approach things, and where other more detailed tutorials fit in the larger picture. The expected user is someone who is comfortable doing analysis and writing scripts in R using RStudio, or python with various IDEs, and has no HPC experience.&lt;/p>
&lt;p>This is written for users of the &lt;a href="https://www.rc.ufl.edu/get-started/hipergator/" target="_blank" rel="noopener">UFL HiperGator&lt;/a> who code in R or python, but most information will apply to potential users for any HPC system and with any scripting language.&lt;/p>
&lt;h2 id="hpc-use-cases">HPC Use cases&lt;/h2>
&lt;p>There are two scenarios where you may want to run your analysis script on the HPC.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Your code takes a very long time to run&lt;/strong>&lt;br>
If your code is taking several hours, days or more to run on your personal computer then using an HPC is likely a good option for two reasons. One is the servers on an HPC have processors much faster than desktops and laptops, so with minimal changes your code will run significantly faster. Second is scripts on an HPC run independently of your personal computer, so you can shutdown your personal computer while the script on the HPC runs overnight or over the weekend. HPC systems can have a time limit of several weeks to a month for any single job.&lt;br>
If your script takes so long that it seems like it will never finish then an HPC can be especially beneficial. Say you do a test run on 1% of your data and it takes 2 days to run. Theoretically it will then take 200 days to run on your full dataset. In this case the benefit of the HPC is its parallel processing power.&lt;br>
By default R and python scripts run on a single processor. Most computers today have 4-8 processors though. And HPC servers have upwards of 64. If you spread the work out to multiple processors you can significantly decrease the amount of time it takes to run. For example: a script that takes 1 hour to run can potentially take 0.5 hours with 2 processors, or 15 minutes with 4 processors, or 7.5 minutes with 4 processors, and so on. There is computational overhead with parallel processing though so halving the time with a doubling of the number of processors is only a general rule. This is not a straightforward change to your code though and it will take time. See below about making scripts parallel and whether it’s even worth it.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Your script fills up your computer&amp;rsquo;s memory and crashes when it runs.&lt;/strong>&lt;br>
When you have large datasets, such as raster files, it’s easy to use up all the memory and freeze your computer. As opposed to just waiting on long running scripts, in this case it makes it nearly impossible to do analysis. Just like HPC servers have powerful processors, they also have extremely large amounts of memory. Usually greater than 100GB. There is a good chance that they can handle whatever large datasets you throw at them.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="should-i-bother-with-an-hpc">Should I bother with an HPC?&lt;/h2>
&lt;p>Your analysis and data can be of &lt;em>any&lt;/em> size. There is no minimum computational requirement to use an HPC. But understand there is a time cost involved with learning how to interact with an HPC and also optimizing your code so it runs most efficiently. Therefore, in some cases it isn’t worth it porting scripts to an HPC system.&lt;/p>
&lt;p>Consider an example where you have a script that takes 1 hour on your laptop, and you must run it once a month. It’s likely reasonable to just keep that workflow. But if a script takes 10 hours and you must run it once a week, then it’s worth considering doing it on an HPC. Especially since that will decrease the wear and tear on your laptop and free up that 10 hours for other uses.&lt;/p>
&lt;p>Whether it’s worth it or not is unique to every situation. Also remember that once you learn all the HPC basics the first time, then that time cost isn&amp;rsquo;t needed for your next project.&lt;/p>
&lt;p>Also consider that the two use cases described above might also be solvable by code optimization. If you can find a section of code which is slow and make it run fast enough to meet your needs, that is preferable over running the code on an HPC. There is no one solution to this, but a good starting point is Hadley Wickham’s Advanced R tutorial on &lt;a href="http://adv-r.had.co.nz/Performance.html" target="_blank" rel="noopener">Performance and Profiling&lt;/a>. This &lt;a href="https://youtu.be/K_90QGUPYCA" target="_blank" rel="noopener">45 minute video&lt;/a> also gives a great overview of profiling, optimization, parallel processing, and the implications in R.&lt;/p>
&lt;h2 id="what-exactly-is-an-hpc">What exactly is an HPC?&lt;/h2>
&lt;p>A high performance cluster (HPC) is primarily two things.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>It’s hundreds of individual servers in a data center. Each server is a computer just like your personal computer, but has more powerful components, and does not have a graphical user interface or even a monitor. You interact with the servers via the command line. If you’ve never used the command line consider it like the Rstudio console, or a python prompt, was your &lt;em>only&lt;/em> way to interact with a computer. More on this below.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>It’s a system for scheduling, prioritizing, and running scripts from hundreds of users. This is how the hundreds of servers can be used as “one”. Access to them is controlled by scheduling programs which you interact with, which then put your scripts in a queue to be run when resources are free. Slurm is probably the most popular scheduler (and the one used on the HiperGator) but some HPC systems may use other ones like &lt;a href="https://en.wikipedia.org/wiki/Job_scheduler#Batch_queuing_for_HPC_clusters" target="_blank" rel="noopener">PBS or MOAB&lt;/a>.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="primary-steps-to-running-your-code-on-an-hpc">Primary Steps to running your code on an HPC.&lt;/h2>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>You need an account&lt;/strong>.&lt;br>
Signup for HiperGator HPC account &lt;a href="https://www.rc.ufl.edu/access/account-request/" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You must be able to login to the HPC via SSH and use the command line.&lt;/strong>&lt;br>
The command line can also be referred to as the “unix shell”. With this you use text commands (just like the RStudio console) to copy files, edit text files, interact with the scheduler, view job status, etc. See the &lt;a href="https://help.rc.ufl.edu/doc/Getting_Started#Connecting_to_HiPerGator" target="_blank" rel="noopener">Hipergator Connection guide&lt;/a>.&lt;/p>
&lt;p>Some unix tutorials:
- Scinet &lt;a href="https://geospatial.101workbook.org/IntroductionToCommandLine/Unix/unix-basics-1.html" target="_blank" rel="noopener">Geospatial Unix Intro&lt;/a>
- Data Carpentry &lt;a href="https://swcarpentry.github.io/shell-novice/" target="_blank" rel="noopener">tutorial on the unix shell&lt;/a>&lt;/p>
&lt;p>If you work on a Mac computer you have a full unix shell available already called the Terminal. For Windows users there are several options available. See the bottom of the Setup page for the Data Carpentry unix shell tutorial.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You must optimize your code to run on the HPC.&lt;/strong>&lt;br>
This is potentially the trickiest part. At a minimum your code must be able to run independently without any interaction from you. Do you have one large (or even several) analysis script where you highlight different parts to run in the correct order? Or to check output before moving on? That will not work on an HPC. A single R or python file (aka a script) must run from start to finish and write some results to a file to be able to be useful on the HPC.&lt;/p>
&lt;p>For R, a good test for this is the Jobs tab in RStudio (next to Console and Terminal tabs. Not available on older versions of RStudio). This is &lt;em>very&lt;/em> analogous to running a script in an HPC environment. If your script can run as an RStudio Job &lt;em>without&lt;/em> copying the local environment or copying the job results anywhere (your script should write results to some file) then it should be able to run on the HPC.&lt;/p>
&lt;p>For python a good test is being able to run the script via the command line (ie. &lt;code>python my_analysis.py&lt;/code>). If you are using an IDE (like spyder or pycharm) then running a full script from start to finish using the “Run” option should also be sufficient.&lt;/p>
&lt;p>Having a script run without any interaction does not necessarily mean it needs to have parallel processing. See Should I make my code run parallel? below.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You have to get your code and data onto the HPC.&lt;/strong>&lt;br>
You’ll need to use special programs to transfer files (both data and scripts) from your local computer to the HPC. For windows this will be the WinScp program, which will have the same username and password as logging into the command line. For mac or linux users you can use the Terminal to transfer files via the command line using the scp command. Read more about the &lt;a href="https://scinet.usda.gov/guides/data/datatransfer#small-data-transfer-using-scp-and-rsync" target="_blank" rel="noopener">scp command here&lt;/a>. More on data transfer for HiperGator can be found here. Something you’ll see mentioned a lot is Globus, which is a useful (but not strictly required) tool when you need to transfer 100GB+ of data.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You need to ensure you have the correct packages.&lt;/strong>&lt;br>
Most HPC systems will have common packages installed and ready to use. If not you’ll have to install them yourself. If you do this then the latest versions will be installed on the HPC, so it’s good practice to make sure all packages on your personal computer are up to date to so they match (in RStudio use Tools-&amp;gt; Check for package updates, in python use conda or pip to update all package to the latest version).
For python packages for your projects you’ll want to use environments with either conda or python virtual environments.&lt;/p>
&lt;p>If you run into errors installing R or python packages you&amp;rsquo;ll likely need to contact HPC support for help, especially if the errors involve missing system libraries. If you successfully install your own packages they will only be available to you, and not to anyone else using the HPC.&lt;/p>
&lt;p>Also take note of the &lt;a href="https://help.rc.ufl.edu/doc/Modules_Basic_Usage" target="_blank" rel="noopener">module&lt;/a> command on the HiperGator. This is used to load preloaded software, including R and python themselves. This is covered is most tutorials about batch scripts (see next section).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You can now submit your scripts to the scheduler.&lt;/strong>&lt;br>
Once your data and scripts are on the HPC system you can submit them to the scheduler to run. This involves defining “jobs” where you tell the scheduler what you need. Specifically: the location of your script, the resources needed (cpus and memory), the time needed to run your script, the location for putting script output and logs, etc.&lt;/p>
&lt;p>Jobs are defined via batch scripts which have a line for each piece of information.&lt;/p>
&lt;p>Some examples:&lt;/p>
&lt;ul>
&lt;li>A &lt;a href="https://geospatial.101workbook.org/Workshops/2-Session2-intro-to-ceres.html#batch-computing-on-ceres" target="_blank" rel="noopener">USDA tutorial on batch scripts&lt;/a>.&lt;/li>
&lt;li>A &lt;a href="https://help.rc.ufl.edu/doc/Sample_SLURM_Scripts" target="_blank" rel="noopener">sample of Hipergator batch scripts&lt;/a>.&lt;/li>
&lt;li>An &lt;a href="https://help.rc.ufl.edu/doc/Annotated_SLURM_Script" target="_blank" rel="noopener">annotated Hipergator batch script&lt;/a>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You might need to debug your script if it doesn’t run correctly.&lt;/strong>&lt;br>
It’s very common for scripts to not run the first time because they were written on a personal computer and things like directory paths and packages may be different. In this case it’s useful to debug your script on the HPC directly.
A good place to do this is an “interactive node” or “interactive session”. For these instead of submitting a job to the queue, you request a new unix shell where a small amount of resources are available. Here you can run your scripts via the Rscript or python command and see the output directly, and make adjustments as needed until it runs successfully.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://help.rc.ufl.edu/doc/Development_and_Testing" target="_blank" rel="noopener">HiperGator guide on interactive sessions&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://help.rc.ufl.edu/doc/GPU_Access#Interactive_Access" target="_blank" rel="noopener">HiperGator guide on interactive sessions when using GPUs&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>You can now get your results back.&lt;/strong>&lt;br>
This is the same process as putting scripts and data onto the HPC but in reverse.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="should-i-make-my-code-run-on-parallel-processes">Should I make my code run on parallel processes?&lt;/h2>
&lt;p>Before you dive into making your script parallel, do a quick cost/benefit analysis. It may indeed take a full day or more to redo your code to take advantage of parallel processing, but the benefits could be extremely large. If your code already runs in a relatively short time, like a few hours on your laptop and less than an hour on the HPC without any modification, and you&amp;rsquo;re happy with that then making it parallel might not be worth it.&lt;/p>
&lt;p>If you do not use parallel processing then your jobs will always request just a single processor. This is perfectly fine as there is no minimum requirement for using an HPC.&lt;/p>
&lt;h2 id="make-your-scripts-use-parallel-processing">Make your scripts use parallel processing&lt;/h2>
&lt;p>By default R and python run on a single processor. Most computers today have 4-8 processors. If you spread the work out to multiple processors you can significantly decrease the amount of time it takes to run.
For example: a script that takes 1 hour to run can potentially take 30 minutes with 2 processors, or 15 minutes with 4 processors. To make your scripts run across multiple processors, you&amp;rsquo;ll have to make some adjustments to your code.&lt;/p>
&lt;p>For R users, if your code uses lapply to run your main function to many items (e.g. fitting the same model to many species), you can swap it for mclapply from the parallel package without making any substantial changes. For more details and advanced uses, here are some short tutorials that go over this:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/" target="_blank" rel="noopener">A brief foray into parallel processing with R&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://resbaz.github.io/r-intermediate-gapminder/19-foreach.html" target="_blank" rel="noopener">Software Carpentry Parallel Processing in R&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://swcarpentry.github.io/python-intermediate-mosquitoes/04-multiprocessing.html" target="_blank" rel="noopener">Software Carpentry Parallel Processing in python&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>Some notes:&lt;/p>
&lt;ul>
&lt;li>If your code already uses functions and for loops, it should be straightforward to make it parallel, unless each pass through the loop depends on the outcome from previous passes.&lt;/li>
&lt;li>On your own computer, never set the amount of processors used to the max available. This will take away all the processing power needed to run the operating system, browser, and other programs, and could potentially crash your computer. To test out parallel code on my computer I set the number of processors to use at 2 (out of 8 available). Then when the scripts are moved to the HPC I set the amount to something higher.&lt;/li>
&lt;/ul>
&lt;h2 id="what-about-distributed-computing">What about distributed computing?&lt;/h2>
&lt;p>The links and examples for parallel computing above show you how to utilize the multiple processors in a single system. In the case of the HPC this means up to (usually) 64-128 processors on a single server. But what if you still need more processing power? In this case it’s possible to write parallel code which takes advantages of the processors on &lt;em>numerous&lt;/em> individual servers. This is how one utilizes 100’s or even 1000’s of processors.&lt;/p>
&lt;p>Going from single system parallel processing to distributed computing with your script is possible but will likely take even more work on your part. For this you might come across tutorials using MPI technology. MPI (Message-Passing Interface) is the protocol for analysis scripts to communicate between servers in an HPC environment, thus enabling distributed computing. Packages to use MPI are available in all common languages such as R, python, julia, and matlab.&lt;/p>
&lt;p>Newer packages are available which either deal with MPI in the background for you, or implement newer protocols. The R package &lt;a href="https://mllg.github.io/batchtools/index.html" target="_blank" rel="noopener">batchtools&lt;/a> has many high end functions for distributed computing. The python package &lt;a href="https://docs.dask.org/" target="_blank" rel="noopener">dask&lt;/a> is a state of the art package for distributed computing, and the accompanying &lt;a href="https://jobqueue.dask.org" target="_blank" rel="noopener">jobqueue&lt;/a> package integrates it with SLURM and other HPC schedulers.&lt;/p>
&lt;h2 id="other-considerations-and-important-points">Other considerations and important points&lt;/h2>
&lt;p>The Research Computing group has a &lt;a href="https://help.rc.ufl.edu/doc/UFRC_Help_and_Documentation" target="_blank" rel="noopener">wiki on HiperGator usage&lt;/a>.&lt;/p>
&lt;p>Some common tasks and scripts are outlind in the &lt;a href="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-reference/">HiPerGator Reference Guide&lt;/a>&lt;/p>
&lt;p>&lt;strong>Login Node&lt;/strong>: When you sign into the HPC there is a single landing server which you’ll start on. It’s important to never run actual scripts on this initial server. It should be used to submit jobs, request development nodes or interactive sessions, or transfer data in or out.&lt;/p>
&lt;p>&lt;strong>Partitions&lt;/strong>: HiperGator resources are divided into partitions, where each partition has a specific set of hardware and resource and tim**e limits. Whenever you request resources you’ll specify which partition you want to use. See the &lt;a href="https://help.rc.ufl.edu/doc/Available_Node_Features" target="_blank" rel="noopener">partitions wiki page&lt;/a>.&lt;/p>
&lt;p>&lt;strong>Account limits&lt;/strong>: The resources you request (eg. number of processors and amount of memory) is limited by how many credits your group has purchased. The number of jobs which can be running concurrently is also determined by this. See more on the &lt;a href="https://help.rc.ufl.edu/doc/Account_and_QOS_limits_under_SLURM" target="_blank" rel="noopener">account limits wiki page&lt;/a>. This is also referred to as QOS (quality of service), a term coined in the &lt;a href="https://en.wikipedia.org/wiki/Quality_of_service" target="_blank" rel="noopener">early internet days&lt;/a>.&lt;/p>
&lt;p>&lt;strong>Processors/CPU/Cores/Sockets/Threads&lt;/strong>: Each of these things is technically different and has a distinct definition. For most users of an HPC system they can be thought of interchangeably though. When you request resources via a batch script, or other method, you’ll usually ask for multiple CPUs to implement parallel processing and leave it at that. Advanced users can read about different terminology &lt;a href="https://login.scg.stanford.edu/faqs/cores/" target="_blank" rel="noopener">here&lt;/a> or &lt;a href="https://slurm.schedmd.com/mc_support.html" target="_blank" rel="noopener">here&lt;/a>.&lt;/p></description></item><item><title>HiPerGator Reference</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-reference/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-reference/</guid><description>&lt;h2 id="what-is-hipergator">What is HiperGator?&lt;/h2>
&lt;p>A University of Florida super-computing cluster.&lt;/p>
&lt;h2 id="why-should-i-use-it">Why should I use it?&lt;/h2>
&lt;p>HiperGator gives the user access to very large processing/memory/storage. This is useful for projects which can&amp;rsquo;t be run on your local laptop.&lt;/p>
&lt;h2 id="how-do-i-access-it">How do I access it?&lt;/h2>
&lt;ol start="0">
&lt;li>
&lt;p>&lt;a href="https://gravity.rc.ufl.edu/access/request-account/" target="_blank" rel="noopener">Request an account&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Connect with &lt;code>ssh &amp;lt;YOUR_USERNAME&amp;gt;@hpg2.rc.ufl.edu&lt;/code> from the Unix terminal or a Windows SSH client (&lt;a href="https://help.rc.ufl.edu/doc/Getting_Started" target="_blank" rel="noopener">more info here&lt;/a>). Enter your password when prompted.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>Need help with command line? A good tutorial is available at &lt;a href="http://swcarpentry.github.io/shell-novice/" target="_blank" rel="noopener">Software Carpentry&lt;/a>.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="s=100" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;img src="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/hipergator-login.png" height="400">
&lt;h2 id="how-do-i-run-a-job">How do I run a job?&lt;/h2>
&lt;p>For large analysis, you should submit a &lt;em>batch script&lt;/em> that tells Hipergator how to run your code. Let&amp;rsquo;s look at an example and walk through it.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Job name and who to send updates to&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --job-name=&amp;lt;JOBNAME&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mail-user=&amp;lt;EMAIL&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mail-type=FAIL,END&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --account=ewhite&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --partition=hpg2-compute&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --qos=ewhite-b # Remove the `-b` if the script will take more than 4 days; see &amp;#34;bursting&amp;#34; below&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Where to put the outputs: %j expands into the job number (a unique identifier for this job)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --output my_job%j.out&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --error my_job%j.err&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Number of nodes to use&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --nodes=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Number of tasks (usually translate to processor cores) to use: important! this means the number of mpi ranks used, useless if you are not using Rmpi)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --ntasks=1 &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#number of cores to parallelize with:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --cpus-per-task=15&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem=16000&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Memory per cpu core. Default is megabytes, but units can be specified with M &lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># or G for megabytes or Gigabytes.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem-per-cpu=2G&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Job run time in [DAYS]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># HOURS:MINUTES:SECONDS&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># [DAYS] are optional, use when it is convenient&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --time=72:00:00&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Save some useful information to the &amp;#34;output&amp;#34; file&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">date&lt;span class="p">;&lt;/span>hostname&lt;span class="p">;&lt;/span>&lt;span class="nb">pwd&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Load R and run a script named my_R_script.R&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Rscript my_R_script.R
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If you are successful, you&amp;rsquo;ll get a small message stating your job ID. Once your batch job is running, you can freely log out (or even turn off your local machine) and wait for an email telling you that it finished. You can log back in to see the results later.&lt;/p>
&lt;h2 id="interactive-work">Interactive work&lt;/h2>
&lt;h3 id="cpu">CPU&lt;/h3>
&lt;p>If you are running into errors, need to install a package in your local directory, or want to download some files, you should use a development server. This is good practice and nice to other people who are logged into the main head node.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#load module made by hipergator admin&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">ml&lt;/span> &lt;span class="n">ufrc&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#request a server for 3 hours with 2GB of memory&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">srundev&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">time&lt;/span> &lt;span class="mi">3&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">00&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">00&lt;/span> &lt;span class="o">--&lt;/span>&lt;span class="n">mem&lt;/span> &lt;span class="mi">2&lt;/span>&lt;span class="n">GB&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="gpu">GPU&lt;/h3>
&lt;p>To test out work involving a GPU you need to explicitly request a development node associated with a GPU. For many GPU tasks you may want a meaningful amount of memory.&lt;/p>
&lt;p>In most cases you&amp;rsquo;ll use the default GPUs:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">srun --nodes&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --gpus&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --mem 20GB --cpus-per-task&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --pty -u bash -i
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>But if you need a lot of VRAM (&amp;gt;24 GB/GPU) you can use the B200 nodes:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">srun -p hpg-b200 --nodes&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --gpus&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --mem 50GB --cpus-per-task&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> --pty -u bash -i
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>To increase the number of GPUs increase the value of &lt;code>--gpu&lt;/code>, but you typically shouldn&amp;rsquo;t need more than 2 for interactive work and then only if you&amp;rsquo;re setting up multi-GPU testing.&lt;/p>
&lt;p>To increase the number of CPUs increase the value for &lt;code>--cpus-per-task&lt;/code>&lt;/p>
&lt;h2 id="how-do-i-know-if-its-running">How do I know if its running?&lt;/h2>
&lt;p>Use squeue -u &lt;username>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">[b.weinstein@login3 ~]$ squeue -u b.weinstein
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> 25666905 gpu DeepFore b.weinst R 22:29:49 1 c36a-s7
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> 25672257 gpu DeepFore b.weinst R 21:07:19 1 c37a-s36
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The column labeled &amp;ldquo;S&amp;rdquo; is your job status. You want this to be &lt;code>R&lt;/code> for &amp;ldquo;running&amp;rdquo;, but it can spend a while as &lt;code>Q&lt;/code> (in the queue) before starting, especially if you request many cores. Sometimes (but not always) the output will explain why you&amp;rsquo;re still in the queue (e.g. QOSMEMLIMIT if you&amp;rsquo;re using too much memory).&lt;/p>
&lt;h2 id="how-do-i-get-my-data-on-to-hipergator">How do I get my data on to HiperGator?&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Often, the easiest way to transfer files to the server is using &lt;code>git clone&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If your files aren&amp;rsquo;t in a git repository, you can use FTP or &lt;code>scp&lt;/code>. FTP has graphical user interfaces that allow you to drag and drop files to the server.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If you use &lt;code>scp&lt;/code>, the syntax for copying one file from your user folder on the server to your local folder is &lt;code>scp MY_USER_NAME@gator.hpc.ufl.edu:/home/MY_USER_NAME/PATH_TO_MY_FILE MY_LOCAL_FILENAME&lt;/code>. Note the space between the remote path and your local filename. If you want to send a file in the other direction, switch the order of the local file and the remote location. You can copy whole folders with the &lt;code>-r&lt;/code> flag.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>If your files are large you should use Globus. See &lt;a href="https://wiki.weecology.org/docs/computers-and-programming/globus/" target="_blank" rel="noopener">the wiki page on Globus&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="https://help.rc.ufl.edu/doc/Storage" target="_blank" rel="noopener">More information about storage&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="storage">Storage&lt;/h2>
&lt;p>There are a few locations to store files on the hipergator&lt;/p>
&lt;p>&lt;code>/blue/ewhite/&lt;/code>&lt;br>
This is the primary space for storing large files, any large amounts of data generated by your programs should be pointed to this location.&lt;/p>
&lt;p>&lt;code>/orange/ewhite/&lt;/code>&lt;br>
This is another space to store large files. The total allocation here is much bigger than &lt;code>/blue&lt;/code>, but this storage space is slower than &lt;code>/blue&lt;/code>. If you have 100GB&amp;rsquo;s of data that you want to save, but are not currently using, &lt;code>/orange&lt;/code> is where they should go.&lt;/p>
&lt;p>&lt;code>/home/your_username/&lt;/code>&lt;br>
Your home directory has 20GB storage space for your scripts, logs, etc. You should not be storing large amounts of data here.&lt;/p>
&lt;h3 id="local-scratch-storage">Local scratch storage&lt;/h3>
&lt;p>&lt;code>$TMPDIR&lt;/code>&lt;br>
&lt;code>/blue&lt;/code> may be a bad place to store temporary cache files, especially if your program is generating 100&amp;rsquo;s of small (&amp;lt;1mb) files. An alternative is to use a temporary directory setup by SLURM every time you run a job. This can be referenced with the environment variable &lt;code>$TMPDIR&lt;/code> or &lt;code>SLURM_TMPDIR&lt;/code>. Read more about this here: &lt;a href="https://help.rc.ufl.edu/doc/Temporary_Directories" target="_blank" rel="noopener">Temporary Directories
&lt;/a>.&lt;/p>
&lt;p>This storage is available on each worker (but not login nodes) but does not persist. For example, it can be referenced from python&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">import os
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">os.env[&amp;#34;TMPDIR&amp;#34;]
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>To access this storage during a run, you can interactively ssh into the node and check out what&amp;rsquo;s there. The folder is named /scratch/local/{job_pid}.&lt;/p>
&lt;p>Note that for &lt;code>/blue&lt;/code> and &lt;code>/orange&lt;/code> if you are working on individual projects that are part of a larger effort you should work in a subdirectory &lt;code>/blue/ewhite/&amp;lt;your_username&amp;gt;/&lt;/code> and &lt;code>/orange/ewhite/&amp;lt;your_username&amp;gt;/&lt;/code>.&lt;/p>
&lt;p>Our current allocations as of July 2019&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Storage Type&lt;/th>
&lt;th>Location&lt;/th>
&lt;th>Quota&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>orange&lt;/td>
&lt;td>&lt;code>/orange&lt;/code>&lt;/td>
&lt;td>48 TB&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>blue&lt;/td>
&lt;td>&lt;code>/blue&lt;/code>&lt;/td>
&lt;td>25 TB&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>home&lt;/td>
&lt;td>&lt;code>/home/&amp;lt;your_username&amp;gt;&lt;/code>&lt;/td>
&lt;td>20 GB&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="status">Status&lt;/h2>
&lt;p>You can check the status of HiPerGator using the &lt;a href="https://metrics.rc.ufl.edu/" target="_blank" rel="noopener">Metrics Dashboard&lt;/a>&lt;/p>
&lt;h2 id="best-practices">Best Practices&lt;/h2>
&lt;p>Below are a collection of best practices by past Weecology Users. These are not the only way to do things, just some useful tools that worked for us.&lt;/p>
&lt;h3 id="r">R&lt;/h3>
&lt;h4 id="installing-packages">Installing packages&lt;/h4>
&lt;p>HiPerGator has a lot of packages installed already, but you might need to install your own, or you might want an updated version of an existing package.&lt;/p>
&lt;p>You can tell R to prefer your personal library of R packages over the ones maintained for Hipergator by adding &lt;code>.libPaths(c(&amp;quot;/home/YOUR_USER_NAME/R_libs&amp;quot;, .libPaths()))&lt;/code> to your &lt;code>.Rprofile&lt;/code>. If you don&amp;rsquo;t have one yet, you can create a new file with that name and put it in your home directory (e.g. in &lt;code>/home/harris.d/.Rprofile&lt;/code>).&lt;/p>
&lt;p>The end result will look like this.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="err">@&lt;/span>&lt;span class="n">dev1&lt;/span> &lt;span class="o">~&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">$&lt;/span> &lt;span class="n">cat&lt;/span> &lt;span class="o">~/.&lt;/span>&lt;span class="n">Rprofile&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">.&lt;/span>&lt;span class="n">libPaths&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">c&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;/home/b.weinstein/R_libs&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="o">.&lt;/span>&lt;span class="n">libPaths&lt;/span>&lt;span class="p">()))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;.Rprofile loaded&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You will need to create the &lt;code>R_libs&lt;/code> directory using &lt;code>mkdir R_libs&lt;/code>.&lt;/p>
&lt;p>When you load R, you should see&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="err">@&lt;/span>&lt;span class="n">dev1&lt;/span> &lt;span class="o">~&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">$&lt;/span> &lt;span class="n">ml&lt;/span> &lt;span class="n">R&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="err">@&lt;/span>&lt;span class="n">dev1&lt;/span> &lt;span class="o">~&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">$&lt;/span> &lt;span class="n">R&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">R&lt;/span> &lt;span class="n">version&lt;/span> &lt;span class="mf">3.5&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="mi">1&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="mi">2018&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">07&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="mi">02&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">--&lt;/span> &lt;span class="s2">&amp;#34;Feather Spray&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Copyright&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">C&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="mi">2018&lt;/span> &lt;span class="n">The&lt;/span> &lt;span class="n">R&lt;/span> &lt;span class="n">Foundation&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">Statistical&lt;/span> &lt;span class="n">Computing&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Platform&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="n">x86_64&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">pc&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">linux&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">gnu&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="mi">64&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">bit&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">R&lt;/span> &lt;span class="n">is&lt;/span> &lt;span class="n">free&lt;/span> &lt;span class="n">software&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="n">comes&lt;/span> &lt;span class="n">with&lt;/span> &lt;span class="n">ABSOLUTELY&lt;/span> &lt;span class="n">NO&lt;/span> &lt;span class="n">WARRANTY&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">You&lt;/span> &lt;span class="n">are&lt;/span> &lt;span class="n">welcome&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">redistribute&lt;/span> &lt;span class="n">it&lt;/span> &lt;span class="n">under&lt;/span> &lt;span class="n">certain&lt;/span> &lt;span class="n">conditions&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Type&lt;/span> &lt;span class="s1">&amp;#39;license()&amp;#39;&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="s1">&amp;#39;licence()&amp;#39;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">distribution&lt;/span> &lt;span class="n">details&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">Natural&lt;/span> &lt;span class="n">language&lt;/span> &lt;span class="n">support&lt;/span> &lt;span class="n">but&lt;/span> &lt;span class="n">running&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">an&lt;/span> &lt;span class="n">English&lt;/span> &lt;span class="n">locale&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">R&lt;/span> &lt;span class="n">is&lt;/span> &lt;span class="n">a&lt;/span> &lt;span class="n">collaborative&lt;/span> &lt;span class="n">project&lt;/span> &lt;span class="n">with&lt;/span> &lt;span class="n">many&lt;/span> &lt;span class="n">contributors&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Type&lt;/span> &lt;span class="s1">&amp;#39;contributors()&amp;#39;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">more&lt;/span> &lt;span class="n">information&lt;/span> &lt;span class="ow">and&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s1">&amp;#39;citation()&amp;#39;&lt;/span> &lt;span class="n">on&lt;/span> &lt;span class="n">how&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">cite&lt;/span> &lt;span class="n">R&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="n">R&lt;/span> &lt;span class="n">packages&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">publications&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Type&lt;/span> &lt;span class="s1">&amp;#39;demo()&amp;#39;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">some&lt;/span> &lt;span class="n">demos&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;help()&amp;#39;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">on&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">line&lt;/span> &lt;span class="n">help&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="ow">or&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="s1">&amp;#39;help.start()&amp;#39;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">an&lt;/span> &lt;span class="n">HTML&lt;/span> &lt;span class="n">browser&lt;/span> &lt;span class="n">interface&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">help&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Type&lt;/span> &lt;span class="s1">&amp;#39;q()&amp;#39;&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">quit&lt;/span> &lt;span class="n">R&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="s2">&amp;#34;.Rprofile loaded&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Once this is set up, you can install or update packages the usual way (e.g. with &lt;code>install.packages&lt;/code> or &lt;code>devtools::install_github&lt;/code>).&lt;/p>
&lt;h2 id="re-writing-your-code-to-take-advantage-of-multiple-cores">Re-writing your code to take advantage of multiple cores.&lt;/h2>
&lt;p>By default R runs on a single processor. Most computers today have 4-8 processors. If you spread the work out to multiple processors you can decrease the amount of time it takes to run by significantly. For example: a script that takes 1 hour to run can potentially take 0.5 hours with 2 processors, or 15 minutes with 4 processors. To make your scripts run across multiple processors, you&amp;rsquo;ll have to make some slight adjustments to your code.&lt;/p>
&lt;p>If your code uses &lt;code>lapply&lt;/code> to run your main function to many items (e.g. fitting a model to each species), you can swap it for &lt;code>mclapply&lt;/code> from the &lt;code>parallel&lt;/code> package without making any substantial changes. For more details and advanced uses, here are 2 short tutorials that go over this:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;a href="https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/" target="_blank" rel="noopener">A brief foray into parallel processing with R&lt;/a>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;a href="http://resbaz.github.io/r-intermediate-gapminder/19-foreach.html" target="_blank" rel="noopener">Software Carpentry Parallel Processing in R&lt;/a>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>Some quick notes:&lt;/p>
&lt;ul>
&lt;li>If your code already uses functions and for loops, it should be very easy to make it parallel, unless each pass through the loop depends on the outcome from previous passes.&lt;/li>
&lt;li>On your own computer, never set the amount of processors used to the max available. This will take away all the processing power needed to run the operating system, browser, and other programs, and could potentially crash your computer. To test out parallel code on my computer I set the number of processors to use at 2 (out of 8 available).&lt;/li>
&lt;/ul>
&lt;h3 id="batchtools">Batchtools&lt;/h3>
&lt;p>Recently, the R package batchtools has made simple parallel job submissions in R much easier. No more bash scripting, just submit a set of jobs by mapping a function to a list of inputs. Here is an example.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="n">library&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">batchtools&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#Batchtools tmp registry&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">reg&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">makeRegistry&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">file&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">dir&lt;/span> &lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;.&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;registry created&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Toy function that just sleeps&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">fun&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">function&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">n&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">Sys&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sleep&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">10&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">t&lt;/span>&lt;span class="o">&amp;lt;-&lt;/span>&lt;span class="n">Sys&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">time&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">paste&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;worker&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">n&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;time is&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="n">t&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="k">return&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">t&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#batchtools submission&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">reg&lt;/span>&lt;span class="o">$&lt;/span>&lt;span class="n">cluster&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">functions&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">makeClusterFunctionsSlurm&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">template&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;detection_template.tmpl&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">array&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">jobs&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">TRUE&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="n">nodename&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;localhost&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">scheduler&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">latency&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">fs&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">latency&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">65&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">ids&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">batchMap&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">fun&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="n">args&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">list&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">n&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">seq&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">10&lt;/span>&lt;span class="p">)),&lt;/span> &lt;span class="n">reg&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">testJob&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">id&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">ids&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">,],&lt;/span>&lt;span class="n">reg&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Set resources: enable memory measurement&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">res&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">list&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">walltime&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;2:00:00&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">memory&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;4GB&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># Submit jobs using the currently configured cluster functions&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">submitJobs&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ids&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">resources&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">res&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">reg&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">waitForJobs&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">ids&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">reg&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">getStatus&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">reg&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">reg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">getJobTable&lt;/span>&lt;span class="p">())&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>with a SLURM template in the same directory.&lt;/p>
&lt;p>detection_template.tmpl&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="cp">#!/bin/bash
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="cp">&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Modified from https://github.com/mllg/batchtools/blob/master/inst/templates/&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Job Resource Interface Definition&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">##&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## ntasks [integer(1)]: Number of required tasks,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Set larger than 1 if you want to further parallelize&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## with MPI within your job.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## ncpus [integer(1)]: Number of required cpus per task,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Set larger than 1 if you want to further parallelize&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## with multicore/parallel within each task.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## walltime [integer(1)]: Walltime for this job, in seconds.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Must be at least 60 seconds for Slurm to work properly.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## memory [integer(1)]: Memory in megabytes for each cpu.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Must be at least 100 (when I tried lower values my&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## jobs did not start at all).&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">##&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Default resources can be set in your .batchtools.conf.R by defining the variable&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## &amp;#39;default.resources&amp;#39; as a named list.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&amp;lt;%
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1"># relative paths are not handled well by Slurm&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">log.file &lt;span class="o">=&lt;/span> fs::path_expand&lt;span class="o">(&lt;/span>log.file&lt;span class="o">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="o">(&lt;/span>!&lt;span class="s2">&amp;#34;ncpus&amp;#34;&lt;/span> %in% names&lt;span class="o">(&lt;/span>resources&lt;span class="o">))&lt;/span> &lt;span class="o">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> resources&lt;span class="nv">$ncpus&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="m">1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="o">(&lt;/span>!&lt;span class="s2">&amp;#34;walltime&amp;#34;&lt;/span> %in% names&lt;span class="o">(&lt;/span>resources&lt;span class="o">))&lt;/span> &lt;span class="o">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> resources&lt;span class="nv">$walltime&lt;/span>&amp;lt;-&lt;span class="s2">&amp;#34;1:00:00&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="k">if&lt;/span> &lt;span class="o">(&lt;/span>!&lt;span class="s2">&amp;#34;memory&amp;#34;&lt;/span> %in% names&lt;span class="o">(&lt;/span>resources&lt;span class="o">))&lt;/span> &lt;span class="o">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> resources&lt;span class="nv">$memory&lt;/span> &amp;lt;- &lt;span class="s2">&amp;#34;5GB&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">-%&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1"># Job name and who to send updates to&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --mail-user=benweinstein2010@gmail.com&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --mail-type=FAIL,END&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --account=ewhite&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --partition=hpg2-compute&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --qos=ewhite-b # Remove the `-b` if the script will take more than 4 days; see &amp;#34;bursting&amp;#34; below&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --job-name=&amp;lt;%= job.name %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --output=&amp;lt;%= log.file %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --error=&amp;lt;%= log.file %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --time=&amp;lt;%= resources$walltime %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --ntasks=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --cpus-per-task=&amp;lt;%= resources$ncpus %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="c1">#SBATCH --mem-per-cpu=&amp;lt;%= resources$memory %&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;lt;%&lt;span class="o">=&lt;/span> &lt;span class="k">if&lt;/span> &lt;span class="o">(&lt;/span>!is.null&lt;span class="o">(&lt;/span>resources&lt;span class="nv">$partition&lt;/span>&lt;span class="o">))&lt;/span> sprintf&lt;span class="o">(&lt;/span>paste0&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;#SBATCH --partition=&amp;#39;&amp;#34;&lt;/span>, resources&lt;span class="nv">$partition&lt;/span>, &lt;span class="s2">&amp;#34;&amp;#39;&amp;#34;&lt;/span>&lt;span class="o">))&lt;/span> %&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;lt;%&lt;span class="o">=&lt;/span> &lt;span class="k">if&lt;/span> &lt;span class="o">(&lt;/span>array.jobs&lt;span class="o">)&lt;/span> sprintf&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;#SBATCH --array=1-%i&amp;#34;&lt;/span>, nrow&lt;span class="o">(&lt;/span>&lt;span class="nb">jobs&lt;/span>&lt;span class="o">))&lt;/span> &lt;span class="k">else&lt;/span> &lt;span class="s2">&amp;#34;&amp;#34;&lt;/span> %&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Initialize work environment like&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## source /etc/profile&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## module add ...&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">source&lt;/span> /etc/profile
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Export value of DEBUGME environemnt var to slave&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">export&lt;/span> &lt;span class="nv">DEBUGME&lt;/span>&lt;span class="o">=&lt;/span>&amp;lt;%&lt;span class="o">=&lt;/span> Sys.getenv&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;DEBUGME&amp;#34;&lt;/span>&lt;span class="o">)&lt;/span> %&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&amp;lt;%&lt;span class="o">=&lt;/span> sprintf&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;export OMP_NUM_THREADS=%i&amp;#34;&lt;/span>, resources&lt;span class="nv">$omp&lt;/span>.threads&lt;span class="o">)&lt;/span> -%&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&amp;lt;%&lt;span class="o">=&lt;/span> sprintf&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;export OPENBLAS_NUM_THREADS=%i&amp;#34;&lt;/span>, resources&lt;span class="nv">$blas&lt;/span>.threads&lt;span class="o">)&lt;/span> -%&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&amp;lt;%&lt;span class="o">=&lt;/span> sprintf&lt;span class="o">(&lt;/span>&lt;span class="s2">&amp;#34;export MKL_NUM_THREADS=%i&amp;#34;&lt;/span>, resources&lt;span class="nv">$blas&lt;/span>.threads&lt;span class="o">)&lt;/span> -%&amp;gt;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## Run R:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">## we merge R output with stdout from SLURM, which gets then logged via --output option&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">echo&lt;/span> &lt;span class="s2">&amp;#34;submitting job&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">module load gcc/6.3.0 R gdal/2.2.1
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#add to path&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Rscript -e &lt;span class="s1">&amp;#39;batchtools::doJobCollection(&amp;#34;&amp;lt;%= uri %&amp;gt;&amp;#34;)&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>yields&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">Submitting 10 jobs in 10 chunks using cluster functions &amp;#39;Slurm&amp;#39; ...
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[1] TRUE
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">Status for 10 jobs at 2019-05-21 14:43:03:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> Submitted : 10 (100.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> -- Queued : 0 ( 0.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> -- Started : 10 (100.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ---- Running : 0 ( 0.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ---- Done : 10 (100.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ---- Error : 0 ( 0.0%)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ---- Expired : 0 ( 0.0%)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="python">Python&lt;/h2>
&lt;h3 id="installing-python-packages">Installing Python Packages&lt;/h3>
&lt;ul>
&lt;li>ssh onto HiperGator&lt;/li>
&lt;li>Download the conda installer: &lt;code>wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh&lt;/code>&lt;/li>
&lt;li>Run the installer: &lt;code>bash Miniconda3-latest-Linux-x86_64.sh&lt;/code>&lt;/li>
&lt;li>Answer &amp;lsquo;Yes&amp;rsquo; at the end of the install to have conda added to your &lt;code>.bashrc&lt;/code>&lt;/li>
&lt;li>Install packages using &lt;code>conda install package_name&lt;/code>&lt;/li>
&lt;li>Run &lt;code>conda activate&lt;/code> as the first step in your slurm script&lt;/li>
&lt;/ul>
&lt;h3 id="dask-parallelization">Dask Parallelization&lt;/h3>
&lt;p>Dask can be submitted through dask-jobqueue.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">#################
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> # Setup dask cluster
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> #################
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> from dask_jobqueue import SLURMCluster
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> from dask.distributed import Client, wait
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> num_workers = 10
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> #job args
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> extra_args=[
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;#34;--error=/home/b.weinstein/logs/dask-worker-%j.err&amp;#34;,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;#34;--account=ewhite&amp;#34;,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &amp;#34;--output=/home/b.weinstein/logs/dask-worker-%j.out&amp;#34;
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> ]
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> cluster = SLURMCluster(
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> processes=1,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> queue=&amp;#39;hpg2-compute&amp;#39;,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> cores=1,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> memory=&amp;#39;13GB&amp;#39;,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> walltime=&amp;#39;24:00:00&amp;#39;,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> job_extra=extra_args,
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> local_directory=&amp;#34;/home/b.weinstein/logs/&amp;#34;, death_timeout=300)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> print(cluster.job_script())
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> cluster.adapt(minimum=num_workers, maximum=num_workers)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> dask_client = Client(cluster)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> #Start dask
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> dask_client.run_on_scheduler(start_tunnel)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> futures = dask_client.map(&amp;lt;function you want to parallelize&amp;gt;, &amp;lt;list of objects to run&amp;gt;, &amp;lt;additional args here&amp;gt;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> wait(futures)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="connecting-through-jupyter-notebooks">Connecting through jupyter notebooks.&lt;/h3>
&lt;p>Its useful to be able to interact with hipergator, without having to rely solely on the terminal. Especially when dealing with large datasets, instead of prototyping locally, then pushing to the cloud, we can connect directly using a jupyter notebook.&lt;/p>
&lt;ul>
&lt;li>Log on to hipergator and request an interactive session.&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">srun --ntasks=1 --cpus-per-task=2 --mem=2gb -t 90 --pty bash -i
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Now we have 90 minutes work directly on this development node.&lt;/p>
&lt;ul>
&lt;li>Create a juypter notebook&lt;/li>
&lt;/ul>
&lt;p>Load the python module&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="n">module&lt;/span> &lt;span class="nb">load&lt;/span> &lt;span class="n">python&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Start the notebook and get your ssh tunnel&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">import socket
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">import subprocess
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">host = socket.gethostname()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">proc = subprocess.Popen([&amp;#39;jupyter&amp;#39;, &amp;#39;lab&amp;#39;, &amp;#39;--ip&amp;#39;, host, &amp;#39;--no-browser&amp;#39;])
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">print(&amp;#34;ssh -N -L 8888:%s:8888 -l b.weinstein hpg2.rc.ufl.edu&amp;#34; % (host))
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If all went well it should look something like:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="err">@&lt;/span>&lt;span class="n">c27b&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">s2&lt;/span> &lt;span class="n">dask&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">jobqueue&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="o">$&lt;/span> &lt;span class="n">python&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Python&lt;/span> &lt;span class="mf">3.6&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="mi">4&lt;/span> &lt;span class="o">|&lt;/span>&lt;span class="n">Anaconda&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">Inc&lt;/span>&lt;span class="o">.|&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">default&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">Jan&lt;/span> &lt;span class="mi">16&lt;/span> &lt;span class="mi">2018&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="mi">18&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">10&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">19&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">GCC&lt;/span> &lt;span class="mf">7.2&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">on&lt;/span> &lt;span class="n">linux&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">Type&lt;/span> &lt;span class="s2">&amp;#34;help&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;copyright&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;credits&amp;#34;&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="s2">&amp;#34;license&amp;#34;&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">more&lt;/span> &lt;span class="n">information&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="n">import&lt;/span> &lt;span class="n">socket&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="n">import&lt;/span> &lt;span class="n">subprocess&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="n">host&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">socket&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">gethostname&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="n">proc&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">subprocess&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">Popen&lt;/span>&lt;span class="p">([&lt;/span>&lt;span class="s1">&amp;#39;jupyter&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;lab&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;--ip&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">host&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s1">&amp;#39;--no-browser&amp;#39;&lt;/span>&lt;span class="p">])&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="nb">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;ssh -N -L 8888:&lt;/span>&lt;span class="si">%s&lt;/span>&lt;span class="s2">:8888 -l b.weinstein hpg2.rc.ufl.edu&amp;#34;&lt;/span> &lt;span class="o">%&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">host&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">ssh&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">N&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">L&lt;/span> &lt;span class="mi">8888&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="n">c27b&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">s2&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ufhpc&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">8888&lt;/span> &lt;span class="o">-&lt;/span>&lt;span class="n">l&lt;/span> &lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span> &lt;span class="n">hpg2&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">rc&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ufl&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">edu&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="o">&amp;gt;&amp;gt;&amp;gt;&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.776&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">The&lt;/span> &lt;span class="n">port&lt;/span> &lt;span class="mi">8888&lt;/span> &lt;span class="n">is&lt;/span> &lt;span class="n">already&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">use&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">trying&lt;/span> &lt;span class="n">another&lt;/span> &lt;span class="n">port&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.799&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">JupyterLab&lt;/span> &lt;span class="n">beta&lt;/span> &lt;span class="n">preview&lt;/span> &lt;span class="n">extension&lt;/span> &lt;span class="n">loaded&lt;/span> &lt;span class="n">from&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">home&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">miniconda3&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">envs&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">pangeo&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">lib&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">python3&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="mi">6&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">site&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">packages&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">jupyterlab&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.799&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">JupyterLab&lt;/span> &lt;span class="n">application&lt;/span> &lt;span class="n">directory&lt;/span> &lt;span class="n">is&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">home&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">miniconda3&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">envs&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">pangeo&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">share&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">jupyter&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">lab&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.809&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">Serving&lt;/span> &lt;span class="n">notebooks&lt;/span> &lt;span class="n">from&lt;/span> &lt;span class="n">local&lt;/span> &lt;span class="n">directory&lt;/span>&lt;span class="p">:&lt;/span> &lt;span class="o">/&lt;/span>&lt;span class="n">home&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">b&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">weinstein&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">dask&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">jobqueue&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.809&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="mi">0&lt;/span> &lt;span class="n">active&lt;/span> &lt;span class="n">kernels&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.809&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">The&lt;/span> &lt;span class="n">Jupyter&lt;/span> &lt;span class="n">Notebook&lt;/span> &lt;span class="n">is&lt;/span> &lt;span class="n">running&lt;/span> &lt;span class="n">at&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.809&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">http&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="o">//&lt;/span>&lt;span class="n">c27b&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">s2&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ufhpc&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">8889&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="err">?&lt;/span>&lt;span class="n">token&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="n">c9c992a219e1e35ddd4cbe782d7f1f56c6680118b13053c&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">I&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.809&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="n">Use&lt;/span> &lt;span class="ne">Control&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">C&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">stop&lt;/span> &lt;span class="n">this&lt;/span> &lt;span class="n">server&lt;/span> &lt;span class="ow">and&lt;/span> &lt;span class="n">shut&lt;/span> &lt;span class="n">down&lt;/span> &lt;span class="n">all&lt;/span> &lt;span class="n">kernels&lt;/span> &lt;span class="p">(&lt;/span>&lt;span class="n">twice&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="n">skip&lt;/span> &lt;span class="n">confirmation&lt;/span>&lt;span class="p">)&lt;/span>&lt;span class="o">.&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">[&lt;/span>&lt;span class="n">C&lt;/span> &lt;span class="mi">17&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">11&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mf">29.811&lt;/span> &lt;span class="n">LabApp&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">Copy&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="n">paste&lt;/span> &lt;span class="n">this&lt;/span> &lt;span class="n">URL&lt;/span> &lt;span class="n">into&lt;/span> &lt;span class="n">your&lt;/span> &lt;span class="n">browser&lt;/span> &lt;span class="n">when&lt;/span> &lt;span class="n">you&lt;/span> &lt;span class="n">connect&lt;/span> &lt;span class="k">for&lt;/span> &lt;span class="n">the&lt;/span> &lt;span class="n">first&lt;/span> &lt;span class="n">time&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">to&lt;/span> &lt;span class="n">login&lt;/span> &lt;span class="n">with&lt;/span> &lt;span class="n">a&lt;/span> &lt;span class="n">token&lt;/span>&lt;span class="p">:&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">http&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="o">//&lt;/span>&lt;span class="n">c27b&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">s2&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ufhpc&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="mi">8889&lt;/span>&lt;span class="o">/&lt;/span>&lt;span class="err">?&lt;/span>&lt;span class="n">token&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="mi">0&lt;/span>&lt;span class="n">c9c992a219e1e35ddd4cbe782d7f1f56c6680118b13053c&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>see that line ssh&amp;hellip;, that is what we need to enter in our local laptop. It will ask for your login password&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">MacBook-Pro:~ ben$ ssh -N -L 8888:c27b-s2.ufhpc:8888 -l b.weinstein hpg2.rc.ufl.edu
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">b.weinstein@hpg2.rc.ufl.edu&amp;#39;s password:
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Don&amp;rsquo;t worry if it looks like it hangs, the tunnel is open! Go check it out.&lt;/p>
&lt;p>Opening your browner, go to localhost:8888&lt;/p>
&lt;p>and viola, we are navigating hipergator from the confines of our own laptop.&lt;/p>
&lt;h2 id="support">Support&lt;/h2>
&lt;p>&lt;a href="https://support.rc.ufl.edu/enter_bug.cgi" target="_blank" rel="noopener">Request Support&lt;/a>.&lt;/p>
&lt;p>Hipergator staff are here to support you. Our grant money pays their salary. They are friendly and eager to help. When in doubt, just ask.&lt;/p>
&lt;p>For more information on (job submission scripts)[https://wiki.rc.ufl.edu/doc/Annotated_SLURM_Script]&lt;/p>
&lt;h2 id="priority">Priority&lt;/h2>
&lt;p>The supercomputer is a shared resource, and the SLURM scheduler has to decide how to divvy it up. The method they use for deciding when it&amp;rsquo;s your turn to use a machine based on a metric called &amp;ldquo;FairShare.&amp;rdquo; You can see your FairShare number by typing &lt;code>sshare -U&lt;/code> in your hipergator terminal. A FairShare of 0.5 means you&amp;rsquo;ve been using exactly your share. Larger numbers mean you can use more, while smaller numbers mean you&amp;rsquo;re using more than your share and will be given lower priority.&lt;/p>
&lt;p>Your &amp;ldquo;usage&amp;rdquo; number is an exponentially-weighted moving average of the resources you&amp;rsquo;ve consumed, with a half-life of two weeks. So if you&amp;rsquo;ve &amp;ldquo;bursted&amp;rdquo; at 10x for a while, it might take a few weeks before you&amp;rsquo;re given decent priority again.&lt;/p>
&lt;p>A more comprehensive description of FairShare is available &lt;a href="https://slurm.schedmd.com/priority_multifactor.html#fairshare" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;h2 id="bursting">Bursting&lt;/h2>
&lt;p>If your jobs will take less than 4 days, you can use &amp;ldquo;burst&amp;rdquo; mode, which provides &lt;em>ten times&lt;/em> as many cores and &lt;em>ten times&lt;/em> as much memory as the default mode. If you cannot burst, just remove the &lt;code>-b&lt;/code> from the line above about &lt;code>qos&lt;/code>. Note than if you are using burst your jobs will automatically be killed after 96 hours if they haven&amp;rsquo;t already finished.&lt;/p>
&lt;h2 id="current-usage">Current usage&lt;/h2>
&lt;p>To see the current usage by our group, as well as overall hipergator usage, use the command&lt;/p>
&lt;p>&lt;code>slurmInfo -pu&lt;/code>&lt;/p>
&lt;p>To see the total available resources use:&lt;/p>
&lt;p>&lt;code>sacctmgr show qos ewhite format=&amp;quot;Name%-16,GrpSubmit,MaxWall,GrpTres%-45&amp;quot;&lt;/code>&lt;br>
for the normal queue, and &lt;br>
&lt;code>sacctmgr show qos ewhite-b format=&amp;quot;Name%-16,GrpSubmit,MaxWall,GrpTres%-45&amp;quot;&lt;/code>&lt;br>
for the &amp;ldquo;burst&amp;rdquo; queue.&lt;/p>
&lt;p>If you want to look at the active resource use by a current job (i.e., how much of the requested resources are actually being used by the code):&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">ml ufrc
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">jobhtop JOBID &lt;span class="c1"># Displays CPU and RAM resource usage for running jobs&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">jobnvtop JOBID &lt;span class="c1">#Displays GPU resource usage for running GPU jobs&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Both of these take a while to start (up to ~2 min).&lt;/p>
&lt;h2 id="partitions">Partitions&lt;/h2>
&lt;p>The HiperGator consists of hundreds of servers. These a split up into several &amp;ldquo;partitions&amp;rdquo; for various reasons.&lt;/p>
&lt;p>Most of the time you can just use the defaults, but there are a few species partitions available that you might been to request:&lt;/p>
&lt;ul>
&lt;li>&lt;code>hpg-b200&lt;/code> - This is the partition to use for GPU jobs with large VRAM needs (&amp;gt;24 GB/GPU). You need to have paid for GPU access, which our lab has.&lt;/li>
&lt;li>&lt;code>bigmem&lt;/code> - This partitions consists of several servers with up to 1TB of memory. This is useful if you need a &lt;em>lot&lt;/em> of memory but still want to keep a script on a single server.&lt;/li>
&lt;li>&lt;code>hpg2-dev&lt;/code> - These are several servers for development purposes. When you use &lt;code>srundev&lt;/code> the jobs get sent here.&lt;/li>
&lt;/ul>
&lt;h3 id="selecting-a-partition">Selecting a partition&lt;/h3>
&lt;p>By default you&amp;rsquo;ll run jobs on the &lt;code>hpg2-compute&lt;/code> partitions. If you want to change it, edit the &lt;code>--partition&lt;/code> line in your job script, or use the &lt;code>-p&lt;/code> command in &lt;code>srun&lt;/code>.&lt;/p>
&lt;h2 id="cron-jobs---how-to-run-regularly-scheduled-jobs">Cron jobs - how to run regularly scheduled jobs&lt;/h2>
&lt;p>If you&amp;rsquo;re unfamiliar with cron jobs read &lt;a href="https://ostechnix.com/a-beginners-guide-to-cron-jobs/" target="_blank" rel="noopener">A Beginners Guide To Cron Jobs&lt;/a>.&lt;/p>
&lt;h3 id="ssh-to-daemon">SSH to daemon&lt;/h3>
&lt;p>Cron jobs on the HPC need to be setup on a special machine called &lt;code>daemon&lt;/code>.
You can ssh there from the HPC using &lt;code>ssh daemon&lt;/code>.
After that you can use the usual &lt;code>crontab -e&lt;/code> to setup your cron job.&lt;/p>
&lt;h3 id="setting-path-for-cron-jobs">Setting PATH for cron jobs&lt;/h3>
&lt;p>For some reason the PATH isn&amp;rsquo;t properly set when running cron jobs, so you need to set it at the top of the crontab.
Add a line like this adding any additional paths you need (e.g., the location of your conda environments).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">&lt;span class="nv">PATH&lt;/span>&lt;span class="o">=&lt;/span>/opt/slurm/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/bin:/home/USERNAME/bin:/blue/ewhite/USERNAME/miniconda3/bin/
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="check-if-running-on-hipergator">Check if running on HiPerGator&lt;/h2>
&lt;p>Sometimes it&amp;rsquo;s useful to have code execute one way on your local computer and another way on the HPC. For HiPerGator you can do this by checking the environmental variable HOSTNAME and looking to see if it contains &lt;code>ufhpc&lt;/code>. For example, in R&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">if (grepl(&amp;#34;ufhpc&amp;#34;, Sys.getenv(&amp;#34;HOSTNAME&amp;#34;))){
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> hipergator_run()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">} else {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> local_run()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If you are submitting via SLURM, the hostname will not contain &amp;ldquo;ufhpc&amp;rdquo; but the nodename will. So use this logic:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">nodename &amp;lt;- Sys.info()[&amp;#34;nodename&amp;#34;]
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">if(grepl(&amp;#34;ufhpc&amp;#34;, nodename)) {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> print(&amp;#34;I know I am on SLURM!&amp;#34;)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="using-rstudio-on-the-hipergator">Using RStudio on the hipergator&lt;/h2>
&lt;p>See the main Wiki page here on running gui programs. &lt;a href="https://help.rc.ufl.edu/doc/GUI_Programs#Start_a_GUI_Session_on_HiPerGator" target="_blank" rel="noopener">https://help.rc.ufl.edu/doc/GUI_Programs#Start_a_GUI_Session_on_HiPerGator&lt;/a>&lt;/p>
&lt;h2 id="using-vscode-on-hipergator">Using VSCODE on hipergator&lt;/h2>
&lt;p>Vscode is a great development environment for many languages (python, java, bash), and allows powerful integration with github copilot and other debugging tools. The docs on hipergator &lt;a href="https://help.rc.ufl.edu/doc/SSH_Using_VS_Code" target="_blank" rel="noopener">hint&lt;/a> at how to do this, but don&amp;rsquo;t make it clear how to check out a node and develop with those resources. We can use &lt;a href="https://code.visualstudio.com/docs/remote/tunnels" target="_blank" rel="noopener">vscode tunnels&lt;/a> to do this easily.&lt;/p>
&lt;p>Start by creating a SLURM script to get a development node. In this case, I want a GPU node.&lt;/p>
&lt;ol>
&lt;li>&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">&lt;span class="o">(&lt;/span>base&lt;span class="o">)&lt;/span> &lt;span class="o">[&lt;/span>b.weinstein@login11 ~&lt;span class="o">]&lt;/span>$ cat tunnel.sh
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#!/bin/bash&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --job-name=tunnel # Job name&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mail-type=END # Mail events&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mail-user=benweinstein2010@gmail.com # Where to send mail&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --account=ewhite&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --nodes=1 # Number of MPI ran&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --cpus-per-task=10&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --mem=70GB&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --time=12:00:00 #Time limit hrs:min:sec&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --output=/home/b.weinstein/logs/tunnel.out # Standard output and error log&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --error=/home/b.weinstein/logs/tunnel.err&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="c1">#SBATCH --gpus=1&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">module load vscode
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">export&lt;/span> &lt;span class="nv">XDG_RUNTIME_DIR&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="si">${&lt;/span>&lt;span class="nv">SLURM_TMPDIR&lt;/span>&lt;span class="si">}&lt;/span>&lt;span class="p">;&lt;/span> code tunnel
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="2">
&lt;li>Submit the job and view the logs&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">(base) [b.weinstein@login11 ~]$ sbatch tunnel.sh
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">(base) [b.weinstein@login11 ~]$ cat /home/b.weinstein/logs/tunnel.out
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">*
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* Visual Studio Code Server
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">*
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* By using the software, you agree to
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* the Visual Studio Code Server License Terms (https://aka.ms/vscode-server-license) and
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">* the Microsoft Privacy Statement (https://privacy.microsoft.com/en-US/privacystatement).
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">*
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">[2024-05-23 11:56:57] info Using Github for authentication, run `code tunnel user login --provider &amp;lt;provider&amp;gt;` option to change this.
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">To grant access to the server, please log into https://github.com/login/device and use code 3390-CCD9
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ol start="3">
&lt;li>
&lt;p>Go to &lt;a href="https://github.com/login/device" target="_blank" rel="noopener">https://github.com/login/device&lt;/a> and authenticate with the code. You will see the hipergator logs change and successfully connect.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Go to your local vscode instance, active the &amp;lsquo;remote explorer&amp;rsquo; extension and click on &amp;rsquo;tunnels&amp;rsquo;. You will see the hipergator tunnel listed.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;img width="972" alt="image" src="https://github.com/weecology/wiki/assets/1208492/6fc30817-5e84-4350-bebf-be6318ebbc69">
&lt;p>Success! Now you are the GPU node and can debug and run with those resources!&lt;/p></description></item><item><title>Lab Coding Guidelines</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-coding-guidelines/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-coding-guidelines/</guid><description>&lt;p>This is a summary of the 2018-10-10 lab meeting where we discussed coding practices. The &lt;a href="https://hackmd.io/K4ARCohDQN2YTj6Reb3Vtw" target="_blank" rel="noopener">full notes&lt;/a> are available online.&lt;/p>
&lt;h2 id="desired-qualities">Desired Qualities:&lt;/h2>
&lt;ul>
&lt;li>completeness
&lt;ul>
&lt;li>code for all the figures and analyses&lt;/li>
&lt;li>external dependencies documented&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>readable
&lt;ul>
&lt;li>uses good standards for code style&lt;/li>
&lt;li>comments help guide navigation to different parts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>(re)usable
&lt;ul>
&lt;li>description of how to run everything, few changes needed to run it all&lt;/li>
&lt;li>examples for functions&lt;/li>
&lt;li>functions written to be flexible (e.g. less dependence on &amp;ldquo;magic numbers&amp;rdquo; and hard-coded parameter values)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="practices">Practices:&lt;/h2>
&lt;ul>
&lt;li>linter (for style)&lt;/li>
&lt;li>tests (to check functions, could also provide simple examples)&lt;/li>
&lt;li>pair programming checks for readability&lt;/li>
&lt;li>documentation&lt;/li>
&lt;li>refactoring core code into reusable packages&lt;/li>
&lt;li>containers&lt;/li>
&lt;/ul>
&lt;h2 id="action-items">Action Items:&lt;/h2>
&lt;ul>
&lt;li>training (and organizing it)&lt;/li>
&lt;li>toolchain (linter, tests, development tools)&lt;/li>
&lt;li>workflow and organization&lt;/li>
&lt;li>regular practices&lt;/li>
&lt;/ul></description></item><item><title>Lab Server - Serenity</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/serenity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/serenity/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Some useful commands to help navigate and use Serenity&lt;/p>
&lt;p>To log into Serenity, you will need to be under the university network or use a &lt;a href="https://it.ufl.edu/ict/documentation/network-infrastructure/vpn/" target="_blank" rel="noopener">VPN&lt;/a> provided by the university.&lt;/p>
&lt;p>Login in&lt;/p>
&lt;p>ssh &lt;a href="mailto:username@serenity.ifas.ufl.edu">username@serenity.ifas.ufl.edu&lt;/a>&lt;/p>
&lt;p>In case of the warnings below&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>edit &lt;code>~/.ssh/known_hosts&lt;/code> and remove the line with serenity&lt;/p>
&lt;p>To change your password, use&lt;/p>
&lt;p>&lt;code>passwd username&lt;/code>&lt;/p>
&lt;p>OR&lt;/p>
&lt;p>&lt;code>sudo passwd username&lt;/code>&lt;/p>
&lt;p>&lt;a href="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/ssh/#setup-instructions-serenity--hipergator--generic">To set up ssh keys&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/rstudio-on-serenity/">RStudio on serenity&lt;/a>&lt;/p></description></item><item><title>Editing the Lab Website</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-website/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/lab-website/</guid><description>&lt;h2 id="cloning-the-website-repository">Cloning the Website Repository&lt;/h2>
&lt;p>First, fork the &lt;a href="https://github.com/weecology/website" target="_blank" rel="noopener">website repository&lt;/a> to your own GitHub account. Then, clone your fork to your local machine and create a new branch. It&amp;rsquo;s recommended to work on a branch rather than the default &lt;code>main&lt;/code> branch to keep your changes organized.&lt;/p>
&lt;h2 id="one-time-setup">One-Time Setup&lt;/h2>
&lt;p>To get started, follow these steps for the initial setup:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Install the &lt;code>blogdown&lt;/code> R package:&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-r" data-lang="r">&lt;span class="line">&lt;span class="cl">&lt;span class="nf">install.packages&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;blogdown&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>&lt;strong>Install Hugo:&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-r" data-lang="r">&lt;span class="line">&lt;span class="cl">&lt;span class="n">blogdown&lt;/span>&lt;span class="o">::&lt;/span>&lt;span class="nf">install_hugo&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>&lt;strong>Create a New R Project:&lt;/strong>
Open RStudio, navigate to your local copy of the website repository, and create a new R project in that directory.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="adding-new-members">Adding New Members&lt;/h2>
&lt;p>To add a new member to the website, follow these steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Create a Directory:&lt;/strong>
Create a directory named &lt;code>content/authors/first-lastname/&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Add an Avatar Image:&lt;/strong>
Place an avatar image file named &lt;code>avatar.jpg&lt;/code> in the directory: &lt;code>content/authors/first-lastname/avatar.jpg&lt;/code>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Create and Customize an &lt;code>_index.md&lt;/code> File:&lt;/strong>
Copy an existing &lt;code>_index.md&lt;/code> file from &lt;code>content/authors/&lt;/code> into the new directory and customize it with the new member&amp;rsquo;s details:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">cp content/authors/existing-author/_index.md content/authors/first-lastname/_index.md
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;/ol>
&lt;h2 id="viewing-your-changes-locally">Viewing Your Changes Locally&lt;/h2>
&lt;p>To see your changes in real-time on your local computer, follow these steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Open the R Project:&lt;/strong>
Open your R project in RStudio.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Serve the Site:&lt;/strong>
Run the following command in the R console:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-r" data-lang="r">&lt;span class="line">&lt;span class="cl">&lt;span class="n">blogdown&lt;/span>&lt;span class="o">::&lt;/span>&lt;span class="nf">serve_site&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;/li>
&lt;li>
&lt;p>&lt;strong>View in Browser:&lt;/strong>
The site will appear in a pane within RStudio. Click the &lt;code>Show in new window&lt;/code> button to open it in your web browser. The site should update automatically as you save files, though it may take a few seconds to reflect changes.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="submitting-your-changes">Submitting Your Changes&lt;/h2>
&lt;p>When you&amp;rsquo;re ready to submit your changes, follow these steps:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Create a Pull Request (PR):&lt;/strong>
Push your branch to your GitHub fork and create a pull request to the original repository. This will allow others to review your changes.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Netlify Deployment Preview:&lt;/strong>
After creating the PR, Netlify will automatically provide a deploy preview for the website. Check this preview to ensure everything looks correct.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Merge the PR:&lt;/strong>
If everything works fine, the PR will be merged.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="after-merging">After Merging&lt;/h2>
&lt;p>Once the PR is merged, the website will rebuild automatically. This process may take a few minutes, so please be patient.
any issues, don&amp;rsquo;t hesitate to ask for help.&lt;/p></description></item><item><title>Parallelizing Code in R</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/parallelization-r/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/parallelization-r/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>The basic idea of parallelization is the running of computational tasks simultaneously, as opposed to sequentially (or &amp;ldquo;in parallel&amp;rdquo; as opposed to &amp;ldquo;in sequence&amp;rdquo;). To do this, the computer needs to be able to split code into pieces that can run independently and then be joined back together as if they had been run sequentially. The parts of the computer that run the pieces of code are processing units and are typically called &amp;ldquo;cores&amp;rdquo;.&lt;/p>
&lt;p>The &lt;code>doParallel&lt;/code> package is (as far as I&amp;rsquo;m currently aware) the most platform-general and robust parallel package available in R. There is more functionality for Unix-alikes in the &lt;code>parallel&lt;/code> package (see link below), but that doesn&amp;rsquo;t transfer to Windows machines.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">library(doParallel)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="cores-and-clusters">Cores and Clusters&lt;/h2>
&lt;p>At the outset, it&amp;rsquo;s important to know how many cores you have available, which the &lt;code>detectCores()&lt;/code> function returns. However, the value returned includes &amp;ldquo;hyperthreaded cores&amp;rdquo; (aka &amp;ldquo;logical cores&amp;rdquo;), which include further computational divvyings of the physical cores.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">detectCores()
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">detectCores(logical = FALSE)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Although hyperthreaded cores are available, they often do not show speed gains in R. However, as with everything parallel in R, it&amp;rsquo;s best to actually test that on the problem you&amp;rsquo;re working on, as there will be gains in certain situations.&lt;/p>
&lt;p>At the user-interface level, we only interact at a single point, yet the code needs to access multiple cores at once. To achieve this, we have the concept of a &amp;ldquo;cluster&amp;rdquo;, which represents a parallel set of copies of R running across multiple cores (including across physical sockets). We create a cluster using the &lt;code>makeCluster&lt;/code> function and register the backend using the &lt;code>registerDoParallel&lt;/code> function:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">ncores &amp;lt;- detectCores(logical = FALSE)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">cl &amp;lt;- makeCluster(floor(0.75 * ncores))
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">registerDoParallel(cl)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">stopCluster(cl)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Here, I&amp;rsquo;ve stopped the cluster explicitly using &lt;code>stopCluster&lt;/code>, which frees up the computational resources. The cluster will be automatically stopped when you end your R instance, but it&amp;rsquo;s a good habitat to be in to stop any clusters you make (even if you don&amp;rsquo;t register the backend).&lt;/p>
&lt;h2 id="foreach-and-dopar">foreach and %dopar%&lt;/h2>
&lt;p>Parallelization in &lt;code>doParallel&lt;/code> happens via the combination of the &lt;code>foreach&lt;/code> and &lt;code>%dopar%&lt;/code> operators in a fashion similar to &lt;code>for&lt;/code> loops. Rather than &lt;code>for(variable in values) {expression}&lt;/code>, we have &lt;code>foreach(variable = values, options) %dopar% {expression}&lt;/code>. Thus, the basic code block of a &lt;code>foreach&lt;/code> parallel loop is&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-gdscript3" data-lang="gdscript3">&lt;span class="line">&lt;span class="cl">&lt;span class="n">cl&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span> &lt;span class="n">makeCluster&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nb">floor&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mf">0.75&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="n">ncores&lt;/span>&lt;span class="p">))&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">registerDoParallel&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">cl&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">foreach&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">variable&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">values&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">options&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="o">%&lt;/span>&lt;span class="n">dopar&lt;/span>&lt;span class="o">%&lt;/span> &lt;span class="p">{&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="n">expression&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="p">}&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">stopCluster&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">cl&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note that there can be multiple variable arguments (&lt;em>e.g.&lt;/em>, &lt;code>foreach(i = 1:10, j = 11:20) %dopar% {}&lt;/code>), although no variables are recycled, so the smallest number of values across variables is used.&lt;/p>
&lt;p>Each of the instances of a &lt;code>foreach&lt;/code> is referred to as a &amp;ldquo;task&amp;rdquo; (a key word, especially in terms of error checking, see link below).&lt;/p>
&lt;p>An important distinction between &lt;code>foreach&lt;/code> and &lt;code>for&lt;/code> is that &lt;code>foreach&lt;/code> returns a value (by default, a list), whereas &lt;code>for&lt;/code> causes side effects. Compare the following:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">cl &amp;lt;- makeCluster(floor(0.75 * ncores))
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">registerDoParallel(cl)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out &amp;lt;- foreach(i = 1:10) %dopar% {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> i + rnorm(1, i, i)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">stopCluster(cl)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>vs.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">out &amp;lt;- vector(&amp;#34;list&amp;#34;, 10)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">for(i in 1:10){
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> out[[i]] &amp;lt;- i + rnorm(1, i, i)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>foreach&lt;/code> creates &lt;code>out&lt;/code>; &lt;code>for&lt;/code> modifies it.&lt;/p>
&lt;p>There are a handful of option arguments in &lt;code>foreach&lt;/code> that are really critical for anything beyond trivial computation:&lt;/p>
&lt;ul>
&lt;li>&lt;code>.packages&lt;/code> passes the libraries down to the cluster of Rs. If you don&amp;rsquo;t include package names and you use a package-derived function, it won&amp;rsquo;t work&lt;/li>
&lt;li>&lt;code>.combine&lt;/code> dictates how the output is combined, defaulting to a list, but allowing lots of flexibility including mathematical operations. If needed, the &lt;code>.init&lt;/code> option allows you to initialize the value (for example with 0 or 1).&lt;/li>
&lt;li>&lt;code>.inorder&lt;/code> sets whether the combination happens based on the order of inputs. If the order isn&amp;rsquo;t important, setting this to &lt;code>FALSE&lt;/code> can give performance gains.&lt;/li>
&lt;li>&lt;code>.errorhandling&lt;/code> determines what happens when a task fails (see link below for using &lt;code>tryCatch&lt;/code> within tasks).&lt;/li>
&lt;/ul>
&lt;p>In addition to &lt;code>%dopar%&lt;/code>, there is a sequential operator for use with &lt;code>foreach&lt;/code>, and it is simply &lt;code>%do%&lt;/code>. Replacing &lt;code>%dopar%&lt;/code> with &lt;code>%do%&lt;/code> will cause the code to run in-order, as if you did not initiate a cluster. Similarly, you can run &lt;code>foreach&lt;/code> and &lt;code>%dopar%&lt;/code> without having made a cluster, but the code will be run sequentially.&lt;/p>
&lt;h2 id="nesting-foreach-loops">Nesting foreach loops&lt;/h2>
&lt;p>Nested loops can often be really powerful for computation. &lt;code>doParallel&lt;/code> has a special operator that combines two &lt;code>foreach&lt;/code> objects in a nested manner: &lt;code>%:%&lt;/code>. This operator causes the outer most &lt;code>foreach&lt;/code> to be evaluated over its variables&amp;rsquo; values, which are then passed down into the next innermost &lt;code>foreach&lt;/code>. That &lt;code>foreach&lt;/code> iterates over its variables&amp;rsquo; values for each value of the outer &lt;code>foreach&lt;/code> variables.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">cl &amp;lt;- makeCluster(floor(0.75 * ncores))
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">registerDoParallel(cl)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out &amp;lt;- foreach(i = 1:10) %:%
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> foreach(j = 1:100) %dopar% {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> i * j^3
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> }
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">stopCluster(cl)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>.combine&lt;/code> option is really important to pay attention to with nested &lt;code>foreach&lt;/code> loops, as it will allow you to flexibly structure the data. The above version produces a list (length 10) of lists (length 100).&lt;/p>
&lt;h2 id="seeds-and-rngs">Seeds and RNGs&lt;/h2>
&lt;p>One of the downfalls of &lt;code>foreach&lt;/code> and &lt;code>%dopar%&lt;/code> is that the parallel runs aren&amp;rsquo;t reproducible in a simple way. There are ways to code seed setting up, but it&amp;rsquo;s a little obtuse. Thankfully, the &lt;code>doRNG&lt;/code> package has you taken care of. There are a few ways to code up a reproducible parallel loop.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">library(doRNG)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">cl &amp;lt;- makeCluster(floor(0.75 * ncores))
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">registerDoParallel(cl)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 1. .options.RNG
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out1 &amp;lt;- foreach(i = 1:10, .options.RNG = 1234) %dorng% {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> rnorm(1, i, i^2)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 2. set.seed
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">set.seed(1234)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out2 &amp;lt;- foreach(i = 1:10) %dorng% {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> rnorm(1, i, i^2)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"># 3. registerDoRNG (note that this doesn&amp;#39;t replace the registerDoParallel!)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">registerDoRNG(1234)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">out3 &amp;lt;- foreach(i = 1:10) %dorng% {
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> rnorm(1, i, i^2)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">}
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">stopCluster(cl)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">identical(out1, out2)
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">identical(out1, out3)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="speed-gains">Speed Gains&lt;/h2>
&lt;p>The degree to which your code speeds up when you parallelize it depends on a few factors: how many cores you have, whether the code can take advantage of hyperthreading, the run time of the code itself, and your ease of coding in parallel. There is definitely some computational and time overhead involved in setting up a cluster, distributing the work, and combining the results. Short tasks (seconds to a half-minute), therefore, are usually faster to run in sequence. Once tasks start creeping up to the half-minute mark on a dual-core machine (or the less time on a more-core machine), the parallel runs will start to be faster. As the runtime of the computation increases, the relative gain by parallelizing increases, although it never gets to a fully fractional decrease of time (&lt;em>i.e.&lt;/em>, 6 cores won&amp;rsquo;t ever get you to 1/6 the runtime&amp;hellip;probably more like 1/4 - 1/5) due to overhead. The other thing to keep in mind with parallelization is that there is often some additional coding time involved, and there can be issues that require an additional level of troubleshooting that quickly eliminates the time gains.&lt;/p>
&lt;p>See references below for speed tests.&lt;/p>
&lt;h2 id="references">References&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/" target="_blank" rel="noopener">example of foreach on Windows, including performance metrics&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://resbaz.github.io/r-intermediate-gapminder/19-foreach.html" target="_blank" rel="noopener">software carpentry tutorial&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/tobigithub/R-parallel/wiki/R-parallel-Errors" target="_blank" rel="noopener">parallel error compendium&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://stackoverflow.com/questions/39262612/r-show-error-and-warning-messages-in-foreach-dopar" target="_blank" rel="noopener">setting up tryCatch inside foreach&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www3.nd.edu/~steve/computing_with_data/22_parallel/parallel_foreach.html" target="_blank" rel="noopener">real world example of parallelization utility: likelihood profiling&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://stochasticcoder.com/2016/01/12/r-using-doparallel-to-significantly-speedup-database-retrieval/" target="_blank" rel="noopener">real world example of parallelization utility: database querying&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://www2.stat.duke.edu/~cr173/Sta523_Fa14/parallelization.html" target="_blank" rel="noopener">details of mc&amp;mdash; functions, useful for Unix-alikes&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/benporter/parallel-speed-test-R" target="_blank" rel="noopener">example parallel speed test&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>Python - Package Documentation</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/python-package-documentation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/python-package-documentation/</guid><description>&lt;h2 id="add-docs">Add docs&lt;/h2>
&lt;h3 id="install-sphinx-and-the-markdown-parser">Install sphinx and the markdown parser&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">pip install sphinx myst-parser
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="make-a-docs-directory-and-run-the-quick-start">Make a docs directory and run the quick-start&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">mkdir docs
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="nb">cd&lt;/span> docs
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sphinx-quickstart
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="add-markdown-and-autodoc-extensions-to-confpy">Add markdown and autodoc extensions to &lt;code>conf.py&lt;/code>&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">extensions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;myst_parser&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s1">&amp;#39;sphinx.ext.autodoc&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="make-docs-files-and-edit-indexrst">Make docs files and edit index.rst&lt;/h3>
&lt;ul>
&lt;li>Edit &lt;code>index.rst&lt;/code> to include the information you&lt;/li>
&lt;li>Add markdown files for each separate page of docs&lt;/li>
&lt;li>In &lt;code>index.rst&lt;/code> add the names of the markdown files (without extensions) to the &lt;code>toctree&lt;/code> block. E.g., if we want to include an the docs in &lt;code>installation.md&lt;/code> and &lt;code>getting-started.md&lt;/code>:&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">.. toctree::
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> :maxdepth: 2
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> :caption: Contents:
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> installation
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> getting-started
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="setup-automatic-function-documentation">Setup automatic function documentation&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">sphinx-apidoc -f -o &lt;span class="nb">source&lt;/span> ../&amp;lt;package-name&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="change-the-theme">Change the theme&lt;/h3>
&lt;ul>
&lt;li>Pick a theme (we currently use either spinx-rtd-theme or furo)&lt;/li>
&lt;li>Install it&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">pip install sphinx_rtd_theme
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>Change the theme value in in &lt;code>conf.py&lt;/code>&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">html_theme&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;sphinx_rtd_theme&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>If using the &lt;code>sphinx_rtd_theme&lt;/code> also add it to &lt;code>extensions&lt;/code>&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">extensions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;myst_parser&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s1">&amp;#39;sphinx.ext.autodoc&amp;#39;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl"> &lt;span class="s1">&amp;#39;sphinx_rtd_theme&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="build-the-docs">Build the docs&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-sh" data-lang="sh">&lt;span class="line">&lt;span class="cl">make html
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="build-docs-automatically-on-readthedocs">Build docs automatically on readthedocs&lt;/h2>
&lt;p>Your project should be in an online repository&lt;/p>
&lt;h3 id="add-a-docs-requirementstxt-file">Add a docs requirements.txt file&lt;/h3>
&lt;p>In the docs directory add a &lt;code>requirement.txt&lt;/code> file that includes the extra packages required for building the docs.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">myst_parser
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">sphinx_rtd_theme
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="add-readthedocsyaml">Add .readthedocs.yaml&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="line">&lt;span class="cl">&lt;span class="nt">version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s2">&amp;#34;2&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">build&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">os&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s2">&amp;#34;ubuntu-22.04&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">tools&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">python&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s2">&amp;#34;3.10&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">python&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">install&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">method&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">pip&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">path&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">.&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>- &lt;span class="nt">requirements&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">docs/requirements.txt&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w">&lt;/span>&lt;span class="nt">sphinx&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="w"> &lt;/span>&lt;span class="nt">configuration&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="l">docs/conf.py&lt;/span>&lt;span class="w">
&lt;/span>&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="connect-your-githubgitlab-account-to-readthedocs">Connect your GitHub/GitLab account to readthedocs&lt;/h3>
&lt;ul>
&lt;li>Go to &lt;a href="https://readthedocs.com/" target="_blank" rel="noopener">https://readthedocs.com/&lt;/a>&lt;/li>
&lt;li>Click &amp;lsquo;Sign up&amp;rsquo;&lt;/li>
&lt;li>Choose &amp;lsquo;Read the Docs Community&amp;rsquo;&lt;/li>
&lt;li>Click &amp;lsquo;Sign up with GitHub&amp;rsquo; (or GitLab)&lt;/li>
&lt;li>Follow the instructions&lt;/li>
&lt;/ul>
&lt;h3 id="connect-your-project">Connect your project&lt;/h3>
&lt;ul>
&lt;li>Go to &lt;a href="https://readthedocs.org/dashboard/" target="_blank" rel="noopener">https://readthedocs.org/dashboard/&lt;/a>&lt;/li>
&lt;li>Click &amp;lsquo;Import a Project&amp;rsquo;&lt;/li>
&lt;li>If the project is listed select it&lt;/li>
&lt;li>If it is not listed click on &amp;lsquo;Import Manually&amp;rsquo; and provide the requested information&lt;/li>
&lt;/ul>
&lt;h3 id="enable-builds-for-prs">Enable builds for PRs&lt;/h3>
&lt;p>If you want to check the doc builds from your PRs enable this by:&lt;/p>
&lt;ol>
&lt;li>Go to the project dashboard on readthedocs&lt;/li>
&lt;li>Select &amp;lsquo;Admin&amp;rsquo;&lt;/li>
&lt;li>Select &amp;lsquo;Advanced Settings&amp;rsquo;&lt;/li>
&lt;li>Click &amp;lsquo;Build pull requests for this project&amp;rsquo;&lt;/li>
&lt;/ol></description></item><item><title>R Resources</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/r-resources/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/r-resources/</guid><description>&lt;p>R is a powerful tool that allows you to do a lot of things such as doing simple arithmetic calculations, organizing and analyzing your data, and even developing your own software. There are several, &lt;strong>FREE&lt;/strong> resources that you can use to learn R. Below is a &lt;em>non-exhaustive&lt;/em> list of tutorials, books, and videos that you can use (most of which are useful for data analysis), whether you&amp;rsquo;re just starting out or are more advanced in programming.&lt;/p>
&lt;h2 id="r-at-an-introductory-level">R at an Introductory Level&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://swirlstats.com/students.html" target="_blank" rel="noopener">swirlR&lt;/a> is an R package that allows you to learn R in the console at your own pace&lt;/li>
&lt;li>&lt;a href="https://tinystats.github.io/teacups-giraffes-and-statistics/index.html" target="_blank" rel="noopener">Teacups, Giraffes, and Statistics&lt;/a> is an interactive website that contains modules useful for learning statistics and R&lt;/li>
&lt;li>&lt;a href="https://rforcats.net/" target="_blank" rel="noopener">R for cats&lt;/a> is an introductory guide to the use of R with cat photos (this is a bonus!)&lt;/li>
&lt;li>&lt;a href="https://paulvanderlaken.files.wordpress.com/2017/08/r_in_a_nutshell.pdf" target="_blank" rel="noopener">R in a Nutshell&lt;/a> is a book that gives you a concise overview of the different things you can do in R&lt;/li>
&lt;li>&lt;a href="https://www.infoworld.com/article/3411819/do-more-with-r-video-tutorials.html" target="_blank" rel="noopener">Do More with R&lt;/a> is a website listing video tutorials on specific topics in R (most videos are said to be &amp;lt; 10 minutes in length)&lt;/li>
&lt;li>&lt;a href="https://datacarpentry.org/semester-biology/readings/R-intro/" target="_blank" rel="noopener">Data Carpentry for Biologists&lt;/a> is derived from the semester-long course taught By Dr. Ethan White that covers basic functions/use of R&lt;/li>
&lt;/ul>
&lt;h2 id="intermediate-or-advanced-use-of-r">Intermediate or Advanced use of R&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://rstats.wtf/" target="_blank" rel="noopener">What They Forgot to Teach You about R&lt;/a> is a set of tips on doing effective, reproducible data analysis in R (and less about actual programming)&lt;/li>
&lt;li>&lt;a href="https://happygitwithr.com/" target="_blank" rel="noopener">Happy Git and GitHub for the UseR&lt;/a> is an instructional guide to using Git, GitHub, and R, which are all tools we use in the lab&lt;/li>
&lt;li>&lt;a href="https://mathstat.slu.edu/~speegle/_book/RData.html" target="_blank" rel="noopener">Foundations of Statistics with R&lt;/a> is a course book for learning probability and statistics using R&lt;/li>
&lt;li>&lt;a href="https://csgillespie.github.io/efficientR/introduction.html" target="_blank" rel="noopener">Efficient R Programming&lt;/a> is another book that would help you increase your algorithmic and programming efficiency when using R&lt;/li>
&lt;li>&lt;a href="https://adv-r.hadley.nz/preface.html" target="_blank" rel="noopener">Advanced R&lt;/a> is the second edition of a book that teaches you more advanced programming skills in R (it makes use of a new package called &lt;a href="https://rlang.r-lib.org/" target="_blank" rel="noopener">rlang&lt;/a>, which is an interface to low-level data structures and operations&lt;/li>
&lt;li>&lt;a href="https://ms.mcmaster.ca/~bolker/emdbook/book.pdf" target="_blank" rel="noopener">Ecological Models and Data in R&lt;/a> is a book about building models implemented in a frequentist or Bayesian framework to answer ecological questions&lt;/li>
&lt;/ul>
&lt;h2 id="getting-help">Getting Help&lt;/h2>
&lt;p>There are times when your code might not seem to work. Apart from the brilliant lab members who you can easily reach out to through Slack for help, you can post questions/concerns in different online communities such as:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://community.rstudio.com/" target="_blank" rel="noopener">RStudio Community&lt;/a> for R-specific questions&lt;/li>
&lt;li>&lt;a href="https://stackoverflow.com/" target="_blank" rel="noopener">Stack Overflow&lt;/a> for programming questions&lt;/li>
&lt;/ul></description></item><item><title>Using Git From RStudio</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/rstudio-git-integration/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/rstudio-git-integration/</guid><description>&lt;p>Just a brief introduction on how to easily set up a Git/GitHub repo in RStudio:&lt;/p>
&lt;ul>
&lt;li>Make sure Git is set up on your computer and in R (ask Ethan for help on this one)&lt;/li>
&lt;li>Sign into your GitHub account&lt;/li>
&lt;li>Click on the plus sign in the top right-hand corner (near your profile picture) and select &amp;ldquo;New repository&amp;rdquo;. You can choose to make your new repo through your account or the Weecology one on the next page.&lt;/li>
&lt;li>Name your repository, add a quick description if desired, select Public or Private (most often Public) and &lt;em>be sure to check the box to initialize a README document&lt;/em>. Click &lt;em>Create Repository.&lt;/em>&lt;/li>
&lt;li>You&amp;rsquo;ll now be on the page for your new repo. Find the green icon &lt;strong>CODE&lt;/strong> that has either an https address or something that looks like an email address (SSH). Make sure that the box to the left of text box says HTTPS rather than SSH. Then click the clipboard icon to the right of the text box. This will copy that address to your clipboard.&lt;/li>
&lt;li>Now you&amp;rsquo;re done with GitHub. Open up RStudio. Go to &lt;em>File&lt;/em> and select &lt;em>New Project&lt;/em>. If Git is properly set up in RStudio, you should have an option called &lt;em>Version Control&lt;/em>. Click on that option.&lt;/li>
&lt;li>Select the &lt;em>Git&lt;/em> option from the next menu.&lt;/li>
&lt;li>Paste the HTTPS address into the &lt;em>Repository URL&lt;/em> box. It will auto-fill the &lt;em>Project directory name&lt;/em>. Make sure the project is being created in the appropriate folder.&lt;/li>
&lt;li>Click &lt;em>Create Project&lt;/em> and voila!&lt;/li>
&lt;li>Now you should see a &lt;em>Git&lt;/em> tab next to the &lt;em>Environment&lt;/em> and &lt;em>History&lt;/em> tabs. From there, you can make commits, push (green up arrow), and pull (blue down arrow).&lt;/li>
&lt;/ul></description></item><item><title>RStudio on Serenity</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/rstudio-on-serenity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/rstudio-on-serenity/</guid><description>&lt;p>We have a lab server which is fairly powerful. It has 32GB of RAM and a 2 core Intel Xeon processor. This is faster and has more ram than any of our laptops. It&amp;rsquo;s good for doing things that may take too long on a desktop/laptop or as a test bed for running on the hipergator.&lt;/p>
&lt;p>Note you must be on campus to use serenity, or logged in via the VPN.&lt;/p>
&lt;h2 id="using-rstudio">Using RStudio&lt;/h2>
&lt;p>Open a browser and go to &lt;a href="http://serenity.ifas.ufl.edu:8787" target="_blank" rel="noopener">http://serenity.ifas.ufl.edu:8787&lt;/a> and login with your serenity username and password. Note this will only work when on campus. For off campus access see below.&lt;br>
This runs exactly like RStudio on your desktop. You&amp;rsquo;ll have to re-install any packages that you need.
This works great for code that takes a long time to run. You can start something, then close the browser and Rstudio will keep running on serenity.&lt;/p>
&lt;h2 id="logging-in-from-off-campus">Logging in from off campus&lt;/h2>
&lt;p>There are two options to login from off campus. The first is to use the UFL VPN. Once connected you can go to the address above. The second is to access it via the hipergator login node using the steps below.&lt;/p>
&lt;h3 id="on-windows">On Windows&lt;/h3>
&lt;p>Download the putty ssh client &lt;a href="https://the.earth.li/~sgtatham/putty/latest/w64/putty.exe" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>Open putty and make a connection to the hipergator login node. Put &lt;code>hpg.rc.ufl.edu&lt;/code> into the Host Name box. Put &lt;code>hipergator&lt;/code> into the Saved Sessions box and click Save to save this setup. Then in the menu go to Connection -&amp;gt; SSH -&amp;gt; Auth -&amp;gt; Tunnels. Put &lt;code>8787&lt;/code> in the source port and &lt;code>serenity.ifas.ufl.edu:8787&lt;/code> in the Destination and click add.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://i.imgur.com/gEtmuCn.png" alt="" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Then in the left side go back to the Session tab and click Save again. Then click open to connect. Enter your username and password. Once connected open a browser and go to http://localhost:8787&lt;/p>
&lt;h3 id="on-mac-or-linux">On mac or linux&lt;/h3>
&lt;p>SSH to the hipergator using the following command&lt;/p>
&lt;p>&lt;code>ssh &amp;lt;username&amp;gt;@hpg.rc.ufl.edu -L 8787:serenity.ifas.ufl.edu:8787&lt;/code>&lt;/p>
&lt;p>Once logged in open a browser and go to http://localhost:8787&lt;/p>
&lt;h2 id="getting-a-login">Getting a login&lt;/h2>
&lt;p>Ask Shawn or Henry for a username and initial password&lt;/p></description></item><item><title>Computer Security &amp; Privacy</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/computer-security-privacy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/computer-security-privacy/</guid><description>&lt;h2 id="password-managers">Password Managers&lt;/h2>
&lt;p>A password manager keeps track of the various passwords you use for different websites. (And the reason to use a different password for each account, is so that if your password at one place is compromised, it does not affect your other accounts.)&lt;/p>
&lt;ul>
&lt;li>Commonly recommended password managers are &lt;a href="https://www.lastpass.com" target="_blank" rel="noopener">Lastpass&lt;/a> and &lt;a href="https://1password.com" target="_blank" rel="noopener">1password&lt;/a>.&lt;/li>
&lt;li>Lastpass has a free option, but will only sync passwords across similar device-types (e.g. between computers, or between smartphones), but not between computer and smartphone. Paid plans for Lastpass start at $2/month.&lt;/li>
&lt;li>1password seems to have a nicer interface, but plans start at $2.99/month.&lt;/li>
&lt;li>The common usage for password managers is as browser plug-ins that can remember passwords, generate new random passwords, and autofill.&lt;/li>
&lt;li>Both Lastpass and 1password have secure notes, which allow you to keep track of answers to security questions, in case you want to generate random responses rather than use the middle school you went to, which anyone can look up.&lt;/li>
&lt;li>Apple&amp;rsquo;s iCloud Keychain is another option if you have all Apple devices. Here&amp;rsquo;s a &lt;a href="https://www.macworld.com/article/3060630/ios/why-not-pick-keychain-instead-of-1password-or-lastpass.html" target="_blank" rel="noopener">discussion&lt;/a> on the differences with third-party password managers.&lt;/li>
&lt;li>Since your email usually allows you to reset passwords, it can be preferable to NOT put your email password into your password manager, and instead to secure your email with another strong passphrase.&lt;/li>
&lt;/ul>
&lt;h2 id="passphrases">Passphrases&lt;/h2>
&lt;p>Because password managers are secured by a single master passphrase (that unlocks all your passwords), it is recommended that you use a strong passphrase.&lt;/p>
&lt;ul>
&lt;li>One method is to string together random words, possibly with capitalization and special characters added.&lt;/li>
&lt;li>Rather than use a website for this (who knows if it is truly random or recording the output it gives you), you can use &lt;a href="https://www.eff.org/dice" target="_blank" rel="noopener">dice and a word list&lt;/a>.&lt;/li>
&lt;li>While trying to memorize the passphrase, you might consider having a written copy that you keep somewhere safe.&lt;/li>
&lt;/ul>
&lt;h2 id="two-factor-authentication">Two-Factor Authentication&lt;/h2>
&lt;p>Two-factor authentication (commonly 2FA or MFA for multi-factor) means identifying yourself using components from at least two different categories:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>something you know (e.g. a password)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>something you have (e.g. a key or phone)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>something you are (e.g. fingerprints, biometrics)
Thus, if someone else has your password, they still can&amp;rsquo;t access your account without e.g. your phone, too.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Many large sites (e.g. banks, email) with security concerns have 2FA as an option, but it might need to be turned on. For example, here are the instructions for &lt;a href="https://www.google.com/landing/2step/" target="_blank" rel="noopener">gmail&lt;/a>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Common implementations are to send a code by email or text when you log in.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Some places also support apps or devices that generate rolling codes (e.g. &lt;a href="https://support.google.com/accounts/answer/1066447?co=GENIE.Platform%3DAndroid&amp;amp;hl=en" target="_blank" rel="noopener">Google Authenticator&lt;/a>). These generate rolling codes that rotate regularly, for example, every minute. When synced up with a website during 2FA setup, you can also use the rolling code to authenticate.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hardware tokens are also possible, for example &lt;a href="https://www.yubico.com/products/yubikey-hardware/" target="_blank" rel="noopener">Yubikeys&lt;/a>. You will want to check compatibility with services. For instance, the basic U2F key will not work as 2FA for Lastpass, and you also need a paid Lastpass plan.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="misc">Misc.&lt;/h2>
&lt;ul>
&lt;li>EFF has a &lt;a href="https://www.eff.org/https-everywhere" target="_blank" rel="noopener">browser extension&lt;/a> to use encrypted HTTPS when possible.&lt;/li>
&lt;li>EFF also has a &lt;a href="https://www.eff.org/privacybadger" target="_blank" rel="noopener">browser extension&lt;/a> to block ads from tracking you across websites.&lt;/li>
&lt;li>Don&amp;rsquo;t plug in unknown USB devices to your computer or your USB devices into unknown ports. (&lt;a href="https://www.reuters.com/article/us-nuclearpower-cyber-germany/german-nuclear-plant-infected-with-computer-viruses-operator-says-idUSKCN0XN2OS" target="_blank" rel="noopener">e.g. &amp;#x1f631;&lt;/a>:&lt;/li>
&lt;/ul>
&lt;blockquote>
&lt;p>As an example, Hypponen said he had recently spoken to a European aircraft maker that said it cleans the cockpits of its planes every week of malware designed for Android phones. The malware spread to the planes only because factory employees were charging their phones with the USB port in the cockpit.&lt;/p>
&lt;p>Because the plane runs a different operating system, nothing would befall it. But it would pass the virus on to other devices that plugged into the charger.&lt;/p>
&lt;/blockquote>
&lt;ul>
&lt;li>You can use incognito / private mode when browsing to bypass some paywalls (e.g. NYT&amp;rsquo;s 10-article/month limit)&lt;/li>
&lt;li>&lt;a href="https://objective-see.com/products/oversight.html" target="_blank" rel="noopener">OverSight&lt;/a> comes recommended by some folks as a way to notify about microphone and camera usage on Macs.&lt;/li>
&lt;/ul></description></item><item><title>Software Testing in R</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/software-testing-r/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/software-testing-r/</guid><description>&lt;p>&lt;a href="https://github.com/weecology/rtestpackage" target="_blank" rel="noopener">https://github.com/weecology/rtestpackage&lt;/a>&lt;/p>
&lt;p>This repo contains a description of how to setup the test environment for R.&lt;/p>
&lt;p>There are additional references from where the information was derived.&lt;/p></description></item><item><title>Statistics for Software Use</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/statistics-for-software-downloads/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/statistics-for-software-downloads/</guid><description>&lt;p>It can be helpful to know how frequently your software is being downloaded to assess it&amp;rsquo;s use and report on its impact. Below are instructions for how to do this for different languages. Keep in mind that downloads can be influenced by automated testing systems if they install the software from the central repository.&lt;/p>
&lt;h2 id="r">R&lt;/h2>
&lt;p>These instructions get downloads from the cloud CRAN mirror. This is a minimum estimate of total downloads.&lt;/p>
&lt;ul>
&lt;li>Install the &lt;a href="https://github.com/r-hub/cranlogs" target="_blank" rel="noopener">&lt;code>cranlogs&lt;/code> package&lt;/a> &lt;code>install.packages(&amp;quot;cranlogs&amp;quot;)&lt;/code>&lt;/li>
&lt;li>Run the &lt;code>cran_downloads&lt;/code> function with your package name and date ranges if desired&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-r" data-lang="r">&lt;span class="line">&lt;span class="cl">&lt;span class="n">downloads&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">cranlogs&lt;/span>&lt;span class="o">::&lt;/span>&lt;span class="nf">cran_downloads&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">packages&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nf">c&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;portalr&amp;#34;&lt;/span>&lt;span class="p">),&lt;/span> &lt;span class="n">from&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">&amp;#34;2020-02-26&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">to&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s">&amp;#34;2020-08-16&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">total_downloads&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="nf">sum&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">downloads&lt;/span>&lt;span class="o">$&lt;/span>&lt;span class="n">count&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="python">Python&lt;/h2>
&lt;p>Python packages are often distributed using both PyPI and conda so you might need to get download statistics for both.&lt;/p>
&lt;h3 id="pypi">PyPI&lt;/h3>
&lt;h4 id="using-the-pypi-stats-websitehttpspypistatsorg">Using the &lt;a href="https://pypistats.org/" target="_blank" rel="noopener">PyPI Stats website&lt;/a>&lt;/h4>
&lt;ul>
&lt;li>PyPI Stats is easy and provides the last 6 months of data.&lt;/li>
&lt;li>E.g., &lt;a href="https://pypistats.org/packages/deepforest" target="_blank" rel="noopener">https://pypistats.org/packages/deepforest&lt;/a>&lt;/li>
&lt;/ul>
&lt;h4 id="using-google-bigquery">Using Google BigQuery&lt;/h4>
&lt;ul>
&lt;li>Setup a Google BigQuery account&lt;/li>
&lt;li>Run a version of this query where with additional modifications as necessary&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-Python" data-lang="Python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">SELECT&lt;/span> &lt;span class="n">COUNT&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="n">downloads&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">FROM&lt;/span> &lt;span class="err">`&lt;/span>&lt;span class="n">the&lt;/span>&lt;span class="o">-&lt;/span>&lt;span class="n">psf&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">pypi&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">downloads2020&lt;/span>&lt;span class="o">*&lt;/span>&lt;span class="err">`&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">WHERE&lt;/span> &lt;span class="n">file&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">project&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s1">&amp;#39;retriever&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="conda">Conda&lt;/h3>
&lt;ul>
&lt;li>Install the &lt;a href="https://www.anaconda.com/blog/get-python-package-download-statistics-with-condastats" target="_blank" rel="noopener">&lt;code>condastats&lt;/code> package&lt;/a> &lt;code>conda install -c conda-forge condastats&lt;/code>&lt;/li>
&lt;li>From the command line run a version of the following command&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">condastats overall retriever --start_month 2020-01 --end_month 2020-08
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>SSH</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/ssh/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/ssh/</guid><description>&lt;h2 id="what-is-ssh">What is SSH?&lt;/h2>
&lt;p>SSH (short for &amp;ldquo;secure shell&amp;rdquo;) is a computer protocol for encrypting access to computers over a network. Typical use cases are:&lt;/p>
&lt;ul>
&lt;li>accessing HiPerGator and Serenity servers from a laptop or desktop machine, while on the UF network&lt;/li>
&lt;li>cloning, pulling, or pushing to GitHub repositories&lt;/li>
&lt;/ul>
&lt;h2 id="what-software-do-i-need">What software do I need?&lt;/h2>
&lt;p>SSH should be pre-installed with Mac OS, Windows 10, and Linux.
On other versions of Windows, you might want to download &lt;a href="https://gitforwindows.org/" target="_blank" rel="noopener">Git Bash&lt;/a> which also contains some other useful tools.&lt;/p>
&lt;h2 id="what-is-an-ssh-key">What is an SSH key?&lt;/h2>
&lt;p>Access to servers through SSH involves authenticating yourself with a username and password (the same as if you were logging in directly to the machine).&lt;/p>
&lt;p>Alternatively, you can generate a digital &amp;ldquo;key&amp;rdquo; that provides access to the keyholder. Typically, you generate an SSH key for your computer, and then set up the appropriate file on the server(s) you wish to access. Then, instead of authenticating yourself with your username and password for the server, you instead use the SSH key, which is verified by the paired file previously copied to the server. (You can also provide a passphrase in order to use your SSH key, but as it&amp;rsquo;s stored on your laptop or desktop and requires you to be logged in, this security step is optional - opinions vary on this.)&lt;/p>
&lt;h2 id="setup-instructions-github">Setup Instructions (GitHub)&lt;/h2>
&lt;p>GitHub has fairly detailed &lt;a href="https://help.github.com/en/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent" target="_blank" rel="noopener">instructions&lt;/a> on creating a new SSH key;&lt;/p>
&lt;p>and further &lt;a href="https://help.github.com/en/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account" target="_blank" rel="noopener">instructions&lt;/a> to enable it for use on Github.&lt;/p>
&lt;h2 id="setup-instructions-serenity--hipergator--generic">Setup Instructions (Serenity / HiPerGator / generic)&lt;/h2>
&lt;p>To setup an SSH key for use on Serenity, we assume you have gone through the steps described in the instructions for GitHub to create an SSH key. This creates both the key itself (usually located in &lt;code>~/.ssh/id_rsa&lt;/code>), and a paired file that verifies that the key is correct (usually located in &lt;code>~/.ssh/id_rsa.pub&lt;/code>).&lt;/p>
&lt;p>&lt;strong>If you are setting up an SSH key on HiPerGator, note that you probably already have one in the default location, which is used for communicating between different HiPerGator nodes. You may only need to follow the &lt;a href="https://help.github.com/en/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account" target="_blank" rel="noopener">instructions&lt;/a> to enable its usage for github, as well.&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>Log in to the server you wish to use the SSH key on.&lt;/li>
&lt;li>Edit the &lt;code>~/.ssh/authorized_keys&lt;/code> file &lt;em>on the server&lt;/em>. It may already have contents, in which case, go to a new, blank line.&lt;/li>
&lt;li>Copy over the contents of the &lt;code>~/.ssh/id_rsa.pub&lt;/code> from your local computer. It will look something like:&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-fallback" data-lang="fallback">&lt;span class="line">&lt;span class="cl">ssh-rsa *************long-string of random characters************* &amp;lt;email address&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>(If you see something like: &lt;code>-----BEGIN RSA PRIVATE KEY-----&lt;/code>, you are using the wrong file!!)&lt;/strong>&lt;/p>
&lt;ol start="4">
&lt;li>Save the modified &lt;code>~/.ssh/authorized_keys&lt;/code> file on the server.&lt;/li>
&lt;li>Try to connect using ssh. In the command line on your local computer:&lt;/li>
&lt;/ol>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="line">&lt;span class="cl">ssh &amp;lt;username&amp;gt;@&amp;lt;server&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>You should authenticate without having to enter your password.&lt;/p></description></item><item><title>Workflow Tools - snakemake &amp; targets</title><link>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/workflow-management/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://deploy-preview-83--weecology-wiki.netlify.app/docs/computers-and-programming/workflow-management/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Workflow tools let you automate the running and rerunning of code with multiple steps.
For example we use it for managing our image processing workflow for our &lt;a href="https://everglades.weecology.org/" target="_blank" rel="noopener">Everglades research&lt;/a>.&lt;/p>
&lt;h2 id="python---snakemake">Python - snakemake&lt;/h2>
&lt;h3 id="getting-started">Getting Started&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html" target="_blank" rel="noopener">Official snakemake tutorial&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://farm.cse.ucdavis.edu/~ctbrown/2023-snakemake-book-draft/" target="_blank" rel="noopener">C. Titus Brown&amp;rsquo;s draft book on using snakemake for bioinformatics&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="handling-complex-inputs-with-input-functions">Handling complex inputs with input functions&lt;/h3>
&lt;p>In our workflows with deal with complex input-output structures, like having early phases of the pipeline work on one flight (file) at a time and later phases work on all of the files from a given site and year as a group.&lt;/p>
&lt;p>This can be accomplished by defining custom input functions.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://www.embl.org/groups/bioinformatics-rome/blog/2022/10/guest-post-snakemake-input-functions-by-tim-booth/" target="_blank" rel="noopener">Introduction to Input Functions&lt;/a>&lt;/li>
&lt;li>See our &lt;a href="https://github.com/weecology/EvergladesTools/blob/main/Zooniverse/Snakefile" target="_blank" rel="noopener">Everglades workflow Snakefile&lt;/a>&lt;/li>
&lt;/ul>
&lt;h3 id="testing-snakemake-with-partial-wildcards">Testing snakemake with partial wildcards&lt;/h3>
&lt;p>When testing big workflow it is often useful to run the workflow on a subset of the data.
For example our Everglades workflow runs on all years, sites, and flight at once, but we might want to test a site year-site combination when making a change.
To prepare to do this replace your Wildcards object with the component lists for the main workflow. E.g.,&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">ORTHOMOSAICS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">glob_wildcards&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;/&lt;/span>&lt;span class="si">{year}&lt;/span>&lt;span class="s2">/&lt;/span>&lt;span class="si">{site}&lt;/span>&lt;span class="s2">/&lt;/span>&lt;span class="si">{flight}&lt;/span>&lt;span class="s2">.tif&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">FLIGHTS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">ORTHOMOSAICS&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">flight&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">SITES&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">ORTHOMOSAICS&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">site&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">YEARS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">ORTHOMOSAICS&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">year&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The components are just lists, so you can then replace them with whatever pieces of the full workflow you want to test. E.g.,:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="line">&lt;span class="cl">&lt;span class="n">ORTHOMOSAICS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">glob_wildcards&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;/&lt;/span>&lt;span class="si">{year}&lt;/span>&lt;span class="s2">/&lt;/span>&lt;span class="si">{site}&lt;/span>&lt;span class="s2">/&lt;/span>&lt;span class="si">{flight}&lt;/span>&lt;span class="s2">.tif&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">TEST&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">glob_wildcards&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;/blue/ewhite/everglades/orthomosaics/2022/StartMel/&lt;/span>&lt;span class="si">{flight}&lt;/span>&lt;span class="s2">.tif&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">FLIGHTS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">TEST&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">flight&lt;/span> &lt;span class="c1"># ORTHOMOSAICS.flight&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">SITES&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;StartMel&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">FLIGHTS&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1"># ORTHOMOSAICS.site&lt;/span>
&lt;/span>&lt;/span>&lt;span class="line">&lt;span class="cl">&lt;span class="n">YEARS&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;2022&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">*&lt;/span> &lt;span class="nb">len&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">FLIGHTS&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="c1">#ORTHOMOSAICS.year&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="r---targets">R - targets&lt;/h2></description></item></channel></rss>