Welcome to Software Carpentry Etherpad!
This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.
Software Carpentry Foundation:
http://software-carpentry.org/
Workshop website:
https://annawilliford.github.io/2016-10-15-UTA/
Vehicle Registration:
https://uta.nupark.com/events/Events/Register/9dc1ecda-c463-4788-badf-330861d04aa9
How to connect :
NetID: evt-sc
Password: Carpentry16
Account has been activated and ready to use. Patrons using this account may access
the UTA wireless network by connecting to "UTA Auto Login using the above credentials.
If this does not work, please try connecting to “UTA Web Login” with the network key
- UTAsecret. Users will then have to open a web browser to authenticate through the
welcome page with the event account NetID username and password and click Submit.
Check software installation (follow steps below AFTER you complete installation - see workshop website for how to install)
Bash: open gitbash terminal, type `bash --version`. You should get an output indicating the version of bash shell
Text Editor: open terminal, type `npp new.txt` if using notepad++; type `edit new.txt` if using Text Wrangler
Git: open gitbash terminal, type `git --version`. You should get an output indicating the version of git
R: open gitbash terminal, type `R --version`. You should get an output indicating the version of R
RStudio: open application, should see 3 or 4 windows.
***************************************************************************************************************
SHELL
Shell Cheatsheet
History of Shell commands:
https://dl.dropboxusercontent.com/u/101820336/2016-10-15-UTA-SWC/ShellHistory.txt
Challenge 1: Navigating
Note where you are in your directory hierarchy. Now aimlessly (randomly) move away from this location at least 3 times (i.e., 3 'cd' commands). Determine where you are and navigate back to your location using 1 (!) command. Half the room is using the relative paths and half is using the absolute paths.
Shell Data Files:
https://dl.dropboxusercontent.com/u/101820336/2016-10-15-UTA-SWC/SWC_Oct2016.zip
Commands you should definitely master!
- whoami # prints username
- pwd # print working directory path
- echo # print
- ls # list the contents of a directory
- cd # change directory
- mkdir # make directory
- touch # create empty file
- cat # view file/concatenate files
- less # controlled view of file
- mv # move/rename file
- cp # copy file
- rm # delete file
Commands I didn't get to in depth, but that you should also try to master!
- wc # word count
- head/tail # display start/end of file
- cut # extract fields (columns) from file
- sort # sort file
- uniq # select uniq lines only
- grep # select rows based on content
- ssh # connect to a remote computer
- scp # transfer files to another computer
- awk # software for manipulating piped data more powerfully in shell
- for loops
- environmental variables
Extra Challenge we didn't get to:
Challenge: Think of a couple questions you could answer using the data we provided you (see above). Answer one using the 'ByMeasure' data and the other using the 'ByCountry' data. You will write a script that loops through these files and extracts the answer from each, writing it to one output file. Repeat and master these tools!
GIT/GITHUB
Linux nano: $ git config --global core.editor "nano –w" Gedit: $ git config --global core.editor "gedit -s“
Mac Text Wrangler $ git config --global core.editor "edit -w“
Windows Notepad++ (Win) $ git config --global core.editor "'c:/program files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"
git config --global user.name "Gaurav Kolekar"
git config --global user.email "gaurav.kolekar01@gmail.com"
git config color.ui "auto"
git config --global color.ui "auto"
git config --list
Link of the md file
https://www.dropbox.com/s/wjf24dxhtl3wjrk/git.md?dl=0
## CHALLENGE
Create another file called abc.txt and push it to Github.
## All the commands from the first session
mkdir swc
517 clear
518 ls
519 cd swc/
520 git init
521 ls
522 ls -la
523 mkdir data images code
524 ls
525 ls
526 cd ..
527 cd ..
528 ls
529 cd Repos/swc/
530 clear
531 cp ../../git.md ./
532 ls
533 git status
534 git add git.md
535 git status
536 clear
537 git commit -m "git markdown added"
538 git push origin master
539 notepad++ README.md
540 git status
541 clear
542 git add README.md
543 git status
544 git commit -m "README.md added"
545 git remote add https://github.com/gauravkolekar/swc.git
546 clear
547 git remote add origin https://github.com/gauravkolekar/swc.git
548 git push origin master
549 touch password.txt
550 ls
551 git status
552 clear
553 notepad++ .gitignore
Jekyll-Now
https://github.com/barryclark/jekyll-now
Rename the file in year-month-date-title.extension
Copy the file to _posts
---
layout: post
title: Blogging Like a Hacker
---
R/RStudio
Morning: R basics and scripts
R data:
curl -o https://annawilliford.github.io/2016-01-30-UTA/data/gapminderData.csv
to download directly from R:
system("curl -o https://annawilliford.github.io/2016-01-30-UTA/data/gapminderData.csv")
or install and use the R curl package: https://cran.r-project.org/web/packages/curl/vignettes/intro.html
Clearning the R console with a command, rather than Ctrl-L, is actually not really intuitive. Here is the answer for how to do it.
'Data Type' is a fairly ambiguous term, so it is good to understand what it can mean. It is also important to understand these basic concepts in general.
- It can refer to the different types of fundamental data types (numeric, string, etc.). These are most basic way a single piece of data is stored and these terms are used across essentially all programming languages. See http://www.r-tutor.com/r-introduction/basic-data-types
- It can also refer to the basic ways that data of any type (numeric, string, etc.) are stored together for use. This is distinct and includes things like vectors, lists, matrices, etc. These are generally used in programming languages as well and apply outside of R. See https://www.tutorialspoint.com/r/r_data_types.htm
0- vs. 1-indexing
- 0-indexing (includes Python), the first element of a vector, list, etc. is indexed as 0, and last element in n-1
- 1-indexing (includes R), the first element of a vector, list, etc. is indexed as 1, and last element is n
- Here is a more graphical way of looking at this. In this application, it is referring to nucleotide strings but the principles are the same. https://www.biostars.org/p/84686/
menuItems<-c("chicken", "soup", "salad", "tea")
menuType<-factor(c("solid", "liquid", "solid", "liquid"))
menuCost<-c(4.99, 2.99, 3.29, 1.89)
myOrder<-list(menuItems, menuType, menuCost)
myOrder_DF<-data.frame(menuItems, menuType, menuCost)
You can subset dataframes or matrices by specifying the rows or columns that you want in brackets:
myOrder[c(1,3), ] # This gives rows 1 and 3
myOrder[, c(1,3)] # This gives columns 1 and 3
Inside the brackets it is always [rows, columns]
# My First R Script# Location of filefilename <- "gapminderData.csv"
# read in data filegapminder <- read.csv(filename)# View dataView(gapminder)
# Select the rows of the country AlbaniaalbaniaData <- gapminder[gapminder$country=="Albania", ]
#GDP per cap of AlbaniaalbaniaGDP <- albaniaData$gdpPercap
Lesson 1: https://www.dropbox.com/s/3ymvdg0fvbacxje/Sunday_R1.R?dl=0
First R Script: https://www.dropbox.com/s/2su6gcq4suovqkp/First_Script.R?dl=0
Afternoon: Plotting and data analysis!!!!
CRAN task views: https://cran.r-project.org/web/views/
1) start code with a header
2) run import statements directly after the header
3) setwd() breaks stuff
4) use # to section of code
5) if you have functions put them up at the top don't bury them in your code
6) be consistent
7) if you have a script >100 lines your doing it wrong
8) have a directory structure that all projects have
9) peer review
10) use git and version control for your code
data("faithful")
head(faithful)
foo <- faithful[ ,2]
mode(foo)
help(mode)
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
Mode(foo)
hist(foo)
dat <- read.csv(url("http://coleoguy.github.io/SWC/scores.csv"))
https://dl.dropboxusercontent.com/u/56656258/test.scores.R
color brewer
http://colorbrewer2.org/#type=qualitative&scheme=Paired&n=3
Heaths website
http://coleoguy.github.io/SWC/index.html