Establishing your online presence with Git and GitHub pages

Motivation

It is becoming evermore important to establish a web presence, even as an undergraduate student. Facebook, Twitter, and the like are methods for doing this, but many would agree that it is much more professional to give people a snazzy website. Historically, people gravitated towards large website creation websites or had to learn HTML and other languages to build their own websites. Then there was the arduous task of finding or paying someone to host your website. Fortunately, the increased expectation of creating and maintaining a website is correlated with the increased ease in doing so, due in part to cloud-based hosting solutions and intuitive markdown-based methods of building content.

GitHub has emerged as one of the best options creating an awesome looking website easily and cheaply (even free!). As the name implies, GitHub was founded as simply a online hub, or hosting service, for Gits, or as they are commonly called, Git repositories. Git, put simply, is a flexible and powerful file management and archival system. It was built with complex project management in mind, in an effort allow project developers to keep track of file identities and versions, and seamlessly navigate through this complex temporal and hierarchical file-system structures. If this isn’t making sense, the comic below more-or-less outlines what Git allows users to avoid.

not_final.doc

Overview of Lesson

Learners will complete the following broad tasks:

  1. Fork a website template repository on GitHub and use Git locally to pull this repository onto their computer.
  2. Make necessary changes to setup the website and then add content, using Markdown, while learning the basics of Git.
  3. Push local changes back to remote repository at GitHub, which then automatically builds and displays the website.
  4. Users will conclude by populating their website with all materials and overviews of the lessons for the Software Carpentry workshop they participated in. This will allow them to reference this information later.

Forking and Cloning a Repository

GitHub Pages, GitHub’s service that builds and hosts websites, utilizes several fairly complex programming languages, thus saving the user the trouble of knowing them in great depth. One simply has to provide a properly formatted repository and GitHub does the rest. We will take advantage of this, plus open source website templates, to create personal websites. For this lesson, learners will be using the Jekyll Now template, but there are many options out there.

This template is contained in a repository owned by Daren Card, which is modified slightly from its original form. It was originally created by Barry Clark. This means that we cannot edit it ourselves. Rather, we will be creating our own copy of it and then modifying it to our liking. The copying will take place on GitHub using a process called ‘forking’ and the editing will take place using a simple text editor on the user’s local computer. Visit the modified Jekyll Now template repository from Daren and click the Fork button at the top-right. If you happen to already be affiliated with another organization in GitHub, you may be prompted to select an account, and you should use your personal GitHub account.

We’ll make some pretty basic changes online to the repository settings. Under the top tabs is a brief description of the repository that we inherited as part of this template. Click to edit and give your repository a new, brief description. Also include the URL we will use to access your forthcoming website, which should be <username>.github.io, where <username> reflects your GitHub username.

From here, we will work further on our repository on our local computer. Before doing so, we must setup Git for the first time on our machine.

# setup the name git will call us by
git config --global user.name "First Last"
# setup the email address to use. must match that used to create github account.
git config --global user.email "email@domain.com"
# add some useful colors to output
git config --global color.ui "auto"
# specify text editor, to be used in committing. Choose appropriate option:
# nano: git config --global core.editor "nano -w"
# text wrangler: git config --global core.editor "edit -w"
# notepad++: git config --global core.editor "'c:/program files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"
# kate: git config --global core.editor "kate"
git config --global core.editor "<choice">
# now take a look at these settings (and more)
git config --list

To work locally, we must download the repository from GitHub, which is called ‘cloning’. We can do this as a zipped file, but let’s instead learn our first bit of Git. Along the top of your files list you should see a URL. Be sure that the box to the left reads ‘HTTPS’ and not ‘SSH’. The latter requires some further steps, that we will leave that for you to do later using instructions on setting up SSH keys. Copy the HTTPS link and in your Terminal type the following commands.

# Create a 'Repos' directory in your $HOME folder
mkdir -p ~/Repos
# Change into your new directory
cd ~/Repos
# Clone your website template repository from GitHub
git clone https://github.com/<username>/<username>.github.io.git

You should now see a see a local copy of your repository in the working directory.

Exploring Git

We should now spend some time explaining Git in more detail.

# Change into your repository directory and view its contents
cd <username>.github.io
ls

You should see the same exact list of files that was observed on GitHub. So what makes this a Git repository? It appears to be like any other directory. Let’s explore that question.

# List the full contents of the directory, including hidden files, as a single column
ls -1a
.
..
.git
.gitignore
404.md

You should now notice several files with leading dots, indicating they are hidden. The first and second represent the working and upper-level directories, respectively. The next ‘.git’ file is the key to a Git repository, as it contains the information that Git (and GitHub, which is essentially Git running on a server, with some fancy add-ons) use to manage this directory as a repository.

We’ve copied this directory, so let’s take a quick tangent and explain how one would initialize a Git repository in any directory.

# Change directory to the upper-level (notice the use of the ..)
cd ..
# Make a new directory called whatever you want (I'm using 'biology') and move into it
mkdir biology
cd biology
# Create some empty files/directories to make this appear like a normal directory
mkdir genetics ecology microbiology
touch zoology botany

So now we have essentially the same thing as our copied website template, but without the ‘fancy’ Git hidden files. Let’s create those now.

# Initialize your directory as a Git repository
git init
# View the directory contents
ls -1a
.
..
.git
botany

So we now have the the .git hidden file, but what about that .gitignore one. All the .gitignore file specifies are the files and/or directories that you want Git to ignore when it tracks the files in your repository. Therefore, if we find plants boring (because they are!), we could have Git ignore the ‘botany’ file. We do this by simply adding the names of the files/directories we want to ignore.

# One way of adding 'botany' to our gitignore, sans text editor
echo 'botany' > .gitignore

That gives some basic Git information, and we’ll explore more with real files inside of our website repository.

Setting Up Your Website

We’ll begin by configuring a couple files that establish the overall website design and that tell our visitors some basic information about ourselves. Let’s first edit the _config.yml file to set some overall design parameters. Open this file in a text editor and fill in some basic information.

name: <name>
description: <name>'s website
url: <username>.github.io

Let’s also replace that stock photo with something more personal. Go online and find a picture of yourself, or of something fitting of yourself (that you have permission to use). In many cases, you can right-click and select “Copy Image URL” or something similar. If your lazy you can just use your GitHub image. You can place this as your avatar.

avatar: http://domain.extension/images/picture.jpg

Now let’s edit the ‘About’ page on our website and provide any visitors with basic information about yourself. When you open the file in a text editor, you’ll see the following header. It simply specifies information about the page design, title, and relative link.

---
  layout: page
title: About
permalink: /about/
---

Next you can modify the text accordingly to tell people about yourself. Notice the pound signs at the beginning of a couple lines and how they translate to the relative text rendering on the website. The syntax is called Markdown and it allows us to take simple text and add some flare without a bunch of coding knowledge. This lesson overview is written in markdown as well. You can use the brief guide at GitHub to get a feel for Markdown.

Integrating Our Changes and Visualizing the Results

Now we want to see what our changes have done to the website. This will introduce the most commonly used Git commands. Let’s start by viewing our Git status.

git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)

modified:   _config.yml
modified:   about.md

no changes added to commit (use "git add" and/or "git commit -a")

The output provides us with some useful information, like the branch name (something we won’t really discuss today). It also includes our two modified files and tells you that neither have been added to commit. Let’s progress by both adding and committing the changes we’ve made. You should also view the Git status after each one.

git add _config.yml about.md
# Hopefully you are noticing the intuitive verbs that Git uses
# This one adds your files to Git's tracking
git commit -m "added some basic website info"
# This command commits your changes and provides a useful message about what they are

You can also view your Git log and see info on the changes you’ve just made, including obvious things and also a computer generated label for the commit (that long alphanumeric string).

git log
commit a69c34a76c0fa85b7ffc86237fb34c8e0f6ae4c3
Author: darencard <dcard@uta.edu>
Date:   Fri Jan 22 22:43:55 2016 -0600

added some basic website info

The steps you just took made your first Git commit, so Git has recorded these initial changes. Let’s make some more quick changes to your _config.yml. You’ll notice that there are fields like ‘email’ and ‘github’ (and others) where you can include your email address and GitHub handle for people to interact with you. Fill one or two in.

Another key attribute of Git is that it allows you to compare versions of files and note differences. Let’s compare our _config.yml file from our two commits.

bash git diff diff --git a/_config.yml b/_config.yml index b89b129..5a4572f 100644 --- a/_config.yml +++ b/_config.yml @@ -21,12 +21,12 @@ footer-links: email: facebook: flickr: - github: + github: darencard instagram: linkedin: pinterest: rss: # just type anything here for a working RSS icon - twitter: + twitter: darencard stackoverflow: # your stackoverflow profile, e.g. "users/50476/bart-kiers" youtube: # channel/<your_long_string> or user/<user-name> googleplus: # anything in your profile username that comes after plus.google.com/

As you can see, Git gives you pretty intuitive information on the files being compared (plus abbreviated labels) and the specific changes made to the file. Now go ahead and practice adding and committing your new changes. If you view your log again you’ll see this second commit.

Now let’s say that you made a mistake in editing the _config.yml file. In this context, it would be very easy to open the file back up and make a couple quick edits. However, what if this file is much more complex, like a long shell script, or what if you go through a series of edits and still have an error, and just want to revert to the last working version you had. Git allows you to do this.

# Revert to the previous commit
git checkout HEAD~1 _config.yml
# Revert to the commit before the previous commit (or beyond)
git checkout HEAD~2 _config.yml
# Revert to a specific commit using the full commit label
git checkout a69c34a76c0fa85b7ffc86237fb34c8e0f6ae4c3

You can use similar relative or absolute commit syntax to compare committed files using a command we’ve already seen.

# Compare two previous commits
git diff HEAD~1 HEAD~2 _config.yml
# Compare our newly edited, uncommited files to an exact commit by label
git diff a69c34a76c0fa85b7ffc86237fb34c8e0f6ae4c3

Integrating Our Changes with our Website

These previous steps demonstrate the basics of Git, which can be used locally to keep track of changes to files without having to make separate file copies for each change. Let’s take this full circle and reintegrate our changes remotely, on GitHub’s servers, and thus customize our website.

# Let's make sure we are at the head of our branch
git checkout HEAD
# Push our changes to GitHub where it can be integrated into our website
# 'origin' refers to the remote server and 'master' to the branch
git push origin master
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 585 bytes | 0 bytes/s, done.
Total 7 (delta 5), reused 0 (delta 0)
To https://github.com/darencard/darencard.github.io
0ed1225..84634d5  master -> master

Now if we refresh our GitHub repository page we should see the changes, including the commit messages. You can click on these files and view their contents, and you can even compare versions within GitHub, just like we’ve already done locally. Now view your website using your URL and you’ll see the changes reflected (sometimes GitHub takes a few minutes to render changes).

Populating Your Website with Workshop Content

Part of the goal of this exercise, especially during a workshop, is to provide the learner with a compact package containing everything they need to go through the workshop again or to refresh their memory. Distributing this package in the form of a website repository not only provides all the files the user needs, but also allows the user to visualize the lessons from anywhere at anytime through a static website. If you look at the repository you originally forked, you’ll see it contains the raw data that was used for the other portions of the Software Carpentry workshop attended. It also contains the markdown files with the lesson overviews (which you are currently reading). Unfortunately, the repository does not contain the output files generated as part of the workshop, which may be beneficial to have if one wants make comparisons with later work.

Let’s begin by creating a .zip archive containing the output from the other portions of the workshop.

# Copy all output into a new directory 'output_data'
# It would probably be best to organize files into subdirectories for Linux, Python, SQL
# Zip the output_data directory
zip -r output_data.zip output
# Move output_data.zip into the 'data' directory of your website repository

Now we can add, commit, and push these changes.

git add *
git commit -m "Added output zip"
git push origin master

Now if we look at our website, we can view all of the lesson writeups, complete with both the raw and output data.

Using a Pull Request to Help a Friend

Let’s say that you have a friend who also took the Software Carpentry workshop, but who had to leave a little bit early. Therefore, he or she didn’t get a chance to properly add the output data to their website repository and has since deleted it. They ask you to help them out. Normally, one may just email the file over, but GitHub offers an alternative method of getting this data to your friend, with the added benefit of placing it directly into the correct location within your friend’s repository. This method is called a Pull Request.

To perform a Pull Request, you must first navigate to your friend’s website repository. Click the ‘New pull request’ button above the file list.

On the next page you will see a few pieces of important information. The bar along the top displays the forks and branches that are to be merged. The ‘base fork’ should be your friend’s repository where you are trying to send your updates. The ‘base’ refers to the branch on that repository you are merging too, which will almost always be ‘master’. The ‘head fork’ is your updated repository with the changes you are trying to send to your friend. In this case, you are comparing your ‘master’ branch with your friend’s repository. Below you will see information on the number of updated commits and files, information that should match your expectations. When working with text files, like scripts, you should also see an intuitive graphic showing the change, with subtractions indicated in red with a ‘-’ sign and additions indicated in green with a ‘+’ sign. Github makes some checks to be sure that you can merge these forks together (see “Able to merge” in the above bar, but you should use this page to make sure you are happy with the changes you are sending to your friend’s repository. Once satisfied, you can press the ‘Create pull request’ button near the top of the page.

On the next page you will see a text box with a subject line, much like an email message. This is what you use to provide information on the Pull Request you are undertaking, so that the other use can easily know the basics of what you are contributing. In the subject/header line you should write a brief, but informative, phrase about the changes occurring. If you would like to add more information, you can use the large text block below to provide more details. In this text block, it is possible to use markdown to render your information, and you can use the preview button to see what this will look like. Take a minute to create an informative message and comment to your friend and click ‘Create pull request’ near the bottom of the page.

Finally, you will see a confirmation/summary page about the Pull Request you just completed. GitHub again gives you the option of leaving a comment to help others understand your changes.

Before your changes can be incorporated, your friend must actually ‘Merge’ your Pull Request. He or she must log into his or her account and should see a notification that a Pull Request is pending. This will take your friend to a page like that below, where he or she can view the changes you propoed.

The ‘Conversation’ tab contains the description of the pull request you gave when you created it, and users can use this space to converse about potential changes over a series of messages, if needed. The ‘Commits’ and ‘Files changed’ tabs are can be used to view the commits made and the actual file changes, using an environment that is similar to what you interacted with during your Pull Reqeust. When your friend is satisfied that he or she wants to accept your Pull Request, he or she can click ‘Merge pull request’. If the changes aren’t appropriate after some conversation, he or she can instead click ‘Close pull request’. Upon accepting a pull request, your changed file(s) will be incorporated into your friend’s repository, and his or her website should update to reflect the changes. Basically, your friend’s website will now contain an active link to the output data that he or she was missing.

Additional Resources