Version Control
Writing code is a complex and continuous process. For this reason, your code will change over time. In order to keep track of these changes, you will need a tool manage the versions of your code. This is where version control systems come into play. A version control system is a tool that helps you to manage the changes in your code. It allows you to keep track of the changes that you made in your code. Also, it allows you to revert your code to a previous state if something goes wrong. In addition to these, it allows you to collaborate with other developers. You can work on the same codebase with other developers without any conflicts. Also, you can see who did what in the codebase. This is very useful when you are working with a team.
There are many version control systems in the industry. Some of them are centralized and some of them are distributed. Centralized version control systems have a central server that stores the codebase. Developers can clone the codebase from the server and they can push their changes to the server. However, this approach has some drawbacks. For instance, if the server goes down, then no one can access the codebase. Also, if the server is slow, then the developers will have to wait for the server to respond. In addition to these, if the server is hacked, then the codebase can be stolen. These are some of the drawbacks of centralized version control systems.
Distributed version control systems are more popular in the industry. In distributed version control systems, every developer has a copy of the codebase on their local machine. This is very useful because you can work offline and you can commit your changes to your local repository. Also, you can push your changes to the remote repository when you are online. This approach is more secure and faster than centralized version control systems. Also, it is more reliable because you have a copy of the codebase on your local machine.
Git
Git is a version control system that is widely used in the industry. This book especially focuses on Git because of its popularity. Git is a distributed version control system which means that every developer has a copy of the repository on their local machine. This is very useful because you can work offline and you can commit your changes to your local repository. Also, you can push your changes to the remote repository when you are online.
Everything in git starts with a repository. A repository is a place where you store all of the changes and versions of
your project. A repository can be located on your local machine and/or on a remote server. You can create a repository
by running git init
command in your project directory. This command will make the current directory a git repository.
When you run the command, you will notice that it will not do any changes other than creating a .git
folder in your
project directory. Therefore, git stores everything in .git folder. This folder is a hidden folder and you should
not directly modify it if you do not really know what you are doing.
After creating a repository, you can start tracking your files in your project directory. The directory that you are using for your project is called working directory in git. Working directory is completely separate from the repository. Git itself may create, modify or delete files in the working directory according to the commands that you run. Also some commands may affect your repository. Therefore we have two different type of data storage behind the scenes. One is the working directory and the other is the repository.
The synchronization between the working directory and the repository is one of the main concepts in git. When you make
changes in your working directory, you should stage them before committing. Staging is the process of preparing your
changes to be committed. A commit is a snapshot of your changes. It is like a save point in a game. We can add a
file to the staging area by running git add <file>
command. After collecting all the changes that we want to commit,
we can run git commit -m "Commit message"
command to commit the changes. The -m
flag is used to add a message to the
commit. This message should describe what you did in this commit.
It is very important to write meaningful commit messages.
We can see the current status of the repository and the changes that we made by running git status
command. This
command will show us the files that are modified, staged or not staged. Also, we can see the history of the repository
by running git log
command.
# Create a directory
mkdir git-done-right
cd git-done-right
# Initialize a git repository
git init
# Create a file (you may create your files first and then run the init command after that)
echo "Hello, World!" > hello.txt
# Add the file to the staging area (Which means that we want to include all the changes in this file to the next commit)
git add hello.txt
# Lets check the status of the repository
git status
# Commit the changes
git commit -m "Initial commit."
# Check the status again
git status
# Make some changes
echo "\n" >> hello.txt
echo "How are you?" >> hello.txt
# Check the status again
git status
# Add the changes to the staging area
git add hello.txt
# Commit the changes
git commit -m "Added a new line."
# See all the commits
git log
We can imagine our changes in the repository like a chain. Every commit is a link in this chain. We can go back to a
previous commit by running git checkout <commit-hash>
command. The commit-hash
is the hash of the commit that you
want to go back. You can see the commit hashes by running git log
command. After going back to a previous commit, you
can go back to the latest commit by running git checkout master
command. The master
is the name of the branch that
we are currently working on. We will talk about branches later. git checkout
command modifies the working directory
according to the commit that you specified. Therefore, you should be careful when you are using this command. If you
have uncommitted changes in your working directory, then you may lose them.
We have two commits right now and our commit history is looks like this:
gitGraph commit id:"A" commit id:"B"
Lets imagine that we want to do some experiments with our code and we will decide later whether we will keep them or not. In this case, we may utilize branches. A branch is a pointer to a specific commit. At this point, this may be very confusing. For this reason, we should not forget that each commit has a parent commit which refers to the previous commit.
flowchart RL A-- parent -->NULL; B-- parent -->A;
However, instead of drawing the parent pointer explicitly, like the above, we will use git diagrams to represent the branches. In this case, you should keep in mind that each commit is represented by a circle and the parent commit is represented is the commit that is located on the left side of the commit and connected to the commit with an edge.
When we create a branch, git will create a new pointer to the commit that we are currently on. This pointer is called a
branch. We can create a branch by running git branch <branch-name>
command. After creating a branch, we can switch
to the branch by running git checkout <branch-name>
command. This command will switch to the branch that you specified.
gitGraph commit id:"A" commit id:"B" branch experiment-1 commit id:"C" checkout main branch experiment-2 commit id:"D" checkout main merge experiment-2
At this point, our repository is completely stored on our local machine. If we want to share our code with others, we should store them in a place where others can access it. This place is called a remote repository. A remote repository is a repository that is stored on a server which can be accessed by other developers. The server may be publicly accessible or it may be a private server.
In order to work with remote repositories, the first step is to introduce the remote repository to our local repository.
We can do this by running git remote add origin <remote-url>
command. This command will add a remote repository to our
local repository. We may define multiple remote repositories for our local repository and for this reason, we are
providing a name for the remote repository. This name is called an alias. In this case, we are using origin
as an
alias. This is a common convention in the industry. You can use any name that you want. After adding the remote
repository, we can push our changes to the remote repository by running git push origin master
command. This command
will push our changes to the remote repository. The master
is the name of the branch that we are pushing. We will talk
about it later.
# Add remote repository
git remote add origin https://github.com/cebecifaruk/git-done-right
# Push the changes to the remote repository
git push origin master
The push command simply sends all the commits that are not in the remote repository to the remote repository. Therefore, in the remote repository, we have all the knowledge that is required to build the current working directory of the project. This means that we can also get all the commits from a different machine or in a different directory. In order to "download" a project from a remote repository, we should take these steps:
# Create a directory
mkdir git-done-right-clone
# Change the directory
cd git-done-right-clone
# Initialize a git repository
git init
# Add the remote repository
git remote add origin
# Pull the changes from the remote repository
git pull origin master
However, instead of doing all the steps manually, git provides a command to do all these steps in one command. This
command is git clone <remote-url>
. This command will take all the steps that we mentioned above and it will create a
new directory with the name of the repository.
Git Behind the Scenes
Github
Github is a web-based platform that is built on top of git where you can store your repositories and collaborate with other developers. It is a very popular platform in the industry. It provides many features that are not available in git itself.
You can use Github personally by creating a personal account or you can use it as an organization by creating an account for your organization. However, instead of creating an account for an organization, you can create an organization in Github. An organization is just a group of users at the end of the day. You can add users to your organization and you can manage their permissions. Also, you can create teams in your organization. A team is another group of users that are in the organization.
In addition to these, Github is like a social media platform for developers. You can follow other developers, star their repositories and watch their repositories. Starring a repository means that you like the repository. Watching a repository means that you want to get notifications about the repository. You can also fork a repository. Forking a repository means that you are creating a copy of the repository in your account. You can make changes in the forked repository and you can create a pull request to the original repository. A pull request is a request to merge your changes to the original repository. The owner of the original repository can accept or reject your pull request.
In Github, you can track the tasks that you need to do, and these are called issues. Issues may be a bug, a feature request, a question etc. You can assign issues to other developers and yourself. Issues are organized per repository. If you are working on multiple repositories, you can use projects to organize your issues. Projects are like a kanban board where you can organize your issues in columns.
Also, as you can store your documentation in the repository, you can also store your documentation in repository's wiki. A wiki is a place where you can store your documentation in markdown format. You can create pages, link them to each other and you can organize them in a tree structure.
Monorepo vs Polyrepo
Best Practices
Your commits should describe what you did but it should not consists of only changes. For instance README file updated
or x file changed
messages are not good commit messages. Instead, you can use Added a new section about x
for
README file updated
or Added a new function to do y
for x file changed
. Therefore, it should give an idea about
what you did. It is not needed to list all crud operations that you did because they are already listed in the
commit itself.
Also please be consistent between your commit messages. For instance, if you use imperative mood for your commit messages, you should use it for all of your commit messages. If you use past tense, you should use it for all of your commit messages. Also syntacticly they should be consistent. For instance, you can start with a big letter and you can put a dot at the end of the message. But you should be consistent between your commit messages.
It is very important to keep git history clean. You should not commit files that are not related to your project.
For instance, you should not commit your node_modules
folder, IDE files, temporary files etc. You should only commit
files that are related to your project. Also, git stores your history. So, you should not grow your git repository size
by committing unnecessary files. PLease be careful about binary files. Git is not good at storing binary files. So, you
should not commit binary files (If you really need them, then you can). But please be careful about them.
Another thing is using .gitignore
file to ignore files that you don't want to track. For instance, you don't need to
track node_modules
folder. You can ignore it by adding node_modules
to .gitignore
file. However, instead of
writing it manually, you can use gitignore.io to generate .gitignore
file for your project.