< Back

Rebuilt an easier-to-use Git version in Java

Rebuilt an easier-to-use Git version in Java

Abstract


In this project, I developed a Java-based version-control system (Git) entirely from scratch. This custom implementation encompasses a wide range of Git's functionalities, including add, commit, status, log, checkout, branches, merge, and the ability to handle merge conflicts through labeling. While Git employs a Merkle Directed Acyclic Graph (DAG) data structure, this program relies on Java serialization for file input/output operations and long-term data persistence.

Gitlet Architecture


CS 61B TA

The working directory refers to the directory on your local computer where you are currently located in your terminal or where your current files are saved. When the command java gitlet.Main init is executed, it establishes a repository within the working directory, configuring the necessary components for Gitlet to track changes: Staging Area, Commits, Master pointer and HEAD pointer. In this project, you can commit a coherent set of files at the same time, each commit represents a snapshot of the entire project at a specific moment in time.

Blobs : holds contents and the name of the file

Commit : consist of a log message, timestamp, a mapping of file names to blob references, a parent reference, and (for merges) a second parent reference

Persistence


Whenever the program is executed, specifically when the gitlet.Main program is run by invoking its main method with the corresponding argument, it will complete its execution and subsequently exit the main method. After this point, no further code will continue running. Hence, we need to store the Staging area, commits, blobs, metadata for commits and all the relevent pointers.

Both Git and Gitlet accomplish this the same way: by using a cryptographic hash function called SHA-1 (Secure Hash 1), which produces a 160-bit integer hash from any sequence of bytes. The SHA-1 hash value, rendered as a 40-character hexadecimal string, makes a convenient file name for storing our data in our .gitlet directory. It also gives us a convenient way to compare two files (blobs) to see if they have the same contents: if their SHA-1s are the same, we simply assume the files are the same.

Classes and Data Structures