FlightGear Git: splitting FGData: Difference between revisions

Jump to navigation Jump to search
Re-organise, clarify the differenceb between repository and project.
(Re-organise, clarify the differenceb between repository and project.)
Line 1: Line 1:
= To Split or not to Split, that's the question =
After much discussion on the mailing list, it was decided to put the existing attempt to split FGdata on hold until further notice. The main reason for postponing the split was that, while it was considered a well intended initiative, the end result of the splitting process itself left the FlightGear fgdata project in a less than desirable state. For this reason, before another splitting attempt is to be undertaken, the pro's and con's of each step should be carefully evaluated. This article discusses some of our options and will formulate a plan of approach that can be presented to -and discussed in further depth- on the developers mailing list. Several reasons have been put forth to split fgdata:
After much discussion on the mailing list, it was decided to put the existing attempt to split FGdata on hold until further notice. The main reason for postponing the split was that, while it was considered a well intended initiative, the end result of the splitting process itself left the FlightGear fgdata project in a less than desirable state. For this reason, before another splitting attempt is to be undertaken, the pro's and con's of each step should be carefully evaluated. This article discusses some of our options and will formulate a plan of approach that can be presented to -and discussed in further depth- on the developers mailing list. Several reasons have been put forth to split fgdata:


== Reasons to split fgdata ==
== Reasons to split fgdata ==
* '''Advantages'''
=== Advantages ===
** Aircraft authors can get commit access to their own aircraft, without granting them global fgdata access.
** Aircraft authors can get commit access to their own aircraft, without granting them global fgdata access.
** When pulling fgdata, one won't have to download several gigs of aircraft data. People will have to pull the base package, but any additional aircraft will be optional.
** When pulling fgdata, one won't have to download several gigs of aircraft data. People will have to pull the base package, but any additional aircraft will be optional.
Line 11: Line 9:
** fgdata size decreases from 5,6 GB to 1 GB (see statistics below).
** fgdata size decreases from 5,6 GB to 1 GB (see statistics below).


=== Disadvantages ===
It should also be noted, however, that a split is not without potential problems:
It should also be noted, however, that a split is not without potential problems:


* '''Disadvantages'''
** It will be harder to keep a local up to date copy of all aircraft. No more "git pull" to fetch all the latest updates.
** It will be harder to keep a local up to date copy of all aircraft. No more "git pull" to fetch all the latest updates.
*** Might be fixed by using Git submodules.<ref>[http://book.git-scm.com/5_submodules.html Git Community Book: Submoduldes]</ref>
*** Might be fixed by using Git submodules.<ref>[http://book.git-scm.com/5_submodules.html Git Community Book: Submoduldes]</ref>
Line 24: Line 22:
One of the most prominent reasons brought forth in favor of splitting fgdata is related to the relatively large size of the initial clone of the git repository, the relatively slow download size of gitorious, and the observation that interrupted downloads cannot be resumed. Before discussing possible alternatives to this problem, a few observations should be made with respect to the actual size of the downloaded git package:
One of the most prominent reasons brought forth in favor of splitting fgdata is related to the relatively large size of the initial clone of the git repository, the relatively slow download size of gitorious, and the observation that interrupted downloads cannot be resumed. Before discussing possible alternatives to this problem, a few observations should be made with respect to the actual size of the downloaded git package:


* '''Statistics''': To obtain proper GIT repository size statistics, make sure to only check the size of the ".git" folder - which contains the history that belongs to the archive and needs to be downloaded. Once you check out a branch as a "working copy" locally, the total size of your actual file system folder increases (likely doubles), since the check-out creates a working ''copy'' of all files by ''extracting'' data from the ''compressed'' archive.
=== Statistics ===
** Size of original fgdata GIT repository: 5.6GB
To obtain proper GIT repository size statistics, make sure to only check the size of the ".git" folder - which contains the history that belongs to the archive and needs to be downloaded. Once you check out a branch as a "working copy" locally, the total size of your actual file system folder increases (likely doubles), since the check-out creates a working ''copy'' of all files by ''extracting'' data from the ''compressed'' archive.
** Size of fgdata core GIT repository without aircraft: 1GB
* Size of original fgdata GIT repository: 5.6GB
** Total size of all aircraft repositories: 3.1GB
* Size of fgdata core GIT repository without aircraft: 1GB
** Number of aircraft: 385
* Total size of all aircraft repositories: 3.1GB
* Number of aircraft: 385


It should be noted that interrupted downloads are a potential problem; however there are a number of viable workarounds for these:
It should be noted that interrupted downloads are a potential problem; however there are a number of viable workarounds for these:
** Download an initial clone using a more robust download system, such as a bittorrent
* Download an initial clone using a more robust download system, such as a bittorrent
** Download a snapshot without full project history.
* Download a snapshot without full project history.
** Clone the repository from a faster mirror, such as the mapserver.
* Clone the repository from a faster mirror, such as the mapserver.


It should further be noticed that git's merging and update algoritms are sufficiently efficient to deal with our ever increasing repository, so no immediate problems are to be expected in this area. Given these considerations, it appears that there are sufficient alternatives to circumvent the initial clone problem, and that the size of the git repository as such poses no immediate problem. That said, there are a number of additional reasons that make it desireable to split the fgdata repository in smaller, more manageable chunks. Splitting off the aircraft directory from the rest is a logical first step, and the main question is how to proceed with this. There are a number of possible alternatives: 1) Split off all aircraft and keep then all in a single, but separate repository. 2) Move each aircraft to its own repository, and 3), organize aircraft by logical units. Here are the advantages and disadvantages of keeping all aircraft in a single repository:
It should further be noticed that git's merging and update algoritms are sufficiently efficient to deal with our ever increasing repository, so no immediate problems are to be expected in this area. Given these considerations, it appears that there are sufficient alternatives to circumvent the initial clone problem, and that the size of the git repository as such poses no immediate problem. That said, there are a number of additional reasons that make it desireable to split the fgdata repository in smaller, more manageable chunks.  


=== Keeping all aircraft under a single project ===
== Options ==
Splitting off the aircraft directory from the rest is a logical first step, and the main question is how to proceed with this. There are a number of possible options:
* Split off all aircraft and keep them all in a single, but separate repository.
* Move each aircraft to its own repository
* Organize aircraft by logical units.
Here are the advantages and disadvantages of each option:
 
=== Single repository ===
* '''Advantages'''
* '''Advantages'''
** The current fgdata-developers team can access any single aircraft, for easy/quick fixes. For example when something is found to be wrong and copied among several aircraft (which happens due to copy&paste). Or when something about the sim itself changes and aircraft msut be adapted to run on an upcoming release.
** The current fgdata-developers team can access any single aircraft, for easy/quick fixes. For example when something is found to be wrong and copied among several aircraft (which happens due to copy&paste). Or when something about the sim itself changes and aircraft msut be adapted to run on an upcoming release.
** When an aircraft developers decides to leave, the repo can easily be taken over by other developers. If the author set up his own repository, we'd have to create a new repository (and thus change all references/links).
** When an aircraft developers decides to leave, the aircraft can easily be taken over by other developers.
** It allows us to use [http://flightgear-bugs.googlecode.com the bug tracker] for aircraft. Most developers won't clone aircraft repos from all kind of places, just to help fixing bugs.
** It allows us to use [http://flightgear-bugs.googlecode.com the bug tracker] for aircraft. Most developers won't clone aircraft repos from all kind of places, just to help fixing bugs.
* '''Disadvantages'''
* '''Disadvantages'''
** Authors won't be able to choose their own license.
** Authors won't be able to choose their own license.
*** The FlightGear Aircraft project has been set to "License: Other/Multiple". This allows (in theory, we first need to agree on this) any aircraft author to add whatever license file to his/her aircraft and still put it under the project.
** The aircraft/ directory makes up the largest part of the current repository. A single aircraft repository would make contributing to the base package easier, but it won't have much effect on aircraft developers.
*** However, the use of a ''common'' license for all ''community aircraft'' has many advantages and has also contributed to the success of the FG project (new aircraft can be easily based on existing aircraft, can copy elements from other aircraft without triggering a complicated mixed license situation), and specifically the GPL has many advantages for the FG project (see [[Changing the FlightGear License]] for details why FG wants to stick to the GPL). Authors preferring an other license can of course already (and continue to) do so - except that their aircraft is not in the community repository.


=== Organizing Aircraft by Logical units ===
=== Per-aircraft repository, single project ===
Each aircraft would get a repository under a "FGAircraft" project at Gitorious.
* '''Advantages'''
* '''Advantages'''
** Logical ordering units remain manageble both in terms of the number of them as well as their size
** The current fgdata-developers team can access any single aircraft, for easy/quick fixes. For example when something is found to be wrong and copied among several aircraft (which happens due to copy&paste). Or when something about the sim itself changes and aircraft msut be adapted to run on an upcoming release.
* '''Disadvantages'''
** When an aircraft developers decides to leave, the repo can easily be taken over by other developers. If the author set up his own repository, we'd have to create a new repository (and thus change all references/links).
** It is difficult to come up with a good set of criteria to define the aircraft categories.
** It allows us to use [http://flightgear-bugs.googlecode.com the bug tracker] for aircraft. Most developers won't clone aircraft repos from all kind of places, just to help fixing bugs.
** Authors can choose their own license.
*** The FlightGear Aircraft project can be set to "License: Other/Multiple". This allows (in theory, we first need to agree on this) any aircraft author to add whatever license file to his/her aircraft and still put it under the project.
*** However, the use of a ''common'' license for all ''community aircraft'' has many advantages and has also contributed to the success of the FG project (new aircraft can be easily based on existing aircraft, can copy elements from other aircraft without triggering a complicated mixed license situation), and specifically the GPL has many advantages for the FG project (see [[Changing the FlightGear License]] for details why FG wants to stick to the GPL). Authors preferring an other license can of course already (and continue to) do so - except that their aircraft is not in the community repository.


=== Assigning each aircraft to its own project ===
=== Per-aircraft project ===
* '''Advantages'''
* '''Advantages'''
** Each aircraft developer can get commit rights to his or her own project.
** Each aircraft developer can get admin rights to his or her own project.
* '''Disadvantages'''
* '''Disadvantages'''
** It will become increasingly difficult to maintain abandoned aircraft, or conduct maintanance
** It will become increasingly difficult to maintain abandoned aircraft, or conduct maintanance
** With over 500 individual repositories it will become increasingly difficult to keep track of new developments.  
** With over 500 individual repositories it will become increasingly difficult to keep track of new developments.  


=== Organizing aircraft by logical units ===
* '''Advantages'''
** Logical ordering units remain manageble both in terms of the number of them as well as their size
* '''Disadvantages'''
** It is difficult to come up with a good set of criteria to define the aircraft categories.


== Considerations ==
Given these considerations, it can be concluded that it is desirable to separate the aircraft from the main repository. It should also be pointed out that seperating out the aircraft, and moving them all into a single repository is the sole action addressing the most urgent reason for the split, namely giving the opportunity to be more liberal in granting aircraft developers commit rights, without having to consider the integrity of the base package as such. It should furthermore be noticed that while it is technically possible to remove an entire subdirectory from a project without losing its history, it is undesirable to do so. Every split done in this manner would force every user to reclone the entire repository in question. Additionally, it is considerably more difficult to combine repositories again once they have been split. Therefore, we should be cautious in performing split operations. For this reason, the most feasible action appears to be to just separate the aircraft directory from fgdata and move this in it's entirety to a new subproject. There it can live until a new and agreed upon classification scheme for separate repositories has been developed.  
Given these considerations, it can be concluded that it is desirable to separate the aircraft from the main repository. It should also be pointed out that seperating out the aircraft, and moving them all into a single repository is the sole action addressing the most urgent reason for the split, namely giving the opportunity to be more liberal in granting aircraft developers commit rights, without having to consider the integrity of the base package as such. It should furthermore be noticed that while it is technically possible to remove an entire subdirectory from a project without losing its history, it is undesirable to do so. Every split done in this manner would force every user to reclone the entire repository in question. Additionally, it is considerably more difficult to combine repositories again once they have been split. Therefore, we should be cautious in performing split operations. For this reason, the most feasible action appears to be to just separate the aircraft directory from fgdata and move this in it's entirety to a new subproject. There it can live until a new and agreed upon classification scheme for separate repositories has been developed.  


Line 68: Line 83:
* As more and more people gain commit rights, some rules and guidelines are in order. We currently have a largely unwritten code of conduct that has proven to work well. With new people entering the scene, it will become necessary to put them in writing however.
* As more and more people gain commit rights, some rules and guidelines are in order. We currently have a largely unwritten code of conduct that has proven to work well. With new people entering the scene, it will become necessary to put them in writing however.


= A new plan for splitting fgdata =
== A new plan for splitting fgdata ==


# Create a new git repository: fgaircraft. This repository will -for the time being- contain all current and new FlightGear aircraft except the default c172p.
# Create a new git repository: fgaircraft. This repository will -for the time being- contain all current and new FlightGear aircraft except the default c172p.
Line 93: Line 108:
# In addition to these rules, anybody contributing to the FlightGear project is encouraged to work with personal clones and submit merge requests.
# In addition to these rules, anybody contributing to the FlightGear project is encouraged to work with personal clones and submit merge requests.


===Comments on the new plan===
=== Comments on the new plan ===


(HHS comments)
(HHS comments)

Navigation menu