Difference between revisions of "FlightGear Git: splitting FGData"

From FlightGear wiki
Jump to navigation Jump to search
m (Forgot to add my name)
Line 91: Line 91:
===Comments on the new plan===
===Comments on the new plan===
(HHS comments)
That sounds like the situation we already have - the only difference: the aircraft are just splitted from FGData, and the maintainer of an Aircraft-project can get the rights to commit without any magic.
That sounds like the situation we already have - the only difference: the aircraft are just splitted from FGData, and the maintainer of an Aircraft-project can get the rights to commit without any magic.
But with that there are lot of rules with exceptions. From my daily real life job I do know, that many rules with exceptions may make things more complicated and provoke discussions and conflicts.
But with that there are lot of rules with exceptions. From my daily real life job I do know, that many rules with exceptions may make things more complicated and provoke discussions and conflicts.

Revision as of 19:09, 11 December 2011

To Split or not to Split, that's the question

After much discussion on the mailing list, it was decided to put the existing attempt to split FGdata on hold until further notice. The main reason for postponing the split was that, while it was considered a well intended initiative, the end result of the splitting process itself left the FlightGear fgdata project in a less than desirable state. For this reason, before another splitting attempt is to be undertaken, the pro's and con's of each step should be carefully evaluated. This article discusses some of our options and will formulate a plan of approach that can be presented to -and discussed in further depth- on the developers mailing list. Several reasons have been put forth to split fgdata:

Reasons to split fgdata

  • Advantages
    • Aircraft authors can get commit access to their own aircraft, without granting them global fgdata access.
    • When pulling fgdata, one won't have to download several gigs of aircraft data. People will have to pull the base package, but any additional aircraft will be optional.
    • It will be easier for aircraft authors to check the history of their aircraft.
    • Commiting will go faster, because Git will no longer have to check those thousands of files to see whether they were edited. NOTE: Can't reproduce even on really old, slow (7.2k SATA) disks.
    • fgdata size decreases from 5,6 GB to 1 GB (see statistics below).

It should also be noted, however, that a split is not without potential problems:

  • Disadvantages
    • It will be harder to keep a local up to date copy of all aircraft. No more "git pull" to fetch all the latest updates.
      • Might be fixed by using Git submodules.[1]
    • How to deal with licences? Until now there was a COPYING file in fgdata. When aircraft are split in separate repositories, they'll likely need to include a license reference themselves.
    • Need a concept for release management, maintaining version numbers, release branches, release tags et. al.
    • Quite a few unmaintained aircraft got adopted after one of the developers accidentially tripped over them. Need a plan how this would be supposed to work with split aircraft repositories, otherwise the project would axe one of the substantial principles which contributed to its success.
    • Need an idea about how to subsitute the the previous "starter" package which was offered via HTTP for those who'd like to have the entire repository.

One of the most prominent reasons brought forth in favor of splitting fgdata is related to the relatively large size of the initial clone of the git repository, the relatively slow download size of gitorious, and the observation that interrupted downloads cannot be resumed. Before discussing possible alternatives to this problem, a few observations should be made with respect to the actual size of the downloaded git package:

  • Statistics: To obtain proper GIT repository size statistics, make sure to only check the size of the ".git" folder - which contains the history that belongs to the archive and needs to be downloaded. Once you check out a branch as a "working copy" locally, the total size of your actual file system folder increases (likely doubles), since the check-out creates a working copy of all files by extracting data from the compressed archive.
    • Size of original fgdata GIT repository: 5.6GB
    • Size of fgdata core GIT repository without aircraft: 1GB
    • Total size of all aircraft repositories: 3.1GB
    • Number of aircraft: 385

It should be noted that interrupted downloads are a potential problem; however there are a number of viable workarounds for these:

    • Download an initial clone using a more robust download system, such as a bittorrent
    • Download a snapshot without full project history.
    • Clone the repository from a faster mirror, such as the mapserver.

It should further be noticed that git's merging and update algoritms are sufficiently efficient to deal with our ever increasing repository, so no immediate problems are to be expected in this area. Given these considerations, it appears that there are sufficient alternatives to circumvent the initial clone problem, and that the size of the git repository as such poses no immediate problem. That said, there are a number of additional reasons that make it desireable to split the fgdata repository in smaller, more manageable chunks. Splitting off the aircraft directory from the rest is a logical first step, and the main question is how to proceed with this. There are a number of possible alternatives: 1) Split off all aircraft and keep then all in a single, but separate repository. 2) Move each aircraft to its own repository, and 3), organize aircraft by logical units. Here are the advantages and disadvantages of keeping all aircaft in a single repository:

Keeping all aircraft under a single project

  • Advantages
    • The current fgdata-developers team can access any single aircraft, for easy/quick fixes. For example when something is found to be wrong and copied among several aircraft (which happens due to copy&paste). Or when something about the sim itself changes and aircraft msut be adapted to run on an upcoming release.
    • When an aircraft developers decides to leave, the repo can easily be taken over by other developers. If the author set up his own repository, we'd have to create a new repository (and thus change all references/links).
    • It allows us to use the bug tracker for aircraft. Most developers won't clone aircraft repos from all kind of places, just to help fixing bugs.
  • Disadvantages
    • Authors won't be able to choose their own license.
      • The FlightGear Aircraft project has been set to "License: Other/Multiple". This allows (in theory, we first need to agree on this) any aircraft author to add whatever license file to his/her aircraft and still put it under the project.

Organizing Aircraft by Logical units

  • Advantages
    • Logical ordering units remain manageble both in terms of the number of them as well as their size
  • Disadvantages
    • It is difficult to come up with a good set of criteria to define the aircraft categories.

Assigning each aircraft to its own project

  • Advantages
    • Each aircraft developer can get commit rights to his or her own project.
  • Disadvantages
    • It will become increasingly difficult to maintain abandoned aircraft, or conduct maintanance
    • With over 500 individual repositories it will become increasingly difficult to keep track of new developments.

Given these considerations, it can be concluded that it is desirable to separate the aircraft from the main repository. It should also be pointed out that seperating out the aircraft, and moving them all into a single repository is the sole action addressing the most urgent reason for the split, namely giving the opportunity to be more liberal in granting aircraft developers commit rights, without having to consider the integrity of the base package as such. It should furthermore be noticed that while it is technically possible to remove an entire subdirectory from a project without losing its history, it is undesirable to do so. Every split done in this manner would force every user to reclone the entire repository in question. Additionally, it is considerably more difficult to combine repositories again once they have been split. Therefore, we should be cautious in performing split operations. For this reason, the most feasible action appears to be to just separate the aircraft directory from fgdata and move this in it's entirety to a new subproject. There it can live until a new and agreed upon classification scheme for separate repositories has been developed.

Finally, the following considerations should be taken into account.

  • FlightGear's release distributions are steadily increasing in size. With the proposed fgdata split, we should consider removing all aircraft, except the default cessna 172p from the base package. All others can simply be downloaded from the website. Considering that the c172p is and integral part of FlightGear, it should remain the ONLY aircraft that remains in the base package, and wich is is not moved to the new fgaircraft repository.
  • It should be emphasized that GIT is a distributed revision control system, and that our current use of git is insufficient.Aircraft developers should be encouraged to set up their own personal clone on gitorious, and we should encourage aircraft developers to post more merge requests.
  • As more and more people gain commit rights, some rules and guidelines are in order. We currently have a largely unwritten code of conduct that has proven to work well. With new people entering the scene, it will become necessary to put them in writing however.

A new plan for splitting fgdata

  1. Create a new git repository: fgaircraft. This repository will -for the time being- contain all current and new FlightGear aircraft except the default c172p.
  2. Provide images of the git repository, to alleviate the initial clone problem. Preferably, this should be done using bittorrent, or another interruptable download source, so that people with limited bandwidth, can monitor their data consumption more flexibly.
  3. After a few days of testing, remove all aircraft from the main fgdata repository.
  4. Formalize the status of the new aircraft repositories by formalizing our existing, but unwritten, code of conduct and granting new aircraft authors who agree to this code of conduct commit rights.
  5. Because the new system can allow us to grant a potentially large number of aircraft developers commit access, we deem it desireabe to formulate an explicit set of rules with regard to the contributor's rights and obligations. It should be noted that the following rules are not really new, but merely formalizations of our existing code of conduct. By explicitly formulating them, our main goal is to avoid misunderstanding, and to provide clear guidelines in the unlikely event of misuse of commit rights.
  6. Post these rule on a relatively prominent location on the main FlightGear.org website.

The rules describing the rights and obligations of committers are as follows:

  1. Authors who obtain commit rights to fgaircraft retain the rights to handle their own work.
  2. The FlightGear fgaircraft administrators are allowed to update aircraft files, but only
    1. to enable the use of new FlightGear features (such as adding a necessary include file or set a few property switches),
    2. to fix small and obvious bugs (such as fixing a misspelled property or file name), or
    3. to apply changes necessary to keep aircraft compatible with an upcoming FlightGear release.
  3. The FlightGear fgaircraft repository maintainers reserve the right to revoke commit access of an individual committer. They can only exercise their right after consensus has been reached among the maintainers, and only in cases of
    1. misuse of commit rights by the offending developer, or
    2. prolonged inactivity of the committer (more than a year of inactivity)
  4. Aircraft that have not been maintained by the prime committer for a prolonged period of time are considered to have been abandoned and may be assigned to a different committer. The obvious exception to this rule is formed by aircraft that have a high level of completeness that are maintained by committers who are still very active in other areas of FlightGear's development.
  5. Commit rights will be given after the authors have shown a reasonable level of competence in both aircraft development and GIT usage. While the aircraft developer is still in the process of obtaining the required skills, he or she can seek a mentor who will handle any merge requests.
  6. If the aircraft developer is uncomfortable in working with GIT he or she can also opt to choose a "mentor" who will handle the merge requests for them.
  7. In addition to these rules, anybody contributing to the FlightGear project is encouraged to work with personal clones and submit merge requests.

Comments on the new plan

(HHS comments) That sounds like the situation we already have - the only difference: the aircraft are just splitted from FGData, and the maintainer of an Aircraft-project can get the rights to commit without any magic. But with that there are lot of rules with exceptions. From my daily real life job I do know, that many rules with exceptions may make things more complicated and provoke discussions and conflicts. So my question is: How can we see that "a reasonable level of competence in both aircraft development and GIT usage" is there? Does he has to be checked? Which level of competence is needed for making an aircraft?

(hvengel comments)

Having well defined rules it a good thing. Some (perhaps many) of the aircraft devs have little or no experience with normal software development work flows and they will need well defined rules to help them do the right thing. Reading through the proposed rules I don't see too much room for things to be incorrectly used. However because the level of experience with software work flows will be all over the place (IE. some aircraft devs are also experienced software developers/engineers and some have absolutely no experience with this type of thing) the rules should be a simple and as clear and complete as possible. Nothing should be open to interpretation.

I don't see the concern over determining GIT and/or aircraft development competence to be a significant issue. I am sure that there are a significant number of aircraft devs who are currently using merge requests who would be given commit rights right from the get go. Newer aircraft devs can work with a mentor with commit rights while they are learning GIT and aircraft development and the mentor can determine when they are ready for commit rights. However I think having a documented process for this might be a good idea.


It is good to see that the existing rules are finally formalized, so we get a clear guideline. It is also good to see that people should be more encouraged to submit merge requests. But this also means that those with commit rights should be more encouraged to review those merge requests and commit them!