Skip to Content.
Sympa Menu

sphenix-software-l - [Sphenix-software-l] git and github gymnastics - some info that might be useful.

sphenix-software-l AT lists.bnl.gov

Subject: sPHENIX discussion of software

List archive

Chronological Thread  
  • From: Martin Purschke <purschke AT bnl.gov>
  • To: "sphenix-software-l AT lists.bnl.gov" <sphenix-software-l AT lists.bnl.gov>
  • Subject: [Sphenix-software-l] git and github gymnastics - some info that might be useful.
  • Date: Wed, 23 Mar 2016 21:41:21 -0400


Dear software aficionados,

one of my quests for the day was to bring some existing git
repositories, the rcdaq one among them, into github. With some CVS-based
repos earlier we had just taken a snapshot and imported it, but since
there already was a multi-year git history with those projects, I did
not want to lose that.

It turns out that a straightforward import is rather easy. I'll show the
command in a minute, but this is not a lot different from making a fresh
repository: In this case you make a fresh repo in github and clone the
empty repository. In your local copy you add files, modify them, and
they acquire some history locally, and at some point you push that to
github.

The only difference between that scenario and an existing repository is
that the latter wasn't "born" in github, but existed already.

So you make a fresh repository in github, usually with the same name,
and obtain the URL, say, the fictitious
https://github.com/sPHENIX-Collaboration/mwc.git ("my wonderful code").
In your existing top repository, which up to this point didn't have an
upstream place to push to, you now simply set the ancestry - the
"origin" - to github:

> git remote add origin https://github.com/sPHENIX-Collaboration/mwc.git

Now they are at the same level just like a repository that was created
in github to begin with; you now push and so propagate your pre-existing
repository to github.

While that sounds easy enough, I immediately ran into organizational
issues. We currently have a number of top-level repositories in our
sPHENIX github area, such as "coresoftware", "analysis", "macros",
"tutorials", and so on. Those are individual (and independent)
repositories; you can clone them independently.

The situation on my small server, which has up to this point hosted the
16 or so individual git repositories, is that there are physical,
filesystem-level directories. In there I keep related but otherwise
independent git repositories, which remain independently cloneable. As I
explained in the HCal meeting yesterday, the various RCDAQ plugins are -
and should be - independent repositories. The whole point is that the
system is NOT a monolithic thing; in this way I can make my core code
available externally without violating any licenses.

It would appear that this organizational structure has no equivalent in
github. Everything that is a repository appears at the top level,
period. There is no concept of a "folder" which in turn hosts multiple
repositories.

If I were to replicate my structure "flat" at the top level, this would
increase the number of top directory (repository) entries from currently
6 to 23. I'm guessing that's not a good idea.

So why not just lump all the independent repositories together into one,
as we have done with "coresoftware"?

> $ ls -l coresoftware/
> drwxr-sr-x 6 purschke rhphenix 2048 Mar 4 16:31 generators
> drwxr-sr-x 6 purschke rhphenix 2048 Mar 4 16:31 offline
> drwxr-sr-x 3 purschke rhphenix 2048 Mar 4 19:36 online_distribution
> drwxr-sr-x 3 purschke rhphenix 2048 Mar 4 16:31 simulation

So maybe at this point it doesn't quite matter that if I'm only
interested in "online_distribution", I'm checking out generators,
offline, and simulation with it - it's only a couple of megabytes each.
As PHENIX's history has shown, the sizes will grow, and dramatically so.
But coming back to my existing structure, I couldn't do that without
giving up a lot of functionality. Today I can hand out RCDAQ to the RD51
folks, together with the SRS plugin they are usually interested in,
without worrying about the code from, say, CAEN. The CAEN code is free
to download from their website, but NOT free to re-distribute.

The argument that one can clone just a partial repository without also
cloning the rest of it falls flat - this is simply not possible without
breaking the git ancestry. The methods I've seen described involve some
tricks using svn's - yes, that's subversion's - ability to work with git
repositories and so get a partial clone - except that it's not a clone.

I'm not really offering a solution, but I think it's worth investigating
if there's a way to organize things more sensibly in folders with the
paid version. We'll need that (mainly the ability to have private
repositories) in any case once we commit code such as the aforementioned
CAEN interfaces, or our commercial Jungo code, or if we think that
future sPHENIX PPGs will use github as the versioning tool.

So much for now,

Martin


--
Martin L. Purschke, Ph.D. ; purschke AT bnl.gov
; http://www.phenix.bnl.gov/~purschke
;
Brookhaven National Laboratory ; phone: +1-631-344-5244
Physics Department Bldg 510 C ; fax: +1-631-344-3253
Upton, NY 11973-5000 ; skype: mpurschke
-----------------------------------------------------------------------




Archive powered by MHonArc 2.6.24.

Top of Page