The GPL, the contributors, the ChangeLog and CVS
Posted 18 Oct 2000 at 16:30 UTC by Raphael 
According to section 2a of the GNU GPL, every person modifying a
program must clearly state what files have been modified and when. This
requirement makes sense when a single person takes an existing program
and redistributes a modified version of it, but it becomes more complex
when multiple developers add their own code as well as patches from
other contributors into a public CVS repository.
Section 2a of the General Public License
says that if you distribute a modified version of a program,
a) You must cause the modified files to carry prominent
notices stating that you changed the files and the date of any change.
This ensures that anybody who gets a copy of the modified software
knows that it differs from the previous version (so they will not blame
the original author if something goes wrong with the modified version).
Also, they know who has modified the software, which could be important
if a copyright problem occurs. This obviously applies to the situation
in which someone takes a piece of code, modifies it, and distributes the
"work based on the program".
But the GPL is a bit vague about what is "distributing a modified
work", especially when multiple developers are involved (1). If you are the original author of the code,
modifying it and re-distributing it is no problem. But if you are a
contributor and you distribute a patch (not the whole code), then I do
not think that you are required to add your name and a description of
your changes in every modified file, because you are only distributing
your own code, not the original code (in fact, that depends on the
format of the patch). Also, the fact that other people have to apply
the patch to create the modified version of the code makes it obvious
that the modifications come from you. So the modifications can easily
be traced back to their author if necessary.
Things get more interesting when multiple authors are committing
their changes to a CVS repository (or any other shared revision control
system). Many projects are using public CVS repositories for
collaborative development. For example: the GIMP, GNOME, KDE, many projects on SourceForge, and so on. Anyone
who has at least read access to the repository can take a look at the
CVS logs and see who has modified what files, so all information
required by the GPL is there. This is a bit different when the code is
taken out of CVS and distributed as a tarball or package, but most
projects also keep a ChangeLog file containing a copy of the CVS commit
messages so the traceability of the changes is preserved.
However, the ChangeLog file is not a perfect solution: if someone
takes some source files from one project and uses them in a different
project (both GPL), it is likely that the ChangeLog for these files will
not be preserved. As a consequence, it will be hard to know the
modification history of these files, and this can be a problem if there
is a need to get in touch with all authors for modifying the license
(cfr. the change of license for Mozilla or the refusal to add the Qt
exception clause in KDE).
An even more common problem is when a developer who has write access
to CVS applies a patch submitted by another contributor. In the best
case, the ChangeLog entry will mention which files were affected, but
sometimes it consists only of the laconic message "applied patch
provided by J. Random Hacker". In this case, it is hard to know exactly
what was modified if you do not have access to the CVS log or to the
original patch.
This causes problems regarding the compliance to section 2 of the
GPL, as well as some practical problems if someone wants to get in touch
with all developers who worked on a piece of code (e.g. for a change of
licensing terms as described above). I don't know if there is an easy
way to improve this situation, that's why I am posting this article...
Is a ChangeLog enough? What if the CVS repository crashes and the logs
are gone? Should we stick to the spirit or to the word of the GPL?
Should we require each contributor to add a comment to every source file
that they modify (this would be boring, both for the contributors and
for those who have to read the code later)?
Another problem that I have not mentioned so far relates to the
copyright owners: in some projects, every developer who creates a file
adds her own copyright to that file. In other projects, the copyright
is always given to the original author of the project (e.g. Spencer
Kimball and Peter Mattis for the GIMP) . In some others, the copyright
is given to a formal or informal group (e.g. the Apache group, the KDE
team, the FSF, ...). The Free Software Foundation and other formal
associations such as the old X Consortium have always been very careful
about copyrights and contributions from external developers: when you
submit asignificant amount of code that is to be integrated into a
project with the copyright assigned to them, you have to sign some
papers or to have your employers signing some papers certifying that the
copyright is
transfered to them. I don't think that any of the informal groups are
taking similar precautions, although this is necessary in order to be on
the safe side (legally speaking). Requiring such paperwork before
accepting patches would certainly reduce the number of contributions,
but not doing this put the developers at risk: the employer of a
contributor could sue the development team for having taken some code
that legally belongs to the company (because the contributor was
contractually bound to that company when he wrote the code). Currently,
everyone (except the FSF) prefers to take this risk in order to get more
contributors, but that could be dangerous if some companies decide to be
nasty.
-Raphaƫl
____________________
(1) It is interesting to note that the GPL
and (almost?) all other software licenses consider the point of view of
a single copyright owner and do not say much about what happens when
multiple developers own different parts of the code. Probably because
the current laws do not make this easy to handle.
work in joint, posted 18 Oct 2000 at 20:39 UTC by Fyndo »
(Journeyer)
just on the issue of multiple people owning different parts of the code,
the default
law for a bunch of people working on a copyrightable work together is
that they all
share equal rights/ownership in the entire thing, but need to share with
the other
owners any licensing fees and the like.
When the only thing you have is a cvs log then you might be
interested in the cvs2cl.pl CVS-log-message-to-ChangeLog
conversion script by Karl Fogel. You should run this
little utility
once before you 'release' or 'distribute' your code. It
creates very, very nice ChangeLog entries. The ChangeLog file
that it generates shows
precisely who changed what file. And if everybody
provides a good commit log entry you never have
to worry about when, who made what change to which part of
the code.
Regardless of what the laws say on how copyright works (I don't even
know myself), many people contributing a patch just don't care what
license it falls under and would find it silly that someone even thinks
they need to provide one. (ie: unless otherwise stated, any submitted
patch should be considered public domain so that its integratable into
any codebase no matter what license)
If someone doesn't feel this way, how can they rightfully sumbit a patch
without including a statement otherwise./ Coming back later to say "hey,
you applied my patch 2 years ago, I want credit up there with the people
who made the project work in the first place and refuse to allow that
file/project to fall under license foo" is silly.
thoughts?
- There has been at least one occasion where I've "fixed[1]" gnu
software, submitted the patch, but never followed through on all the
release stuff that GNU requires.
- Getting your boss to sign a release form for a 4 line patch is more
trouble than
it's worth... Yes, I'm lazy.
- It'd be nice if there was some minimum size of code that was always
"fair use". I may be wrong, but can't you use 4 secs of any recorded
song
without violating copyright? There's a similar "fair use" limit for
copyrighted
text as well , isn't there?
- I'd put the limit at 4k.
- Booker C. Bense
[1]- it was a performance tweak to remote tar on the Cray YMP.
I have a prediction -
Large chunks of the GPL will be deemed unenforceable in court.
This will be one of them.
This is a good thing. The goal of open source (well, my
goal for doing open source) is to avoid all the crap copyright
introduces. Where the GPL does not succeed in doing that, I will be
happy to see it tossed into the courtroom trashcan.
-Bram Cohen
Unfortunately, the primary support paradigm for open-source software has
been centered around this premise: "The source is available, so if
you want to know what's going on, read it." This is an
unefficient, and crappy, I might add, position to put someone in to
explain to a customer what changes were made between two packages that
may only differ in one revision number, but may have a buttload of
differing patches.
As someone who spends more time doing support hacks and translating
business-to-geek-to-business than actually looking at code, it's
disconcerting to have to always pull up emacs/vi/more/less to see what,
exactly, is going on. It's important to keep accurate and constant
changelogs, both in CVS, as well as in packaging spec files for
deb/rpm/whatever.
One of the ongoing criticisms of open-source software is that it doesn't
have support. Now, companies such as VA Linux, Red Hat, LinuxCare,
Lineo, etc. have proved this false from the standpoint of blame,
only. This just answers the question of "Who do I sue,
when a failure when using open-source software costs me money?"
Support means much more than this question, although to your customers
it is probably 90% of the road to get there.
Those who make the pretense to support Linux should all have a database
similar to Freshmeat entries, where each and every change to the
software is documented and announced. Now, these don't have to be
individually done, but a batch announcement of the order of this is very
effective, and is easily understandable by the customer while
simultaneuosly giving them that "warm and fuzzy" feeling:
Changed foo.c to eliminate ongoing memory leak. Bugzilla #11111
Added changes to bar.h, bar.c, and baz.c to add support for the
Microsoft CTRL-ALT-DELETE Widget 3.51
(Where the bugzilla bug and the home for the MS widget code are
hyperlinked, so further research can be done at the customer's
leisure.)
Maybe I'm talking out of my ass, and maybe it is an unrealistic
expectation to set WRT open-source developers, but I put the onus on the
above companies to lead by example. Red Hat and Mozilla both have
public Bugzilla DB's, and this is commendable, but linking RH's Bugzilla
to the RPM .spec changelogs would truly be an awesome
cross-referencing tool for people who use RH Linux seriously, not to
mention a way to enhance the usefulness of both sources of information
greatly...
As always, just my $0.02.
rcs2log?, posted 20 Oct 2000 at 05:28 UTC by mibus »
(Journeyer)
mjw: whats wrong with rcs2log?
What you have described is not a barrier caused by the GPL. Instead
it is caused by the copyright assignment requirement used for many (not
all) GNU projects.
Your complaints are equally valid for any project that requires
copyright assignment, be it GPL or some other licence. I agree that
copyright assignment does deter a number of contributors (who wants to
fill out a form, possibly post it by international mail and wait for a
response before being allowed to apply your patch?) but they usually
have their reasons and you should respect that. Remember that you can
always fork the project if it is a real problem.
As for splork's comment, I am more likely to contribute to GPL
covered programs, because I know that my contributions will remain free
and no one will profit from them at the expense of others (note that
this is different from preventing people to profit from it altogether).
I don't expect the patch to be in the public domain for people to do as
they wish with it. In the case where the submitter doesn't specify a
licence for a patch, I would have thought the default would be the same
licence as the original code.
cvs2cl vs rcs2log, posted 20 Oct 2000 at 10:20 UTC by mjw »
(Master)
mibus:
Nothing is wrong with rcs2log. I just didn't know that it existed.
I quickly looked at rcs2log and it probably does everything you want.
But cvs2cl has a bit more options such as XML output, the use of regular
epressions to select the files, showing of tags and branches in the
ChangeLog and the option to put ChangeLogs in subdirs (although most of
that is probably easy to emulate with rcs2log).
For other people that didn't know about rcs2log:
rcs2log is a shell script included with Emacs and written by Paul
Eggert.
splork writes:
Coming back later to say "hey, you applied my patch 2
years ago, I want credit up there with the people who made the project
work in the first place and refuse to allow that file/project to fall
under license foo" is silly.
I agree that it would be silly, but I mentioned the traceability of
patches not because all contributors should be credited, but because the
project could be in trouble later if it is difficult to know who wrote
what part of the code. And it is not the fault of the license (GPL or
other) as you and bbense seem to imply; no,
this is because of the copyright law and other laws.
Let's consider this a small variation on your scenario: it is not the
contributor who complains two years later, but her current or previous
employer. And that employer decides to sue the project maintainer(s)
for distributing some code that belongs to them.
Many companies and universities have (abusive) contracts stating that
all code written by their employees belongs to the company or that the
company shares the rights with the employee. Depending on the country
or state in which you live, such a clause may not be enforceable or may
be restricted to the code that is written using some equipment provided
by the employer (e.g. company PC). But the fact is that many developers
are bound by such contracts. As a result, the code they write in their
spare time may legally belong (in whole or in part) to their
employer. This is usually not a problem as long as the employer
tolerates that and plays fair. But what if the free software project is
identified later as a competitor to some product sold by that
company?
If a developer submits some patches to a project in good faith and
the only thing that is mentioned in the ChangeLog is "applied patch by
someone@company.com", there is a risk that this company discovers later
that the code was written by one of their employees and tries to prevent
the code from being distributed. The only way to prevent that seems to
be: first, describe exactly what files were affected by the patch (so
that you could remove them in the worst case, or at least rewrite or
remove the tainted parts); second, ask all contributors to get a release
statement from their employer. This only applies if the contribution is
significant, but as far as I know there is no well-defined threshold for
the number of lines of code the could be considered significant.
I would like to know if there is a better way than requiring some
paperwork, but it looks like the law requires this...
jamesh: Yes, assuming is it under the
same license as the original code is better and more natural. I'm the
same way about preferring to contribute patches to projects under a
license that lets the code live.
Some posters are confusing two separate things: what the GPL requires,
and what the FSF requires for contributions to software that it owns.
These are two separate things.
The FSF requires copyright assignments or disclaimers, plus employer
disclaimers. The GPL, strictly speaking, doesn't require these, the
FSF is just being extra cautious, because they are a potential lawsuit
magnet. Perhaps the FSF is being over-cautious, but the Linux
community is definitely erring in the opposite direction. If you
are a programmer working in the US, and you send in patches to a
free software program without telling your management and getting
approval, you're putting us all at risk of losing the work to cease
and desist orders from your company some time down the road. Read
those papers you signed when you were hired; it is quite likely that
they say the company owns every program you write, even on your own
time, and even every idea you think up. Such clauses may not be legal,
but can we afford to fight your employer in court?
The FSF avoids these fights by asking for a disclaimer from your
employer. This disclaimer would not be necessary for a change that
is too small to be covered by copyright, so for the person who had a
four line change, it shouldn't be a problem.
Another issue is the enforceability of the GPL. Only the copyright
owner has standing to sue. If there are hundreds of owners, it may be
very difficult to stop violators, as if only a subset of the owners
want to sue the violator could try to get the case thrown out for lack
of standing. Since the FSF owns all of gcc (for example), they have
clear rights to sue.
Linus doesn't like the idea of copyright assignment and that's OK.
But I'd feel better if he at least got employer disclaimers from
major contributors.
I would like to expand a bit on jbuck's
last comment, by giving another example showing that the requirements of
the FSF are not related to the requirements of the GPL. The disclaimers
are necessary to protect the FSF and other receivers of the software,
regardless of the license that is used for the software.
The X Consortium had exactly the same policy as the FSF, although it
used a different license: the X Consortium License, a.k.a MIT X License,
is close to the BSD license. A few years ago (1993-94) I wrote a small
program with a friend of mine, while we were still at the university.
This program, called xsession(1), was
included in the X11R6 contrib distribution. But before including our
program in the distribution, someone from the X Consortium (I forgot his
name) asked us to sign some papers certifying that we were the copyright
owners for the program and that we allowed the X Consortium to
distribute it. We also had to ask some representative of the university
to sign a disclaimer (because the university could have claimed some
rights on the program).
The X Consortium was extra cautious although it was only distributing
the software. They did not ask us to transfer the copyright to them.
The code was released under a license that is simpler than the GPL (2). Yet they asked us and our employer to
sign some papers, because they wanted to be legally safe.
(1) In case you are wondering, xsession was a
(very) simple session manager, allowing you to switch easily between
several window managers. This was useful because some old window
managers were crashing from time to time, and our program was restarting
them automatically instead of sending you back to the console (or the
login screen of the X terminal). This program is outdated now, although
I still use it on some machines.
(2) Whether you consider the X Consortium
License better or worse than the GPL is up to you. Simpler does not
necessarily mean better. Personally (after several good and bad
experiences with various licenses) I prefer the extra protections
offered by the GPL.
- yes, I realize that I painted the GPL with the FSF brush, I'm sorry.
It's not
what I meant. I fully understand why they do that, however it doesn't
make me any more likely to bother with it in the future. However, I am
really interested in this quote.
The FSF avoids these fights by asking for a disclaimer from
your employer. This disclaimer would not be necessary for a change that
is too small to
be covered by copyright, so for the person who had a four
line change, it shouldn't be a problem.
- As far as I know, there is no "change that is too small to be covered
by copyright". The 4 line change I submitted low these many years ago
did
require a copyright assignment. That was in 1991, have the laws changed
since?
- Booker C. Bense