Linux Kernel Development: Getting Started Copyright under GPLv2
Guan Xuetao May, 2012
http://mprc.pku.edu.cn/~guanxuetao/linux/
Slide 2
Abstract/Goal Abstract/Goal Linux development is fast-paced and
[as they say in Oregon] things are different here. This tutorial
introduces some of the Linux culture and how to succeed when
working with the Linux development community. The kernel's
development process Differs greatly from proprietary development
methods May come across as strange and intimidating to new
developers But there are good reasons and solid experience behind
it
Slide 3
Number crunching > 15 million 38083 ~1000 190 > 100 ~80 1
SLOC in Jan. 2012 Files in v3.3 Contributors for each release
Subtrees in linux-next Patches applied per day Days: a new major
kernel release Linus Torvald One of the largest and most active
free software projects in existence
Slide 4
Topics Open source development style, values, culture Linux
rapid development cycle How to get your change into the linux
kernel Communications methods Some best known practices Getting
involved
Slide 5
Open source development style, values, culture
Slide 6
Development Style, Values, and Culture Learning curve, things
are different Meritocracy good ideas & good code are rewarded
Chance to work on a real OS any parts of it that interest you
Massive amounts of open communication via email, IRC, etc. (i.e.,
not private)
Slide 7
Linux Culture (1) Work in open, not behind closed doors (in
smoke- filled rooms) Community allegiance is very high Do what is
right for Linux Meritocracy: good ideas and good code are rewarded
Often driven by ideals and pragmatism, bottom-up development Not
driven by marketing requirements Don't just take, give back too:
Modifications are & remain GPL (if distributed) Payment in
kind, self-interest Improve software quality, features
used/understood more
Slide 8
Linux Culture (2) Committed to following and using standards
(e.g., POSIX, IETF) Committed to compatibility with other system
software Informal design/development: Not much external high-level
project planning or design docs (maybe some internally at
companies); can appear to be chaotic New ideas best presented as
code, not specifications or requirements RERO: Release Early,
Release Often -- for comments, help, testing, community acceptance
Possible downsides: flames, embarrassment
Slide 9
Linux Culture (3) Development community is highly technical
Motivated and committed, but since many are volunteers, treat them
with respect and ask/influence them, don't tell Continuous code
review (including security) Continuous improvement Have fun!! :)
Follow the culture
Slide 10
Linux Development Values Scratch your own itch Weekenders ->
big business Code, not talk Pragmatism, not theory Thick skin Code
producer makes [most] decisions Pride, principles, ethics, honesty
Performance Hardware & software vendor neutral Technical merit,
not politics, who, or money Maintainability & aesthetics: clean
implementation, not ugly hacks (coding style) Peer review of
patches (technical & style) Contributions earn respect
Slide 11
Some Things to Avoid Patents, binary modules, NDA Proprietary
benchmarks Huge patch files Adding more IOCTLs Marketing Design
documents Mention of accomplishments outside of the open source
world No patch rationale How do I intercept a system call (or
replace a syscall table entry)? Making demands instead of requests
This {driver / feature} must be merged, it's important to our
company. Date or release version requirements
Slide 12
Linux rapid development cycle
linux/Documentation/development-process/2.Process
Slide 13
How the development process work A loosely time-based release
process A new major kernel release happening every two or three
months A rolling development model Patches Files changed
InsertionsDeletions v3.0Jul. 21, 2011 v3.1Oct. 24,
201193809181726251602017 v3.2Jan. 4, 2012126951260816454471417264
v3.3Mar. 18, 20121141610698598640431219 From v3.0 to v3.3: 241
days, 139 patches per day
Slide 14
Typical development cycle Merge window The major changes or new
features pulled Deemed to be sufficiently stable (already accepted
by the development community) At a rate approaching 1,000 per day
Lasts for approximately two weeks The fist -rc kernel released,
merge window closed New -rc kernels released about once a week Over
the next six to ten weeks Only patches which fix problems should be
submitted The patch rate will slow over time Normally, somewhere
between -rc6 and -rc9 The final 3.x release The whole process
starts over again
Slide 15
The v3.3 development cycle All dates in 2012Patches v3.2Jan.
4Stable release v3.3-rc1Jan. 19Merge window closed9460 v3.3-rc2Jan.
31515 v3.3-rc3Feb. 8288 v3.3-rc4Feb. 18370 v3.3-rc5Feb. 25198
v3.3-rc6Mar. 3215 v3.3-rc7Mar. 10241 v3.3Mar. 18Stable release129
sum74 days11416 Bug Convergence
Slide 16
The stable/longterm kernel trees The "stable team: Greg
Kroah-Hartman The ongoing maintenance of stable kernels The 3.x.y
numbering scheme Cc: stable To be considered for an update release,
a patch must 1.fix a significant bug 2.already be merged into the
mainline for the next development kernel The "long term" kernels
Purely a matter of a maintainer having the need and the time to
maintain that release The current long term kernels and their
maintainers 2.6.27Willy Tarreau(Deep-frozen stable kernel)
2.6.32Greg Kroah-Hartman(be picked up by Ubuntu to 2015) 2.6.34Paul
Gortmaker(Wind River) 2.6.35Andi Kleen(Embedded flag kernel)
3.0Greg Kroah-Hartman(for 2 years at the minimum) 3.0.x-rtxSteven
Rostedt(Real-time) 3.2Greg Kroah-Hartman(Ubuntu 12.04)
Slide 17
The lifecycle of a patch Design Specifying the problem Early
discussion Early review Posted to the relevant mailing list Wider
review To ready for mainline inclusion Show up in the maintainer's
subsystem tree Show up into the -next trees Follow through
Persistent in updating the patch to the current kernel so that it
applies cleanly Keep sending it for review and merging Merging into
the mainline Stable release Long-term maintenance
Slide 18
How patches get into the kernel Mainline Kernel.org Linus
Torvalds ARM Russell King asm-generic Arnd Bergmann Networking
David S. Miller Networking drivers [email protected] Wireless
Networking John W. Linville DOCUMENTATION Randy Dunlap Staging Greg
Kroah-Hartman AKPM Andrew Morton -next tree Stephen Rothwell
Contributors Subsystem maintainers
Slide 19
The lieutenant system built around the chain of trust Kernel
series maintainers Linus Torvalds Benevolent dictator Avoid sending
patches directly to Linus Subsystem maintainers Gatekeepers,
integrators, tiebreakers or overrulers Coordinate subsystems and
maintain consistency Share the load Lower-level trees maintainers
Chain of repositories Don't have absolute authority Driver
maintainers Other contributors Find the right maintainer
Slide 20
The linux-next tree & staging tree The linux-next tree
Maintained by Stephen Rothwell Up to 190 trees The primary tree for
next-cycle patch merging A snapshot of what the mainline is
expected to look like after the next merge window closes Subsystem
trees are collected for testing and review Recreate/test
automatically everyday The staging tree Maintained by Greg
Kroah-Hartman For drivers or filesystems that are on their way to
being added to the kernel tree live A TODO file should be present
Code in staging which is not seeing regular progress will
eventually be removed
Slide 21
Summary for development cycle Rapid development cycle, no
timelines/schedules Only online documentation has a chance of being
up- to-date Accommodate large changes and high rate of change
without regressions Open discussion (mailing lists, archives, not
private) RERO, facilitates testing on a large variety of platforms
Maintainers available and accessible, don't disappear for long
periods of time Test suites Bug tracking
Slide 22
How to get your change into the linux kernel
linux/Documentation/SubmittingPatches
Slide 23
Creating and sending your change 1. diff -up 2. Describe your
changes 3. Separate your changes 4. Style check your changes 5.
Select e-mail destinations 6. Select your CC (e-mail carbon copy)
list 7. Just plain text. 8. E-mail Size 9. Name your kernel version
10. Dont get discouraged. Re-submit. 11. Include PATCH in the
subject 12. Sign your work 13. When to use Acked-by: and Cc: 14.
Using Reported-by:, Tested-by: and Reviewed-by: 15. The canonical
patch format 16. Sending git pull requests
Slide 24
1/16: diff -up Make patches against linux-next or Linus's tree
Unless they only apply to some other tree or patchset To create
patches (unified diff format) diff -uprN git diff In linux/
Documentation/dontdiff .gitignore Make sure your patch does not
include any extra files which do not belong in a patch submission.
Make sure to review your patch after generated it with diff, to
ensure accuracy.
Slide 25
2/16: Describe your changes Describe the technical detail of
the change(s) your patch includes. Be as specific as possible. The
WORST descriptions possible include things like: "update driver X"
"bug fix for driver X" "this patch includes updates for subsystem
X. Please apply." Served as a commit log If the patch fixes a
logged bug entry, refer to that bug entry by number and URL.
Slide 26
3/16: Separate your changes Separate logical changes into a
single patch file Patch series: an ordered sequence of multiple,
related patches For example: Bug fixes and performance enhancements
for a single driver An API update and a new driver which uses that
new API MUST keep bisect-able git bisect
Slide 27
4/16: Style check your changes Coding Style
Documentation/CodingStyle Keep it looking like Linux code for
readability, maintainability, debugging, etc. Check your patches
with the patch style checker prior to submission, and justify all
violations ./scripts/checkpatch.pl
Slide 28
5/16: Select e-mail destinations Best choice: Look through the
MAINTAINERS file and the source code ./script/get_maintainer.pl
Next choice: If no maintainer is listed, or the maintainer does not
respond, send to the primary Linux kernel developer's mailing list
[email protected][email protected] With no
other choice: Linus Torvalds is the final arbiter of all changes
accepted into the Linux kernel. (Avoid sending him e-mail.)
[email protected][email protected] Do not
send more than 15 patches at once to the vger mailing lists!!!
Slide 29
6/16: Select your CC list Unless you have a reason NOT to do
so, CC [email protected][email protected]
Other mailing lists for specific subsystems, such as USB,
framebuffer devices, the VFS, the SCSI subsystem, etc.
http://vger.kernel.org/vger-lists.html
http://vger.kernel.org/vger-lists.html If changes affect
userland-kernel interfaces A man-pages patch to the MAN-PAGES
maintainer (as listed in the MAINTAINERS file) Or at least a
notification of the change For small patches you may want to CC the
Trivial Patch Monkey [email protected][email protected]
Slide 30
7/16: Just plain text No MIME, no links, no compression, no
attachments. Submit e-mail "inline" It is important for a kernel
developer to be able to "quote" your changes, using standard e-mail
tools. Be careful of your email client (and your editor) Be wary of
your editor's word-wrap corrupting your patch, if you choose to
cut-n-paste your patch. Documentation/email-clients.txt hints about
configuring your e-mail client so that it sends your patches
untouched git send-email Exception: If your mailer is mangling
patches then someone may ask you to re-send them using MIME.
Slide 31
8/16: E-mail size If uncompressed size > 300kB Store your
patch on an Internet-accessible server, and provide instead a URL
(link) pointing to your patch Acceptably SLOC for a patch < 100
lines
Slide 32
9/16: Name your kernel version Make patches against linux-next
or Linus's tree It is important to note, either in the subject line
or in the patch description, the kernel version to which this patch
applies. If the patch does not apply cleanly to the latest kernel
version, the subsystem maintainer will not apply it.
Slide 33
10/16: Don't get discouraged. Re-submit. After you have
submitted your change, be patient and wait. If your patch is
dropped, it could be due to: Your patch did not apply cleanly to
the latest kernel version. Your patch was not sufficiently
discussed on linux-kernel. A style issue. An e-mail formatting
issue. A technical problem with your change. There are tons of
e-mail, and yours got lost in the shuffle. You are being annoying.
When in doubt, solicit comments on linux-kernel mailing list.
Slide 34
11/16: Include PATCH in the subject Prefix your subject line
with [PATCH] Due to high e-mail traffic to Linus, and to
linux-kernel To distinguish patches from other e-mail discussions
If you want to send a patch series --thread option: to make
threaded mails --cover-letter option: to prefix with [PATCH 00/14]
If patches are modified and re-submitted: Prefix with [PATCH
v2]
Slide 35
12/16: Sign your work The sign-off: A simple line at the end of
the explanation for the patch, to certify that you wrote it or
otherwise have the right to pass it on as an open- source patch The
rules are pretty simple: if you can certify the Developer's
Certificate of Origin 1.1 (see next slide for DCO), then you just
add a line saying Signed-off-by: Random J Developer Using your real
name No pseudonyms or anonymous contributions For a subsystem or
branch maintainer sometimes you need to slightly modify patches you
receive in order to merge them (If you stick strictly to DCO)
Signed-off-by: Random J Developer [[email protected]:
struct foo moved from foo.c to foo.h] Signed-off-by: Lucky K
Maintainer Note that under no circumstances can you change the
author's identity (the From header), as it is the one which appears
in the changelog.
Slide 36
Developer's Certificate of Origin 1.1 By making a contribution
to this project, I certify that: (a) The contribution was created
in whole or in part by me and I have the right to submit it under
the open source license indicated in the file; or (b) The
contribution is based upon previous work that, to the best of my
knowledge, is covered under an appropriate open source license and
I have the right under that license to submit that work with
modifications, whether created in whole or in part by me, under the
same open source license (unless I am permitted to submit under a
different license), as indicated in the file; or (c) The
contribution was provided directly to me by some other person who
certified (a), (b) or (c) and I have not modified it. (d) I
understand and agree that this project and the contribution are
public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Slide 37
Special note to back-porters To facilitate tracking, insert an
indication of the origin of a patch at the top of the commit
message Example 1 in 2.6-stable: Date: Tue May 13 19:10:30 2008
+0000 SCSI: libiscsi regression in 2.6.25: fix nop timer handling
commit 4cf1043593db6a337f10e006c23c69e5fc93e722 upstream Example 2
in 2.4: Date: Tue May 13 22:12:27 2008 +0200 wireless, airo:
waitbusy() won't delay [backport of 2.6 commit
b7acbdfbd1f277c1eb23f344f899cfa4cd0bf36a]
Slide 38
13/16: When to use Acked-by: and Cc: Signed-off-by: The signer
was involved in the development of the patch Or he/she was in the
patch's delivery path Acked-by: The acker has at least reviewed the
patch and has indicated acceptance He/she can acknowledge just the
part of the patch Patch mergers will sometimes manually convert an
acker's "yep, looks good to me" into an Acked-by: Cc: A person (in
potentially interested parties) has had the opportunity to comment
on a patch, but has not provided such comments The only tag which
might be added without an explicit action by the person it
names
Slide 39
14/16: Using Reported-by:, Tested-by: and Reviewed-by:
Reported-by: To credit the reporter for their contribution Please
note that this tag should not be added without the reporter's
permission, especially if the problem was not reported in a public
forum. Tested-by: To inform maintainers that some testing has been
performed To provide a means to locate testers for future patches
To ensure credit for the testers Reviewed-by: The patch has been
reviewed and found acceptable according to the Reviewer's Statement
(see next slide) To give credit to reviewers and to inform
maintainers of the degree of review which has been done on the
patch Any interested reviewer (who has done the work) can offer a
Reviewed-by tag for a patch.
Slide 40
Reviewer's statement of oversight By offering my Reviewed-by:
tag, I state that: (a) I have carried out a technical review of
this patch to evaluate its appropriateness and readiness for
inclusion into the mainline kernel. (b) Any problems, concerns, or
questions relating to the patch have been communicated back to the
submitter. I am satisfied with the submitter's response to my
comments. (c) While there may be things that could be improved with
this submission, I believe that it is, at this time, (1) a
worthwhile modification to the kernel, and (2) free of known issues
which would argue against its inclusion. (d) While I have reviewed
the patch and believe it to be sound, I do not (unless explicitly
stated elsewhere) make any warranties or guarantees that it will
achieve its stated purpose or function properly in any given
situation.
Slide 41
15/16: The canonical patch format The canonical patch subject
line (see next slide) Subject: [patch 2/5] ext2: improve
scalability of bitmap searching Subject: [PATCHv2 001/207] x86: fix
eflags tracking The canonical patch message body A "from" line
specifying the patch author. An empty line. The body of the
explanation, which will be copied to the permanent changelog to
describe this patch. The "Signed-off-by:" lines, described above,
which will also go in the changelog. A marker line containing
simply "---". Any additional comments not suitable for the
changelog. The actual patch (diff output).
Slide 42
The Subject line format Subject: [PATCH tag] subsystem: summary
phrase The tag Version descriptor: "v1, v2, v3 Request for
comments: "RFC Sequence number: 1/4, 2/4, 3/4, 4/4 zero-padded: to
sort the emails alphabetically (and numerically) by subject line
The "subsystem To identify which area or subsystem of the kernel is
being patched The "summary phrase A globally-unique identifier for
that patch what the patch changes why the patch might be necessary
Not be a filename Not use the same one for every patch in a whole
patch series No more than 70-75 characters git log --oneline
Slide 43
The from line and marker line The "from" line From: Original
Author Must be the very first line in the message body If the
"from" line is missing, then the "From:" line from the email header
will be used to determine the patch author in the changelog. The
marker line "--- To serve the essential purpose of marking for
patch handling tools where the changelog message ends Any
additional comments could be followed: Diffstat: to show what files
have changed, and the number of inserted and deleted lines per file
Patch changelogs: to describe what has changed between the v1 and
v2 version of the patch
Slide 44
16/16: Sending "git pull" requests The proper format Please
pull from git://jdelvare.pck.nerim.net/jdelvare-2.6 i2c-for-linus
to get these changes: Do write the git repo address and branch name
alone on the same line To generate the diffstat git diff -M --stat
--summary the -M enables rename detection the summary enables a
summary of new/deleted or renamed files git request-pull
Slide 45
Summary for submitting patches (1) Patch current mainline from
kernel.org or linux-next or target-specific trees Send patches to
subsystem maintainer, driver maintainer, & mailing list Each
patch (re-)submission should include feature justification and
explanation, not just the patch Use the DCO (Signed-off-by: Your
Name [email protected]) Patches should be encapsulated (self-
contained) as much as possible, not touching other code (when that
makes sense)
Slide 46
Summary for submitting patches (2) ONE patch per email, logical
progression of patches, not mega-patches, not attached and not
zipped (cannot review/reply) Don't do multiple things in one patch
(like fix a bug and do some cleanup) Check your email client: send
a patch to yourself and see that it still applies (doesn't damage
whitespace, line breaks, content changed) before going public with
it Patch must apply with 'patch -p1'; i.e., use expected directory
levels Don't use PGP or GPG with patches, they mess up patch
scripts
Slide 47
Communication methods
Slide 48
Communications Communicating is hard, let's go shopping Writing
ideas/thoughts down is good (but too wordy may be ignored)
Participate constructively Mailing lists & archives
(newsgroups) Working in open/public (technical readers/writers) vs.
embarrassment Discussion and decisions on lists, no meetings
required Work through concensus (with exceptions) Project web
pages, IRC channels Developer conferences
Slide 49
Mailing List Etiquette Use Reply-to-All, threaded (Message-ID,
References) > > Try A or B. > I prefer A, sound OK? yes Be
prompt with replies (being responsive is important) No encoded or
zipped attachments (inline preferred, text/plain attachments OK);
others are often ignored No HTML or commercial email, no
auto-replies (OOO/vacation) ALL CAPS == SHOUTING; rude, don't do it
Use < 80-column width lines (70-72 is good) for text (not for
patches)
Slide 50
Mailing List Etiquette (2) Keep it technical and professional.
If attacked (flamed), stick with technical points, don't get
involved with attacks, & move on. Trim replies (body) to
relevant bits (don't modify To:/Cc: recipient list). Don't
cross-post to closed mailing lists. Non-English speakers Resources:
http://www.arm.linux.org.uk/armlinux/mletiquette.php
http://www.arm.linux.org.uk/armlinux/mletiquette.php RFC 1855:
Netiquette Guidelines: http://www.ietf.org/rfc/rfc1855.txt
http://www.ietf.org/rfc/rfc1855.txt
Slide 51
No top-posting A: http://en.wikipedia.org/wiki/Top_post Q:
Where do I find info about this thing called topposting? A: Because
it messes up the order in which people normally read text. Q: Why
is top-posting such a bad thing? A: Top-posting. Q: What is the
most annoying thing in e-mail? A: No. Q: Should I include
quotations after my reply?
Slide 52
Some Good Terms to Use Simpler Deletes N lines of code Faster
(with data) Smaller (with data) Here's the code.... Series of small
patches.... Tested... (how many configs) Builds on 8
architectures
Slide 53
Some Best Known Practices
Slide 54
Some Best Known Practices (1) Send patches directly to their
intended maintainer for merging (they don't troll mailing lists
looking for patches to merge) Copy patches to the appropriate
mailing list(s), not private (don't work in isolation) Subscribe to
relevant mailing lists (or use one representative for this) Listen
to review feedback and promptly respond to it
Slide 55
Some Best Known Practices (2) Some maintainers do not
acknowledge when they merge a patch; you just have to keep watching
Use correct 'diff' directory level (linux/ top-level directory) and
options (-up) Use source code to convey ideas Generate patch files
against the latest development tree branch (-rcN) or mainline
kernel if there is no current development branch Make focused
patches or a series of patches, not large patches that cover many
areas or that just synchronize a (CVS) repository with the kernel
source tree
Slide 56
Some Best Known Practices (3) Use an email client that supports
inserting patches inline (not as attachments) Begin with small
patches: use kernel-janitors mailing list e.g. For larger patches
or complete drivers or features, use the kernel-mentors mailing
list (for beginner feedback, comments and corrections) Don't post
private email replies to a public mailing list (without permission)
Don't introduce gratuitous whitespace changes in patches
Slide 57
Some Best Known Practices (4) Back up your patch with
performance data (if applicable) Don't add binary IOCTLs unless
there are no other acceptable options; use sysfs (/sys) or
private-fs or debug-fs or relayfs or netlink if possible Make Linux
drivers that are native Linux drivers, not a shim from another OS
Don't introduce kernel drivers if the same functionality can be
done reasonably in userspace Try to be processor- and
distro-agnostic (except for CPU-specific code) Don't be afraid to
accept patches from others
Slide 58
Some Best Known Practices (5) Keep your patch(es) updated for
the current kernel version Resubmit patches if they are not
receiving comments Release early, release often Open, public
discussion on mailing lists One patch per email Large patches
should be split into logical pieces and mailed as a patch series
Make testing tools available & easy to use; your device(s) will
get better testing
Slide 59
Getting involved (Andrew Morton gives this advice for aspiring
kernel developers, http://lwn.net/Articles/283982/
)http://lwn.net/Articles/283982/ The #1 project for all kernel
beginners should surely be "make sure that the kernel runs
perfectly at all times on all machines which you can lay your hands
on". Usually the way to do this is to work with others on getting
things fixed up (this can require persistence!) but that's fine -
it's a part of kernel development.
Slide 60
Getting Involved in Kernel Development Testing,
feedback/results Learn some basics at
http://kernelnewbies.orghttp://kernelnewbies.org Find some small
tasks that are identified at
http://kernelnewbies.org/KernelJanitors
http://kernelnewbies.org/KernelJanitors Focus on an area that you
are interested in (many to choose from) Can just fix
compile/build/sparse problems as an introduction Fix bugs in the
bugzilla database at http://bugzilla.kernel.org
http://bugzilla.kernel.org Add to kernel documentation
Slide 61
Reference Linux-mentoring Linux Kernel Development: Working in
the Community: Social/Cultural Engineering Issues By Randy Dunlap
[email protected] various versions at:
http://www.xenotime.net/linux/mentor/
http://www.xenotime.net/linux/mentor/