Design and Architecture

Q. Where can I find information on how to write a script?

A. Some of it is in this page but the most up-to-date information is in Mozilla Releng readthedocs page.

Q. What does mozharness do?  What problems does it solve for me?

A."Mozharness is a configuration-driven python script harness with full logging that allows production infrastructure and individual developers to use the same scripts."
 
designed to be able to run in multiple environments/platforms/configurations
Since mozharness allows for complex configuration, scripts don't need to hardcode as much behavior, and running a script on a different-but-similar system has a better chance of running (with a new configuration file).
designed to be easier to debug or replicate problems
The logs are often comprehensive enough to debug problems without running.
When they're not enough, mozharness is designed to be able to run scripts standalone, without a full buildbot setup.
designed to be able to iterate over small sections of code more easily
When writing or debugging large pieces of automation, it can be time consuming to run through the entire thing over and over, when you really want to test and iterate on a section. Mozharness's actions allow you to run each action individually, or skip specific actions, or run a subset of the actions easily.

Q. What are the parts of mozharness and how do they all fit together?

A. Mozharness lives at http://hg.mozilla.org/build/mozharness/; its primary parts are as follows:
 
There are three core classes:
 
mozharness.base.script.BaseScript
All scripts derive from this class. It provides the logic for actions, creates a self.log_obj and self.config (from MultiFileLogger and BaseConfig+ReadOnlyDict), as well as methods to do basic scripting with logging.
 
For instance, self.mkdir_p(path) will basically os.makedirs(path), but will also log that we're doing so (via self.log_obj).  self.run_command() and self.get_output_from_command() allow the script to call other executables easily, while keeping information in the logs.
 
BaseScript.run() makes the script go.
mozharness.base.config.BaseConfig
This combines the initial script config, the config file, and the command line options into a single config, which becomes the BaseScript self.config. This config is locked (via ReadOnlyDict) at the end of BaseScript.__init__() and dumped to the logs for ease of debugging and more predictable behavior.
mozharness.base.log.LogMixin
Almost every mozharness object inherits this mixin at some level, to allow for the various log-level methods.
 
If self.log_obj isn't set, we fall back to print's. If it is set to a subclass of BaseLogger (or otherwise compatible logging object that has a log_message() method), then we log through that object.
 
By default, BaseScript uses MultiFileLogger as its log_obj, which creates a file per logging level.
There are also a number of other classes and mixins that provide useful functionality but aren't required for all script/job types. This semi-complete list will most likely fall out of date at some point:
 

Q. What classes of mozharness scripts are there (or should there be)?  Tests, tools, etc.?

A. Right now mozharness is slated to replace buildbotcustom as MoCo releng's source of automation logic.

There are/will be scripts in mozharness for builds, tests, repacks, releases. We may also put machine maintenance type scripts here; anything we want run on releng build+test farms. At some point we will just be able to consider these compute farms, and mozharness will be what runs on that compute farm.
 
Since mozharness is just a python harness, almost anything that can be written in Python could potentially be added to mozharness.

Q. I want to write a mozharness script for a new automation problem. How do I do this?

For your first mozharness script, it's probably easiest to use existing, similar scripts as examples; that's definitely easier than writing an entirely new type of script.

Split your script into logical chunks, or actions:

Figure out the different configurations that might be in play; these should be pre-defineable in either command line options or config files. (Complex data structures are best suited in config files; commonly overridden options should be available via command line.)

Coding

Q. What are all the possible mozharness actions, and what dependencies exist between them?

A. This is very much script- or script-family-specific. Any mozharness script can define any arbitrary list of all_actions. Each action has a matching method of the same name (after substituting underscores for dashes).

We attempt to make the actions standalone, so you could potentially run action #4 over and over without having to run actions 1-3 over and over as well. However, action #4 probably needs actions 1-3 to have run at some point (clone repos, download installers, etc.), or for the user to have manually done those steps beforehand.
 
If you're asking what actions have been defined in existing scripts, a 
    grep -r 'all_actions\s*=' .
might be a good start. This will change, so a static list here will be obsoleted.
 
Some actions with pre-written methods inside of mozharness.base.* or mozharness.mozilla.* include:
 
# clobber
# pull
# list-locales (l10n)
# pull-locale-source (l10n)
# passphrase (android signing)
# create-virtualenv (python)
# read-buildbot-config (buildbot interface)
# download-and-extract (tests)
# install (tests)
 
Perhaps a mozharness-level document like this won't be as useful for questions like this, as much as script-family level documentation. For instance, "What actions do TestMixin- (or LocalesMixin-, or ...?) based mozharness scripts have defined already and what do they do?"

Q. I want to use a python library in a mozharness script.  How can I do that? 

If it's a python built-in, you can import as normal. We currently support python 2.5.1 -> 2.7.x, but we're moving towards having python 2.7.3 everywhere.
 
If it's an external package, you most likely need mozharness to install it into your virtualenv (see PythonMixin.create_virtualenv()).
 
If there's a commandline tool that's also installed (say, mozinstall), you can call it from self.run_command(). For instance, TestMixin.install() calls mozinstall like this: http://hg.mozilla.org/build/mozharness/file/3334bfde4eed/mozharness/mozilla/testing/testbase.py#l229 with a preflight check to make sure mozinstall is in the virtualenv, first: http://hg.mozilla.org/build/mozharness/file/3334bfde4eed/mozharness/mozilla/testing/testbase.py#l213.
 
If the external package doesn't have a commandline tool, you need to import it. However, since the module won't be available at the beginning of the script run, a bare |import module| will throw an exception before you create the virtualenv.
 
I solved this in SUTDeviceHandler by importing the module in a method that isn't called until after the virtualenv is created.  http://hg.mozilla.org/build/mozharness/file/b8104340b600/mozharness/mozilla/testing/device.py#l444

Q. I want to use a python library in a patch I'd like to contribute to the core mozharness classes. How can I do that?

A. Same as above. A core mozharness class will usually need to handle things in a more generic way than standalone scripts, but that's true of most core classes.

Q. I've just finished my mozharness script, and think it should live inside mozilla-central. Is this possible?

A. It's certainly possible, with the following caveats:

This is complicated by the fact that only a subset of mozharness jobs require the tree, and most of those require a specific branch or revision in production. If the scripts and logic to do this lived in the tree, we would either need external logic to do this, or have the script update itself to a potentially conflicting version, and reload itself mid-script.

Configuration

Q. What should/can I put in a production configuration file? A test configuration file?

A. This actually is fairly script-dependent.

If we were to run, say, http://hg.mozilla.org/build/mozharness/file/bb6851655e8d/scripts/configtest.py configtest.py in production, we wouldn't necessarily need a config file at all. We'd just need a python2 >=2.5 with json or simplejson support.
 
As you add functionality, the requirements grow.
 
If you need to upload:
If you need to create a virtualenv:
If your production script will be running a different set of actions than developers:
If you're running on the test pool or build pool:
If the script needs mercurial:
If your script will be run in ScriptFactory:
These lists may not be complete and may change, but we can update this documentation over time.

Q. How do I inherit from one configuration file in another one?

A. Currently you can't, and I'd like to discourage that approach.

The upside of flat config files is it's easy to annotate/blame/diff to see exactly who changed what, when. This is highly useful. The downside is there are more files to update when changing something that's shared amongst multiple config files.
 
Inheriting/including/programmatically creating config has the reverse upside/downside. However, RelEng has gone so far down this route that we're forever trying to figure out what will change if we pull *this* string or who changed what, when, that affected this seemingly unrelated thing. We're moving towards dynamically creating flat configs (and checking those flat configs into revision control), which should hopefully give us a combination of upsides with fewer downsides.
 
If we have to choose one upside/downside combination, it's my (Aki's) strong belief that optimizing for debugging + reading (read:  diff/blame/annotate) is more important than optimizing for writing, since we're going to read these configs a lot more than we're going to write them.
 
However, when https://bugzilla.mozilla.org/show_bug.cgi?id=779294 is resolved, you should be able to specify multiple config files on the command line, and there will be rules as to what to do with conflicting config entries. Hopefully this will allow us to easily override things in certain cases, without potentially duplicating large swaths of configs in dozens of files.
 
Also, I don't mind having tools to easily compare config files, or build new ones programmatically, as long as the final flat configs are checked in for easy reading, diff, annotate, blame, etc.

Q. Is there a 'master' production configuration file that I can use?

A. No, and this doesn't necessarily make sense when you consider that "production configuration file" could refer to an Android multilocale build, Android signing, an l10n repack, a b2g build, a Firefox windows desktop unittest, or any other of a number of wildly disparate tasks.

There isn't a lot in common between http://hg.mozilla.org/build/mozharness/file/b8104340b600/configs/marionette/windows_config.py and http://hg.mozilla.org/build/mozharness/file/b8104340b600/configs/multi_locale/release_mozilla-release_android.json, for example.
 
We are starting to see "families" of scripts forming that do have a significant amount of overlap, however. We can consolidate, or share, or logically split these up to take advantage of the multiple-config-files in bug 779294 when that's done. We haven't tried to do this at the outset until obvious patterns and use cases emerged, but once those are apparent people can spend time to make this easier or more logical.

Q. What is http://hg.mozilla.org/build/mozharness/file/tip/configs/users for?

A. In theory people could check in their configuration files here without affecting production configs. This might not be the best place, and we may very well just remove this location. But if this is useful for people to keep their configs in a shareable, easy-to-access-from-mozharness location, we can keep it.
One benefit of keeping development configs here is if someone makes changes (e.g., obsoletes a config variable, changes a config variable, adds a new required config variable, etc.), it's easy to search for config files that need to be updated and either do so or contact the owner.

Q. Where should production configuration go? Can it go in, say, mozilla-central?

A. It should go in mozharness/configs.

The problem with putting a production configuration file in mozilla-central is that mozharness needs the config file at the beginning of the script run.
So to run the mozharness script, you need something that knows how to check out mozilla-central (or is it another branch? what branch?) and gets it to the right revision (what revision?) and then calls the mozharness script.

Mozharness scripts currently run under ScriptFactory in buildbot, which essentially clones a repo (hg.m.o/build/mozharness in this case), then runs a command line. If the config file isn't on the slave at this point, we've got problems.

Could we allow for this? Possibly. Unless we have a strong set of reasons to change this, however, production configs should go in mozharness/configs.
(The config file can, however, point at external files for additional configuration. See: buildprops.json, talos.json, etc. But the config file itself should live in mozharness/configs/.)

Non-production configs can live wherever.

Interaction with Other Components

Q. How do mozharness and buildbot interact? What do I need to know about buildbot to write a good mozharness script or config? What do I need to know about build or test slaves?

A.  As a caveat, we see mozharness as an easier way to contribute than anything touching buildbot, for those unfamiliar with our infrastructure. However, we also want to avoid having multiple projects bottleneck on having one or two release engineers staging everything.

We have hooks in buildbotcustom for easier definitions of mozharness tests, at least.

The marionette configs are a good example of how to define additional mozharness tests:
http://hg.mozilla.org/build/buildbot-configs/file/cdde16c17256/mozilla-tests/config.py#l1075

When mozharness unittests and talos merge into mozilla-central ( https://bugzilla.mozilla.org/show_bug.cgi?id=713055 and https://bugzilla.mozilla.org/show_bug.cgi?id=793022 ), the layout of mozilla-tests/config.py may change considerably.

For other non-test automation, you will probably also have to add hooks in buildbotcustom/misc.py. This will probably be a lot more involved, and probably would best involve a discussion with releng first.

Q. How do mozharness scripts and Treeherder interact? How do I get the output of my test run to show up correctly in Treeherder?

A.  Currently, Treeherder gets its Buildbot data from http://builddata.pub.build.mozilla.org/buildjson/builds-4hr.js.gz. That json file is generated from buildbot. There is currently no way to inject any additional or different data into that file, however you can use treeherder-client to submit data directly to Treeherder.

In general, when adding a new test, you can add it to Treeherder via something like this:
https://github.com/mozilla/treeherder-service/commit/f976b93b1bdee5d89a7b355f2ab74d93ca6eb9a6

A new build platform:
https://github.com/mozilla/treeherder-service/commit/c370f477e2fa77d1c9efbb772fbe000a19106543

Testing

Q. How can test my mozharness code changes in treeherder?

A. You can modify testing/mozharness/mozharness.json, point to your own user repository and push it to try. Read more in here.

NOTE: Use hg branch names instead of bookmarks with mozharness.json

Q. I want to run a mozharness script in a manner with parity to production. How do I do this?

A. This depends on what you mean by "in a matter with parity to production".

As we add more mozharness scripts and script families to production, we'll have better answers for these questions, but as we're adding swaths of new automation, we're having some growing pains.

You can test locally, which I encourage. But this isn't necessarily going to test all production platforms, and especially not all production platform configurations.
You can have buildduty loan you a machine, which would definitely make sure the script could run on production platform configurations. If this covered all production platforms, there would be a strong likelihood of things working when rolling out, but it wouldn't necessarily cover the buildbot <-> script interaction.
We also are going to be helping some people set up their own staging environments which would help test the buildbot portion of the automation as well.
https://wiki.mozilla.org/ReleaseEngineering/How_To/Setup_Personal_Development_Master

A project that may help at some point in the future is mozharness try, which would allow for running scripts in an environment similar to production:
https://bugzilla.mozilla.org/show_bug.cgi?id=791924

A lot of testing can be done without having these full environments set up, but not all of it. It depends on the scope and level of risk in the change.
(A dirty not-so-secret: our best attempts at staging everything outside production have still missed some testing for large scale projects.
We try to have full staging runs that are "with parity" by doing the same things, but in ways that only affect staging reporting, or staging uploads. This requires having a setup like production but separate. In practice, this tends to miss small testing elements: for instance, whilst we have a staging Treeherder instance, its only for testing Treeherder changes since it still uses production buildbot data. And since our staging pool of machines is so much smaller, we don't always get full end-to-end testing on all platforms and branches on all affected builders/testers.

For full, 100% parity, the job would likely need to be in production itself, with everything that applies. However, this would most likely negatively affect other people. Usually when testing, you want to avoid affecting production in any way.)

Q. How can I test my mozharness scripts/configs?

A. Absent the above, you can also test locally.

unit.sh runs python nosetests on mozharness. Best practice says we should add tests for whatever new code we write. In practice, this hasn't been the case, but this will become more important as the code base grows and more changes from multiple developers come in.

(It's not always possible to write tests that can be run locally for all tests: if they require a platform or environment you don't have locally, for example. Someone had a suggestion for how to return pre-cooked output from "fake" external scripts or tools for testing; we need to investigate this.)

As you write a script, if you're splitting actions up to be individual chunks as you should, you can test them individually and make sure they do what you expect before moving on to the next one: does this action clobber the directory as expected? Does this one populate the directory tree as expected? Does this one build or package or install or whatever it is, the way it should?

(You can call an action individually by --ACTIONNAME. You can skip an action by --no-ACTIONNAME. You can add an action to the list of default actions by --add-action ACTIONNAME. You can run two or more actions by --ACTIONNAME1 --ACTIONNAME2.)