A simple queue worker that produces Mozilla code repository telemetry.
The bin/process-queue-messages script reads messages about Mozilla source
code pushes from the Mozilla
Pulse messaging service.
It adds some data about the code review system used by the commit author and
submits the data to telemetry.mozilla.org
where we can build nifty dashboards.
You will need to create an account on Mozilla Pulse to collect messages about hgpush events.
These programs were designed to run on Heroku and follow the Heroku Architectural Principles. They read their settings from environment variables.
See the file dotenv.example.txt in the project root for possible values. The values that must be present in your local and/or heroku execution environments:
$ cp dotenv.example.txt .env
$ vim .env
# Add your personal environment's configurationRun the following command to check that everything works. It won't send any data:
$ PYTHONPATH=. bin/process-queue-messages --no-send$ PYTHONPATH=. bin/process-queue-messagesRead all push messages from the hgpush event queue, figure out which review system was used for each, and send the result to telemetry.mozilla.org.
Use --help for full command usage info.
Use --debug for full command debug output.
Use --no-send to gather all the data and build a payload, but do not
send any real pings. All push event messages remain in queues, too. This is
great for testing changes or diagnosing problems against a live queue.
$ PYTHONPATH=. bin/dump-telemetry SOME_COMMIT_SHA1Calculate and print the ping that would have been sent to telemetry.mozilla.org for a given changeset ID. This command does not send any data to telemetry.mozilla.org. Useful for debugging troublesome changesets and testing service connectivity.
Use --help for full command usage info.
Use --debug for full command debug output.
$ PYTHONPATH=. bin/backfill-pushlog REPO_URL STARTING_PUSHID ENDING_PUSHIDRead the Mercurial repository pushlog at REPO_URL, fetch all pushes from STARTING_PUSHID to ENDING_PUSHID, then calculate and publish their telemetry. This can be used to back-fill pushes missed by service gaps.
Use --help for full command usage info.
Use --debug for full command debug output.
Use --no-send to gather all the data and build a payload, but do not
send any real pings.
Use pyenv to install the same python version listed in the project's .python-version file:
$ pyenv installSet up a virtual environment (e.g. with pyenv virtualenv) and install the project development dependencies:
$ pip install -r requirements.txt -r dev-requirements.txtCode formatting is done with black.
requirements.txt and dev-requirements.txt are updated using hashin.
Push event messages are read from a Pulse message queue. You can inspect a live hgpush message queue with Pulse Inspector.
Messages use the hgpush message format.
Push events are generated from the mercurial repo pushlogs.
Pings (telemetry data) are sent to TMO using the hgpush ping schema. Make sure you match the schema or your pings will be dropped!
The unit test suite can be run with py.test.
Manual testing can be done with:
$ PYTHONPATH=. bin/dump-telemetry --debug <SOME_CHANGESET_SHA>and
$ PYTHONPATH=. bin/process-queue-messages --no-send --debugIf you need a message queue with a lot of traffic for testing you may want to
listen for messages on integration/mozilla-inbound. To switch the message
queue set the following environment variables:
PULSE_QUEUE_NAME=hgpush-inbound-test-queue
PULSE_QUEUE_ROUTING_KEY=integration/mozilla-inboundAfter deploying a schema change check these monitors:
- Graph of all pings for the last 8 days (successes and failures)
- List of the last 10 ingested pings (both successful and rejected)
- Reason for the last 10 ping rejections
You can also write custom monitors using hand-crafted CEP dashboards.
Ask in #datapipeline on IRC if you need help with this.